Wikipedia:Wikipedia Signpost/2011-11-07/Technology report

October Engineering Report published
The Wikimedia Foundation's Engineering Report for October was published last week on the Wikimedia Techblog and on the MediaWiki wiki, giving an overview of all Foundation-sponsored technical operations in that month. Many of the projects mentioned have been covered in The Signpost, including the New Orleans hackathon, the introduction of native  support for all Wikimedia wikis, and the local deployment of MediaWiki 1.18. Also included among the headlines on the report were the deployment of the Translate extension to Meta-Wiki and the completion of the first revision of the MediaWiki architecture document.

As described in brief in last week's "Technology report", progress is being made on a new parser and visual editor combination. The official engineering report documented exactly where that progress was coming from, with at least six developers (Trevor Parscal, Inez Korczynski, Roan Kattouw, Neil Kandalgaonkar, Brion Vibber and Gabriel Wicke, who only joined the team recently) each working on different elements of it concurrently.

Progress in other areas was more restrained but still being made; for example, developer Andrew Garrett worked on a script to convert existing LiquidThreads installations to the new, revised schema. Likewise, the "last critical bugs" in version 0.9 of WMF-supported offline reader Kiwix were fixed, with the release candidate cycle expected to begin shortly. There was also some bad news in the report, however, as it described how data analyst Erik Zachte had discovered inconsistencies in his report card numbers, which were investigated and attributed to packet loss of up to 25%, rendering several figures unreliable.

Scheduled for November are substantive work on the Git conversion and  support on the mobile platform to mirror that available on the non-mobile site.

User script writers grapple with protocol-relative URLs
With the switchover to native  slowly fading into history, the baton for ensuring total security has been passed on to script writers. This is because, although all interface images were switched over to using protocol-relative URLs, many user scripts will also have to be updated to use the new format.

Forcing use of insecure images or dependency scripts negates much of the benefit of using a secure site; as a consequence, browsers are right to show warnings, Ryan Lane explained. And as Brion Vibber described, the warnings are often very obvious: "Firefox can throw up a scary dialog box on every page view... Chrome does the big scary X-ing out of the 'https'... IE in latest versions just ignores any of the content that came over HTTP unless you opt back into it by clicking on a little bar at the bottom of the window" (Words and what not blog).

And so, with increasing numbers of users expected to switch to using the  version of the site, more and more script developers have been working to clear up any warnings; nonetheless, help will be needed within smaller sites to fix code copied and pasted from larger wikis months or even years before the   support went native.

In brief
Not all fixes may have gone live to WMF sites at the time of writing; some may not be scheduled to go live for many weeks.


 * 1.18 beta 1 released: following the successful deployment of MediaWiki 1.18 to Wikimedia wikis earlier this month (Signpost coverage), the first "tarball" version of the software has been released (wikitech-l mailing list). Although tarball releases enabled non-Wikimedia wikis to utilise MediaWiki's latest features, the advice for this beta version is to avoid "run[ning] it on any wikis that you really care about". The release is the first to come packaged with extensions, which enables a cleaner "straight out of the box" experience.
 * CAPTCHA vulnerabilities exposed: Several publications, among them PC World and CRN.com, picked up on a recent report of efforts by researchers at Stanford University to break the CAPTCHA security measures which form part of major websites' protection against spamming. One in four attempts at cracking Wikipedia's CAPTCHA defences were successful, compared with 43% for eBay, 20% for Digg, 16% for CNN.com and 5% for Baidu. Only Google and reCAPTCHA were invulnerable to infiltration attempts, in an experiment which may raise eyebrows among WMF developers keen to prevent abuse of Wikimedia wikis. Nonetheless, Wikimedians can no doubt take pride in the fact that obvious spamming of the sort orchestrated by bots and normally prevented by CAPTCHAs is, in any case, unlikely to go unnoticed for long.
 * Data analytics staffing explained, hiring: Director of Platform Engineering Rob Lanphier this week posted a description of the WMF's data analytics team on the Wikimedia blog. Lanphier explained that the team is responsible for "building out our logging and data mining infrastructure, and for making Wikimedia-related statistics useful to other parts of the Foundation and the movement". He added that it was "hiring for two full-time analytics positions right now, plus a contract opportunity".
 * Puppet, LDAP, autofs, and Nova: Ryan Lane, Operations Engineer at the WMF, used two posts on his personal blog to explore the work he has recently been doing for the Foundation. Both "Sharing home directories to instances within a project using puppet, LDAP, autofs, and Nova" and "A process for puppetization of a service using Nova" were targetted at sharing best practice among those responsible for the performance of large websites; in related news, this week's Engineering report highlighted how the WMF were working with the JuJu project to enable the sharing of DevOps practices.
 * SMWCon presentation videos available: According to a post on the Semantic MediaWiki website, videos of talks and presentations from SMWCon Fall 2011 have been uploaded to YouTube. The conference had been held on September 21–23 in Berlin, Germany, with approximately forty attendees.