Wikipedia:Wikipedia Signpost/2012-05-07/Technology report

April engineering report published
The Wikimedia Foundation's engineering report for April 2012 was published this week on the Wikimedia Techblog and on the MediaWiki wiki, giving an overview of all Foundation-sponsored technical operations in that month (as well as brief coverage of Wikimedia Deutschland's Wikidata project). Three of the headlines for the month have already received coverage in previous issues of the Signpost: the selection of nine Google Summer of Code students (explored in more detail below), the shift to a rapid deployment cycle with the deployment of 1.20wmf1 and a new version of the Wikimedia iOS app. Of the two others, one relates to work on a document detailing Wikimedia engineering’s goals for the next fiscal year, which will be featured in the "Technology report" as soon as it becomes official, and the deployment of a new mobile skin (example), which occurred after the publication of last week's issue. The skin update has since received broadly positive commentary among users; it provides a rival to the more native experience of an Android or iOS app.

Elsewhere, the roundup contained details of a massive improvement in the amount of time taken to use Wikimedia's Lucene-based internal search engine after "months of preparation and refactoring work". The difference reported was "quite amazing": the actual search component of 99% of search requests now takes under a second, down from nine seconds before; and the average search time is now 100 ms, down from 700 ms. Among the interesting updates included in the report was the news that the localisation team had started coding for a universal language selector, over 18 months after the feature was first proposed.

Coding has begun on a full system of Lua scripting, while April also saw over 20x improvements in the processing speed of the new parser on template heavy pages, suggesting that preparations for its rollout (a prerequisite for deployment of the new Visual Editor) will begin shortly. Among the relative failures of the month was the deployment of a new media caching layer ("Varnish"), which, while it has the potential to improve performance and scalability, seems to be preventing users from downloading large files successfully (bug #36577).

Corrections:

Google Summer of Code students and their project
As announced a fortnight ago, nine students have now been selected to work on MediaWiki this year, supported by Google stipends and WMF mentors (Wikimedia blog). The projects they represent fall across a broad spectrum: some, like Aaron Pramana's project to rethink the display and functionality of Wikimedians' watchlists, involve highly visible changes; others will have a more indirect effect on the average Wikimedian user experience (such as Suhas HS's project to improve the OpenStackManager extension that underpins the virtualisation functionality of Wikimedia Labs and Robin Pepermans' attempts to improve the usability, performance, and coverage of Wikimedia's Incubator for nascent language editions). It will be the first time many of the students have undertaken such ambitious projects in the name of open-source development; they join hundreds of other students worldwide, each working on different projects for different open-source initiatives (several countries are represented even among the WMF's nine students).

In terms of focus, three of the nine projects selected for WMF mentorship relate to media handling: Ankur Anand will work on integrating Flickr upload and geolocation into Wikimedia Commons' UploadWizard, Platonides on a new cross-platform mass media uploader with a broader function set than existing tools, particularly with regard to image upload "campaigns" such as Wiki Loves Monuments, and Harry Burt (the author of this Signpost report) on an upgrade to the Translate extension to allow it to translate Scalable Vector Graphic (SVG) files and with a view to eventual deployment to Wikimedia Commons.

Other projects defy such easy categorisation by topic; these include Akshay Chugh's work on a convention/conference extension for MediaWiki that will ease the job of meetup organisers, Ashish Dubey's attempts to get real-time collaboration integrated into the upcoming visual editor (a topic that hit mailing lists again this week), and last but not least Nischay Nahata's attempts to optimise the performance of the Semantic MediaWiki (SMW) extension. If particularly successful, the last project could clear the way for its use on a test Wikimedia wiki (see previous Signpost coverage).

Students will officially start coding later in the month, although they may begin when they wish. They must present their final work in late August for evaluation.

In brief
Not all fixes may have gone live to WMF sites at the time of writing; some may not be scheduled to go live for many weeks.
 * 1.20wmf2 begins deployment process, 1.19 released: 1.20wmf2, the second "rapid deployment" to Wikimedia sites, began its process of deployment on April 30 when MediaWiki.org was switched over to the latest version, 20 days after the wiki received 1.20wmf1. It was then deployed to Wikimedia Commons May 2 2012, where it will have its greatest impact: the most significant feature of 1.20wmf2 is the introduction of so-called "chunked" uploads, which prevent intermittent connection failures from restarting an upload in its entirety. Representing just those 20 days of development, the deployment (which also hit the English Wikipedia on May 7 2012 and will be introduced to other Wikipedias on May 9 2012) includes some 47 bugfixes, among them fixes allowing the translation of edit notices and better display of the Vector skin on "high definition" devices. This week also saw the official release of MediaWiki 1.19 to external sites, allowing their wikis to benefit from a similar range of improvements as those that came to Wikimedia wikis in late February (wikitech-l mailing list).
 * Gerrit upgraded: Gerrit, MediaWiki's new code review system, has been upgraded to version 2.3, incorporating "lots of fixes and various other improvements". However, the most notable feature is the internalisation of Git's submodule system, which allows developers to get copies of ("clone") and update copies of multiple related repositories at the same time; as developer Chad Horohoe explained, the primary-use case is to allow developers to replicate their pre-switchover ability to grab the code from all WMF-hosted extensions simultaneously. The update has not prevented work on a competitor review system: named "Gareth" and developed by long-time volunteer developer Daniel Friesen, the code review system is an attempt to provide a more homegrown, MediaWiki-compatible review experience (also wikitech-l).
 * Two billion mobile pageviews: Mobile devices accounted for some 2.089 billion pageviews in April 2012, as reported on Thursday via the Wikimedia blog. The monthly figure, the first to break the self-set two billion pageview target, represents a 187% year-on-year increase, WMF Senior Manager Amit Kapoor reported on behalf of the WMF mobile team in his blogpost addressing the milestone – a jump which he attributed to improvements in both the standard mobile site and device-specific "apps" during the last calendar year. The figures for non-English Wikipedias look particularly encouraging: among those wikis seeing enormous jumps in mobile device browsing, the Portuguese Wikipedia clocks in with 600% growth, Arabic with 500% growth, and Turkish with 800% growth. Several less-developed countries are also set to gain in the second half of this year with the wider rollout of Wikipedia Zero, although the list of countries is protected by commercial confidentiality agreements.
 * Two performance and two preference problems: Wikimedia wikis suffered two performance issues in the past seven days, one prompting "failed images, scripts, and other static resources" for almost an hour on April 30 (Wikitech incident documentation), the other causing image corruption and various related and unrelated issues for at least four minutes on 3 May 2012 (Wikimedia Commons Village Pump). Two changes to default preferences (enabling email watchlist notifications and edits-add-to-watchlist by default) also went amiss during the week, the result of a problem related to distinguishing between the default preferences of new users preferences and those of older users who had simply never overridden the defaults; consequently, users registered more than a couple of years ago found that their preferences had mysteriously changed overnight until both changes were reverted pending a proper fix for the issue.
 * CurationToolbar-Flyout-Tagging.png Mixed news for New Pages Feed project: As reported in last week's "Technology report", this week should have seen the deployment of one of the two halves of the New Pages Feed (formerly "New Page Triage") project. Bugs encountered during the first attempted deployment of this new "list view" – essentially a successor to Special:NewPages – forced a second attempt to be scheduled later in the week. That attempt was also unsuccessful, prompting a third attempt to be scheduled for May 7 2012 (PDT). The second half, community liaison Oliver Keyes reported in a newsletter this week, will take the form of a new "toolbar, which we're calling the 'curation bar'; you can see a mockup here. A stripped-down version of this should be ready to deploy fairly soon after the list view is [deployed]". When fully released, the 'curation bar' will contain many options to interact with the page including patrolling it, adding maintenance tags and nominating it for deletion, Keyes says.
 * MathJax preference goes WMF-wide: After a trial installation on the MediaWiki wiki, MathJax has been enabled as a -rendering preference for all Wikimedia wikis (wikitech-l mailing list). The JavaScript-based system, which has been available as a user script for some time, replaces the usual PNG renderings of mathematical formulae with asynchronously typeset non-image-based representations. Although well received, the escaping of HTML tags (bug #36059) has continued to be an annoyance for MathJax users; indeed, a dozen MathJax-related bugs of varying severity have been filed in the past two weeks.
 * Four bots approved: 4 BRFAs were recently approved for use on the English Wikipedia:
 * Lowercase sigmabot II's 1st BRFA, clearing the Sandbox and reinserting its header;
 * 28bot's 4th BRFA, reverting test edits and removing signatures from articles;
 * Xqbot's 4th BRFA, adding GA and FA signs to interwiki links;
 * Justincheng12345-bot's 1st BRFA, adding, removing and modifying interwiki links.
 * At the time of writing, 14 BRFAs are active. As usual, community input is encouraged.