Wikipedia:Wikipedia Signpost/2010-09-06/Technology report

Development team to start publishing updates regularly
As the MediaWiki software behind Wikimedia sites grows and matures, it becomes more complicated to manage and to oversee major changes. As such, the Foundation has begun to bring in more paid contractors and employees (though not many for such a large and popular set of websites), each with their own project. The first in a soon-to-be-monthly series of posts outlining these projects was posted this week on the Wikimedia Techblog. The projects that receive some sort of paid support rather than being left entirely to the community to develop include the following. This is not complete list and the items are numbered only for convenience:


 * 1) Virginia Data Center – To set up a world-class primary data center for Wikimedia Foundation properties.
 * 2) Media storage – To re-vamp our media storage architecture to accommodate expected increase in media uploads.
 * 3) Monitoring – To enhance both ops and public monitoring to (a) notice potential outages sooner, (b) increase transparency to the community, and (c) support the progress-tracking required in the five-year plan.
 * 4) Article assessment – To collaboratively assess article quality and incorporate reader ratings on Wikipedia.
 * 5) Pending changes – The enwiki trial.
 * 6) Liquid threads – An extension that brings threaded discussions capabilities to Wikimedia projects and MediaWiki.
 * 7) Upload wizard – An extension for MediaWiki that provides an easier way to upload files to Wikimedia Commons, the media library associated with Wikipedia.
 * 8) Add-media wizard – A gadget to facilitate the insertion of media files into wiki pages. Its development is supported by Kaltura.
 * 9) Resource loader – To improve the load times for JavaScript and CSS components on any wiki page.
 * 10) Central notice – CentralNotice is a banner system used for global messaging across Wikimedia projects.
 * 11) Analytics revamp – To incorporate an analytics solution that can grow and answer questions posed by the Wikimedia movement.
 * 12) Selenium deployment – To build an automated browser-testing environment for MediaWiki.
 * 13) Fraud prevention – To focus on integrating new fraud prevention schemes in our credit-card donation pipeline.
 * 14) CiviCRM upgrade – To upgrade our heavily customized CiviCRMv2 install to a mostly stock CiviCRMv3 install.
 * 15) Process improvement – To increase transparency and generally organize the Foundation’s engineering efforts more efficiently.

Further information on each, including their current status, is available on the original post. Updates on each should be more accessible in future.

ResourceLoader coming "soon"
Developer Trevor Pascal announced on Twitter that his work on a new ResourceLoader to improve loading speeds on Wikimedia sites had progressed and could now be expected "soon". He went into more detail on the Wikitech-l mailing list, explaining the main features to expect:
 * Combines resources together. Multiple scripts, styles, messages to be delivered in a single request, either at initial page load or dynamically; in both cases resolving dependencies automatically.
 * Allows minification of JavaScript and CSS.
 * Dramatically reduces the number of requests for small images. Small images linked to from CSS code can be automatically in-lined as data URLs (when the developer marks it with a special comment), and it's done automatically as the file is served without requiring the developer to do such steps manually.
 * Allows deployment changes to all pages for all users within minutes, without purging any HTML.
 * Provides a standard way to deliver translated messages to the client, bundling them together with the code that uses them.
 * Performs automatic left-to-right/right-to-left flipping for CSS files. In most cases the developer won't have to do anything before deploying.
 * Does all kinds of other cool tricks, which should soon make everyone's lives better

He gave the example of a page that would previously require 35 requests (totalling 30kB) now taking just one of 9.4kB. Gains for users on older hardware or mobile devices might be improved even more, he said, since they were being served whole scripts they could do nothing with.

Google Summer of Code: Jeroen De Dauw
We continue a series of articles about this year's Google Summer of Code (GSoC) with student Jeroen De Dauw, who describes his project to develop a system for managing the extensions installed on a wiki (read full blog post): My initial proposal was to create an awesome extension management platform for MediaWiki that would allow for functionality similar to what you have in the WordPress admin panel ... I started with porting the filesystem abstraction classes from WordPress, which are needed for doing any upgrade or installation operations that include changes to the codebase. (The current MediaWiki installer can do upgrades, but only to the database.) I created a new extension called Deployment, where I put in this code ... but it turned out that doing filesystem upgrades securely is not an easy task, so after finishing the port, I stopped work on this temproarily. I then poked somewhat at the new MediaWiki installer [due to ship with MediaWiki version 1.17], which is a complete rewrite of the current installer. I made some minor improvements there, and split the Installer class, which held core installer functionality, into a more generic Installer class and a CoreInstaller. This allows for creating an ExtensionInstaller that uses the same base code ...

I decided to create the package repository, from which MediaWiki and extensions could get updates and new extensions, from scratch, and started working on another extension, titled Distribution, for this purpose. I merged it together with a rewritten version of the MWReleases extension written by Chad, which already had core update detection functionality. After the Distribution APIs where working decently I started work on the Special pages in Distribution that would serve as the equivalent of the WordPress admin panel. As I had put off the configuration work, and also the file-system manipulation for the initial version, this came down to simply listing currently installed software, update detection and browsing through extensions available in the repository ...

Special-extensions.png

So, what is the state of the code at the moment? The interfaces that are finished to some extent are:
 * Special:Extensions (pictured): This page lists all installed extensions and allows you to filter on extension type. It’s based on the WordPress "plugins" page and is currently only an improved version of the extension list in Special:Version. It's the only special page added by Deployment that can be viewed by non administrators. When logged in however, every extension has a list of links allowing you do various actions ... A planned feature for this special page is showing update notifications in each extension row.
 * Special:Install: This page allows you to search through available extensions in the repository. The interface is based on the "plugin-install" page of WordPress and allows for searching extensions based on term, tag or author. After performing a search you get a list of matching extensions showing their name, version, authors, description, link to the documentation, and a link to download them. Later on this download link will be replaced by an "Install" one.
 * Special:Update: This page will inform you of any updates to both core and extensions. It behaves basically identically the WordPress "update" page.

Although some very basic functionality is working, quite some work still needs to be done to get this to the WordPress-awesomeness level.

In terms of Wikimedia sites, developments in this field could improve the turnaround time for extension deployment, but the significant gains will be for spreading extensions to and from other MediaWiki-based sites.

In brief
Not all fixes may have gone live to WMF sites at the time of writing; some may not be scheduled to go live for many weeks.
 * The cause of disappearing images has been identified as a code flaw that allowed the upload to stick in some places on the servers but not in others, where, for extraordinary reasons (such as the recent server transition), some but not all actions were impossible. The upload process will now be consistent in its approach (revision #72021).
 * A two-day public "hackathon" to get together the WMF tech staff with volunteer engineers for "an old-fashioned face-to-face coding sprint" is in the pipeline. This meeting will be formally announced in September and will take place in October on the east coast of the US (Wikitech-l mailing list).
 * Also in the planning stage is an invitational meeting to discuss plans related to "how data is organized, displayed, captured and analyzed on WMF properties" ("Data Summit"). The invite list is to be kept small, but interested parties should apply to  for an invite (Wikitech-l mailing list).
 * Systems Administrator Domas Mituzas blogged about how the Wikipedia database system compares with those of other websites, and the upgrade from MySQL versions 4.0 to 5.1.
 * The last of the Wikimedia Foundation wikis have been switched over to the Vector skin. At the same time, a new "global opt-out" was enabled, allowing users to easily switch back to Monobook for multiple projects simultaneously.
 * MediaWiki developer Yaron Koren published a proposal for Making Wikipedia into a database (cf. Signpost coverage of related proposals). The article expanded on statements by Koren that had been quoted in an article in Technology Review ("Wikipedia to Add Meaning to Its Pages"), and was written for Hatilda Harevi’it ("The Fourth Tilde"), the Signpost's sister publication in the Hebrew Wikipedia, where it appeared in Hebrew translation last week. Although Koren's consultancy company "WikiWorks" specializes in Semantic MediaWiki, he uses a simple CSV data format in the proposal.