Wikipedia:Wikipedia Signpost/2010-09-27/Technology report

CTO Danese Cooper does Office Hours
Wikimedia's Chief Technical Officer Danese Cooper last week took part in an "office hours" discussion on IRC. Speaking about her own role, the state of Wikimedia's technical departments, and answering questions from the community, she gave insights into what the future might hold for Wikimedia (public logs). Although discussion was fragmented, a number of important points were touched on:
 * the hiring of Zak Greant a technology writer, and a volunteer engineering coordinator later this year, to help bridge the communication gap between paid staff and the development community;
 * that the future of the Toolserver (currently paid for and hosted in Germany by Wikimedia Deutschland) has been discussed with WMDE, although the Foundation is "not ready" to disclose the conclusions;
 * that plans for the "Data Summit" have not been finalised, including discussion on structured data, analytics and research announced in July;
 * a reiteration of her priorities, as outlined earlier this month, particularly uptime; and
 * the news that the Foundation does intend to announce a financial assistance program to help bring developers together at events such as next month's "Hack-a-ton" (see last week's Signpost).

Video developments for Wikimedia Commons
Michael Dale, a Kaltura employee working with the Wikimedia Foundation to build easier ways of using the power of video in Wikimedia projects, this week announced the creation of a free "video sequencer" for Commons (Wikimedia techblog). The sequencer, which allows users to remix existing and new video, audio, text and images into single video sections (see example, right), was described by Kaltura as "a stepping stone in the world of online media". It requires a modern browser to use, with the best performance by the Firefox 4 betas. It is hoped the sequencer will bridge the gap between images, which are relatively common on Wikimedia projects, and videos, which are relatively rare, to create overview, documentary-style introductions to topics on Wikipedia, among other uses.

As the capabilities around video are refined and expanded, a worry has been that increased usage of video would impose a significant additional cost on the Foundation, especially due to bandwith usage. Michael Dale announced a cooperation with P2P-Next, who presented at Wikimania this year. Their technology makes it possible to use peer-to-peer technology for downloading the videos and all you need is to enter the [ mwEmbed video pilot] and install the P2P-Next Swarmplayer Firefox plugin (a plugin for Internet Explorer and a MacOS version of Swarmplayer are still in development). After viewing the video, your browser will share the video for you with other viewers and thus alleviate the strain on the resources of the Foundation. It is claimed that the sharing is configurable and will not get in the way of your browsing experience.

Google Summer of Code: Stephen LaPorte
We conclude a series of articles about this year's Google Summer of Code (GSoC) with Stephen LaPorte, a law school student, who describes his project to creating a tool to format judicial decisions, legal scholarship, and statutes for Wikipedia's sister project Wikisource: WikiSource should be a repository of statutory law, judicial decisions, and legal scholarship. Prof. Timothy K. Armstrong identified Wikisource as solution to the architectural limitations of existing repositories for judicial decisions and legal scholarship. Prof. Armstrong listed three obstacles for Wikisource--legal, content, and cultural issues. The legal and cultural issues can be address through education and outreach. This project addresses the problem of content.

A tool to format judicial decisions and statutes will help users move text that is already electronically available and in the public domain to Wikisource, solving the "chicken-and-egg" problem that Wikisource currently faces. Once Wikisource has a substantial body of legal sources, users will gain value from and improve the coverage of those legal sources.

Stephen worked on four such tools: importing U.S. Supreme Court cases (example), importing the current U.S. legal code (example), wikifying legal citations (tool) and helping categorise U.S. Supreme Court cases (tool).

In brief
Not all fixes may have gone live to WMF sites at the time of writing; some may not be scheduled to go live for many weeks.
 * Since Brion Vibber, former Wikimedia CTO and general MediaWiki guru, departed almost a year ago, the code review workflow has been continuously playing catch-up. The process, which defines the time it takes to get code from the sandbox to live Wikimedia sites, has slowed down considerably, with only Tim Starling responsible for monitoring it. The amount of work is only increasing, while Tim will be having less time on his hands next month because he expects to become a father. For this reason the Foundation has rehired Brion under a temporary 2 hour-per-day contract, while continuing to look for a more permanent solution to this problem (Wikimedia announcement).
 * The MediaWiki API's " " module was briefly deactivated over performance concerns stemming from the Extension:ImageAnnotator gadget, before being re-enabled.
 * In an unrelated development, the ability to flag parameters as "required" on the API was added, causing an error to be triggered if they are left unset.
 * Adding the text  to a page will now prevent bots running on the popular pywikipedia framework from moving interwikis to the bottom of the page (as is the norm for the vast majority of pages).
 * There was a discussion on the wikitech-l mailing list about when non-WMF MediaWiki users could expect version 1.17 of the software. The conclusion, though indeterminate, was that "I think we can all agree that it doesn't make sense to cut a new release of MediaWiki until the ResourceLoader stabilizes" (User:RobLa) and that that point may be some time off yet. Neil Kandalgaonkar questioned whether fixed releases were required at all.
 * User:RobLa has invited collaboration on October's WMF Engineering Overview (wikitech-l mailing list).
 * Developer Trevor Parscal opened a discussion on the idea of more clearly defining which code should be in the "core" MediaWiki setup and which should be classed as optional extensions.
 * With the resolution of bug #16574, it will eventually be possible to exclude specific IP addresses from the account creation limit. This feature would be useful to, for example, allow Wikipedia to be made available in an educational or event context without all but the first 6 users getting very little out of it owing to being unable to create an account.