Wikipedia:Wikipedia Signpost/2010-08-02/Technology report

MediaWiki 1.16 released
The release of the latest version of MediaWiki (Version 1.16) was announced this week (Wikimedia Techblog); a separate, minor update (Version 1.15.5) was also released for operators unwilling or unable to upgrade fully. Both versions were billed as fixing an important "data leakage vulnerability" (wikitech-l mailing list). The milestone has little inherent significance for Wikimedians, since Wikimedia Foundation wikis run their own version of the MediaWiki software, which is usually well ahead of the official release. MediaWiki was originally developed with Wikipedia in mind but is now in use in some form or other on a number of other popular sites, including the commercial wiki host Wikia. System administrators of these other installations are encouraged to upgrade, both for security reasons and to take advantage of features introduced since the 1.15 milestone, reached more than a year ago. However, Wikimedians can be hopeful that the release is a sign of strength in the development community.

Study of web passwords includes Wikipedia
The handling of user passwords on 150 websites was analysed in a recent study. Joseph Bonneau and Sören Preibusch, researchers from the University of Cambridge who conducted the study (The password thicket: technical and market failures in human authentication on the web, see also blog post and downloadable data), called it "the first large-scale empirical analysis of password implementations deployed on the Internet". Wikipedia received a "password security score" of 4 out of 10, falling short of the optimal score with respect to several evaluation criteria: the password selection advice does not prohibit dictionary words, a minimum length (>1) is not required, the use of numbers or symbols in the password is not enforced, federated identity services are not supported (although a MediaWiki extension for OpenID exists), the user list is not protected from probing (the list is intentionally available), and TLS is normally not used to protect password submissions (the password is sent in cleartext when logging in. However, the secure server provides encrypted connections).

Asked by The Signpost for comment, Sören Preibusch said: Wikipedia exhibits a unique set of password practices [see "Clustering" on p. 28]. The site is doing a decent job in preventing password guessing by requiring captcha-solving after three attempts -- one of the lowest limits observed in the market. Creating a random new password instead of sending out the old password during password reset is another positive feature. However, Wikipedia makes it easy to probe usernames through the enrolment, log-in, and reset forms. Whilst this is a deliberate and documented practice, and usernames associated with administrative privileges are also available through published lists, it leads to a lower password score in our survey.

Much security could be gained by making encrypted transmission of the password the default. Imposing a minimum length is another low-hanging fruit. Similarly, a graphical password strength indicator could complement the ample password advice already available on the sign-up page. Given the technology-savvy population of Wikipedia account holders, HTTP Digest authentication may improve security without making TLS the default.

Wikipedia's threat model and its specific motivations for deploying passwords, such as reputation-building and persistent display preferences, would seem to make OpenID a viable alternative to passwords. I think it is unfortunate that Wikipedia is not yet OpenID-enabled.

See also past Signpost coverage about password security on Wikipedia: Four administrator accounts desysopped after hijacking, vandalism, Administrator status restored to five accounts after emergency desysopping (about a 2007 incident which led to some changes in MediaWiki and the start of the page Security), Blank passwords eliminated for security reasons (2006), Password security upgraded after Slashdot furor (2005, about an incident after which salted passwords were introduced).

95% of MediaWiki installations said to have a "serious vulnerability"
In an unrelated announcement, research published by Qualys – a private software security firm – has shown that 19 in every 20 MediaWiki installations are running software old enough to include "serious vulnerabilities", compared with fewer than 1 in 20 Wordpress installations (Wikimedia Techblog). Developer Tim Starling (one of only a handful of paid MediaWiki programmers) explained the startling figure: While WordPress's web-based upgrade utility certainly has a positive impact on security, I feel I should point out that what WordPress counts as a serious vulnerability does not align with MediaWiki’s definition of the same term. For instance, if a web-based user could execute arbitrary PHP code on the server, compromising all data and user accounts, we would count that as the most serious sort of vulnerability, and we would do an immediate release to fix it.... in WordPress, they count this as a feature, and all administrators can [execute such code].... If you are running MediaWiki in a CMS-like mode, with whitelist edit and account creation restricted, then I think it's fair to say that in terms of security, you're better off with MediaWiki.

However, the statistics presented by Qualys show that an alarming number of people are running versions of MediaWiki older than 1.14.1, which was the most recent fix for an XSS vulnerability exploitable without special privileges. There is certainly room for us to do better.

In brief
Note: not all fixes may have gone live to WMF sites at the time of writing; some may not be scheduled to go live for many weeks.
 * Daniel Kinzler (User:Duesentrieb, a MediaWiki developer employed by Wikimedia Germany) has written a new program to work with Wikipedia's category structure (CatGraph), using the Neo4j graph database. The long-term goal is to provide category-based search (especially deep category intersection), replacing his own CatScan tool.
 * Erik Zachte, WMF data analyst, notes that the underreporting of pageview counts has been repaired for recent months (and the cause of the problem has been identified and removed) and that there now exists a new summary report for Wikimedia page views that "presents trends for nearly all projects on a single page".
 * Bug #24564 has been fixed, restoring use of " " in the API, broken by recent updates.