Wikipedia:Wikipedia Signpost/2005-03-14/Power outage redux

Wikipedia suffered another power outage last Monday, but unlike the last incident (see archived story) it proved to be a relatively simple problem and service was restored after a couple of hours.

Developer Kate Turner reported that the Wikimedia Foundation's websites were inaccessible for about two hours (around 13:00 to 15:00 UTC), as a result of a power strip blowing out. Unlike the power outage two weeks earlier, it was not caused by any problems at the colocation facility, but was simply the result of faulty equipment.

The outage affected the switch that connects the server network to the internet, meaning that access to Wikipedia was cut off entirely. Several machines also lost power, but not the database server, thus avoiding the extended delay that was needed last time in order to restore the database to service.

After power was restored, performance remained slow for a while as the affected machines were brought back online. Grunt reported experiencing technical glitches and said the site "moves about as fast as a zombie". As things began to improve, the site briefly went back into read-only mode on Tuesday. Complaints about performance continued to be registered periodically over the next few days at the OpenFacts status page.

Image server overloaded
Another problem cropped up due to the image server being overloaded. Developer Jamesday reported on Tuesday that the main Wikimedia image server, which hosts images for all Wikimedia projects including the Commons image repository, was experiencing more read requests than it could keep up with.

As a result, images were removed from a number of frequently used templates as a temporary measure to reduce the load. To deal with the problem, a new server is being purchased with more disk space and memory. Jamesday indicated this would either replace the image server if necessary, or else be used as a database server. Also, additional technical measures are being designed that will hopefully reduce the amount of work the image server needs to do.