Wikipedia:Bots/Requests for approval/Chartbot 2


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol keep vote.svg Approved

Chartbot 2
Operator:

Time filed: 15:34, Friday March 15, 2013 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): PHP

Source code available:

Function overview: Replacement of obsolete links to weekly Billboard charts

Links to relevant discussions (where appropriate):

Edit period(s): one time run

Estimated number of pages affected: 7000

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details: Billboard has changed its URL format

The previous site formatted a URL for a specific chart on a specific date like
 * http://www.billboard.com/charts/r-b-hip-hop-songs?chartDate=2008-01-05

The new site formats the link as


 * http://www.billboard.com/charts/2008-01-05/r-b-hip-hop-songs

Billboard has put a redirect system in place so that links to the old positions redirect to the new. Unfortunately, Billboard had a bug for many years that corrupted their URL presentation. We have many links like
 * http://www.billboard.com/charts/r-b-hip-hop-songs#/charts/r-b-hip-hop-songs?chartDate=2008-01-05
 * http://www.billboard.com/charts/pop-songs#/charts/pop-songs?chartDate=2012-03-24
 * http://www.billboard.com/column/the-juice/kanye-west-s-g-o-o-d-music-announces-new-1006672952.story#/charts/rap-albums?chartDate=2012-04-21

For the technically minded, the bug was that Billboard would always present the location inside of Billboard where the session began as the path, and the current location as a fragment. Note the last example above, where the person entered the site to read a column about Kanye West, proceeded to a chart page, and then pasted a link to the chart page into Wikipedia. When Billboard's redirection system encounters such a link, it redirects to the contents of the entry URL, not the fragment with the information that the reference points at.

For this task, Chartbot will look for links and fragments of the form ?chartDate=, and replace the URL with the modern form.

Discussion
 MBisanz  talk 16:31, 16 March 2013 (UTC)

Had to halt trial and repair after this disastrous edit, which turned out to be a bug in an rarely executed branch where a correctly parsing URL had been formed that returned a 404 error from Billboard. The 404 came about because the original URL contained an invalid date. Restarted repaired bot, and it functioned fine from that point onward. For anyone that reviews the diffs, edits that look like this one are correct: all the spurious text in the old URL was residue from the bug in the Billboard URLs discussed above.&mdash;Kww(talk) 02:30, 17 March 2013 (UTC)
 * While I'm here, I would like to see if people think it's reasonable to go ahead and change links like http://www.billboard.com/#/charts-year-end/hot-pop-songs?year=2011 to http://www.billboard.com/charts/year-end/2011/hot-pop-songs under this same request or if I need to do a new one. It's nearly the same code, and there are thousands of instances.&mdash;Kww(talk) 17:01, 17 March 2013 (UTC)
 * A dry-run of a version that includes the annual charts completed successfully, identifying 14795 links that the bot can successfully repair.&mdash;Kww(talk) 19:00, 18 March 2013 (UTC)
 * You don't need a new one, you can do the extra 14,795 with this approval.  MBisanz  talk 11:50, 19 March 2013 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.