Wikipedia:Bots/Requests for approval/Chartbot 5


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol keep vote.svg Approved

Chartbot 5
Operator:

Time filed: 18:04, Thursday March 28, 2013 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): PHP

Source code available:

Function overview: Updating of obsolete dead links to Billboard.com

Links to relevant discussions (where appropriate):

Edit period(s): one-time plus clean-up run

Estimated number of pages affected: 2000

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details: This is a mild rework of Chartbot 2, picking up some different old URLs, specifically URLS like
 * http://www.billboard.com/bbcom/esearch/chart_display.jsp?cfi=417&cfgn=Singles&cfn=Global+Dance+Tracks&ci=3084943&cdi=9307737&cid=07%2F21%2F2007
 * This form can be readily mapped to the current format: the cfi values are all researched and the supported ones are stored in BillboardChartNum. cid converts to a date, but needs its fields rotated.


 * http://www.billboard.com/bbcom/charts/yearend_chart_display.jsp?f=The+Billboard+Hot+100&g=Year-end+Singles&year=1947
 * This form needs just some mild name wizardry on the value of the "f" parameter, and it is then easily mapped to the current format


 * http://www.billboard.com/bbcom/yearend/chart_display.jsp?f=Top+Latin+Albums&g=Year-end+Albums
 * This special one-year-only link format linked to the 2008 annual charts and was only live during early 2009. Name wizardry matches the earlier format.

For those watching, I pretty much know what Chartbot 6 and Chartbot 7 are going to do, and then this will be done. The Chartbot framework is best suited for picking up group of URLs that perform a similar function and replacing them with one kind of new URL, which makes it easier to process these things in batches. &mdash;Kww(talk) 18:04, 28 March 2013 (UTC)

Discussion
 MBisanz  talk 04:49, 29 March 2013 (UTC)
 * This was far more work than I had anticipated, primarily due to information rot. I was able to recover the syntax of earlier links quite readily, but found that due to changes in the site over the last decade, much of the data is now stored in different places. The big change is that many charts do not have weekly data stored for them any more, but do have the peaks stored if the artist is provided as a key. Chartbot 5 had to incorporate infobox scanning: if a URL couldn't be constructed that would point to a weekly chart entry, the bot retrieves all the infoboxes from the article. If there is at least one infobox of the types
 * person
 * musical artist
 * musician awards
 * album
 * single
 * song
 * artist discography
 * then Chartbot will validate that all the infoboxes in the article refer to the same artist. If so, it constructs a URL that points to the appropriate chart type entry for the artist. The trial ran from 31 mrt 2013 18:57 to 31 mrt 2013 19:39.&mdash;Kww(talk) 20:00, 31 March 2013 (UTC)
 * Do you need another test or do you think the validation code is solid?  MBisanz  talk 21:13, 31 March 2013 (UTC)
 * I've done a dry run against 1400 articles, and yes, it's solid. I'm ready to go. Chartbot has a special "dry" mode, where it redirects the final article output to my local drive, and that permits me to test things pretty well.&mdash;Kww(talk) 21:17, 31 March 2013 (UTC)
 *  MBisanz  talk 21:58, 31 March 2013 (UTC)
 *  MBisanz  talk 21:58, 31 March 2013 (UTC)


 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.