Wikipedia:Bots/Requests for approval/KSFT bot


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol oppose vote.svg Withdrawn by operator.

KSFT bot
Operator:

Time filed: 19:22, Thursday, June 23, 2016 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): Python

Source code available: https://github.com/KSFTmh/football-bot

Function overview: This bot should add football player articles to team categories.

Links to relevant discussions (where appropriate): WP:BOTREQ

Edit period(s): One-time run

Estimated number of pages affected: I plan to try to run it on as many football player biography articles as I can. Based on Category:Association football players by nationality, which I could use to find all of them, there appear to be on the order of tens of thousands.

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): No

Function details: This bot should go through articles about football players, check the infobox parameters "clubs1" through "clubs40" and "currentclub", then add the article to categories of the format "Category:[team] players" for each team the person has played on. This is the first time I've written a Wikipedia bot, so I expect that it will take a while for me to get it working correctly.

Discussion
Is this something that there has been a discussion on elsewhere? (Are you the sole "requester" of this task?) — xaosflux  Talk 19:34, 23 June 2016 (UTC)
 * It was requested on WP:BOTREQ. Is this something that needs consensus? It seems like uncontroversial maintenance to me, but, again, I have no other experience with bots on Wikipedia. I wanted to learn more about writing bots, and this request looked easyish, so I took it. KSF  T C 19:41, 23 June 2016 (UTC)
 * I just noticed you left the links to discussion part blank. — xaosflux  Talk 19:43, 23 June 2016 (UTC)
 * I didn't think that was a "discussion", but I've added it now. KSF  T C 19:57, 23 June 2016 (UTC)
 * For adding to Category:[team] players - will you only be adding if the category exists? — xaosflux  Talk 19:46, 23 June 2016 (UTC)
 * I'm...not sure. Should I? KSF  T C 19:57, 23 June 2016 (UTC)
 * WP:REDNOT says no. So you will need to check if the category exists each time, prior to adding it to articles. —  xaosflux  Talk 01:26, 24 June 2016 (UTC)
 * I just changed it. It should now only add categories that exist. KSF  T C 01:37, 24 June 2016 (UTC)

As the person who raised this at the Bot requests page, I didn't want the bot to add the categories, I wanted it to identify missing categories and produce a list. There are going to be lots of complications with club identification which will need further sorting (see list below), so I thought it would be good for the bot to produce an output list which we can work on. The specific issues are: Happy to answer any more questions. Cheers, Number   5  7  12:05, 24 June 2016 (UTC)
 * 1) Missing categories (which has already been mentioned). I would like to create many of these, so the bot missing these off won't help.
 * 2) Renamed clubs – many clubs have changed names over the years (e.g. Newton Heath became Manchester United), so whilst the categories are at the current names, players who played for the clubs at the time of their former names would have that in the infobox, and so a matching category to add players to will not exist.
 * 3) Misnamed clubs – many articles do not have exactly the correct name of the club in the infobox (for instance linking to Manchester United rather than Manchester United F.C. or Wrexham F.C. rather than Wrexham A.F.C.), so the bot adding categories would in theory miss these out as there is no category for the incorrect link.
 * Oh, I misunderstood the request. Why do you want a list when a bot can fix it automatically instead? KSF  T C 12:50, 24 June 2016 (UTC)
 * Because I don't think it's possible to programme it to spot all the potential issues mentioned in points 2 and 3 above (there will be thousands of exceptions). Plus you've said it won't be creating new categories, so there needs to be a list of the missing ones to be created anyway. Number   5  7  15:39, 24 June 2016 (UTC)
 * It currently doesn't create categories only because I thought that wasn't what you wanted; it would be easy to change that. If you just want a list of articles with discrepancies between the infobox and the categories, we wouldn't need a bot that edits pages, so we don't need this BRFA. I can generate a list like that, but I'm not sure how a bot would be able to tell whether a team had changed its name. KSF  T C 19:19, 24 June 2016 (UTC)
 * If you can generate a list, that would be great – I can use Excel or something to sort it. How do you generate such a list out of interest? Number   5  7  16:23, 26 June 2016 (UTC)
 * I can use the API to get all the articles in a category, like Category:English footballers, and then check the "currentclub" and "clubs1" through "clubs40" parameters of the infobox, then check category links and compare them. I can check (imperfectly) for misnamed clubs by comparing the first words of the names; if they're the same, but the whole text is different, there's probably a mistake. There appear to be thousands to tens of thousands of articles just in Category:English footballers with discrepancies. If you're sure you don't just want them fixed automatically, I can withdraw this BRFA and give you a list. KSF  T C 16:29, 26 June 2016 (UTC)
 * I'd rather have the list if possible – I am worried there will be too many exceptions to make a bot workable. Cheers, Number   5  7  16:33, 26 June 2016 (UTC)
 * I'm now running it on that category. I'll put the list somewhere in my userspace and update here when it's done, which should be in a few minutes. Once we've confirmed that it's mostly working, I'll run it on more articles. Let's continue this discussion on my talk page so I can withdraw this request and it can be closed. KSF  T C 16:44, 26 June 2016 (UTC)

Per the above discussion, this bot is  KSF  T C 16:46, 26 June 2016 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.