Wikipedia:Bots/Requests for approval/SportsStatsBot


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was

SportsStatsBot
Operator: and co-botop

Time filed: 00:12, Saturday, March 18, 2017 (UTC)

Automatic, Supervised, or Manual: automatic

Programming language(s): Python

Source code available: User:DatBot/footycode

Function overview: Bot automatically updates football (soccer) league tables

Links to relevant discussions (where appropriate): Special:Permalink/770569052

Edit period(s): Checks every 30 mins

Estimated number of pages affected: Minimum 2 templates, excluding transclusions. Minimum 53 transclusions.

Exclusion compliant (Yes/No): No

Already has a bot flag (Yes/No): Yes

Function details: A bot that automatically updates tables. The bot would take input from the sites that are customised at User:SportsStatsBot/footyconfig, and edits the template directly. The only manual thing would be to edit the bot's settings for relegations and promotions I believe. It would also be possible to turn off one of the leagues that the bot would manage, if there would be some weird event that the source updated incorrectly.

Discussion

 * This seems like a very worthwhile task for a bot - I can't see any policy reason for objection. Would the bot automatically shut off 30 minutes after the last game in the season has been played to avoid any errors in the edits regarding promotion/relegation? TheMagikCow (T) (C) 09:43, 18 March 2017 (UTC)
 * I'm not sure whether the page would have SEASON ENDED that I could look for in the html, but I believe it shan't edit if there have been no matches. Dat GuyTalkContribs 11:58, 18 March 2017 (UTC)
 * I'm not sure about looking in the html, but perhaps if the bot has not edited a season in - say 4 weeks - it would stop editing that template until it has been restarted with the updated information? I don't it's a huge problem, though. TheMagikCow (T) (C) 19:59, 18 March 2017 (UTC)


 * Note to any BAG member reading this: I plan on expanding this to include infoboxes. Should I make a new bot account? Dat GuyTalkContribs 11:58, 18 March 2017 (UTC)
 * I've created a new account. cc . Dat GuyTalkContribs 18:41, 25 March 2017 (UTC)

I could make an option called 'dryrun' which outputs the result of the dry run to a subpage such as User:SportsStatsBot/dry/[leaguename]x (x = Number of dryrun). Dat GuyTalkContribs 14:31, 29 March 2017 (UTC)
 * BAGAssistanceNeeded Comments? Dat GuyTalkContribs 18:30, 24 March 2017 (UTC)
 * Seems pretty nifty! What templates will the bot be editing, and how many transclusions do they have? If we're talking just a handful of pages then it's no biggie, but if there are hundreds, thousands, then that changes everything. The config page will have to at least be semi-protected, and if the bot is going to affect a lot of pages we may have to consider moving it to .js so only the botop and admins can edit it. Another important thing for this task is to properly handle edit conflicts, otherwise you may overwrite someone else's changes. Your code doesn't seem to do this, but then again as you know I don't speak Python very well :) I would also recommend that this task be exclusion compliant, especially if we're going to be editing in the mainspace.Just so you know, I'm off to WMCON tomorrow and won't be back till 3 April, so pre-apologies if I'm not very responsive. No need to wait for me, other BAGgers feel free to take over &mdash; MusikAnimal  talk  04:59, 25 March 2017 (UTC)
 * It'll edit the templates themselves. For example, it will edit Template:2016-17 Premier League table and not the pages it is transcluded on. It isn't currently added, but I could make it not edit if there's a conflict. Hope you have a great time in Berlin. Dat GuyTalkContribs 14:26, 25 March 2017 (UTC)
 * That template is transcluded 27 times, so editing it will affect 27 pages. This is very important information. Please update the "Estimated number of pages affected" (keyword affected), and make note of all templates the bot will edit. Thank you! &mdash; MusikAnimal  talk  10:02, 26 March 2017 (UTC)
 * Well, 53 pages would be a minimum since the bot is currently set up to edit Bundesliga and premier league. I've updated the "estimated number of pages affected." Is this ready for trial? Dat GuyTalkContribs 10:08, 26 March 2017 (UTC)
 * I feel that semi-protection on the config page is advisable, to prevent the potential of vandalism. TheMagikCow (T) (C) 17:37, 26 March 2017 (UTC)
 * I see 27 transclusions for 2016–17 Premier League table and 26 for 2016–17 Bundesliga table. Not sure if the pages that transclude them overlap, but if not we're up to 53 pages. It is really nice that you can simply add more templates and regex to the config, but the problem I see with that is that you'd need to somehow do testing first. I consider myself quite fluent with regex but I still wouldn't change it without doing a dry run with the bot. Obviously the room for error is much greater when you are not the bot operator, or very good with regex, so maybe a config isn't the best idea? What do you think? &mdash; MusikAnimal  talk  09:47, 29 March 2017 (UTC)
 * A dry config page and a dry output page sound like really useful testing tools that anyone could employ before passing the config to the live page. If you are willing to code those. — HELL KNOWZ  ▎TALK 15:53, 29 March 2017 (UTC)
 * It's been implemented. Dat GuyTalkContribs 16:17, 30 March 2017 (UTC)
 * I like it a lot! Just need to make sure it is well documented how to do a dry run, and encourage it before instructing the bot to update the actual template. On to the next question – what's up with User:SportsStatsBot/nbaconfig? Are we planning on doing NBA as well? &mdash; MusikAnimal  talk  21:40, 5 April 2017 (UTC)
 * Planning is the key word. Currently, there's enough for the statistics, but I've found difficulty of how to transition it onto the template and find whether a team has clinched a playoff spot. Dat GuyTalkContribs 13:55, 6 April 2017 (UTC)

It seems for this bot the dry run functionality is important, so let's do a trial of that first. The other major component missing here is documentation – User:SportsStatsBot currently only states that the account is a bot, nothing more. It would be good to explain what the bot does, and for highly configurable bots like this you should also explain all the available config options, and also how to do a dry run, etc. &mdash; MusikAnimal  talk  01:36, 10 April 2017 (UTC)
 * Conclusions from mini-run for a week or so (please don't consider that a full trial):


 * I let it run while on vacation, not a very good idea.
 * Bot took content from the template, and put it in the dryrun page. This made the trial effectively useless.
 * I'll change the code, so that if the page is not creating it takes from the template page. If the page is already created, it'll update itself, excluding the template.
 * Documentation has started
 * Thanks, Dat GuyTalkContribs 09:42, 23 April 2017 (UTC)

I've made a pretty simple fix at which should work for all normal template runs and most dry runs. Think it is time for maybe a live run. Dat GuyTalkContribs 21:32, 24 June 2017 (UTC)
 * I suppose I tried to debug stuff on the way, and that's why it took such a long time. Everything works well aside from the function of determining when it was updated. The diff system is by bytes, and the BBC page has Last updated 13 hours ago at the bottom. Every time there is a new digit, the bot takes it as a change. If anyone knows how to fix it, it will be appreciated. Dat GuyTalkContribs 10:09, 7 May 2017 (UTC)
 * How about parsing the integer out of that text? So with regex grouping you could do Last updated (\d+), which will return the number as a string, which you can parse into an integer. If it is different than what you have stored, then the bot makes the updates. &mdash; MusikAnimal  talk  23:55, 13 May 2017 (UTC)
 * SQL Query me! 04:02, 22 May 2017 (UTC)
 * Impossible since I use response.info["Content-Length"]. It doesn't get the content of the page directly, but gets specific attributes about it. Also, I tried doing removing Last updated[^a]*ago with Regex in a file but the length is still different for some reason. I thought about using a module named difflib, but I've never used it before. Dat GuyTalkContribs 17:47, 22 May 2017 (UTC)
 * I've changed a bit of the code. I believe it should work now. See . Dat GuyTalkContribs 15:56, 14 June 2017 (UTC)
 * Have you tried looking at the other headers? There is an ETag that I think will be updated when the content changes. You could keep track of that instead. Also, where is the dry run page? &mdash; MusikAnimal  talk  16:11, 16 June 2017 (UTC)
 * ETag fails. Trying to get a hang of kees08 on IRC. If we can't find a way, I'll have to find another site since it seems like I've exhausted all the options. Dat GuyTalkContribs 10:38, 22 June 2017 (UTC)
 * I will keep fixing it up, but is there any content this is missing from the bot documentation page or the user page that I can add?  Kees08  (Talk)   01:17, 7 July 2017 (UTC)
 * Sorry for the very, very long delay! I think we can move forward with a live trial. Let me first make sure I've got this right: Based on the config, the bot would be editing Template:2017 League of Ireland Premier Division table, Template:2016–17 Bundesliga table, Template:2016–17 Premier League table, correct? Next, I think we should put a notice up on these template pages saying they will be automatically updated by a bot (and link to the bot userpage). You might also write a note to the primary maintainers of those templates, so we don't catch them off-guard. They might be willing to help vet the data, too. Let me know when we've done these things, and we'll get a trial going :) &mdash; MusikAnimal  talk  17:18, 13 August 2017 (UTC)
 * I'd suggest starting with only the Irish one since its much less popular than the Premier League (that has started two days ago) and the Bundesliga (which hasn't started yet). Also, BBC have changed their format, so it's either impossible or more difficult to adapt. Haven't tested it yet, since I've had some problems catching Kees due to our different timezones. Dat GuyTalkContribs 14:22, 14 August 2017 (UTC)
 * Very well then. I guess let me know when you've adapted the code to work with the new format. I saw the bot was still doing test runs and they looked OK (I think), which is why I was ready to start a trail. I would make sure your bot looks for the format it expects, and if it detects it's some other format, abort entirely rather than try to parse and potentially make incorrect edits &mdash; MusikAnimal  talk  16:57, 16 August 2017 (UTC)


 * OperatorAssistanceNeeded Is there any update on this?— CYBERPOWER  ( Chat ) 11:42, 31 August 2017 (UTC)
 * The Airtricity league ends on October 27. That's 7 'match days,' which are sometimes more than one day. That should be about 21 edits, if we start soon and do follow through until the end of the season (one long BRFA, eh?). That might be good for a trial? I don't think we should worry about the Premier League and Bundesliga because they won't be edited by the bot. If we do choose to do another trial of one of the aforementioned leagues, then we could use soccerway. If not, then I'm going to keep trying to catch up to and figure out a way to maybe use BBC once more. Dat GuyTalkContribs 17:47, 31 August 2017 (UTC)


 * This is probably the longest trial I will ever approve. Let's get started and report your results when the trials end.— CYBERPOWER  ( Chat ) 16:26, 3 September 2017 (UTC)
 * can you please post the results?— CYBER POWER  ( Trick or Treat) 18:18, 28 October 2017 (UTC)
 * Whoops, forgot to do that. Anyways, there weren't any errors. There were a few times where another IP and a user updated the template, but there were still 16 edits. See contribs. One time, the IP accidentally changed the first digit of one of the statistics to the incorrect one, and then the bot fixed that . Dat GuyTalkContribs 18:22, 28 October 2017 (UTC)


 * — CYBER POWER  ( Trick or Treat) 18:34, 28 October 2017 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.