Wikipedia:Bots/Requests for approval/MacMedBot


 * The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Symbol oppose vote.svg Withdrawn by operator.

MacMedBot
Operator: MacMed

Automatic or Manually assisted: Automatic

Programming language(s): Python

Source code available: Yes

Function overview: MacMedBot finds pages that have been tagged with PROD in the past, but don't have the oldprodfull tag on their talk page.

Edit period(s): Continuous to get the majority of what is missing now, then probably once a week or so.

Estimated number of pages affected: Any page missing, could be hundreds or thousands, I am not really sure.

Exclusion compliant (Y/N): N.

Already has a bot flag (Y/N): N.

Function details: MacMedBot searches pages per category (including subcats). It looks through past revisions for "{{datedprod", and if it finds one, it toggles to the talk page. It then checks the talk page for an existing tag, and adds one if it cannot be found.

Discussion
Essentially the idea here is that you would specify a (preferably large) category for the bot to work on when starting the script, such as Category:Living people, Category:Medicine, or Category:Arthropods, and the bot would process every page in that category and its subcategories, then shut off when done. I hope this clears this task up a little, which was probably rather ambiguous at first. The Earwig (Talk &#124; Contribs) 19:18, 6 September 2009 (UTC)
 * I had a short look trough the code, although I don't know Python. But looking at it, it seems the bot will just add oldprodfull, without any parameters. Would it bee possible to add parameters? Such as what the concern was, what date it was PRODed, who contested, what date it was contested, etc. Also, will the bot check for Template:OldPRODfull? - Kingpin13 (talk) 11:29, 7 September 2009 (UTC)


 * As it stands, I think this is far too wasteful for the minimal benefit gained. Some performance issues I see:
 * If you're going to use categories, you should make a list somewhere of what pages its already done, else you're going to end up checking a lot of pages multiple times.
 * It doesn't check for an existing oldprodfull until after it gets the content of 50 revisions. You could check for oldprodfull by just getting the text of 1 revision, or better yet, just the list of templates on the talk page
 * It gets the last 50 revisions unconditionally. If you absolutely must check the revision text, you should get them one at a time. A) you can better throttle your requests, B) You won't download 49 unnecessary revisions if the page is currently prod'ed.
 * Some general code review comments:
 * The way it searches for prod templates seems rather odd. You cut off the current revision from the revision list, then return the last revision as "check" and if "check" contains the dated prod template, you assume the prod is still active. What if it was the current revision was the one that removed it?


 * What on earth is this? If you're only dealing with articles, getting the talk page title should be as simple as prepending "Talk:" to the page title.


 * For one, this doesn't work, it should be  (datetime is a class inside the datetime module), second, this is going to give you dates like "2009-09-07 21:37:05.127097".


 * One of those comments is wrong...
 * It only seems to log successful actions - that's generally the one thing you don't need to log, as MediaWiki does that for you. If you need to log something, it should probably log errors.
 * -- Mr.Z-man 21:59, 7 September 2009 (UTC)


 * MacMedtalk stalk 22:26, 8 September 2009 (UTC)


 * ✅. I have added the nomreason parameter, and fixed the problems pointed out by Mr. Z-man. The code should run without a problem now. MacMedtalk stalk 00:52, 12 September 2009 (UTC)

Is the bot still planning on looking through every revision of every page to find dated prod? I agree that is a massive waste of resources (and would also take ages to get anywhere). --ThaddeusB (talk) 20:50, 12 September 2009 (UTC)
 * No, it only searches the last 50 revisions of the page. MacMedtalk stalk 21:28, 12 September 2009 (UTC)
 * OK, but surely there is a better to accomplish this check than loading the talk page and then (if necessary) the last 50 revisions for all 3 million articles? --ThaddeusB (talk) 02:13, 14 September 2009 (UTC)
 * The point of loading the talk page is so that we don't waste time looking through a page, then discovering that it has already been tagged with . And the search through the revisions goes one by one through each revision (via the API), so that it stops as soon as it finds the PROD. This is as optimized as it can get, and the point is that a bot will take that time to look through the revision history. Yes, it could take some time, but without this bot, a human would take much, much longer. MacMedtalk <sub style="color:black;">stalk  03:59, 14 September 2009 (UTC)
 * I fully understand the point of checking the talk page first. I think, perhaps, you are missing my point.  3000000 x 40 (estimate - not 50 since some pages have less than 50 total revisions) is an awful lot of queries to find the <1% of articles that have been previously PRODed.  At one query a second (which is the absolute maximum it should be pulling) that would take 1400+ days to go through them all. Anyway, its not up to me to decide if it is worthwhile - that is BAG's job.  --ThaddeusB (talk) 21:06, 17 September 2009 (UTC)

BAGAssistanceNeeded Perhaps I could run a trial on a small cat? <b style="color:green;">MacMed</b><sup style="color:red;">talk <sub style="color:black;">stalk 20:55, 15 September 2009 (UTC)
 * So are you now checking the revisions one at a time? Or still loading all 50 in one go. Also, there is a bot which currently does something similar to this, but it patrols the PROD category, and the op says he may switch it off should this bot become active (see Wikipedia_talk:BRFA). - Kingpin13 (talk) 08:39, 16 September 2009 (UTC)
 * Yes. The bot searches the talk page for oldprodfull first, then processes the page if oldprodfull is not found. After that it goes through revisions one at a time, stopping once it finds one. It will also place the concern of the prod in the |nomreason= parameter of oldprodfull. Regards, <b style="color:green;">MacMed</b><sup style="color:red;">talk <sub style="color:black;">stalk 13:01, 16 September 2009 (UTC)
 * Okay, this cat should have at least one - Kingpin13 (talk) 14:52, 16 September 2009 (UTC)

(outdent)
 * There were a few bugs in the code, so if you'd like I can test on another small cat before getting started for real. (Note:The bugs were caused by the software update.) <b style="color:green;">MacMed</b><sup style="color:red;">talk <sub style="color:black;">stalk 21:10, 17 September 2009 (UTC)
 * Couple of suggestions:


 * 1) oldprodfull offers a number of additional parameters. You shouldn't have any problem getting at least the nom date (it's right in the dated prod code) and the person who removed it isn't all that difficult to pull either - it is simply the person who made the revision immediately after the last one with the template (i.e. the revision the bot pulled before the matching one.)
 * 2) If the talk page has "{{oldafd" I'd skip the page. Yes, it may also have been prodded, but AfD always overrides prod (an article sent to AfD that survives can never be prodded again) so there is no need to have both templates on the talk page.
 * --ThaddeusB (talk) 21:20, 17 September 2009 (UTC)
 * What's the status of this? I can find another category for the bot to do a trial on, but I'd like to at least see ThaddeusB's first suggestion implented first, it shouldn't be too hard to identify more info. - Kingpin13 (talk) 04:43, 30 September 2009 (UTC)
 * {{BotWithdrawn}} for now. I have a lot of stuff going on in RL, maybe I'll come back to this later. <b style="color:green;">MacMed</b><sup style="color:red;">talk <sub style="color:black;">stalk 01:59, 2 October 2009 (UTC)

'kay, I've marked it as such for now, feel free to reopen anytime (let me know if you need help reopening the requests). Best, - Kingpin13 (talk) 07:58, 2 October 2009 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.