Wikipedia:Bots/Requests for approval/Orphaned talkpage deletion bot


 * The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Symbol keep vote.svg Approved.

Orphaned talkpage deletion bot
Operator: Chris G

Automatic or Manually Assisted: Auto

Programming Language(s): PHP using my classes

Function Overview: Deletes orphaned talkpages per WP:CSD

Edit period(s): Weekly

Already has a bot flag (Y/N): N (needs +sysop as well)

Function Details: Gets a list of orphaned talkpages from the toolserver. Deletes pages that match the following conditions:
 * have not been edited in the last seven days
 * don't transclude go away or G8-exempt (and are not in Category:Wikipedia orphaned talk pages that should not be speedily deleted)
 * are not redirects
 * don't have 'archive' in their title

Discussion
source -- Chris  12:12, 15 May 2009 (UTC)


 * How many such pages are the currently? Are they all left-overs from deleted articles? -- User:Docu
 * I suppose the answer to my first question is at Database_reports/Orphaned_talk_pages. -- User:Docu


 * Are you considering deleting image talk pages? Because when an image is moved to Commons, sometimes the talkpage is necessary. Also, in userspace and Wikipediaspace, sometimes it's intentional to have a talkpage without an mainpage. – Quadell (talk) 12:26, 16 May 2009 (UTC)
 * Also, "Editnotice" and "Comments" shouldn't usually be deleted when they are in the name. –Drilnoth (T • C • L) 18:55, 16 May 2009 (UTC)

Just had an idea. The bot will only delete pages where there was a mainpage but it was deleted. This should avoid a lot of the possible false positives. Also the bot ignores the user space. -- Chris  12:59, 17 May 2009 (UTC)
 * What about Wikipediaspace? Imagespace? – Quadell (talk) 13:10, 17 May 2009 (UTC)
 * It will only delete pages in the Talk,Wikipedia talk,File talk,Template talk,Help talk,Category talk or Portal talk namespaces. -- Chris  12:40, 22 May 2009 (UTC)
 * CSD G8 expressly"excludes any page that is useful to the project, and in particular: deletion discussions that are not logged elsewhere, user and user talk pages, talk page archives, plausible redirects that can be changed to valid targets, and image pages or talk pages for images that exist on Wikimedia Commons"
 * How will the proposed bot task exclude the following pages from deletion?
 * any page that is useful to the project
 * deletion discussions that are not logged elsewhere
 * Erik9 (talk) 02:11, 18 May 2009 (UTC)
 * No bot like this can get things 100% correct however I feel by using the checks mentioned (must have had a deleted mainpage, title can't contain words like 'archive','editnotice', etc) should reduce the chances of false positives and make the bot worth running. - Chris  12:40, 22 May 2009 (UTC)

Not deleting redirects seems silly. But you do need to avoid subpages (/Comments, /Archive, /Editnotice, etc.). --MZMcBride (talk) 03:26, 18 May 2009 (UTC)
 * How so? Just because a redirect does not have a mainpage does not make it any less valid. -- Chris  12:40, 22 May 2009 (UTC)

The query you copied is actually outdated. The better query is available here: http://en.wikipedia.org/w/index.php?oldid=290842965 It JOINs against Commons appropriately and such to get File_talk: orphaned talk pages and has some other fixes in it. --MZMcBride (talk) 01:11, 19 May 2009 (UTC)
 * Thanks :) I've updated the query. -- Chris  12:40, 22 May 2009 (UTC)
 * Will using this query prevent the deletion of file talk pages where the image move transwikied to Commons? – Quadell (talk) 13:05, 22 May 2009 (UTC)
 * Yes. -- Chris  13:23, 22 May 2009 (UTC)

Would you be willing to tag the pages db-g8 for a trial, instead of deleting them, so we can better see how it works? If so, – Quadell (talk) 17:36, 22 May 2009 (UTC)
 * Results can be seen here. I've decided to remove the mainpage log check as it doesn't seem to prevent any false positives and causes some false negatives. The one problem that I did pick up was images like File talk:Fieldscape.jpg where the image has been moved to commons but under a different name and thus wasn't excluded by the sql query. Thoughts? -- Chris  05:37, 23 May 2009 (UTC)
 * What do you mean by "the mainpage log check"? – Quadell (talk) 13:17, 23 May 2009 (UTC)
 * From above: "Just had an idea. The bot will only delete pages where there was a mainpage but it was deleted." -- Chris  01:22, 24 May 2009 (UTC)
 * That concerns me. It seemed like a great way to prevent false positives. How else can you be sure that Talk:Article/ObscureArchiveName actually needs to be deleted? – Quadell (talk) 14:38, 24 May 2009 (UTC)
 * The bot avoids subpages. -- Chris  08:09, 25 May 2009 (UTC)
 * Okay, that's fine then. – Quadell (talk) 00:21, 26 May 2009 (UTC)
 * When an image is moved to Commons under a different name, that can be a serious problem. Most of the time it should be okay to delete the talkpage, but not all the time. Most of the time it should be evident from the deletion log that the image was moved to Commons, but not always, and you can't count on that. I'd recommend not deleting files in filespace, but instead tagging them with That way a human can see and make that decision. What do you think? – Quadell (talk) 00:21, 26 May 2009 (UTC)
 * Sounds good, I've updated the code to do this. -- Chris  08:41, 26 May 2009 (UTC)
 * The bot currently admin-shops such pages since it doesn't appear to check whether it tagged them before or not. Amalthea  14:06, 26 May 2009 (UTC)
 * In the event a page shouldn't be tagged all the admin has to do is add go away -- Chris  10:15, 27 May 2009 (UTC)

Looking through the test results, I don't see any false positives. I appreciate that you tested a number of pages in all namespaces. My only real question is, where are the bot's edits? It appears that most of these were deleted by MBisanz for CSD G8 reasons, but there is no record of the bot adding the g8 tag. This is also true of Template talk:Adirondack Phantoms and Talk:Slaters, which were not deleted. What happened here? – Quadell (talk) 13:39, 26 May 2009 (UTC)
 * Basically I did a run of 50 edits which can be seen here. When it became clear that a, 50 edits was a tad limiting for fully testing the bot - b, Since most of the pages would be deleted it would be hard for non admins to see which pages were tagged; I did a 2nd dry run in which the bot just listed the pages it would delete here -- Chris  10:15, 27 May 2009 (UTC)

But with a caution on the bot-op's talkpage. – Quadell (talk) 18:30, 29 May 2009 (UTC)


 * The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.