Wikipedia:Bots/Requests for approval/Qbugbot 3


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was

qbugbot 3
Operator:

Time filed: 05:01, Friday, March 29, 2019 (UTC)

This will edit pages created by qbugbot 2, updating references, photos, common names, and a few minor edits. Not all changes will be made to all pages, and some pages will not be changed.

Automatic, Supervised, or Manual: Automatic

Programming language(s): vb.net

Source code available: Yes. I will update User:Qbugbot/source before the first test.

Links to relevant discussions (where appropriate): There have been some comments, requests, and edits over the past year that have motivated to do this, but I have not requested a consensus on ToL. I think it will be non-controversial.

Edit period(s): 8-24 hours per day.

Estimated number of pages affected: 17,000

Namespace(s): Mainspace

Exclusion compliant (Yes/No): Yes

Function details:

Qbugbot2 created around 18,000 pages about a year ago. I'd like to make corrections and updates to these pages. These changes are a result of comments and page edits. Edits made to these pages since they were created will be preserved. The first 100+ edits by this bot will be reviewed manually.

1. "Further reading" and "External link" references will be updated, and in most cases cut back or eliminated. Any references in Further reading and External links that were created with the page will be removed and replaced with the new references from the current qbugbot database. This will provide fewer and more specific references in these areas. Any reference added by other editors will be retained as is. References are matched by title, or by authors and year. This item will affect most pages, and has been the source of most negative comments about qbugbot articles.

2. If the prose, infobox, and inline refererences have not been edited since an article was created, it will be updated with the following changes:
 * Wording in the prose may be updated, usually for the distribution range or common names, sometimes to correct errors.
 * Inline references will be updated. Sometimes more specific references will be added, and sometimes non-specific references may be removed (such as EOL, some redundant database references, and some database references without specific data on the article.)
 * The database sources for lists of taxonomic children (species list, etc.) will be removed. While this information might be handy, it makes it difficult for people to update the list. When list is edited, the source database information tends to be omitted.
 * Occassionally, the taxonomic information and children will updated.

3. Photos will be added if they are available and not already on the page. This will affect a minority of pages. The Photos have been manually reviewed.

4. Unnecessary orphan and underlinked tags will be removed.

5. External link to Wikimedia commons will be updated to handle disambig links properly, without displaying the "(beetle)" in something like "Adelina (beetle)"

6. The formatting of many references has been improved, correcting errors, adding doi's, etc. These will be updated in most cases. If the references has been edited since creation, it will not be changed.

Here is an example of a page editing manually using bugbot 3 content: Muellerianella

Discussion

 * You say that this will edit around "17,000" pages, despite creating ~18,000 - why not edit the other 1,000? --DannyS712 (talk) 20:29, 29 March 2019 (UTC)
 * Some pages have been changed so much that the bot can't successfully revise them without altering other people's edits, something I'd rather not do automatically and something that's probably not necessary in pages with significant additions. Some other pages won't need any of these changes, either because the changes have already been made through manual edits, or because the original pages happened not to need them. I am just estimating the 1,000 pages. It could be more or less than that. Bob Webster (talk) 00:38, 30 March 2019 (UTC)


 * Will you also be categorizing articles? Looking briefly at the pages the bot created where the bot's edit is not the current version, a theme I see is that the pages have been categorized by the year the species was described. (eg Special:Diff/840205159, Special:Diff/840205191, Special:Diff/840205216) Thanks, --DannyS712 (talk) 06:09, 30 March 2019 (UTC)
 * Also pinging who seems to have done most of that categorization --DannyS712 (talk) 06:11, 30 March 2019 (UTC)
 * I looked at this and decided to postpone it for another update. The main problem is that I could see no easy way to determine what was described in 1956 (or any year) -- insects? moths? spiders? animals? beetles? North American millipedes? I was also considering narrowing down some of the categories (bees to sweat-bees, etc.) as some editors have been doing, but I haven't found a reliable list of categories to use. The same thing applies the -stub templates. I would prefer to do these three tasks in another bot session. Bob Webster (talk) 03:00, 31 March 2019 (UTC)
 * Having done some of this categorisation [caveat: not so much recently], I have to agree that this problem exists, and there are various schemes of parent categories that are in use if the category you are assigning needs to be created. One could put everything into a higher level category, to await sorting, but I see no great advantage. I would accept it as WP:WORKINPROGRESS. William Avery (talk) 08:56, 1 April 2019 (UTC)


 * BAG assistance needed Bob Webster (talk) 04:18, 5 April 2019 (UTC)
 * I assume these edits will not be marked as minor? Primefac (talk) 14:56, 7 April 2019 (UTC)
 * That's correct. Bob Webster (talk) 22:02, 7 April 2019 (UTC)

Since I fixed a crapton of those citations myself, I'm rather enthusiastic about qbugbot cleaning up after its own mess. &#32; Headbomb {t · c · p · b} 05:25, 9 April 2019 (UTC)


 * 50 pages were updated, and are listed on the bot talk page. I found and fixed a couple of bugs. One prevented the introduction from being updated sometimes, and the other was a minor line spacing error. Bob Webster (talk) 23:04, 10 April 2019 (UTC)
 * BAG assistance needed Bob Webster (talk) 04:04, 17 April 2019 (UTC)
 * what exactly is the criteria for removal/addition here? &#32; Headbomb {t · c · p · b} 05:54, 17 April 2019 (UTC)

Also, see... &#32; Headbomb {t · c · p · b} 05:42, 17 April 2019 (UTC)
 * 
 * 
 * 
 * 
 * 
 * 
 * 
 * 
 * 
 * 
 * 
 * 
 * 


 * I've significantly reduced the number of references in further reading in new pages created by qbugbot. This was a manual, subjective process. The pages edited in qbugbot3 will have the original further reading references replaced with this new set. If a further reading reference has been added by an editor since page creation, it will be included in the edited page. The inline citations of qbugbot have also been updated. If the text of a page has not been edited, the original set of inline citations will be replaced.
 * I've corrected the references you listed, and fixed the problem of ending up with the same references in both inline citations and further reading.
 * Bob Webster (talk) 14:53, 17 April 2019 (UTC)
 * Also, EOL inline citations are removed even if the text has been edited. Bob Webster (talk) 14:56, 17 April 2019 (UTC)
 * I think that given the number of articles/citations affected, it would be a good idea to have a sandbox version of all references that will be used. Then you (or I, if you don't know how) could run citation bot on them, and see what the improvements are, and those could get implemented, reducing the future cleanup load. &#32; Headbomb {t · c · p · b} 15:12, 17 April 2019 (UTC)
 * Also, EOL inline citations are removed even if the text has been edited. Bob Webster (talk) 14:56, 17 April 2019 (UTC)
 * I think that given the number of articles/citations affected, it would be a good idea to have a sandbox version of all references that will be used. Then you (or I, if you don't know how) could run citation bot on them, and see what the improvements are, and those could get implemented, reducing the future cleanup load. &#32; Headbomb {t · c · p · b} 15:12, 17 April 2019 (UTC)
 * I think that given the number of articles/citations affected, it would be a good idea to have a sandbox version of all references that will be used. Then you (or I, if you don't know how) could run citation bot on them, and see what the improvements are, and those could get implemented, reducing the future cleanup load. &#32; Headbomb {t · c · p · b} 15:12, 17 April 2019 (UTC)


 * I think that would be good. I don't know how to run the citation bot, but I've copied all the citations to these sandbox pages. Can you run the bot on them? (A few of the citations are leftover and will never be used. It's easier to fix them all than sort them out, so don't worry if you see a few weird titles and dates.) Thanks!
 * User:Edibobb/sandbox/ref1
 * User:Edibobb/sandbox/ref2
 * User:Edibobb/sandbox/ref3


 * Bob Webster (talk) 15:56, 17 April 2019 (UTC)
 * User:Citation bot/use explains the various methods. Right now the bot is blocked, so only the WP:Citation expander gadget works. I'll run the bot on these pages though. There's an annoying bug concerning italics and titles though, so just ignore that part of the diffs that will result. &#32; Headbomb {t · c · p · b} 16:01, 17 April 2019 (UTC)
 * could you upload in batches of 250 citations? The bot chokes on pages so massive. &#32; Headbomb {t · c · p · b} 16:15, 17 April 2019 (UTC)
 * No problem, they're up now on User:Edibobb/sandbox/ref1 through User:Edibobb/sandbox/ref15 Bob Webster (talk) 17:36, 17 April 2019 (UTC)
 * The bot still crashes. Could you do 100 per page? &#32; Headbomb {t · c · p · b} 19:58, 17 April 2019 (UTC)
 * They're up now, 100 per page, User:Edibobb/sandbox/ref1 to User:Edibobb/sandbox/ref36 Bob Webster (talk) 23:09, 17 April 2019 (UTC)

what's the status on this? &#32; Headbomb {t · c · p · b} 00:55, 22 August 2019 (UTC)
 * I'm satisfied with the references and am ready to proceed. Bob Webster (talk) 06:05, 22 August 2019 (UTC)
 * &#32; Headbomb {t · c · p · b} 13:37, 22 August 2019 (UTC)
 * 50 pages were updated, and are listed on the bot talk page. I found and fixed these things:
 * If an editor had added multiple columns to the Further Reading section, the section would not be changed by the bot.
 * Inline commons tag was being added even if there was already a commons tag.
 * Subdivisions were being added to the taxobox for species and subspecies.
 * Some corrections were made to the edit summaries.
 * Bob Webster (talk) 23:08, 23 August 2019 (UTC)

BAG assistance needed
 * My bad, I thought I gave another extended trial for this, but the edit must have been lost somehow. Anyway, here goes another 100 edits to make sure all kinks are worked out. &#32; Headbomb {t · c · p · b} 17:39, 19 September 2019 (UTC)
 * 100 pages were updated, and are listed on the bot talk page. I didn't find any problems on any of the updated pages. (No problem on the delay -- it actually fit my schedule better)
 * Bob Webster (talk) 22:54, 19 September 2019 (UTC)

&#32; Headbomb {t · c · p · b} 17:47, 22 September 2019 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.