Wikipedia:Bots/Requests for approval/Dreamy Jazz Bot 2


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was

Dreamy Jazz Bot 2
Operator:

Time filed: 23:33, Wednesday, January 16, 2019 (UTC)

Automatic, Supervised, or Manual: automatic

Programming language(s): Python

Source code available: On request. An understanding of python and regex would be helpful if a BAG member wanted to check the code.

Function overview: Link the root article and relevant root categories to every portal when they are not already linked

Links to relevant discussions (where appropriate): Portal guidelines, Wikipedia talk:WikiProject Portals/Design and Wikipedia talk:Portal guidelines (specifically Wikipedia talk:Portal guidelines)

Edit period(s): daily using a petscan for new portals and every 15 days for all portals

Estimated number of pages affected: First run the bot will check all portals, which would amount to ~1500 pages affected. Subsequent runs should be normally less than 20 pages.

Exclusion compliant (Yes/No): Yes. The bot will respect the templates.

Already has a bot flag (Yes/No): No

Function details: This bot will link the root categories and root article to the portal. The steps involved are:
 * 1) The bot will generate a list of portals to check, either:
 * 2) * using a list of new portals through petscan (new portals means all portals since the bot was run) which are in Category:All portals
 * 3) * using all portals in Category:All portals
 * 4) For each portal find the root categories and root article by scanning the wikitext with regex
 * 5) For each root article/root category check to see if the page is linked, if not:
 * 6) If the page already contains a "portal linking template" (i.e. Portal), then the bot will add the associated portal as a parameter to this template (except templates which only take one parameter, like Portal-inline)
 * 7) If the page does not contain a "portal linking template" (i.e. Portal), then add Portal to the top of category pages and Portal-inline to the See Also section of a main space page. If no See Also section is found the bot will attempt to add one, but if the bot cannot it won't link the portal
 * 8) The bot will also perform other checks/tasks on the portal once the links are added. More tasks will go to a BRFA first, unless the only page affected is the bot's or my userspace. The tasks which the bot will carry out for now are:
 * 9) *Check for and list empty portals on a subpage of the bot. Separately, if the bot detects an empty portal the bot will not link the portal as it is most certainly an incomplete portal. See User:Dreamy Jazz Bot/Empty portals for current list in bot userspace.
 * 10) *Place Category:Portals needing placement of incoming links on portals which do not have the links wanted by Portal guidelines. When checking the bot will purge the template(s) used (unless the portal is already linked), as several navbox templates will automatically detect the existence of a portal then adding the link.
 * 11) **The bot will output a page of statistics in the portals wikiproject space to show what portals need what links. The page will list each portal with the link to the pages needing the links.

The bot will run to check links to new portals (through petscan) every day and will check all portals (using all portals in the category) every 15 days. The links should be added per Portal guidelines. I, however, am not adding links to navigational boxes due to the variation in how the templates are built/coded and sometimes templates will automatically link to portals. The bot flag would be useful, as the task of linking portals is a small and gnomish task which would unnecessarily clog up watchlists. For statistics which require editor attention I will ensure the bot does not use the bot flag.

See User:Dreamy Jazz Bot/Task 2/data for the console output from the script when it ran through some of the categories on Portal:Mathematics.

Discussion
BAGAssistanceNeeded Dreamy Jazz 🎷 talk to me &#124; my contributions 11:10, 27 January 2019 (UTC)
 * should this get approved, be sure edits under task 1 do not use the 'bot' designation, but this one should use it. — xaosflux  Talk 20:53, 27 January 2019 (UTC)
 * , I will. Thanks, Dreamy Jazz 🎷 talk to me &#124; my contributions 20:54, 27 January 2019 (UTC)


 * please post the results of your trial here. This may have multiple trial runs, but an initial run will help demonstrate the edits. —  xaosflux  Talk 20:56, 27 January 2019 (UTC)
 * , thanks. I will get the editing part coded in for the trial. Dreamy Jazz 🎷 talk to me &#124; my contributions 20:57, 27 January 2019 (UTC)
 * Just to let you know, as part of the bot's functions, the bot makes touch edits to ensure that the page info is fully purged. This should never produce an actual edit saved, but there seems to be a bug with pywikibot which removes whitespace at the end of a page when touching a page. I have reported it at and the page affected was Portal:County Durham. This edit was made before the trail period was granted above, so wanted to report it here incase it was noticed by others. I can assure you that it is an issue with pywikibot and not the bot. Dreamy Jazz 🎷 talk to me &#124; my contributions 22:03, 27 January 2019 (UTC)
 * , . Two problems encountered:
 * The category Category:Portals needing placement of incoming links was added when it was already used on the portal. This was fixed half way through the run and I was able to confirm in the second half that the fix fixed the problem. See this diff for the problem.
 * The adding of the Portal-inline template was always added to the top of the see also section, but sometimes this is not what is needed. To fix this, the bot now tries to add Portal-inline after the first occurrence same template on the page. This would fix the case with the Dragon article. In the 25 edits this only was a problem once and is minor cosmetic issue. If the Portal-inline template is not found on the page, it is added as usual to the top of the See Also section.
 * Apart from these issues the run went smoothly. I will manually fix the pages affected by these fixed issues. Dreamy Jazz 🎷 talk to me &#124; my contributions 12:42, 28 January 2019 (UTC)
 * Apart from these issues the run went smoothly. I will manually fix the pages affected by these fixed issues. Dreamy Jazz 🎷 talk to me &#124; my contributions 12:42, 28 January 2019 (UTC)


 * Is Task 2 supposed to make errors like this one? w umbolo   ^^^  13:32, 28 January 2019 (UTC)
 * , that is not an error with the bot. The portal Portal:Corals uses Category:Mutation as it's root category. Therefore, the bot has to assume that this is a "root category". I think it would be unreasonable to assume that the bot has to work out when editors have made mistakes, as this would then violate WP:CONTEXTBOT. Dreamy Jazz 🎷 talk to me &#124; my contributions 13:47, 28 January 2019 (UTC)
 * I do agree that this is not an appropriate link, but the editor used this as the root category for the portal. I will change the root category to an appropriate one. Dreamy Jazz</i> 🎷 talk to me &#124; my contributions 13:53, 28 January 2019 (UTC)
 * I think these errors should be limited and are down to editor mistake. I will output a file with links added to pages, which I can review after to ensure that the links added are correct, but I feel that these problems will be few and far between, being down to editor mistake. The automated portal design uses the pagename of the portal to get the portal root article and category (through PAGENAME), so for this to happen an editor has to deliberately copy and paste in wrong wikitext. This will be a minor problem, but because each day the portal namespace only gets under 20 new pages, checking that the links are appropriate should be easy for me. Dreamy <i style="color:#d01e1e">Jazz</i> 🎷 talk to me &#124; my contributions 14:14, 28 January 2019 (UTC)
 * , please post results here and continue to link to this BRFA in the edit summaries. -- The SandDoctor Talk 20:35, 29 January 2019 (UTC)
 * Ok onto it now. Dreamy <i style="color:#d01e1e">Jazz</i> 🎷 talk to me &#124; my contributions 21:39, 29 January 2019 (UTC)
 * Ran into a script error, which did not affect any pages. This has been fixed and now moving on. Dreamy <i style="color:#d01e1e">Jazz</i> 🎷 talk to me &#124; my contributions 21:53, 29 January 2019 (UTC)
 * Fixed issue with adding to Category:Shaquille O'Neal. See this diff. Dreamy <i style="color:#d01e1e">Jazz</i> 🎷 talk to me &#124; my contributions 18:57, 30 January 2019 (UTC)
 * , . Ran into two minor issues. One did not affect pages and one was caused a silly mistake in the regex, which is now fixed and only occurred once. No other issues and the run went well. Had to stop the bot editing a few times, but this (except from noted) wasn't because of any errors. The problems noted here are shown above. Dreamy <i style="color:#d01e1e">Jazz</i> 🎷 talk to me &#124; my contributions 19:10, 30 January 2019 (UTC)
 * , Why does the bot mark all edits as minor? SQL <sup style="font-size: 5pt;color:#999">Query me!  00:54, 31 January 2019 (UTC)
 * , pywikibot automatically marks all edits as minor unless you say otherwise. I can tell pywikibot to mark edits as not minor if needed. Dreamy <i style="color:#d01e1e">Jazz</i> 🎷 talk to me &#124; my contributions 07:09, 31 January 2019 (UTC)
 * , Nah, I was just curious. I'm going to give this another day or two for input, then I will likely approve it. SQL <sup style="font-size: 5pt;color:#999">Query me!  18:42, 31 January 2019 (UTC)
 * SQL <sup style="font-size: 5pt;color:#999">Query me! 05:50, 2 February 2019 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.