Wikipedia:Bots/Requests for approval/STBot 2


 * The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section.

STBot task 2
Operator: ST47

Automatic or Manually Assisted: Automatic, with some human interaction

Programming Language(s): AWB

Function Summary: Link repair - link -> link; links -> links; selfreflink -> selfreflink

Also requesting interwiki link harvesting using pywikipedia

Edit period(s) (e.g. Continuous, daily, one time run): probably for about 4 hours, once a week, running off the latest database dump
 * that may have been a lie - I did a manual test run, 43 out of 45 pages needed edits, and I'm building a full list, it's at 25k with no sign of stopping, I may have to limit it to the first 3000 edits a day(3000/6 a minute = about 6 hours), and run it every day, to catch up, and then do maintenance every weekend after that.

the interwiki will probably be daily, 1 hour, not concurrent to the link repair, for times when I'm up and vandals are asleep

Edit rate requested: 6 per minute, enforced by a setting of 5 seconds in AWB to account for loading time. in non-peak hours when loading is faster, the delay will be brough up. I will try to run it overnight.

Already has a bot flag (Y/N): Indeed it does

Function Details:

link -> link links -> links selfreflink -> selfreflink link_link -> link link italics -> italics AWB's bad link fixing utility Unicoding of articles link bulletting heading capitalization < to & lt ;

running off a database dump, manually downloaded, harvested, and started.

interwiki link harvesting will follow links from en-wp and will try to find links to wikipedias that are not yet linked to

Discussion
I already run the bot, and it has a flag. It is already run daily with CFD/W work, I do not believe the 2 tasks will interfere with one another. Though, I think the bot will demand more cookies in exchange for the extra work... ST47 Talk 14:00, 7 October 2006 (UTC)
 * I have a one objection, you say "interwiki link sorting" there is no consenus on how they are sorted I would say remove that function and reguarding the other the others can you post some diffs and explain where and how you generate your lists. Betacommand (talk • contribs • Bot) 02:21, 8 October 2006 (UTC)
 * ok, noted above, thanks :) ST47 Talk 10:46, 8 October 2006 (UTC)

Ok, here's the stuff Beta asked for. The main point of the bot is link simplification, to unclutter the edit views and clean stuff up.
 * plurality and underscores
 * double piped link

And the list was generated using AWB from the most recent dump, using the parameter 'has links AWB will simplify' ST47 Talk 12:03, 8 October 2006 (UTC)

Also, the pywikipedia interwiki script - does that work well? ST47 Talk 12:43, 8 October 2006 (UTC)
 * Looks simple go ahead 50 edit trial approved Betacommand (talk • contribs • Bot) 03:08, 9 October 2006 (UTC)
 * ok, trial will occur tonight once the new dump is downloaded and extracted ST47 Talk 18:43, 10 October 2006 (UTC)
 * Trial complete - of 50 edits
 * 47 (94%) changed links
 * 3 (6%) changed interwiki or typos(expected to decrease when using a fresh dump)
 * 0 (0%) were worthless
 * ST47 Talk 20:48, 10 October 2006 (UTC)
 * after a convorsation with beta, i changed some settings, can I run another test? this one would be starting to eat into the list of needed changes, instead of getting 50 pages off the dump, I'll generate a list that I can remove edited pages from ST47 Talk 19:13, 12 October 2006 (UTC)

I would object to this bot. The above are all trivial tasks, and I believe the bot will just add too many items on people's watchlists and recent changes without accomplishing anything significant. I would suggest these functions be used in the context of more general cleanup, like spell-checking, bringing articles to Wikipedia style, etc. Oleg Alexandrov (talk) 03:11, 12 October 2006 (UTC)
 * well, I have a botflag, and all edits will be minor, shouldn't bother watchlists too much and won't bother RC at all - do you have an idea of how to focus a bot of the things you mentioned? Typos cannot be done with a bot, due to the messupability of a bot, whereas this is a fireandforget thing that will be running late at night ST47 Talk 23:12, 12 October 2006 (UTC)
 * given the changes that i asked for and the fact that this bot has a flag already it will not affect RC or watchlist as they can hide bot edits, I would approve this bot as it's functionality is useful, But I would like other BAG input first. Betacommand (talk • contribs • Bot) 23:31, 12 October 2006 (UTC)
 * read ST47 Talk 23:36, 12 October 2006 (UTC)

While I agree that this bot is useful and good as per Betacommand, I also partially agree with Oleg Alexandrov to some extent, but not for the same reason. There are a number of potential changes: plurality, underscores, double-piped links, self referential links, and interwiki links. I'll focus on the first 4. Plurality and double-piped links only serve to hurt the readability of the code behind the page, so there is a compelling reason to make these changes even though they are trivial and may enter people's watchlists. As mentioned above, bot edits to watchlists can be disabled, so this isn't a big deal. With self-referential links, this isn't so much of a big deal. They shouldn't happen, true, but they don't really overly hurt readability. Nevertheless, I don't mind if they occur. My problem is with changes that do nothing but change underscores. An underscore in a link is not a big deal and there is no compelling reason to make an edit to a page solely to change underscores to spaces. This adds database load without a clear benefit. So here is what I suggest. Do the plurality, double-piped link, and self-referential link fixes for any articles needing it, but only correct underscores if you are making one of the other 3 fixes. So basically only correct underscores if you were going to edit the page anyway. This will cut down on needless edit load to the databases. I don't mind if you process underscores during time periods where Wikipedia is experiencing low load. So make sure if you do just underscores that you do so at major off-peak times. -- RM 03:19, 13 October 2006 (UTC)

Approved per Betacommands recommendations and with my restrictions on underscore-fix-only edits. Bot shall run with a bot flag, since the edits are fairly trivial and many. (for the record, if my restrictions are too burdensome, then Be Bold and ignore them) -- RM 03:19, 13 October 2006 (UTC)

OK, though the underscore fix might not be removable, I'll have to check on that, also the list seems to have messed itself up, cause it hates me i guess, so I'll redo that. ST47 Talk 10:33, 13 October 2006 (UTC)


 * The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.