Wikipedia:Bots/Requests for approval/SineBot


 * The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Symbol keep vote.svg Approved.

SineBot
Operator: slakr

Automatic or Manually Assisted: Automatic

Programming Language(s): PHP5 (cli).

Function Summary: Fill-in for HagermanBot, which has been missing in action.

Edit period(s) (e.g. Continuous, daily, one time run): continuous

Edit rate requested: 5 or so edits per minute during peak times, most likely much less during non-peak times.

Already has a bot flag (Y/N):N

Function Details:
 * I'll try to avoid transcluding the deja-vu details from User:SineBot, but it functionally aims to replicate HagermanBot, including exclusion lists and everything. It is also almost completely written/tested, with only two exceptions: first, I need to test out submitting the actual edit form using a sandbox, and second, I need to test out maintenance of the opt-out exclusion lists, since that also requires live edits.  Ideally, I would like to have the bot post would-be edit summaries to User:SineBot/Log for a few days so that others can notice any potential bugs/problems that I might miss and bring them to my attention.  Those edits would be queued to only occur, at max, every minute (or longer, if you would like).


 * I'm not sure what to do about the two categories HagermanBot previously monitored. I was originally planning on creating identical categories for "SineBot" instead of "HagermanBot."  However, I planned for this, so I can easily point the bot to any category/categories that I/others want it to monitor, so it can just as easily monitor the HagermanBot categories as well as its own, if that behavior is so desired.

If you have any questions/comments/concerns, please do not hesitate to ask.

-- slakr 19:13, 24 July 2007 (UTC)

Discussion
See Bots/Requests_for_approval/OverlordQBot. --Android Mouse 20:08, 24 July 2007 (UTC)
 * I figured that since there's been no activity on OverlordQBot for a month, it was more or less safe to submit this one, since it's already written and almost ready to go. -- slakr  04:45, 25 July 2007 (UTC)
 * Have you talked to OverloardQ? I asked him about this yesterday ane he still appears to be working on it. --Android Mouse 06:42, 25 July 2007 (UTC)
 * Hmm... well, if that's the case, I say he should keep working on it, since there's really no reason to tell anyone to scrap either project. After all, if HagermanBot could disappear, any bot could; so, it would make sense to have a backup ready to go just in case anything happens in the future.  Likewise, if you'd rather his bot (whenever he finishes it) do the signing for now, that's totally cool as well.  Of course, if something doesn't work out-- no matter who the author is-- it would make sense that instead of waiting several months for a new bot to be developed we could instead have one ready to be activated on request.
 * Though, this does bring up an interesting point related the the discussion directly after this one: should we, instead of creating categories that are specific to the certain bots, simply create more generic categories?  For example, instead of having "Non-talk pages automatically signed by SineBot/HagermanBot/WhateverBot," it might be a good idea to have, instead, a category named "Non-talk pages that are automatically signed" and "Non-talk pages with subpages that are automatically signed."  That way, the bot that's currently active would be the "primary" bot for handling that category.  If, for whatever reason, that primary bot goes down for an extended period of time/becomes inactive, the "backup" bot could be activated by its owner to assume the primary's duties without having to worry about changing category names and such.


 * Naturally, only one bot would be active at any given time, but another option would be splitting the bots' duties by namespace. Have one bot handle the "talk" namespaces while another handles non-talk pages and their subpages.  Long story short: they're all easy patches, and they wouldn't interfere with each other (excepting, perhaps, in placing  warnings on talk pages.  The "pain in the neck" would come with exemption lists, which technically could also be thrown into categories as well (ie, instead of users using each individual bot's OptOut list, have users simply place a generic "opt out" category on their user pages that opts them out of autosigning. Honestly, that's how I would have designed it in the first place, but I wanted to replicate HagermanBot as closely as possible, so I stuck with the HagermanBot way of getting users to post to a bot subpage.  I actually wouldn't mind the category approach, though, because it'd actually make development easier, since populating categories from api.php is a quick'n'easy task that doesn't require extraneous edits. :P  -- slakr  08:51, 25 July 2007 (UTC)
 * I didn't mean to sound like I have a preferance for him to run the bot rather than you. It honestly doesn't matter to me, I was just trying to make sure both of you knew you were working on the same thing. As for the categories, I like what you suggested of having a "Non-talk pages that are automatically signed" to avoid having to create yet another category if one of your bot should eventually need to be replaced for some reason. --Android Mouse 17:39, 25 July 2007 (UTC)


 * Would support the approval with of this brfa since his solutions appears ready to go and I have evil gremlin stuck in my code. Q  T C 19:59, 25 July 2007 (UTC)

The bot uses categories such as Category:Non-talk pages automatically signed by SineBot; will it also be compatible with HagermanBot's categories? — Madman bum and angel (talk – desk) 21:22, 24 July 2007 (UTC)
 * Yes. The two categories work just like HagermanBot's categories.  Pages added to Category:Non-talk pages automatically signed by SineBot will be monitored singly, while pages in category Category:Non-talk pages with subpages automatically signed by SineBot will have their children (i.e., any page created/active underneath them) monitored, but the parent, itself, will not be monitored unless it's added to the former category.  -- slakr  04:45, 25 July 2007 (UTC)


 * Okay, so either you will need to include Category:Non-talk pages automatically signed by HagermanBot until users fix the category, or someone will need to do a run through and replace Category:Non-talk pages automatically signed by HagermanBot with Category:Non-talk pages automatically signed by SineBot and the like. I can replace the categories if its needed. ~   Wi ki  her mit  06:38, 25 July 2007 (UTC)


 * Well, I think in that case, it should go the normal route since I'd personally prefer a more generic category rather than having two or three separate categories which serve exactly the same purpose. In any case, I'd definitely support a trial (and I'd obviously still be happy to help out in the testing department ;). It would also be interesting to see if any problems occur with multiple signing bots running simultaneously. There might be some issues with regard to race conditions but I don't anticipate that to be a major problem and, frankly, I'm not quite sure how the Mediawiki software handles these. -- S up? 08:24, 25 July 2007 (UTC)


 * Lol, you said what I was about to say: I can technically do both.  The inclusion array of categories inside the bot doesn't care if there are redundancies or duplicates (e.g., some people list talk pages redundantly in the above categories or list subpages despite a page already being listed with subpages).  It doesn't really matter; so, I would say it could watch both HagermanBot's old categories and the bot's new categories, because even if a certain page is found in both, it won't matter.  The increased server load is trivial, because the bot caches the contents of the categories every five minutes, so an extra category request to api.php is relatively simple.  I should point to the above discussion, however, to note a similar issue with categories.  We could simply convert the category names over to something more generic so that, heaven forbid, should anything happen to this bot, future bots won't have this tedious issue.  With regard to race conditions, I noted that above with regard to possibly having a primary and secondary bot, or splitting duties between varying namespaces. -- slakr  08:51, 25 July 2007 (UTC)


 * I'm a big fan of the proposal to make the category name more generic at this time, regardless of whether or not OverlordQBot and/or SineBot get/gets approved. Bot turnover happens.  And when it does, we don't want to deal with this again.  — Madman bum and angel (talk – desk) 13:30, 25 July 2007 (UTC)
 * I am also a big fan of both proposals. I'd also support a generic category so that any bot can go and utilize it.  I can move the categories with my bot if my bot request is approved if you'd like. ( [ →]O - RLY?) 17:16, 25 July 2007 (UTC)

Cool. Well, it seems from the replies that we agree that more general categories would be ideal for future-proofing. Tentatively, I'd like to propose these as replacements: I also like the idea of adding a third category, which would deviate from the original HagermanBot, but would also help to future-proof: a category for users who opt out of autosigning. Essentially, a user would place the category on his User: page, and the bot would simply read category members of that category for User: namespace pages and stop signing comments from those users. It would ignore non-User: space pages added to that category. Thus, if it's okay with everyone else, I would like to propose, in addition to the two categories above, a third category: If anyone has any other ideas, please lemme know. :) -- slakr  17:29, 26 July 2007 (UTC)
 * Category:Non-talk pages that are automatically signed
 * ...replaces Category:Non-talk pages automatically signed by HagermanBot
 * Category:Non-talk pages with subpages that are automatically signed
 * ...replaces Category:Non-talk pages with subpages automatically signed by HagermanBot
 * Category:Users who have opted out of automatic signing
 * ... new bot-neutral category for "Opt out" users.
 * Eek, I should also mention that opting back in would simply involve the user removing himself from the category by removing the category from his user page and waiting a couple of minutes for the bot to refresh its internal cache of category members. -- slakr 17:32, 26 July 2007 (UTC)
 * I don't think there is a need for an opt out category. Just use something like  or use Bots.  ~   Wi ki  her mit  21:50, 26 July 2007 (UTC)
 * I considered that, but best-case scenerio, that significantly increases processing time and bandwidth usage. Consider:
 * User "John" makes an unsigned comment to Talk:Sharks.
 * If John does not want his comments signed and he uses the category/opt-out list method, the bot simply skips over this RC entry completely and moves on to more important things. If John does want his comments signed (i.e., he hasn't opted out), processing continues:
 * Bot has to download John's revision to Talk:Sharks (at least, until I/someone else code(s) diff support into api.php), or rely on kooky, possibly unpredictable calls to the wiki diff of the page. Under the category method, if it then determines the comment is unsigned, the bot adds the edit adding the signature and continues on to more important things.
 * BUT, under the method, once the bot has determined an edit is suitable for autosigning, an extra step is needed every single time.  That is, the bot is forced to surf on over to User:John, load his page completely, and check to see if there's a  tag or comment instructing the bot to ignore the page found anywhere on the page, and if so, discard all the work it did for nothing.  Bear in mind: the bot cannot cache this information, because there will be no way to know when/if John opts back in or opts back out again without first checking his user page; because, the bot would ignore future edits from John-- including edits to his User:John page.  So, in order to be fully-compliant, the bot would be forced to surf User:John every time John makes an autosignable unsigned edit, and it will also be forced to load every candidate talk page that John edits that might qualify for an autosign.  Once the bot has finally determined whether John still wants the bot to sign his comments, the bot can then add the new edit back to Talk:Sharks and move on to John's next edit to Talk:Dolphins and repeat the same process over again.


 * Long story short, the category (or per-bot opt out) methods are both significantly faster, easier on the servers, and bandwidth-friendly for both the maintainer of the bot and wikimedia when compared to the roundabout method.  Extrapolated further, the per-bot exclusion lists actually add a trivial amount of extra processing requirements, as they could be viewed as database clutter (each opt in or opt out is a new revision row; or, more simply put, a new entry in the "history" log).  The choice, then is not between the category and the  method, but instead, the real choice is between per-bot exclusion lists ("the HagermanBot way of doing it;" not future proof) and bot-independent exclusion lists ("the category method;" future proof). -- slakr  02:34, 27 July 2007 (UTC)


 * Clarify. Does the user opt out, or is the page opted out? ~   Wi ki  her mit  06:07, 27 July 2007 (UTC)
 * bots makes sense to opt-out a page. A user cat or centralised list makes the most sense to opt-out a user. --ais523 07:55, 27 July 2007 (UTC)
 * You read my mind-- that's exactly what I was trying to say. :P I was already using  for exempting pages normally monitored by the bot, so I was saying that the exempting users should be done using a bot-neutral user exemption category to avoid unneeded overhead.  I think I made it sound more complicated than it actually is :D  -- slakr  08:16, 27 July 2007 (UTC)
 * for the log page, to test lists. If that works, a non-userspace trial should be simple. Matt/TheFearow (Talk) (Contribs) (Bot) 08:38, 30 July 2007 (UTC)
 * Start running the trial in userspace once its finished, i'm keen to see how well this worked. Matt/TheFearow (Talk) (Contribs) (Bot) 01:50, 31 July 2007 (UTC)
 * Cool, will do. I'm finishing off the token stuff so that it can make edits right now.  I'll let you guys know when you can start playing. :)  -- slakr  02:27, 31 July 2007 (UTC)

Well, it should be safe to play with the bot now. It's only monitoring the default talk pages and new categories only (not hagermanbot's old ones). If you want to play in the sandbox, the bot will behave/edit exactly the way it would outside its namespace. If you would like to see a summary of what it would do if fully-enabled to do global edits, visit its log page, which is updated every five minutes or so with the edits it has done in the meantime. I included a summary on all signature edits showing where it would place the signature, plus a convenient little diff to point you to the change. Everything should be working, so if you encounter any problems, see any odd behavior, or simply have any questions, feel free to drop me a line. Cheers. :) -- slakr 19:46, 31 July 2007 (UTC)


 * A user signed with three tildes, only producing their name . You should place undated on these edits (undated). The same with dated only ones (five tildes), should use unsigned. Lastly, you should place tilde on the talk pages of users who have left unsigned comments three or more times. Once you get these sorted out, Everything looks good. Matt/TheFearow (Talk) (Contribs) (Bot) 21:49, 31 July 2007 (UTC)
 * Technically, HagermanBot didn't mess with non-dated/dated-only type things. I'm somewhat iffy on it, though, seeing as there are tons of weird signatures out there.  However, I went ahead and added those features in just for you. :P  As for the  warnings, I'll make/test that tomorrow (and obviously before making the bot live).  My brain's fried from work+an exam earlier today. :\ -- slakr  05:17, 1 August 2007 (UTC)


 * Also, provide me with hagermanbots categories and the neutral categories, and I will speedy rename them in AWB. Matt/TheFearow (Talk) (Contribs) (Bot) 21:52, 31 July 2007 (UTC)
 * I posted them above (paragraph starting with "Cool. Well" but I forgot the "Category:" part in the link, so I went ahead and fixed that so that the actual categories are no longer redlinked. If in doubt, you can always check the bot's user page  -- slakr  05:17, 1 August 2007 (UTC)


 * Last question, does this patrol all talk namespaces? E.g. talk, user talk, cat talk, template talk, wp talk, mw talk, portal talk, etc? Matt/TheFearow (Talk) (Contribs) (Bot) 21:54, 31 July 2007 (UTC)
 * As it stands, yes. However, should any namespace need to be enabled/disabled, it's a quick fix.  -- slakr  05:17, 1 August 2007 (UTC)
 * Ok, i'm renaming the categories now. Should be done within the hour. Matt/TheFearow (Talk) (Contribs) (Bot) 07:37, 1 August 2007 (UTC)


 * I've moved most of them, should be good enough for the trial. Request the others on CFD/W. Matt/TheFearow (Talk) (Contribs) (Bot) 08:26, 1 August 2007 (UTC)

Added the tilde warnings and such, and just did a live trial run. A couple things seem to have happened:
 * Erring on the side of caution, the bot threw warnings on edit pages that looked funky to it (though it said the wrong reason). I might end up removing those, as had it made those actual edits, it would have been okay.
 * Had a couple problems with non-standard signatures.
 * One was plausible, but outlandish as it was a "User talk" only signature, and even worse, it was prefixed with a colon, which I hadn't predicted. The bot did detect the timestamp, however, so I'll add in an exception for people prefixing stuff with a colon.
 * The other is as best as I could have possibly predicted. The signature directs to two separate users, neither of which are the actual user account that made the edit.  I have no idea how I can fix that one without radically changing the bot's behavior, so I think I'm just going to leave that user to simply opt out if he insists on using 3 separate pages. The only other option is to simply move on when it finds "user talk" within a wikilink.  *shrug* Ideas?


 * So, for now I'm letting the bot rest and I'll fix those two small bugs tomorrow. Lemme know of any questions/comments/concerns.  Cheers :)  -- slakr  08:31, 2 August 2007 (UTC)
 * Great, everything looks good. Those issues are a bit unusual, maybe best thing is to look for anything starting with --Matt/TheFearow (Talk) (Contribs) (Bot) 10:05, 2 August 2007 (UTC)
 * I am in full support of this bot request. Just FYI, I like how the bot is running and believe it will be a good replacement for HagermanBot. — E  talkbots 11:15, 10 August 2007 (UTC)

Do we really want a HagermanBot replacement? I remember asking around about whether HagermanBot should be replaced about a month ago and opinion was pretty divided about whether this sort of Bot is actually helpful. Might it be worth a wider community discussion about whether we want a bot signing unsigned postes? WjBscribe 19:22, 10 August 2007 (UTC)


 * Well, on pages like the help desk, ref desks, etc it is extremely annoying trying to help people not knowing who they are - often there are many replies since then and the diffs crash browsers due to huge page size. Matt/TheFearow (Talk) (Contribs) (Bot) 23:28, 11 August 2007 (UTC)

Quick update: I and the bot have been inactive since last week as I've been out of town visiting dad, so sorry for the inactivity, but I wanted to take a coding vacation :P. Anyway, I fixed the open issues/bugs:
 * Trial run archive. Since the current log has changed since the original trial run, the original trial run is now in the log page's history:  available here.  The issues there are detailed in the next two bullet points.
 * Weird sigs. Weird signatures are now better detected by making searching for them more generically (it'll detect " [[User " as a valid user portion of a signature).  I think this will work better, since worst case the bot simply assumes that the post was signed.  Plus, it's a good bet that if there's a " [[User " in the last non-blank line of the edit that it's a signature of some sort anyway.  The only downside is that now the bot won't detect whether someone is fake-signing for someone else (ie, one user makes it look like another user is signing the post that the first user actually wrote).
 * Edit abort. I changed the way the bot checks for edit conflicts and whether it's recreating a deleted talk page (it was too touchy during its trial run and aborted when it shouldn't have aborted).  Now it will properly detect whether or not it should edit a page or simply run away.

If there is anything I forgot, please let me know. -- slakr 22:36, 13 August 2007 (UTC)

Section break
Everything looks good, since you've made some bigish changes, Everything else looks good. Matt/TheFearow (Talk) (Contribs) (Bot) 22:42, 13 August 2007 (UTC)
 * I do not approve of this. I hated HagermanBot. I'd find some troll made a comment, go to revert, and end up reverting HagermanBot. I think this warrants further community discussion before being adopted as a bot. At the very least, this bot needs an opt out mechanism. Does that opt out category work? --Deskana (banana) 23:55, 13 August 2007 (UTC)
 * All opt out categories do work, as specified above (unless they changed something). It does make it easier to identify which troll made the comment, especially on high usage pages where its often lost in the edit history. Matt/TheFearow (Talk) (Contribs) (Bot) 23:57, 13 August 2007 (UTC)
 * I still feel this warrants further community discussion, rather than BAG discussion. Only my opinon though... --Deskana (banana) 00:08, 14 August 2007 (UTC)
 * Although this is one of the points of a trial, I sort of agree with what you say, so hold the trial until further discussion. Probably best to discuss at the Village Pump. Matt/TheFearow (Talk) (Contribs) (Bot) 00:37, 14 August 2007 (UTC)
 * Mmmk. I'll hold off. And yes, all the categories (as well as the bot inclusion/exclusion tags) all work.  So question:  how do you guys want to handle this?  Would you like me to post to the pump or would you like to do so instead?  Also, if the bot's quickness is a problem, a delay could be implemented on certain badwords/tilde-warned people so that the bot will wait a minute or two before adding signatures in order to allow vandalism reversion; then, it would abort if the page has changed in that period of time (just like it already does).  The downside is that on frequently-changing pages, like AIV, it might nullify the bot's beneficial aspects.  However, technically-speaking, I could manually add a "quick list" of sorts that makes the bot always quickly sign unsigned contribs on certain pages.  Alternatively, we could create a tag much like  to do something similar with regard to modifying the default timing behavior of the bot.  -- slakr  01:07, 14 August 2007 (UTC)
 * Make it wait a minute or so, but skip if the unsigned comment is still there, or if there are no edits in the meantime with a summary containing "rv" or "revert". Matt/TheFearow (Talk) (Contribs) (Bot) 01:24, 14 August 2007 (UTC)
 * Hmm, I'm kind of confused as to how that's worded, since the "skip" is kind of confusing to me in this context (as I equate "skip" with the bot NOT signing a comment). I'm assuming that you mean that the bot should wait for a minute or so, THEN sign the comment-- so long as there haven't been edits/signings in the meantime.  Please correct me/rephrase/give an example if I'm totally reading that wrong :P.  -- slakr  01:44, 14 August 2007 (UTC)
 * Almost right, but what I meant is wait a minute or so (except on certain pages), and if the unsigned comment is still there without anything new on the same line(s) as it, then sign it. Basically, if it isnt removed within a minute, sign it. Otherwise, ignore it. Matt/TheFearow (Talk) (Contribs) (Bot) 04:56, 14 August 2007 (UTC)
 * Done. Details on changelog - 1.0.0pre5. Basically I took "a minute or so" to be somewhere between 1 and 2 minutes.  So I made it a minute and a half, which actually seems like a good time for now.  It allows people to go "OMG! I forgot to sign" and still be able to make changes, and it also allows people/bots who have pretty much live rc feeds enough time to revert obvious vandalism.  However, the bot will skip adding signatures if there is a new revision within those 90 seconds, and if the new revision is sufficiently close to the old one, it will skip signing all together, because it assumes that someone is making corrections to either their own text or someone else's text.  I'd rather play it safe on that one, because on cursory inspection, substantially more logic would be needed to diff across multiple revisions without worrying about putting a signature in the wrong place/for the wrong person.  Therefore, the bot will simply abort adding signatures on an outdated revision (hence the reason for the high-priority list for pages that frequently change).  By the way, the delay could easily be increased, though, if need be.  -- slakr  06:15, 15 August 2007 (UTC)
 * Oh yeah, so, ermm, what should I do now? -- slakr  04:47, 16 August 2007 (UTC)
 * BAGAssistanceNeeded -- slakr 02:39, 17 August 2007 (UTC)
 * It all looks good, and the delay is fine. Provide the list of high-priority pages, then . Matt/TheFearow (Talk) (Contribs) (Bot) 03:15, 17 August 2007 (UTC)
 * Cool. I added a dedicated page for that (plus instructions for people who want to add/remove stuff to simply contact me) at User:SineBot/HighPriority.  I'll let you know when the trial run is finished.  -- slakr  01:42, 18 August 2007 (UTC)
 * Well, quick hold up. I started out the trial fine, but apparently wikipedia's squid servers were having issues, as referenced by all of the warnings in the live run.  I stopped the trial, as for in order to catch up with the signatures, the bot had to wait for the error to clear, then process its queue, which would cause it to exceed 4epm.  The errors take the form of:

Request: POST http://en.wikipedia.org/w/api.php, from 66.230.200.147 via sq22.wikimedia.org (squid/2.6.STABLE12) to 10.0.5.3 (10.0.5.3)

Error: ERR_ZERO_SIZE_OBJECT, errno [No Error] at Sat, 18 Aug 2007 02:48:47 GMT
 * Thus, I'm confident this isn't a bot issue but instead a squid issue, since I haven't had problems with this until today, and the problem only happens sporadically from what I can tell. The people in #wikimedia-tech say it's a temporary problem until this one crashed slave fixes itself, so I'll try re-running tomorrow, which should hopefully give the problem time to heal itself.  -- slakr  03:02, 18 August 2007 (UTC)
 * Sure, post your report when you think all is OK and we can get this rolling. — E  talkbots 09:23, 18 August 2007 (UTC)
 * Edits look fine and alright to me. You're at 5EPM. —  E  talkbots 09:53, 19 August 2007 (UTC)


 * The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.