User:Blevintron/BotWriterGotchas

This is where I collect the list of subtle issues that arise while writing a bot. This is meant to supplement Bot_policy. Check there first for the basics. AFAIK, these are not documented elsewhere, or at least not collected into a single page.

WP:Bots are annoying

Publishing your source code
Even if all the cool kids are doing it, you probably don't want to upload your bot's code to wikipedia.

Explicitly say you're a bot
Imagine you have a lazy reader who doesn't make it as far as the opt-out line. Does he still know you're a bot?

You should at least avoid to the first person pronoun 'I'.

You need an opt-out mechanism
Typically, this takes the form of a Bots exclusion on the user or user talk page.

You should advertise that mechanism
 * In every bot message,
 * On the bot user page

Prove it won't be annoying
Before you even propose it, you should have a good idea of communication rates.
 * Something like 'no more than one message per week'.

You probably want to start a discussion on Village Pump to confirm that your proposed contact frequency won't annoy people. Keep it brief, try not to let the conversation sway too far from the issue at hand. Use that discussion as evidence during your BRFA. My past mistakes might make your life easier:
 * VP discussion 1
 * VP discussion 2.

Collect Stats to Prove notifications are effective
Do editors act on those notifications? Do they act more than if they did not receive an active notification? Is a passive notification (such as adding the article to a maintenance category) equally effective? Collect numbers to prove your point.

DPL bot's BRFA demonstrates good evidence that notifications are effective.

Use substitution-transclusion to generate messages
This way the messages can be adjusted without yicking with the source code. Also, other editors will be able to adjust them (hopefully for the better).

You probably want to get those semi-protected.

Which bots already do this?
This is a known list of bots that contact users. Please add if you know more.
 * User:BlevintronBot - not yet approved.
 * User:Sinebot - reminds you to sign your posts.
 * User:DPL bot - notifies editors of dablinks.
 * User:OrphanBot - notifies image uploaders about impending deletion.
 * User:ImageTaggingBot - notifies image uploaders when they forgot to add license tags.
 * User:SuggestBot - opt-in only; suggests articles to edit based on prior contributions.
 * User:CSDWarnBot - notifies authors of a speedy-delete nomination. Kind of a bad example.
 * There is a whole category of newsletter delivery bots

Creating Result or Summary Pages
Sometimes it is useful for a bot to create a page that summarizes results. If you do this:
 * 1) Ensure that it is within your (or your bot's) User: space.
 * 2) Explicitly mark it with User page.  Consider __NOINDEX__

Copying Existing Article Content into User: Space
You may want your bot to copy article content into user space, possibly as part of a result summary, or maybe to create test cases. However, the original article probably contained tags which include it into categories, etc. You want to remove them in a procedure that resembles the userification process.

If you do this, make sure you:
 * 1) Explicitly mark it with User page.  Consider __NOINDEX__
 * 2) Redact category membership tags from the content: replace  → Category:Foo
 * 3) Redact inter-language links (there is large set of language codes)

Avoid edit wars
Impose a timeout on an article after your bot edits.

Edit Summaries are fixed length
And that length is something like 250 characters. Be safe: ensure your edit summaries are < 230 characters or risk truncation.

Date Formats
People are sensitive about the difference between DMY and MDY. Nice articles declare the preferred format using one of Use mdy dates, Use dmy dates, Mdy, or Dmy. When articles lack this, try to infer one a date format: look for dates in known contexts in the article, and determine whether they put day or month first.

The date is your friend.

Most templates employ ad-hoc formatting arguments, and no templates take advantage of the globally declared preference. For example,
 * Wayback requires date in an yyyymmdd format, and asks for a df parameter which controls whether the day will be displayed before month. This is independent of any preferred format declared elsewhere.
 * WebCite requires date in any format, but accepts a dateformat parameter to control how it renders.

Your best bet is to carefully read template documentation.

Template Spacing
People are sensitive to the spacing around the parameters of a template. These two examples are 'very different' and interchanging them is a no-no.

2 April 2007

2 April 2007

In no case can you get away with this: 2 April 2007

Acknowledgements
Most of this I learned under the supervision of User:Hellknowz during my first BRFA.