Wikipedia:Bots/Requests for approval/HostBot 2


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol keep vote.svg Approved

HostBot
Operator:

Time filed: 01:02, Saturday July 7, 2012 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): Python, uses WikiTools

Source code available: Source code available here: https://github.com/jtmorgan/hostbot/tree/master/new_editor_invites

Function overview: A proposed extension of HostBot's duties to include inviting selected new editors to participate in WP:Teahouse by posting an invite template on their talk pages.

Links to relevant discussions (where appropriate): Wikipedia_talk:Teahouse/Host_lounge/Archive_5

Edit period(s): Daily

Estimated number of pages affected: 70 - 100 pages per day

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details: Teahouse relies on direct outreach to new editors for a great deal of its traffic. But manually inviting people via talk page templates is time-consuming. Since many invites need to be sent each day, invites are sent to a subset of new editors in a pre-filtered list, and the task is fairly tedious and without much personal interaction at present, we feel it is a good candidate to experiment with automation.

HostBot currently publishes a daily invitee report which consists of a list of ~50-100 new editors who meet a set of baseline criteria for invitation. These criteria are intended to screen out both account creators who don’t intend to edit seriously, and blatant vandals.

Currently, Teahouse hosts manually invite this set of editors using a standardized talk page invite template. Because of hosts’ other time commitments, many days no invitations are sent out at all, and so many of the promising new editors that the Teahouse could serve don’t hear about it. We would like to try automating the invite process in order to see if it increases good-faith only traffic to Teahouse and allows volunteers to spend time on more personal tasks, such as answering questions and welcoming new editors who do visit Teahouse.

We’re aware that automatic welcoming is one of the perennially rejected bot proposals, and that some might think automated invitations are similar. The reasons generally given for rejecting welcome bot proposal are:
 * 1) If a bot is used, it is cold and impersonal, and the bot is incapable of mentoring and assisting newcomers.
 * 2) The bot would make thousands of pointless edits welcoming vandals and accounts that never make an edit.

These are valid concerns for welcoming on Wikipedia, as welcome templates are delivered to a majority of new editors each day, but they don’t apply to the situation of Teahouse invites.


 * “If a bot is used, it is cold and impersonal...” - The bot-delivered template will be similar to the current hand-pasted or Twinkle-assisted template that is currently used for invitations. Although these are delivered by humans, they aren’t very personalized, but they do direct editors to a place where they can receive highly personalized help. Bot delivery would serve the same function.
 * “The bot is incapable of mentoring and assisting newcomers.” - Although HostBot itself isn't capable of mentoring or assisting newcomers, it will link people to Teahouse, where rapid assistance is available from real people 24 hours a day via the Q&A board and IRC. We intend to adjust the language of the invite template so that new users are aware that it was delivered by a bot, not a human.
 * “The bot would make thousands of ...edits” We have no intention of inviting more than 100 new editors per day through this automated process, a relatively small number of edits.
 * “...welcoming vandals...” All potential invitees will be checked against the block log and any editors who are or have been blocked will be excluded. Furthermore, we have found that only a small percentage (around 10%) of the editors who meet our invitation criteria are subsequently blocked. In fact, we saw no significant difference between the percentage of subsequently blocked users in a group of manually invited new editors and a control group of new editors who fit our basic invitation criteria, suggesting that--in this situation at least--humans are not necessarily better at predicting which new editors are likely to vandalize than a bot would be.
 * “welcoming... accounts that never make an edit.” We will only invite new editors who have already made several edits during their first 24 hours; we will not invite all new account creators. The current 10 edit/24 hour threshold was established based on research that showed account creators who make 10+ edits in their first 24 hours are significantly more likely to continue editing than those who show less initial activity.

We invite your feedback on how to make the automated invite process compliant with existing bot policies and best practices; we want to do this right. If approved, we intend to implement automatic Teahouse invitations via HostBot on a trial basis, for a period of 2 weeks. Bot behavior will be monitored daily by Teahouse hosts during this period, who will perform spot checks to make sure the bot is performing as designed. After the trial, we will assess whether this invitation process has led to an increase in vandalism, and/or whether the bot has had any other unforeseen impact.

Discussion
BAG assistance needed - No action taken on the request by the BAG for almost a week now. --Nathan2055talk - contribs 17:30, 10 July 2012 (UTC)
 * Well, I personally think this is fine. It seems to have been given more thought than your average "welcome bot" and it's for a good purpose. I'm inclined to go along with the two-week trial; does anyone else have thoughts on this before we give it the go-ahead? —  The   Earwig   (talk) 20:24, 10 July 2012 (UTC)
 * Just for the record, a mock BRFA on this was held at the Teahouse host lounge prior to this BRFA. Just thought I would mention it for the record. --Nathan2055talk - contribs 17:05, 11 July 2012 (UTC)
 * Just for your information: it is already linked above at the standard questions... mabdul 18:26, 11 July 2012 (UTC)
 * - Must have missed that. Anyway, there's the link. --Nathan2055talk - contribs 18:30, 11 July 2012 (UTC)

—  The   Earwig   (talk) 15:09, 13 July 2012 (UTC)

Has the source code been published? It looks like it has some quirks in it. --MZMcBride (talk) 15:55, 15 July 2012 (UTC)

When I first joined Wikipedia, I was welcomed by a person. I still remember that. I'm not sold on the idea that having a welcome bot, even if limited by specific parameters (edit count, block status, etc.), is a good idea. Hasn't a good amount of the relevant research (by both your team and other groups) indicated that new users—in particular—value human interaction, not botspam? --MZMcBride (talk) 15:55, 15 July 2012 (UTC)

It would be helpful if you could clearly define what problems you're attempting to address with this bot. Fundamentally, using a welcome bot is a bad idea. I say so, the community has said so on countless occasions, and nothing in this request makes it clear that anything has changed that requires a re-evaluation of this position. You're certainly not the first bot operators to suggest adding filters/constraints to the input list. This is a perennially denied request for a reason.

If you can begin to define the problems you're trying to address, it might be helpful in developing actual solutions to those problems. What you've currently created is a wall of text to obfuscate the fact that you're simply re-proposing what has been previously (and rightly) rejected countless times. --MZMcBride (talk) 16:05, 15 July 2012 (UTC)


 * I think some of this has already been answered. First off, no one disagrees that getting a personalized message from a specific user (e.g. "thanks for doing x ") is the best kind of welcome. Currently, this is quite rare, with the majority being automatic through Twinkle, etc. Even though these are left manually or semi-manually by actual users, the message itself is mass-produced and not any more friendly than the message the bot will leave.
 * Re "what problems are we trying to address?", my understanding is this: "One of the big items [...] is to increase the number of invitations sent to new editors every day. To make sure that Teahouse hews closely to its mission of active outreach to very new editors, we need to invite more of them. But we know that manually inviting people is time-consuming and tedious. During the pilot period, Sarah and RosieStep took on the lion's share of invites, and the rest of us invited a little, a lot, or not at all depending on our availability and inclination. I don't think that model is sustainable." The bot does not seem significantly more impersonal than the "personal" welcomes currently being left and there is no more "human" interaction in that than this. Actual interaction comes from discussion after the welcome, which in this case is facilitated by the teahouse, and in the standard case, is facilitated by the welcomer's talk page. Are you challenging the mass-welcoming of users with boilerplate messages or the use of a bot to do so? I can understand the former, but that is a process that has been going on for years and extends far beyond this single request. The latter (the purpose of this request) is just to make that easier on the teahouse itself so they can spend more time on mentoring than inviting. —  The   Earwig   (talk) 16:51, 15 July 2012 (UTC)
 * Earwig hit my point exactly. This could be considered more of an "invite-bot" than a welcome-bot. Ryan Vesey  Review me!  16:53, 15 July 2012 (UTC)
 * Can I bring to the table that the WMF is known to make bots that invite people to their projects based on their editing habits (I don't mean that the TH is just another WMF project, I'm talking in general about projects such as WP:NPT)? This really could simply be considered a large-scale invite bot, just as Earwig said. --Nathan2055talk - contribs 19:26, 15 July 2012 (UTC)
 * - Sorry about the double post, but the bot hasn't been turned on yet. --Nathan2055talk - contribs 19:28, 15 July 2012 (UTC)
 * I think the J-MO is at Wikimania, he'll be able to start the trial soon. Ryan Vesey Review me!  19:52, 15 July 2012 (UTC)


 * Earwig: I'm challenging the use of a bot here. As I said, I don't see anything said above that negates the reasonable conclusion that many previous discussions about this topic have come to: a welcome bot is a poor idea.
 * The Teahouse seems to have plenty of active volunteers. Perhaps better tools are needed for these volunteers to be able to do outreach? I completely agree with you that a personalized welcome message would be best. Why is that so difficult? What can be done to make it simpler? Focusing on that problem might actually work toward resolving the underlying issue, rather than throwing a spambot at the problem. (The idea presented on this page that this is not a welcome bot and is instead an invite bot is simply absurd. They're all particular types of spambots. I think I would know, I run one of the most obnoxious ones.)
 * If people are going to devote resources to working on this problem, I'd like to see it done by actually working toward a sustainable solution. I don't believe there's any distinction here between what has been previously rejected and what is being proposed. Automated welcoming should be avoided. While users can certainly engage in welcoming by hand (just as they can fix wikimarkup by hand or do spell-checking by hand or do whatever else by hand), that is not a reason to condone the use of bots here (just as we perennially ban spell-checking bots and bots to fix wikimarkup and whatever else). --MZMcBride (talk) 23:21, 15 July 2012 (UTC)


 * The real problem being solved is that it takes freaking forever to invite the same people "by hand" that the bot would invite. I still say that if the people who disagree with using a bot (to show people the friendly and peopled place we are waiting to welcome them to) want to spend a few hours a week inviting people then yes, awesome! we don't need it. But we do. Because it makes very little sense not to. It makes a lot more sense to spend all of that time and energy with people who need help or greeting people who want to interact. Basically I am saying the same thing as J-Mo, but repetition for emphasis or something I guess. heather walls (talk) 22:32, 16 July 2012 (UTC)


 * I have to second that: as I already mentioned in the 'pre BFRA' at the Teahouse page, I'm strictly against to welcoming bots. I see the need of welcoming users for the TH and that (especially) new users should get the attention and the information that there are helping boards like the TH or the HD are out there; this is the actual reason I had nothing against to implement the (additional checkbox for the) automatic TH invitation welcoming into the AFC helper tool. I believe there are more tools out there which could also get such a invitation addition. mabdul 23:46, 15 July 2012 (UTC)

A few brief points (wouldn't want to be a textwaller!): first, the trial will need to be postponed a week anyway as I crawl my way out of Wikimania backlog/coma and get the new code written, tested and made available in an online repository, which I will then provide a link to (no arguments with you there, MZ). Second, I think that many people involved in this discussion and elsewhere will agree with me that the Teahouse itself is an attempt to address the "underlying issue" of new editors not getting sufficiently personalized welcomes. The invite template is simply the most expedient mechanism for directing new editors to a place where they can feel welcome and be welcomed by a person. If we can save hours of volunteer time by automating that process, we create more opportunities for friendly interaction on the Teahouse, which is a lot more satisfying for both guests and hosts. - J-Mo Talk to Me   Email Me  20:01, 16 July 2012 (UTC)


 * MZ, I'd like to understand more about why you think it's simply absurd to distinguish between invites and welcomes here. Just because a message is delivered by similar means doesn't mean the content, intent, or impact is necessarily the same.  The Teahouse invitation is just that - an invitation to visit the Teahouse.  It doesn't preclude someone from also receiving a welcome message (templated, handwritten, or otherwise).  I've seen no evidence demonstrating that TH invites are serving as a replacement to other talk page welcomes, though I would be interested to learn if anyone else has.  TH has demonstrated positive impact on new editors who hear about it, and the way that they currently hear about it is not sustainable.  There are probably a lot of ways the TH invite method and message could be improved.  One way that TH would like to experiment with is via bot delivery.  Because the project team intends to measure the outcomes of a 2 week trial, we'll all learn whether or not bot delivery is more or less effective than the current method (ie, whether or not more or less people in the sample feel motivated to ask for editing help at the Teahouse and whether or not more or less people continue to edit the encyclopedia after receiving the bot-delivered invite), and that should help resolve the issue of whether or not bot delivery is a good or bad thing for this particular message.  If you've got other amazing ideas for sustainable solutions to inviting people to the Teahouse too, though (and I bet you do!), I'd love to hear about them, because it would be great to see the project team experiment with several things in addition to this in order to see what works best.  Sbouterse (WMF) (talk) 05:44, 17 July 2012 (UTC)


 * Some of the comments above try to make a distinction between "invite bots" and "welcome bots" as a means of attempting to bypass the longstanding prohibition on welcome bots. That is, people feel that if this particular bot (HostBot) can be re-labeled, its behavior becomes acceptable. My point was that any such distinction is absurd; both invite and welcome bots are forms of spam bots.
 * I'll try to diagram what I see here:
 * Problem: users should be welcomed to project and invited to contribute further
 * Past proposed solutions:
 * Use a bot!
 * bots are impersonal; prefer (personalized!) human messages to users when possible
 * Current proposed solution:
 * Use a bot!
 * nothing has changed
 * As I said above, we can't (and we won't) stop users from dropping awful welcome templates on user talk pages (e.g., this monster: Template:Anonwelcomeg). However, as a community, the English Wikipedia has said that we do not want a welcome bot. There's an existing MediaWiki extension (NewUserMessage) deployed to over 20 Wikimedia wikis that automatically welcomes new users. The English Wikipedia is not one of those wikis. As far as I know, there's been no shift in consensus on this issue.
 * The comments above by Jtmorgan are along the lines of "oh, of course, we'll just filter the list better to a certain edit count and block status!", as though none of the previous bot operators had considered such a thought. As Bots/Frequently denied bots makes clear, "Several variations have been proposed, such as only welcoming users who have made an edit, or a certain number of edits, but these requests are still denied."
 * So I'm left wondering: what has changed that makes the Teahouse feel that this bot request is appropriate? If there's such active interest in welcoming new users and participating in the Teahouse, why can't human editors be used here? If people want to devote resources to improving this situation (by writing scripts, e.g.), why can't those resources be put toward a non-bot solution? --MZMcBride (talk) 19:39, 17 July 2012 (UTC)
 * Nothing has changed to make the Teahouse feel that a welcome bot is appropriate. You can say it is a welcome-bot all you want; however, this bot is different.  The purpose of this bot is to direct new editors to the Teahouse where they can receive assistance and guidance.  Why do you feel that there is any difference between me placing Teahouse invitation than a bot placing the template?  The only difference I see is that the bot frees me to do other things, like interact with the editors or write some articles.  Honestly, I don't send out the invitations because I don't have the time. Ryan Vesey  Review me!  19:45, 17 July 2012 (UTC)
 * Add the line: 'Welcome to wikipedia!' and you has a very similar welcome template to welcome-short... mabdul 20:13, 17 July 2012 (UTC)

I'm concerned that the number of posts I've made here (and the posts' lengths) make it seem as I though I care about this issue more than I actually do. I really don't care very much. I posted here because Jtmorgan asked me to take a look and I did. What I found was a proposal (a message delivery bot for new users) that the community has specifically rejected previously (to the point that people wrote documentation about it) and that I personally don't consider to be an appropriate use for a bot. There are appropriate and inappropriate times to use automation; for me, this is an inappropriate time (much like automatic spell-checking or automatic image de-linking). I think that leaving personalized messages is better and that working toward that goal would be a better use of time and resources. I also don't think there's much community consensus for adding a welcome bot (outside of Teahouse-related folks), but it's not my call. And maybe the community has finally changed its mind and doesn't mind. I'm not a member of the bot approvals group. One of those people (and a bureaucrat, I guess) will decide whether to approve or deny this bot. I'm hoping this is my last post here, but I'm making no promises. :-) --MZMcBride (talk) 22:41, 17 July 2012 (UTC)

I've checked the code for automatic invites into a subversion repository with Google Code (link above). Other HostBot scripts will live there as well, including the code that generates the list of new editors who are to be invited (the same script that runs the daily Invitee reports). I intend to run my first small-scale test tonight, on 10 new editors' pages, and monitor the outcome closely, reverting any errors that may occur. If nothing breaks, I'll set up a cron job to run the automatic invite script daily for the duration of the 2-week trial, through 8/5/2012. - J-Mo Talk to Me   Email Me  23:03, 22 July 2012 (UTC)
 * One of the issues with human notification over bot notification is that humans can see the (top) in their contributions list letting them know they had the last edit on a user talk page. That way, if the user doesn't follow the link to the teahouse, and instead posts a question on their own talk page, a human can see it.  So my question is this: in addition to the bot's task, could it also create and maintain a list of user's it has left a notification for who have replied in the same section as the notification on their user page?  For example: A) Bot posts teahouse invitation with header "Teahouse".  User replies in that thread, bot adds their name to a list that actual users can monitor.  B) Bot posts teahouse invitation with header "Teahouse", another user comes by and leaves another template/section/message and the bot will not add the user to the list.  Doable?  If it's not worth it, not a big deal.  Just an idea.--v/r - TP 14:44, 23 July 2012 (UTC)
 * Mabdul has suggested this too. This is a good idea, and could be implemented without a huge amount of extra work, but it's a big enough task that I want to hold off until after the trial is over to evaluate it. Once the trial is over, I can retroactively analyze how often this kind of thing has occurred. If it happens frequently, I will create a regularly-updated report (or look into setting up an external watchlist with one of the available tools--suggestions welcome!) that will allow hosts to see when a new editor has edited the Teahouse invites' section on their own talk page. - J-Mo  Talk to Me   Email Me  00:13, 24 July 2012 (UTC)


 * Why doesn't it substitute the invite template or the ?  Rcsprinter   (warn)  @ 16:20, 25 July 2012 (UTC)
 * Good call. I just changed the code so that it now substitutes the invite template. What do you mean about here? I don't quite understand what you were suggesting with that.  - J-Mo  Talk to Me   Email Me  19:37, 27 July 2012 (UTC)
 * Hey J-Mo, I thought we argued that it wasn't subst. because of metrics? Maybe that changed or I misunderstood. LOVEBOT. love. it. heather walls (talk) 04:37, 30 July 2012 (UTC)
 * In the section title. Make it so it seems more personal.  Rcsprinter   (tell me stuff)  @ 19:43, 27 July 2012 (UTC)
 * I agree. Looking at User talk:Juyorican the template seems nice.  Then you look at the code and it says, You are.... That doesn't seem like the best solution, especially when the editors might not understand templates yet.  I'm also curious as to why there aren't different categories for the two forms of the templates.  Shouldn't one be Wikipedians who received Teahouse template A and Wikipedians who received Teahouse template B?  Is there another system for doing the measuring for the A/B testing? Ryan Vesey  Review me!  19:49, 27 July 2012 (UTC)
 * That makes sense too :) Since it was an easy fix, I've just changed the section header to post . Regarding tracking the different templates: I'm keeping track of who got which version in a database table, so there's no need for the categories to be different.  - J-Mo  Talk to Me   Email Me  22:01, 27 July 2012 (UTC)
 * Some minor notes; there's a rather large gap in between section title and invitation. Only cosmetic, but might as well be addressed. And it's been 14 days - so please stop the bot for the trial to be assessed.  Rcsprinter  (warn)  @ 13:49, 29 July 2012 (UTC)
 * Note that due to Wikimania, the trial didn't start as soon as it was approved. The trial started on the 22nd so 14 days will be the 5th. Ryan Vesey  Review me!  13:52, 29 July 2012 (UTC)
 * Ah, good catch. Struck above.  Rcsprinter  (message)  @ 13:57, 29 July 2012 (UTC)
 * @RC: Thanks! I looked into this, and the answer seems to be buried somewhere in the WikiTools code. If I find it, I'll fix it, but it may not happen right away. - J-Mo  Talk to Me   Email Me  04:51, 1 August 2012 (UTC)

You can view relevant contribs: here. Will link to a couple examples of minor (fixed) errors tomorrow. - J-Mo Talk to Me   Email Me  03:44, 6 August 2012 (UTC)
 * Alright, results are in. Between July 23rd and August 5th (inclusive), HostBot sent out 630 invitations to new editors, for an average of 48 invitations per day. 34 of these editors (5%) were subsequently blocked from editing, roughly same 'error rate' as manual inviting (5%-6%). 18 of these new editors have visited Teahouse so far (asked a question, created an intro, or both), for an initial return rate of approximately 3%, within the range of responses from manual invites and likely to rise a bit over time as people trickle in. I could only find one user (searching by revision comment string) who appears to have responded to the template like it was a person, a concern that TP and Mabdul voice above.
 * Here are the major fixes I made to the bot code during the trial. I thank everyone who provided feedback!
 * on 7/29, set invite template to substitute rather than transclude, per this feedback
 * substituted in section title, rather than transclude. From Rcsprinter's feedback above.
 * check for existing invite (or any link to Teahouse) on user's talk page; skip user if invite already exists. From this feedback
 * Let me know if there's any more info you need from me in order to proceed with the evaluation. - J-Mo  Talk to Me   Email Me  22:40, 6 August 2012 (UTC)
 * User_talk edits made by HostBot during the trial can be found here. Does the approvals group need any additional information from me before they proceed with the evaluation? - J-Mo  Talk to Me   Email Me  19:55, 10 August 2012 (UTC)
 * Symbol note.svg A user has requested the attention of a member of the Bot Approvals Group. Once assistance has been rendered, please deactivate this tag. not trying to nag, just looking for confirmation that you're not waiting on anything else from me :) - J-Mo  Talk to Me   Email Me  20:07, 10 August 2012 (UTC)

Discussion break
Just a headsup related to possibility for the watchlist: Let the bot watchlist all pages which got an invitation (a standard preferences at Special:Preferences --> Add pages and files I edit to my watchlist) and publish the RSS token as described at WATCHLIST. This is the easiest solution at the moment (without any tools) to following the new contributing user talk pages. BTW: What does the bot if somebody redirected (by accident, or wanted, e.g. by moving his talk page to mainspace; original used as sandbox) his own talk page to another page? mabdul 07:44, 6 August 2012 (UTC)
 * Thanks for saying that. I may have mentioned it before but if I didn't I meant to.  Is it possible to create a public RSS feed or would each person interested in watching it need to create their own? Ryan Vesey 23:11, 6 August 2012 (UTC)
 * If you release the RSS token of the bot, then hence, the RSS feed is public for everyone. mabdul 04:46, 7 August 2012 (UTC)
 * That, I understand. What I mean is that I can create a personal RSS feed using something like google reader.  As far as I know, I'm the only person who can read that RSS feed.  I'm wondering if there are RSS feeds out there where anyone who follows the link can view the feed.  That way it would only need to be set up once.  Sorry if my terminology is incorrect, I hope you understand what I mean. Ryan Vesey 04:49, 7 August 2012 (UTC)
 * The easiest thing you can do is get the token, access the feed URL, and then publish it using Google Feedburner. That would make a public watchlist that anyone can access. --Nathan2055talk - contribs 18:36, 7 August 2012 (UTC)


 * Sorry, no: you don't understand: if you release the URL of atom-link, then everybody is able to follow:
 * On the left at the Watchlist, logged in with the bot account, rightclick in atom (under Toolbox), "copy link address" (or similar, so you get something like: https://en.wikipedia.org/w/api.php?action=feedwatchlist&allrev=allrev&wlowner=HostBot&wltoken=234567898765de45678edd67e&feedformat=atom key/token unpublished until now)
 * If you publish that link, then everybody would be able to follow the watchlist of the bot's account by using a feed reader. (Oh dunno if that changes something in the atom feed, but give it a try: hide "your own edits".
 * That's the reason why nobody should release his own personal URL. mabdul 18:45, 7 August 2012 (UTC)

BAG Comment: As the overturning of a long-standing consensus, this should really be advertised on WP:CENT or somesuch. - Jarry1250 [Deliberation needed] 12:39, 13 August 2012 (UTC)
 * I feel like we're going around in circles here. The Teahouse template is not a welcome template, it's an invitation to participate in the Teahouse. No consensus, long-standing or otherwise, has been overturned. - J-Mo  Talk to Me   Email Me  21:13, 13 August 2012 (UTC)
 * I have to agree with this again. It isn't a welcome bot.  In fact, it doesn't even welcome the editor.  It thanks them and informs them of the Teahouse, two very clear things.  No consensus is being overturned. Ryan Vesey 21:30, 13 August 2012 (UTC)
 * I also agree - I will not repeat what has already been said, but as for welcoming, we're trying to push a link to the Teahouse into the existing welcoming templates. I don't see why this bot is coming under so much scrutiny, it's clear what it's designed to do and for what reason - there's no problem when a user performs these edits. Osarius - Want a chat? 21:43, 13 August 2012 (UTC)
 * Well okay then, I'll strike the first half of my comment if you take issue with it. But it should still receive more input; mass unsolicited mailings have proven intensely controversial in the past and this is likely to be viewed as a welcome template, even if it isn't, adding to the potential for controversy. - Jarry1250 [Deliberation needed] 21:45, 13 August 2012 (UTC)
 * Question for J-Mo. Aside from this trial, doesn't the Teahouse have an expected shutoff date for research analysis to be done?  If that is the case, could the bot continue to run through that date.  That would also leave more time to see if there are any external complaints.  I have seen none except those that have appeared in this BRfA so far. Ryan Vesey 21:49, 13 August 2012 (UTC)
 * I still support this. I've read a bit more about this sub-project and it is nothing like a welcome bot. I don't believe this has anything to do with the welcome bot discussion. Somewhere here I have links to the official research report on Meta...here it is. That shows the kind of invites that we will be submitting in the tests. Any thoughts? --Nathan2055talk - contribs 22:35, 13 August 2012 (UTC)
 * (edit conflict) Since the raison d'etre of Teahouse is to increase editor retention, we've committed to doing analysis for at least nine months after the end of the pilot period to see if we can detect a shift in the numbers needle. Realistically, analysis will likely continue as long as I have breath in my lungs and a Toolserver account. :) - J-Mo  Talk to Me   Email Me  22:38, 13 August 2012 (UTC)

While I agree that this is technically not a welcoming bot, it does perform a closely-related function, so I'm not sure how fair it is to consider it as a wholly-separate issue. In light of the fact that the Teahouse emphasizes more personal interaction with new editors, it deserves careful consideration.

On the other hand, it's important to consider factors besides the fact that this is an automated process which can lead to it being perceived as impersonal—in particular, the template message. Currently, we are using what is essentially one template (regular one, plus the AFC variant) to invite users. It is the same template, usually with the same message, every time. As far as having a bot do this goes, it's really no worse than what we already have, except instead of a lot of people plastering the same message everywhere, it'll be a few people and a program plastering the same message everywhere. As I mentioned over at Meta (in the discussion of various Teahouse pages by all the hosts), we could really benefit from having a large variety of templates, message wordings, etc. to choose from. Hell, perhaps I'll start making some to use myself, and we'll see if they catch on. If the bot could cycle through various messages / template styles or randomly select one for each user, I would certainly support that. dalahäst (let's talk!) 23:32, 13 August 2012 (UTC)

Discussion break and review

 * Hi everyone. So to review, here's where I think things stand: we established that an invite bot is not a welcome bot; for thoroughness' sake, we also addressed each of the specific, listed points of concern for avoiding the use of welcome bots (my first post, above); we demonstrated in a two week trial that automated Teahouse invitations work, and cause no harm.


 * We've also demonstrated that our process is cautious, well-supported, closely supervised (by folks like me, Rcsprinter, Ryan Vesey, Nathan2055, Writ Keeper, Mabdul, others) and responsive to feedback. It will continue to be that way.


 * Finally, many other editors have articulated the exigence for automated Teahouse invitations better than I ever could, in this forum and in others. What's the next step? - J-Mo  Talk to Me   Email Me  16:59, 14 August 2012 (UTC)
 * We seem to be in a deadlock here. While I, J-Mo, and the others have shown that this bot doesn't seem to be related to the welcome bot issue, several people still persist in saying "OBJECTION!". I believe the only way to solve this is to directly address each person's complaints. I feel like we've already done that, but apparently there still needs to be more consensus here. We have two options. We can either continue this discussion, perhaps with another trial, or we can close this as deadlocked and try again in a month or so. I personally opt for the former, but let's let the BAG decide. --Nathan2055talk - contribs 20:26, 14 August 2012 (UTC)
 * RFC? Has that ever been done before?  Possibly BotExtendedTrial? Ryan Vesey 20:28, 14 August 2012 (UTC)
 * We don't seem deadlocked to me, in fact weighing in the direction of 'go'. I don't think we should or can afford to wait, the invites made a big difference. heather walls (talk) 20:30, 14 August 2012 (UTC)
 * A deadlock in a BRFA has never happened before. Then again, such a controversial bot never made it to BRFA before. If the BAG moves for an extended trial, I will advertise this at CENT. --Nathan2055talk - contribs 20:39, 14 August 2012 (UTC)
 * I don't see why this hasn't been enacted yet. Count me among those that thing that the objections involve making mountains out of molehills.  As what Ryan Vessey says above, having a bot do this enables me to not have to do it, and instead do something more useful: Isn't that what bots are for?  -- Jayron  32  16:17, 15 August 2012 (UTC)
 * Approved for an ongoing trial, such that the bot can prove its efficacy and thus gain a more established consensus, whilst not missing opportunities to entice new editors. - Jarry1250 [Deliberation needed] 16:31, 15 August 2012 (UTC)
 * We'll get HostBot started inviting again. I was thinking I'd check in again and report findings, bugs & updates after 1 month of operation. Sound good? — Preceding unsigned comment added by Jtmorgan (talk • contribs) 18:59, 2012 August 15‎

There is now one day left until the month of extended trial is up. The bot has been running and appears to be fine, but tomorrow the operator will call in and tell us any findings/bugs. After that, we seem to have general consensus for "yes, start writing invitations."  Rcsprinter  (whisper)  @ 16:09, 14 September 2012 (UTC)
 * Question Where does it say "one month" or is that a standard somewhere? heather walls (talk) 22:17, 14 September 2012 (UTC)
 * Jtmorgan said he would report back in one month above. LegoKontribsTalkM 22:18, 14 September 2012 (UTC)
 * LOL. Thanks! heather walls (talk) 22:40, 14 September 2012 (UTC)
 * Thanks for monitoring, Rc. Today I'm incorporating the suggestions that have been made on my talk page, and gathering data together. I'll publish findings from our A/B testing so far, as well as how many invitees visited in general, over the weekend. Jmorgan (WMF) (talk) 21:36, 14 September 2012 (UTC)
 * Forgive me if this is considered off-topic. If this functionality is so desirable, then why is it not made easily available for new editors to find themselves upon registration instead of going through the expense of database queries/pushing it out to every user's talk page? I just received a message from the bot performing this action, but that method makes no sense to me both in terms of the processing required and caveats of periodic database queries -- this information might have been useful last night, but instead I got a message this morning. Shouldn't this sort of thing be shown to new users upon registration? Cdwn (talk) 12:05, 15 September 2012 (UTC)
 * Hi Cdwn, I agree that invitations to participate in help spaces like the Teahouse would be a great addition to the registration process. And that may well happen at some point, although to my knowledge it is not currently in the development pipeline. As to the cost of bot-driven invitations, it's actually quite low: the invitations are only sent to between 50 and 100 brand new editors per day, and other than API requests all the processing happens on a separate mirror database server set up for research experiments like this one. So the resource allocation is minimal and the performance cost basically null. As an aside, I sincerely hope you find the invitation to Teahouse useful at some point in the future! Jmorgan (WMF) (talk) 01:28, 17 September 2012 (UTC)
 * My point was more one of design than performance, it just seems weird to do it this way to me. I will take a look at the Teahouse now, thanks. Cdwn (talk) 02:04, 19 September 2012 (UTC)

Status update
I've put together a report of findings from the last month of automated invites on the Meta research page. Looks like findings are consistent with those from the first report: automated invites get about the same response rate & block rate as manual invites. Interestingly, there's no significant difference between the performance of generic vs. personalized invites. And the August Teahouse metrics report shows that automated invites have substantially increased participation. There's plenty of room for discussion/interpretation on all of these fronts, obviously. Tho IMO we should probably hold any research-related discussion on the Teahouse Host Lounge so that more hosts can participate.

In response to editor feedback (see here, here and here) I've added some exclusion criteria into the script. The newest version is available at the google code link above. Specifically, HostBot now 'skips' a potential invitee if the following appear on their talkpage:
 * 1) a link to any WP:Teahouse/* page (not just the main page, as was previously the case)
 * 2) a  template or a
 * 3) a level-4 vandalism warning (see the threads I link to above for my justification for limiting this exclusion to level 4 warnings)
 * 4) one of several socksuspect notices

These changes have just gone into effect in the last few days, but are currently functioning as advertised. For example, this user was passed over for invitation today even though they appeared on the September 17th Teahouse invitee report because they had been served a level 4 vandalism warning. I'll continue to monitor the code for breaks and inappropriate edits, and I will continue to discuss updates, tweaks and new ideas with other community members! Jmorgan (WMF) (talk) 01:56, 17 September 2012 (UTC)


 * Update: new bug found with an indef-blocked user being invited today. I'll stop relying on the (apparently slow-to-update) enwiki.logging table for blocks, and start looking for them in the page text along with the other exclusion criteria. Jmorgan (WMF) (talk) 02:07, 17 September 2012 (UTC)
 * Rather than looking for them in the page text, wouldn't it be more reliable to query the API for a users status since not all admins will leave a blocked template? It might be a fringe case, but it should be easier than scanning through text for a template. Thanks, LegoKontribsTalkM 13:33, 17 September 2012 (UTC)
 * Depends: For the SPI stuff it might be worth to check the wikitext, for blocks you are indeed right. (e.g. promotional usernames (as reported to UAA) do not always get a warning on their talk page) mabdul 13:42, 17 September 2012 (UTC)
 * As of now, I'm checking both the page text and the logging table. The script is up and running again. Here's a L-4 vandal who was skipped, and here's someone who was skipped because they received a welcome template with a Teahouse link in it. Jmorgan (WMF) (talk) 17:40, 19 September 2012 (UTC)
 * The block bug has been fixed. This user appeared on the September 21st invitee report, but did not receive a Teahouse invite from HostBot because they had been blocked. Jmorgan (WMF) (talk) 01:20, 21 September 2012 (UTC)
 * Are you ready for approval? I don't see any objections.  MBisanz  talk 16:45, 27 September 2012 (UTC)
 * Yep, ready when you are! Cheers, Jmorgan (WMF) (talk) 23:58, 30 September 2012 (UTC)

I apologize for the late "objection" however I have a few questions regarding the bot task (I've been following it for a while, but never had any real time to look into it).
 * 1. As has been previously mentioned and explained by MZMcBride, this bot does "overturn" a WP:FDB. Ideally that shouldn't happen without community consensus (consensus put it on that list in the first place). The only discussion I see took place at Wikipedia talk:Teahouse/Host lounge. I don't think that is a very good place to build consensus. Was the discussion advertised on cent? The village pumps? I didn't see any mention in this BRFA of this being done, however I haven't looked either.


 * MZMcBride claimed that this overturns FDB by asserting that HostBot is a welcome bot. But it's not. It's an invite bot, and as such one of several community-sanctioned spambots. To further illustrate the difference between HostBot and the perennially-denied welcome bot proposal, I addressed each of the reasons for rejecting welcome bots individually, and described why they did not apply to our invite bot, in my first post to this BRFA. Jmorgan (WMF) (talk) 22:37, 2 October 2012 (UTC)
 * I don't think it's worth for me to repeat what was said above, however I agree with the diagram that MZM layed out explaining the problem, the past proposals, and how nothing was realistically different between the two. However that is just my opinion, and in the end it will come down to a BAG'er making the call. I would have appreciated to see a bit more community involvement outside of the Teahouse members, but it's a bit late for that now. LegoKontribsTalkM 08:40, 3 October 2012 (UTC)
 * 2. The results on this page indicate that around 4% of invitees responded. Has similar research been done regarding the effects of "standard welcomes" (like through the welcoming committee and welcome) as opposed to Teahouse invites? (This is more Teahouse-centric, but still part of the bot's task)


 * No, we just compared it to manually-delivered invites, which elicited the same response rate. I don't see what comparing it to welcomes would get us. Could you elaborate? I should note that I've observed in spot checks that many new editors who receive Teahouse invites also (previously or subsequently) receive welcome templates. Jmorgan (WMF) (talk) 22:37, 2 October 2012 (UTC)
 * Now that I've thought about it, I suppose my question was more related to how effective is the Teahouse at "retaining" editors compared to "standard welcomes". I don't think it's an issue for this BRFA though. LegoKontribsTalkM 08:40, 3 October 2012 (UTC)


 * 3. The bot's operator is supposed to be . Is that in an official WMF capactity (User:Jmorgan (WMF)) or a community member? Both accounts have been used to edit this BRFA.


 * Yeah, I apologize for the ambiguity there. I was actually not given a (WMF) user account until after I listed this BRFA. Much of the development of HostBot has been done with Foundation support. And the experiments on personalization, reporting of metrics, etc. that HostBot made possible were performed in support of a WMF Community Fellowship project I participated in. So this BRFA was undertaken in my capacity as a Foundation contractor, rather than a volunteer. But I will continue operating the bot as a volunteer even after my official responsibilities for the Teahouse have concluded. In the meantime, I'm willing to re-attribute HostBot to my WMF account if that seems more appropriate. Jmorgan (WMF) (talk) 22:37, 2 October 2012 (UTC)
 * I don't think that matters much, it was merely something I was confused about. LegoKontribsTalkM 08:40, 3 October 2012 (UTC)


 * 4. The research supposedly shows that generic invites have a higher yield than personalized ones. Will all future invites be generic? Or will they continue to use the A/B method of variance? I quote: We see no reason to stop A/B testing as long as automated invitations continue. In my opinion, bots that have been "approved" should not still be testing (this is definitely a unique scenario though). Also if the data shows that one method is better than the other, wouldn't it (theoretically) be advantageous to say 55% of invitees will get generic ones, 45% get personalized ones? (I'm not a statistician so I'm not sure how flawed my logic is here)


 * As you kind of pointed out, the bot made the experiment possible; we weren't testing the bot itself. The 0.6% difference between the response rates isn't statistically significant, so essentially any difference either way is probably random variation. Additional experimentation (with a larger sample, or a greater degree of personalization in the 'treatment' template) could determine differently, but it's not my highest priority as a researcher. I personally think that--assuming we don't feel the need to test out new and fascinating permutations on personalization--future invites should contain a host name. It just seems nicer to provide invitees with a personal point of contact. Don't you think? Jmorgan (WMF) (talk) 22:37, 2 October 2012 (UTC)
 * Assuming that the 0.6% difference is not statistically significant, I would agree with you on that. If you do plan on going ahead with the A/B testing, I would appreciate if you could publish your numbers once again like you did for this trial. LegoKontribsTalkM 08:40, 3 October 2012 (UTC)

Again, I'm sorry for bringing these questions up so late in the approvals process, I really just haven't had time to look into the task as much as I had wanted to. Thanks, LegoKontribsTalkM 07:10, 2 October 2012 (UTC)
 * Thank you for taking the time to respond to my questions Jmorgan. I still don't feel that #1 has been addressed properly, but at this point it's just my opinion. I'm completely fine with #2-4 though. Thanks, LegoKontribsTalkM 08:40, 3 October 2012 (UTC)

Code review: the code isn't fully exclusion compliant. For example, if I used, HostBot would still edit the page. It's probably just easier to use the regex listed at Template:Bots. LegoKontribsTalkM 08:53, 3 October 2012 (UTC)
 * ✅ I've added the regex in. See the function 'allow_bots' on line 109 of the code in the new Git repository. Jmorgan (WMF) (talk) 20:59, 7 October 2012 (UTC)

A few more things I've found: You really should be using, however the above will sanitize your data. (You can never be too careful!)
 * Your  (also lines 165-167) function isn't safe against injections. The proper way to implement it would be like this:
 * ❌ I spent a bunch of time on this, trying to make MySQLdb do structured, sanitized queries. I'm not sure it can be done. So my current plan is to install  once I've migrated the code to its new server, and sanitize then. Jmorgan (WMF) (talk) 20:59, 7 October 2012 (UTC)


 * In, is there a reason you are using urllib2 to get the ?action=raw as opposed to the builtin  ?
 * Tried to do this through Wikitools, couldn't make it happen in the time I had. Decided to make it work another way for now... see below. Jmorgan (WMF) (talk) 20:59, 7 October 2012 (UTC)


 * I also don't see any indication that the user-agent urllib2 is sending has been modified to follow User-Agent policy. Also rather than manually changing a few url escapes, you can just use  to do them all. LegoKontribsTalkM 09:16, 3 October 2012 (UTC)
 * ✅ I now pass a custom user agent header in with each request. See updates to the talkpageCheck function. Jmorgan (WMF) (talk) 20:59, 7 October 2012 (UTC)


 * If you use  (available on willow) over   you don't have to use the   you have in line 55.
 * Thanks for pointing this out. Seems like a better alternative to MySQLdb. Will implement after migration. Jmorgan (WMF) (talk) 20:59, 7 October 2012 (UTC)


 * I'm a bit concerned about the general  statement (L81). Anything from a connection/server error when using urllib2 to a unicode error in the text could trip it, and the error isn't logged anywhere so a human could potentially invite the user, it just gets lumped into the skipped group. LegoKontribsTalkM 02:59, 4 October 2012 (UTC)
 * ✅ well, kinda done. What I've done is just update the output script  so that it adds 'skipped' to the invited? column of the database report. This makes it more transparent that HostBot has skipped a user. I'll build in better non-latin char handling post-migration. Sounds like   will help me there, too. Jmorgan (WMF) (talk) 20:59, 7 October 2012 (UTC)


 * Thanks for this detailed code review, Lego. I'll implement the first few changes right away (regex for compliance, sanitize queries), and look into what user-agent header urllib is using, and try to make that compliant ASAP. I'll also see about switching from mysqldb to oursql. I'm not running on toolserver right now (and in fact, it looks like I'm going to be migrating the db and code to a whole new server soon... urgh), but handling encoding exceptions has been a real pain, so a more robust sql module would be awesome. I used my own solution for grabbing page text because I wasn't all that familiar with the WikiTools codebase (still not, but I'm learning...). Do you know if getWikiText conforms with the user agent policy? Jmorgan (WMF) (talk) 21:47, 4 October 2012 (UTC)
 * Yes, getWikiText should follow the user-agent policy. I believe that any request made through the wikitools library will. LegoKontribsTalkM 21:51, 4 October 2012 (UTC)
 * I sent you a pull request which fixes the MySQLdb issue and uses getWikiText as opposed to urllib2. LegoKontribsTalkM 19:18, 10 October 2012 (UTC)

Status update - Discussion break
Any updates? mabdul 19:00, 2 November 2012 (UTC)
 * Not as such. I've seen LegoKTM's pull request but haven't integrated it yet. I intend to do so soone tho (not ignoring; just extremely backlogged). My goal is to make this update and several other ones next week. Meantime, HostBot is merrily inviting new editors, and the Teahouse is busier than ever. No blocking issues or new dependencies.  - J-Mo  Talk to Me   Email Me  19:53, 2 November 2012 (UTC)
 * Update: LegoKTM's changes have been implemented, and are reflected in the repository linked above. - J-Mo  Talk to Me   Email Me  00:20, 19 November 2012 (UTC)
 * Ready for approval?  MBisanz  talk 04:26, 19 November 2012 (UTC)
 * I am if you are! - J-Mo  Talk to Me   Email Me  19:42, 20 November 2012 (UTC)


 *  MBisanz  talk 14:51, 26 November 2012 (UTC)


 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.