Wikipedia:Bots/Requests for approval/HostBot 8


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was

HostBot 8
Operator: (bot)

Operator (training modules):

Time filed: 18:25, Tuesday, May 2, 2017 (UTC)

Automatic, Supervised, or Manual: Automatic (after supervised trial)

Programming language(s): Python

Source code available: https://github.com/jtmorgan/hostbot

Function overview: Posts a welcome message on new users' talk pages that includes links to introductory training modules on Programs & Events Dashboard, like this. Here's the template: Welcome training modules.

Links to relevant discussions (where appropriate): Village_pump_(proposals) (permanent link). See also Bots/Requests for approval/RagesossBot 3.

Edit period(s): Continuous

Estimated number of pages affected: 5000

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details:

The main purpose of this task is to run a controlled experiment inviting new users to use the Programs & Events Dashboard training modules to learn the basics of Wikipedia. It finds recently registered accounts that have made between one and five edits, and sorts half of them into an experimental group and half into a control group. For the experimental group, it posts a welcome message (similar to what HostBot has done before with Teahouse invitations) that invites users to try the training modules, which are forked from the Wiki Ed classroom program training modules that we've been using and refining over the last few years and get very positive feedback from student editors.

Jonathan Morgan and I would like to get a sample of about 5000 invited users so that we can see if it makes a difference in terms of new users being more likely to stick around and keep editing. The last major experiment along these lines that I know of was The Wikipedia Adventure; in contrast to TWA, these trainings are more practical and wide-ranging, and have also been refined over time to try to head off the most common errors and confusing aspects of Wikipedia that new users run into.--ragesoss (talk) 18:25, 2 May 2017 (UTC)

The code has been tested on test.wikipedia.org and you can see a sample of the output here. J-Mo 18:32, 2 May 2017 (UTC)

Discussion

 * The discussion, archived here, has not been formally closed, but it seems that there is reasonable grounds for the trial. I certainly support such a trial, as editor retention is a key issue, that needs addressing. With new users, have you considered the best edit range? You said between 1 and 5, but I feel this is too low, and risks inviting vandalism only accounts into both groups. Perhaps a range of 7+ would filter these out? TheMagikCow (T) (C) 07:05, 5 May 2017 (UTC)
 * Thanks . Regarding the edit range: we settled on 2 edits for several reasons. First, most people leave Wikipedia after their first couple edits. There are undoubtedly many reasons for this, but one that we're fairly sure of is that they find the editing process daunting (both the UI/tech and the policies). The training modules are designed to address these issue directly, so we want to put them in front of people who are experiencing them as quickly as possible. Second, we want to gather a large sample so that we can run stats to determine impact, and if we limit the invitees to people who have 7-10 edits, our maximum daily sample drops by about 90%. Third, HostBot currently sends Teahouse invites to most eligible newbies who reach the 5-edit threshold on any given day. We want to test the impact of the Training Modules independent of the impact of the Teahouse, which we already know has a positive effect on new editor retention, so we don't want to invite people who already have a Teahouse invite. Finally, regarding your concerns about inviting vandals, I plan to use the same approach to weeding out vandals in this study that I use for Teahouse invites: if someone has a level-4 user warning on their talkpage, if they have been blocked or banned, or if they have been accused of sockpuppetry, they won't receive an invite. We will also exclude people who meet these criteria from the control group. This filtering strategy has worked well for the Teahouse; there have been very few issues with disruptive editors there in the ~5 years that I've sent out HostBot invites. Does this rationale for using a 2-edit threshold make sense to you, and does my description of the vandal-filtering process address your concerns? Cheers, J-Mo 22:41, 5 May 2017 (UTC)
 * Thanks for such a detailed reply ! That makes perfect sense now you point about the issue of new editors making few edits and then quitting, and it sounds like 2 if the prefect number. The vandal exclusion is also a very neat idea. I am fully supportive of this task, anything to help with the issue is much welcomed. TheMagikCow (T) (C) 10:56, 6 May 2017 (UTC)


 * Approximately how many edits per day do you think this would result in? SQL Query me!  19:17, 18 May 2017 (UTC)
 * around 200 talkpage invites per day, roughly doubling the current volume of edits by HostBot. J-Mo 07:52, 22 May 2017 (UTC)
 * whichever comes first. SQL Query me!  02:45, 23 May 2017 (UTC)
 * Thank you, . Just a quick heads-up that I'm going to be largely AFK until June 18, and I obviously don't want to leave the bot unattended during the trial, so I anticipate that I will start this trial on or around Monday, June 19. Cheers, Jmorgan (WMF) (talk) 16:29, 24 May 2017 (UTC)
 * I started the trial today. J-Mo 20:25, 30 June 2017 (UTC)
 * I stopped the bot today. 1385 invites were sent. If everything looks good to you, I'd like to run this trial for another 3-4 weeks to gather a large enough sample for retention analysis. Let me know what you think, J-Mo 21:18, 6 July 2017 (UTC)
 * Would it be possible to slow the bot down a little bit? It seems like running at 80+ edits/min is a little high. Other than that I really don't have any concerns - I'll give it a couple days for others to look over your trial edits as well. SQL Query me!  03:06, 7 July 2017 (UTC)
 * Certainly. I'll add a 5-second sleep between invites. And I'll wait for your signal of 'all clear' before starting up again. Thanks! J-Mo 18:13, 7 July 2017 (UTC)
 * This is just to say that I've implemented the 5-second sleep, which should reduce invite volume to no more than 15-20/min. Cheers, J-Mo 23:41, 13 July 2017 (UTC)
 * ready for the next wave?--Sage (Wiki Ed) (talk) 19:15, 26 July 2017 (UTC)


 * It doesn't look like another trial is necessary. Implementing a sleep function is straightforward and the edits themselves look okay.— CYBERPOWER  ( Chat ) 08:44, 28 August 2017 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.