Wikipedia:Bots/Requests for approval/HostBot


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol keep vote.svg Approved

HostBot
Operator:

Time filed: 21:01, Thursday April 5, 2012 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): Python

Source code available: source code will be made available

Function overview: HostBot is intended to make the process of inviting new editors to WP:Teahouse easier by providing hosts with a regularly-updated list of promising new editors to invite. Regularly updating invite status further streamlines the process by allowing hosts to track which editors have already been invited. Note: this bot is intended to take over automated tasks that are already being run under my user account, a situation which arose because I needed to get reports working in time for the Teahouse pilot project launch. Creating a bot for these tasks is a necessary step towards a sustainable solution for the Teahouse invite process, especially as Teahouse shifts from a fellowship project to one that's purely volunteer-based.

Links to relevant discussions (where appropriate): Wikipedia_talk:Teahouse/Host_lounge

Edit period(s): new report generated once a day; "invite status" on current report is updated every 30 minutes when changes happen

Estimated number of pages affected: 1

Exclusion compliant (Y/N): Yes N/A

Already has a bot flag (Y/N): Yes No

Function details: Once a day, HostBot would generate two database reports on the Teahouse invitee reports page. These reports initially pulled from the enwiki db on Toolserver, but scripts were moved to WMF db1047 when the Nightshade server went down, and have been there ever since. These reports display information about a sample of potential Teahouse invitees who match the following criteria:
 * 1) New editors: editors who joined within the last 24 hours, have since made more than 10 edits, and were not blocked at the time the report was generated, and
 * 2) Newish editors: editors who joined within the last 4 days, have since made more than 20 edits over the course of at least 3 sessions, and were not blocked at the time the report was generated.

The report includes the following metadata about each editor: username (linked to editor's talk page); edit count (New editors) or edit sessions (Newish editors); whether the editor has "Email this user enabled" (and if so, a link to email them); editor contribs link; and "invite status" (initially blank). An "edit session" is comprised of a series of edits where each edit was made less than one hour after the previous edit. This script uses the wikitools framework.

A separate, update script runs every 30 minutes thereafter. The script checks for transclusions of the Teahouse Invite template via an API query, and generates an updated report with "Invited" column filled in for users who have received a templated invite since the last update. This script uses pywikipediabot.

Discussion
Seems straight-forward enough, only 1 page affected. — HELL KNOWZ  ▎TALK 11:06, 6 April 2012 (UTC)
 * Comment A large portion of the edits seem to be just updating the time it updated the page without updating any invitees. Is there a way to make it not edit if there is no change, to avoid making surplus edits for no reason?  Rcsprinter  (message)  19:45, 12 April 2012 (UTC)
 * Indeed. Is there any reason why it is so important that the most recent timestamp is on the page? It only hinders page watchlisting and clutters diffs/histories. — HELL KNOWZ  ▎TALK 08:40, 13 April 2012 (UTC)
 * Thanks for the feedback. I'm currently updating the update script so that it only edits the reports page when there are new invitee status updates to post. I confess that I'm not clear on why having the most recent timestamp on the page is problematic (other database reports include this update timestamp), but I can certainly remove it from Teahouse/Hosts/Database_reports if it causes problems; it's not a critical feature. I'll assume that the bot will continue in trial status until I've demonstrated that that my changes have been successful? - J-Mo  Talk to Me   Email Me  22:57, 14 April 2012 (UTC)
 * Not the timestamp itself, but the update frequency. The page you linked is updated only once a month. You can keep running the trial until you feel confident it's working. — HELL KNOWZ  ▎TALK 23:01, 14 April 2012 (UTC)
 * K. Fixed the scripts so that HostBot only edits the report when there's new content to add, and removed the timestamp. How does it look now? - J-Mo  Talk to Me   Email Me  18:58, 16 April 2012 (UTC)

Looks good, don't see any issues. Only one page affected, all issues clarified, task has consensus. Happy sailing. — HELL KNOWZ  ▎TALK 19:01, 16 April 2012 (UTC)


 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.