User:Ocaasi/TWAbag

HostBot 4
Operator:

Time filed: 14:58, 7 November 2013 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): Python, uses WikiTools

Source code available: Source code for hostbot is public, source code for this particular task is being developed

Function overview: Invites new good faith editors to play The Wikipedia Adventure

Links to relevant discussions (where appropriate):

Edit period(s): Daily

Estimated number of pages affected: 100 per day during the course of the beta test (2-4 weeks)

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details:

The sampling and data analysis plan for the TWA test will be similar to that used to evaluate the impact of participation in the Wikipedia Teahouse, which is described in the Teahouse metrics report and in this research paper.

We will invite a sample of 100 new editors to play TWA every day. The sample will be drawn from the set of users classified as “good faith” by the Snuggle tool developed by EpochFail. A sample of Snuggle data is available here. The criteria for invitation will be:
 * The user created their account within the past 24 hours
 * The user has made at least 1 main namespace edit
 * The user has a Snuggle desirability score of >.8. Blocked or banned accounts are excluded by this threshold, as are users who are likely to be editing in bad faith.
 * The user has not yet received a Teahouse invitation.

Invites to play the game will be sent via a talkpage invitation from HostBot. Users who receive an invitation and subsequently complete at least 1 level of the Wikipedia Adventure will serve as the Experimental group (Group A).

For every 100 editors invited to play TWA, another 100 new editors who meet the criteria for invitation will not receive one. Of these editors, those who subsequently make at least 1 edit to Wikipedia will serve as a basic experimental control group (Group B). We require at least 1 subsequent edit (after the hour when the user would have been invited to TWA, had they been included in Group A) in order to assure that the editors in this group would have had the opportunity to see the invitation--i.e. to make sure they had not already given up or lost interest in editing by the time of invitation.

A second control group (Group C) will consist of editors who received an invitation, did ‘’not’’ play TWA at all, but who did make at least 1 edit to Wikipedia after receiving the invitation. This control group will be used to determine whether the invitation itself has any effect on subsequent editing activities, or long-term retention, separate from the potential impact of playing TWA.

The editing subsequent editing activities of the editors in Group A will be compared with those in Groups B and C. Metrics used to evaluate impact are likely to include number of edits, number of articles edited, change in Snuggle desirability score, and level of activity over time (retention), and may include other metrics.

Editors who start the game will be monitored for signs of increased vandalism to the encyclopedia, and cleanup actions will be taken by those monitoring as needed during the test.

Background
There is interest in how principles of game mechanics and playful design can help encourage users to take meaningful actions online. It is not yet known if that body of research offers lessons for improving Wikipedia, however. As such, we have designed an experiment to test whether an onboarding game is a useful method for training new Wikipedians, using a fun, interactive onboarding game/tour called The Wikipedia Adventure.

Research questions
Do new editors who complete the Wikipedia Adventure:
 * go on to be more active and successful Wikipedians?
 * make more positive contributions to the encyclopedia?
 * have a better understanding of Wikipedia and experience fewer frustrations?
 * remain with the community for longer than editors who are not exposed to it?

Test plan
The sampling and data analysis plan for the TWA test will be similar to that used to evaluate the impact of participation in the Wikipedia Teahouse, which is described in the Teahouse metrics report and in this research paper.

We will invite a sample of 100 new editors to play TWA every day. The sample will be drawn from the set of users classified as “good faith” by the Snuggle tool developed by EpochFail. A sample of Snuggle data is available here. The criteria for invitation will be:
 * The user created their account within the past 24 hours
 * The user has made at least 1 main namespace edit
 * The user has a Snuggle desirability score of >.8. Blocked or banned accounts are excluded by this threshold, as are users who are likely to be editing in bad faith.
 * The user has not yet received a Teahouse invitation.

Invites to play the game will be sent via a talkpage invitation from HostBot. Users who receive an invitation and subsequently complete at least 1 level of the Wikipedia Adventure will serve as the Experimental group (Group A).

For every 100 editors invited to play TWA, another 100 new editors who meet the criteria for invitation will not receive one. Of these editors, those who subsequently make at least 1 edit to Wikipedia will serve as a basic experimental control group (Group B). We require at least 1 subsequent edit (after the hour when the user would have been invited to TWA, had they been included in Group A) in order to assure that the editors in this group would have had the opportunity to see the invitation--i.e. to make sure they had not already given up or lost interest in editing by the time of invitation.

A second control group (Group C) will consist of editors who received an invitation, did ‘’not’’ play TWA at all, but who did make at least 1 edit to Wikipedia after receiving the invitation. This control group will be used to determine whether the invitation itself has any effect on subsequent editing activities, or long-term retention, separate from the potential impact of playing TWA.

The editing subsequent editing activities of the editors in Group A will be compared with those in Groups B and C. Metrics used to evaluate impact are likely to include number of edits, number of articles edited, change in Snuggle desirability score, and level of activity over time (retention), and may include other metrics.

Editors who start the game will be monitored for signs of increased vandalism to the encyclopedia, and cleanup actions will be taken by those monitoring as needed during the test.

Analysis
We will be logging the game using Guided Tours so we can evaluate the degree to which the game impacts engagement.

We’re comparing:
 * Not invited
 * Invited but didn't complete mission 1
 * Completed Mission 1
 * Completed Mission 4
 * Completed Mission 7

Quantitative
We'll run database queries to gather quantitative data including:


 * number of edits to articles
 * number of talk page edits
 * frequency of edits over time
 * amount of content added that survives over time
 * warnings and blocks
 * STiki scores - metadata analysis
 * Namespace breakdown


 * Mission-specific skills we can evaluate
 * userpage edits
 * user talk edits
 * article space edits
 * teahouse edits
 * inline citations added
 * wikilinks, images, headers added

Qualitative

 * A survey linked from the last page of the game will be deployed to help assess the editors' satisfaction, experience, and understanding of Wikipedia
 * We may also use Snuggle to manually review game participants and categorize their editing activity subjectively

Follow-up
The results of the experiment will be shared by the end of 2013 with the community in order to inform a decision about whether the game should be more widely deployed, discontinued, or if further testing is needed.