Wikipedia talk:Education program/Research

Edits
I'd suggest asking editors to provide an exact count of edits (they can be easily given a link to an edit counter). It would be much more useful than ranges. If the ranges are kept, give more choices. As an instructor with over 100,000 edits I find the current range for instructors (maxed at 200) grossly inadequate, and ambassador one at 1,500 is little better. --Piotr Konieczny aka Prokonsul Piotrus&#124; talk to me 00:04, 5 May 2012 (UTC)
 * I completely agree with this. I believe that someone suggested to me that an appropriate minimum number of edits for an online ambassador at the time of their first approval to be an OA is 3,000 edits and at least a year of experience as an editor. I don't know that such a high standard is necessary, but the fact that anyone even suggests this level of experience as a minimum for an OA implies to me that this survey should be able to distinguish much higher numbers of edits for CAs and Professors also. Pine(talk) 08:32, 5 May 2012 (UTC)
 * Thank you for your suggestions! This is why we ask for your input, as you're the experts on this type of thing. We'd like to keep as many questions as possible as multiple choice (better for comparing commonalities across classes). Can you guys suggest better range options? For example, in your experience, in what range are editors likely to have the same experience (I'd guess that someone with 508 edits isn't much more experienced than someone with 470). JMathewson (WMF) (talk) 16:39, 7 May 2012 (UTC)
 * Jami, you might have a look at WP:SERVICE. If a user has been registered for less than three months or has less than 1,000 edits, they're considered a novice. The way this system has been set up, a user with 2,000 edits who has only been around for two years is ranked at the same level as a user who has been around for only 6 months and has 4,000 edits. This system is a tried and true method for evaluating a user's editing experience (since both time and edit count are required, and it may be a good measure here as well. Rob SchnautZ (WMF) (talk • contribs) 20:42, 7 May 2012 (UTC)
 * I agree only partially. I think that the service awards are a convention that is well documented. But regarding "tried and true", I don't think that any research has been done that suggests that "journeyman editors" are meaningfully better editors than "novice editors." Indeed there are highly experienced editors who get into edit wars, sock puppetry, and copyvios. And it's entirely possible for an editor to reach 10,000 edits just by doing anti-vandal work, which is unlikely to give them the kinds of experiences that would help them to be a good WEP ambassador or professor, so we should be very careful when using edit count and years of experience as a way of measuring an editor's expertise. Pine(talk) 21:17, 7 May 2012 (UTC)
 * By "tried and true" I meant that the folks at WT:Service awards have attempted to improve the system time after time, and no one's come up with anything better. It's certainly not a tell-all solution, it is the best solution for this problem that I've seen presented on Wikipedia. Do you have another idea, perhaps? Perhaps adding a third metric to this ranking system would improve it? Rob SchnautZ (WMF) (talk • contribs) 16:22, 8 May 2012 (UTC)
 * How about adding "How many WEP courses had the ambassador assisted prior to this class?" to give a measure of the ambassador's experience with WEP? Pine(talk) 06:55, 10 May 2012 (UTC)
 * Oooo, I like it. Rob SchnautZ (WMF)  (talk • contribs) 16:11, 10 May 2012 (UTC)

Suggestion about motivation reason
I suggest changing the reason "What is this instructor's primary motivation for assigning students to edit Wikipedia?" from "in order to be affiliated with the Wikimedia Foundation" to "in order to be affiliated with Wikipedia or the Wikimedia movement." Pine(talk) 08:35, 5 May 2012 (UTC) ✅

"Instructor Wikipedia experience (by edit count), before the start date of each new term"
If I'm understanding this right, the answer to this question will be different for each term that the professor teaches, so the same professor may be mentioned in multiple surveys with different numbers of edits for each survey. Is that right? I think that this question should be clarified. <b style="color:#01796F;">Pine</b><sup style="color:#000000;">(talk) 08:37, 5 May 2012 (UTC) ✅
 * Yes, you're understanding it right. We have a lot of instructors who have taught for three or four semesters now, so their experience on-wiki has greatly changed since their first edits. Are you sure this question is too unclear? The questions are to guide the interviewer and will not likely be word-for-word. Let me know what you think. JMathewson (WMF) (talk) 16:43, 7 May 2012 (UTC)
 * If the interviewer understands this that's fine, although I think that it's important for some of these questions to be asked word-for-word to limit the possibility of unintended variances creeping into the survey results. <b style="color:#01796F;">Pine</b><sup style="color:#000000;">(talk) 03:54, 8 May 2012 (UTC)

Options for "Each Campus Ambassador"
Add "university staff" as an option. I believe that at least a few university library staff are campus ambassadors. Also, change "working professional" to "working but not as an employee of the university" and "unemployed and seeking work." <b style="color:#01796F;">Pine</b><sup style="color:#000000;">(talk) 08:39, 5 May 2012 (UTC)

✅ Very good point-- many CAs are also employed by professors as secretaries. I used "employee" instead of "staff" since at many universities, a distinction is made between "staff", "faculty", and "student worker". Since "employee" has a more universally consistent definition, it will generate less confusion. <span style="font-family:linux libertine o, times; font-variant:small-caps">Rob SchnautZ (WMF) (talk • contribs) 16:30, 8 May 2012 (UTC)

"Subject area"
Add "law", "nursing, public health, or medical sciences", "computer science," "journalism, marketing, or communications", "engineering", "economics", and "business (excluding marketing and economics)". I recommend separating "economics" from social sciences and business because some people might think of economics as being a business subject and others might think of it as being a social sciences subject, so giving it a unique category helps to avoid ambivalence. <b style="color:#01796F;">Pine</b><sup style="color:#000000;">(talk) 08:52, 5 May 2012 (UTC) ✅

"What kinds of content did students contribute as part of the Wikipedia assignment?"
Add as options "references", "copy editing", and "fact checking". Also I suggest broadening the question to "What kinds of contributions or actions did students contribute as a part of the Wikipedia assignment?" <b style="color:#01796F;">Pine</b><sup style="color:#000000;">(talk) 09:09, 5 May 2012 (UTC)

New section suggestions
I suggest adding questions about talk pages and user pages. Did students participate in article talk pages, and how were their experiences? Did students get comments from Wikipedians on their user talk pages, and how were their experiences? <b style="color:#01796F;">Pine</b><sup style="color:#000000;">(talk) 09:00, 5 May 2012 (UTC)
 * Very important! Their participation on both types of talk pages should be required and its quality included as a part of their individual marks. --Hordaland (talk) 15:06, 6 May 2012 (UTC)
 * Pine, thanks for these suggestions-- please feel free to edit the page itself with any questions you see fit. <span style="font-family:linux libertine o, times; font-variant:small-caps">Rob SchnautZ (WMF)  (talk • contribs) 19:18, 8 May 2012 (UTC)

"Is the university a liberal arts college?"
I suggest changing this to ask "What kind of university was involved?" Options could be "community or technical college," "liberal arts college," "comprehensive university," "research university", and probably others. <b style="color:#01796F;">Pine</b><sup style="color:#000000;">(talk) 09:05, 5 May 2012 (UTC)
 * Since my university has schools of Liberal Arts, Education/Science/Engineering, Business, and Nursing, I've taken the liberty of adding "science" and a few other common schools focuses I'm aware of in my state. <span style="font-family:linux libertine o, times; font-variant:small-caps">Rob SchnautZ (WMF) (talk • contribs) 19:34, 8 May 2012 (UTC)
 * I suggest breaking up this question into two, "What kind of university was involved?" and "What kind of college/school was involved, if applicable?" <b style="color:#01796F;">Pine</b><sup style="color:#000000;">(talk) 07:14, 11 May 2012 (UTC)
 * Like this? <span style="font-family:linux libertine o, times; font-variant:small-caps">Rob SchnautZ (WMF) (talk • contribs) 19:09, 11 May 2012 (UTC)
 * No. For example, a university might be a research university, and the school might be the school of computer science. <b style="color:#01796F;">Pine</b><sup style="color:#000000;">(talk) 07:14, 12 May 2012 (UTC)

"Is the university public or private?"
Change the options to "public", "private non-profit," and "private for-profit". <b style="color:#01796F;">Pine</b><sup style="color:#000000;">(talk) 09:07, 5 May 2012 (UTC)

Quality and burden/impact
Although the bullets at the top of the list specify that quality is a key metric, there doesn't seem to be an on-wiki quality assessment component. I think the PPI metric should be re-used here, and assessors recruited to do the assessments work -- we could start by pinging the PPI assessors.

Second, there's no measure of community impact, or the burden placed on the community by these courses. This is important for us to measure if we want to determine whether these classes are costing more in labour than they deliver in quality. I suggest a short questionnaire asking editors to rate how much work they had to do to clean up the students' edits. The questionnaire needs to distinguish between work done to, e.g., format or MOS-ify a perfectly good piece of content, and work done to remove material cited to inappropriate sources. The former is not burden, it's additional improvement on top of the improvements made, and shouldn't be regarded as negative impact; the latter is an unnecessary burden and needs to be counted as negative impact.

I'm going to post a couple of notes to draw attention to this discussion and see if we can get more participation. If anyone can think of additional forums that would be good places to mention this, please post a note there too. Mike Christie (talk - contribs - library) 12:34, 6 May 2012 (UTC)
 * Whoops, just saw this comment, which corresponds a bit with the message below. Please post anything you find there! JMathewson (WMF) (talk) 18:06, 7 May 2012 (UTC)

Measures of success
To properly assess the classes that have participated in the PPI or US/Canada Education Programs so far, we realize we need some measures of success to cross-reference with the findings from these questions. I think you guys probably have some good ideas about what measures of success are relevant and how to gather that information. I'm listing a few ideas below but would like some input to see what else you think we should look into. Please remember that we may not have the resources to put into each measure of success suggested, or that particular measure may not align with the goals of the Education Program. Still, we will try to provide the best results based on everyone's input. These are just some ideas, but please comment below on any measures you think are extremely important to find. Again, I appreciate your suggestion for assessing any measures, too. JMathewson (WMF) (talk) 18:00, 7 May 2012 (UTC)
 * Quality change/article improvement
 * Impact on readers
 * Number of articles created
 * Number of student articles to achieve Good Article status
 * Number of student articles to reach DYK
 * Sustainability
 * How many professors returned (or plan to return) to the program?
 * How many ambassadors returned (or plan to return) to the program?
 * I agree we need a quality metric. I'm not sure what "impact on readers" is; are you talking about the articles' importance?  If so I don't think this should be a factor.  I also don't think you can pay much attention to the number of articles created -- it's quite hard to find a good new article topic and I suspect that if there's any correlation between this number and quality of contributions it's as likely to be negative as positive.  The next two, GA and DYK, are fine but could be subsumed into a more general quality metric.  Sustainability is interesting, but seems an indirect measure of success.


 * As I said above, I think we also need a measure of the burden the class places on the community.


 * I do think the instructor's contributions are a key metric, and an easy one to get. I'd suggest recording the total edit count as well as the number of edits to article space and article talk.  Per Piotr above, I wouldn't bother trying to categorize it by e.g. >100 edits; just record the number, and we can work with the data later if necessary.  I also agree with the opinions expressed above that to be regarded as more than a novice you have to have a minimum of a thousand or so edits over a period of months.  It's very hard to convey to someone without that number of edits exactly what it is that is learned by the time one makes one's five-thousandth edit.Mike Christie (talk - contribs -  library) 01:39, 8 May 2012 (UTC)

Data collection
I'm not sure how this works from the WMF end, but from my end the ethics committees (which you don't have to worry about) are generally concerned about asking for information not related to the particular research question. Thus I guess I should flag instructor age as a concern. - Bilby (talk) 03:18, 8 May 2012 (UTC)
 * I don't think that's a considerable issue here. It could be argued that age plays a role in how professors interact with the community (either because of the connection between age and comfort with technology, which is debatable, or because of a disparity between professor age and "average Wikipedian" age). Nikkimaria (talk) 03:39, 8 May 2012 (UTC)
 * I think you would have to make that case, though. And I also don't think that it is a major worry, but (at least here) we need to be able to justify why we ask for any personal data in a survey tool. I'm not overly worried if the wish is to include it, but I do feel the need to flag it as a potential concern, in the same way I would flag a request for gender or other personal data that seems unrelated to the success of the course. (And I mean no ill in this - it's just one of those things you are trained to question). - Bilby (talk) 03:47, 8 May 2012 (UTC)
 * Actually, querying gender would be an even better idea, given the WMF concern with the "gender gap". But I get where you're coming from. Nikkimaria (talk) 04:04, 8 May 2012 (UTC)
 * I definitely see where you're coming from. I think the idea is to gather as many potential factors as possible so we can do a proper comparison across the board. It still doesn't mean age is causal, even if all of our "successful" courses happen to have a 31-year-old instructor. Just like it doesn't mean only graduate classes can be successful, even if all of our "successful" courses turn out to be graduate classes. Thank you for flagging the concern, though. It's definitely not something the interviewer will insist on getting from the instructor. JMathewson (WMF) (talk) 16:38, 8 May 2012 (UTC)

Professor's article space edits
I've added an article space edit count to the professor metrics. For consistency I left the edit counts as they were for the overall metrics, but I think that the buckets are too small -- the cutoffs should be more like 100, 500, and 2,000. However, if we record actual numbers along with the bucket, we'll have both versions of the data to look it if necessary. Mike Christie (talk - contribs - library) 00:36, 10 May 2012 (UTC)
 * Agreed-- 50 edits doesn't say much; it does indicate a usually somewhat limited knowledge of the syntax, but the numbers you suggested seem more relevant to understanding whether the instructor has solid editing experience. <span style="font-family:linux libertine o, times; font-variant:small-caps">Rob SchnautZ (WMF) (talk • contribs) 16:15, 10 May 2012 (UTC)

Impact on the community
I see three methods of measuring community impact: 1) "Does the student reply to comments placed on their talk pages" followed by for those who respond "do they accept the feedback, argue against Wikipedia policies or do not address the feedback at all".

2) The number of nominations to DYK, GA and FA and the percentage that where passed compared to the average for the community as a whole. Also the number of peer reviews requested and the degree they where acted on by students verses the norm for the community as a whole. 3) Survey Wikipedians. -- Doc James (talk · contribs · email) 01:11, 10 May 2012 (UTC)
 * As an aside, the first would be a potential factor for student success, but wouldn't directly relate to community impact. However, that and some other data might be better derived from edit histories - is it practical to derive some data from histories rather than expanding the survey tool further? I assume that these won't be anonymous submissions, or that the submissions would be grouped by course. - Bilby (talk) 01:42, 10 May 2012 (UTC)
 * I don't think edit histories would work, because you can't tell how much time was sunk into those edits, and that's the real metric. If I make one edit to clean up a student's work, and to make it I spent thirty minutes digging in online sources to verify my suspicion that the student used an unacceptably poor source, that's indistinguishable from a five second edit to remove obvious nonsense.  I don't think the surveys are a particularly good way to measure impact, but so far we haven't come up with anything better.  I don't think it's acceptable not to measure impact because that's the key component of the cost in any cost-benefit analysis of the EP. Mike Christie (talk - contribs -  library) 02:01, 10 May 2012 (UTC)
 * Sorry, I confused two issues here. I agree edit histories don't help measure impact. My thought was I regard to other questions, such as the extent to which professors (always confusing to write that from an Australian perspective) edited prior to the start of the course, or if students responded to talk page messages. I'm wondering to what extent survey responses will or can be supplemented with derived data from public logs, in order to minimize the survey length while providing more data.
 * in regard to the thread's topic, as discussed elsewhere, we can't evaluate community impact without a baseline measure, but my assumption was that this research was to identify indicators of course success, in which impact measures are useful for internal comparisons. So GA/DYK pass rate would have a bearing, although limited in scope. - Bilby (talk) 03:17, 10 May 2012 (UTC)

Derived data
Hi! I asked this above, but probably didn't do a good job of it, so I figure it is worth asking in its own section. :) One of the old rules with the design of survey tools is that you need to keep them manageable in size, especially when they're optional. Long surveys don't always get completed. That's not a big deal here, as I think people will be inclined to run through them anyway, but there are a few things in the list which could be eadily derived from available data, rather than asked during an interview. It seems you could reduce the list by about 10-15 questions by relying on data derived from other sources. Is this something that is planned and or possible? - Bilby (talk) 04:46, 11 May 2012 (UTC)
 * I was starting to think about a similar problem. If we have dozens of questions then how do we know which variables are the most important for us to use as independent variables as we analyze the data? Is it feasible to analyze every way of looking at the data? This might be a good question for a statistician, if WMF has any of those. <b style="color:#01796F;">Pine</b><sup style="color:#000000;">(talk) 07:17, 11 May 2012 (UTC)
 * These are just the datapoints we'll be trying to compile for each class. We already have the information for a lot of them but wanted to present them all to you, in case you guys had feedback on those points, too. The others will have to be asked in an interview, but the actual interviews won't be as long as this list. Thanks, JMathewson (WMF) (talk) 19:51, 11 May 2012 (UTC)

Syllabi
Where is the appropriate place where people can upload their syllabi of courses that involve Wikipedia (either as one of the sources, or the topic of the course)? I know people who have designed such courses, and I'm writing this while attending Wikimania 2012, and the question has come up. -- kosboot (talk) 01:48, 12 July 2012 (UTC)
 * Ok, I just found this: http://outreach.wikimedia.org/wiki/Education/The_Syllabus#Syllabus_collection_from_past_terms.  But it was hard to find!  Even Google didn't bring it up. If one of the aims is to solicit ideas and syllabi, it's well hidden. -- kosboot (talk) 02:27, 12 July 2012 (UTC)
 * Hi Kosboot! I hope others who have used Wikipedia assignments before will also add themselves to the new Case Studies brochure web version at http://education.wikimedia.org/casestudies. We'll have physical copies of the printed brochure at the Wikipedia in Education Meet-Up on Friday evening; since you're at Wikimania, I hope you'll be able to join us! -- LiAnna Davis (WMF) (talk) 11:26, 12 July 2012 (UTC)
 * Thank you, LiAnna - good to know there's an education.wikimedia.org (even if it's an alias). I'm going to suggest that to someone I know who's taught 2 Wikipedia courses.  I have to check the schedule, but I'm speaking Friday morning, so I'm not sure I'll be able to attend, but I hope you or someone takes good notes on what transpires. --kosboot (talk) 11:58, 12 July 2012 (UTC)