Wikipedia talk:India Education Program/Archive 5

WMF EP quantitative analysis conference 22 November
There's an online conference presenting quantitative analysis of the IEP (mostly) occurring at 16:00 UTC, 22 November (see here for your timezone). More information is available at Global Education Program Metrics and Activities Meeting. The general community is welcome to attend; we need a few people there to ask the hard questions and disrupt the inevitable WMF circlejerk. (I won't be attending myself, time = midnight for me).

I also note with disdain this official WMF blog post, which is extolling the virtues of expanding the EP without regard to community health.

P.S. Why aren't the Foundation staff posting this? (Cross posted to WT:USEP.) MER-C 05:50, 18 November 2011 (UTC)


 * I notice they only have capacity for 25 participants so get there early. Joja  lozzo  15:54, 18 November 2011 (UTC)


 * I've replied to MER-C's post at WT:USEP, but please note that this meeting is not "mostly" about the India Education Program -- it is a high-level overview of activity happening in education programs around the world on several different language versions of Wikipedia. India is one of five countries we will discuss during the meeting. -- LiAnna Davis (WMF) (talk) 16:53, 18 November 2011 (UTC)

Reformat finished
The reformatting of WP:IEPS is finished (a few trivial things are left, which can be done later). For all of you involved in the cleanup, here are the rules for editing the tables: (The 2nd column is the comments column, and the third is the status column.  Some more rules have been kept here (This page has been transcluded to the editnotice of WP:IEPS and is visible to whoever tries to edit the page). If you have any queries, please post them below. Thanks, Manish Earth Talk •  Stalk 09:42, 18 November 2011 (UTC)
 * Please check all edits made by a student, not only the IEP-related ones. Also, check if a student has already been commented on/cleaned up on a different course (many students have enrolled for multiple IEP courses) before proceeding.
 * Explain what you have changed in the edit summary. If you have changed some data (not if you have added a comment), please sign the "Last local table update: signature/date" hatnote with ~ (you may have to remove an older sign).
 * All comments now go in the "cleanup comments" section. They should be placed in bulleted lists (using *), with the recentmost comment up top. Precede the comment with a timestamp, using . Please sign the comments.
 * When you are done checking, update the "Cleanup status" column with the status (using a timestamped bulleted list like the "Cleanup comments" column). Use the following statuses: "Checked:OK", "Checked:Copyvios/Blanked", "Checked:Copyvios/Not blanked", "Not sure", and "Unchecked". If you want to add more detailed info, use the "cleanup comments" column. Wrap the cell with yes, no, partial templates according to the latest check.  (Yes for "Checked,OK", No for "copyvio", Partial for "not sure"). Note that these templates only work if their opening braces touch the "|" of the table cell. E.g:
 * blah
 * Have checked thoroughly, does not seem to be a copyvio.
 * 2011-10-19T01:14Z: I'm not too sure about the first section. Example (talk)
 * Checked:OK
 * 2011-10-19T01:14Z: Not sure
 * I have a couple of questions.
 * I've been posting in the OA column, as I have no copyright cleanup experience; do you want me to not do that any longer? Do you need me to move my past comments to the cleanup column?
 * It looks like you're considering this as an article by article cleanup, so that the comments apply to the article listed in that row. I don't think that's a good idea -- the students often didn't work on the article listed, but worked on another article, or multiple articles, instead.  In addition, I've found it's much more efficient to check all a student's contributions when I get to that student, and make an annotation that refers to all their work. If you look at my comments you'll see that they are identical for each instance of the student.  Search for "vastu1706" and you'll see what I mean.  The result is that the comments you give as an example would not be meaningful, because you'd have to qualify which article you're talking about.
 * Finally, I'm curious as to why so much effort is being put into reformatting this page. So much of what the students have done is being completely removed that in many cases there's nothing for a returning cleanup pass to do.  See this section, for example, and read the first half dozen comments; I don't think there's much left to examine.  Add to that the fact that most students appear in the list two or three times at least, and that the cleanup work done so far has probably dealt with perhaps a third of the page already, and I'm not sure that this is such a big project after all -- at the current rate, about another two weeks will probably see it completed.  I have no CCI background so pardon me if I'm missing something here.
 * Thanks -- Mike Christie (talk - contribs - library) 12:24, 18 November 2011 (UTC)
 * Answers:
 * No, the past comments can stay where they are (You can move them if you want, just set the timestamp to 'unknown'). You do not need copyright cleanup experience to use the cleanup columns, just common sense. The CCI people will do a check later for anything we've missed.
 * Actually, it's a student by student cleanup. Sorry, I forgot to mention that. You must check the contribs of all the students in a row, clean up whatever is necessary, and report the findings. I don't see how that would make the comments not meaningful. You can qualify the article, and add another comment for the next article.
 * I've added a note above on how to handle multiple students.
 * Because the whole thing was rather haphazard before. One of the problems with different formats is that its hard to see what's going on fully. The other problem was that exporting the student/article data to a machine-readable list was a big headache (And this is needed if you want to keep an eye on the students or do some repetitive task). Actually, it wasn't much 'effort' to reformat the tables (I had a script), just a bit boring. Regarding your estimate, I (and I think most of the others) would disagree that it will take a few weeks to complete this (Read some of the discussions above, for example the "Common format-consensus" one).  Manish Earth Talk •  Stalk 13:18, 18 November 2011 (UTC)
 * Mike, this was discussed in other threads further up. Even if most of the students' contributions will have to be deleted in the end (a pity, but not our fault), you need a solid database to work on. So far, we haven't had anything like this. The various lists in other places have been incomplete, outdated, faulty, contradicting and inconsistent. Tons of articles were missing. We still find "new" IEP students by accident, which have not been listed so far. And some students have created more than one account or have been editing under IP addresses. Basing your work on this mess, you will be able to investigte some edits, but you will also miss many, even if you carefully check a student's other edits. The full article list to be checked could (and will) be automatically derived from a full students list, but we maintained the articles in the master table as well, since in several cases we found "new" IEP students by checking edits of originally listed or related articles, which have been in the scope of the programme. So, in order to get a complete list of students, we will have to reverse-lookup and check any editor, who edited an article im the past months, if they might have some connection with IEP. Some of this can be automated, but only after the tables on the master list were brought into a uniform table format, so that it can be parsed by scripts easily. Another reason for the table reformatting is the fact, that in order to force anyone to work on a single database (the only way to avoid synchronisation problems) we deleted the various distributed lists in other places (after carefully merging the info). Most of the info is not relevant for any cleanup efforts, but if it wouldn't have been relevant for the local community (instructors etc.) they would not have added it to the tables. After all, they still have to evaluate the students' work to determine if they have passed their courses or not. If we'd just delete their stuff without consideration, our efforts could hardly be seen as an attempt of a cooperation, and a complete lack of communication and cooperation (from their side) is what has caused this chaos in the first place. --Matthiaspaul (talk) 13:47, 18 November 2011 (UTC)
 * Thanks for the responses; I didn't mean to complain, just check on a couple of things. I'll start adding my comments to the cleanup column and stop adding them to the OA column.  One remaining issue: what I meant by "not meaningful" is that "I'm not too sure about the first section" (the given example comment) doesn't mean anything if the row refers to multiple articles.  Anyway, if the comments I've already been making are acceptable I will continue to use that format (in the new column) since I have worked out a fairly efficient process for doing them that way. Mike Christie (talk - contribs -  library) 14:09, 18 November 2011 (UTC)
 * Make sure that you still timestamp and bullet the comments, whatever format you use. Regarding the example comment, it wasn't exactly an example comment, it was just to display how one should use yes and the timestamps. Manish Earth Talk •  Stalk 15:28, 18 November 2011 (UTC)

One more question: if I find copyright issues currently I am just deleting the material, and leaving an appropriate note, both here and in the edit summary. Technically this material should be revdeleted, I gather. Should I be flagging these for revdel? If so, what's the best way to do that? Mike Christie (talk - contribs - library) 15:30, 19 November 2011 (UTC)
 * And one more; if an article listed was not in fact worked on by a student, should I delete that article from that row of the table? Mike Christie (talk - contribs - library) 15:47, 19 November 2011 (UTC)
 * When I search for copyvios I only check online sources. If I find nothing I will add "could not detect copyvio" to the comments on that article, and will add "Checked: OK" as the status, even though I've not checked offline sources.  Let me know if that's not OK; and also please take a look at the first few edits I make here and let me know if there are other changes you need me to make. Mike Christie (talk - contribs -  library) 16:06, 19 November 2011 (UTC)
 * Yes, it should be revdeled, that will be done during the CCI. You can flag them if you wish (that will reduce the CCI workload), but it is not necessary to do so. Don't remove articles from the list, the list isn't only for the cleanup (The profs need it, too).
 * Yes, it is understood that we can't easily check offline sources, so an online check is enough. Fortunately, if something is copied from a textbook, it is usually obvious example (atleast for most of the IEP edits I've seen). These can be safely blanked. If you're not sure, use the "not sure" option. Also, if you want to check out textbooks, you can do a Google Book search of the text in question.
 * By the way, don't forget to add the yesno templates to the cleanup status. This will primarily help the CCI (and us) understand what is needed at a glance (yes means it's clean, no means that it needs revdelling, partial means the cleanup-er wasn't sure). I have added them to one of the tables. Manish Earth Talk •  Stalk 02:11, 20 November 2011 (UTC)
 * Thanks for the example edit; that was helpful. I'll put the yes/no/partials in.  Please keep an eye on my updates and let me know if there's anything else I need to change about the way I'm doing them. Mike Christie (talk - contribs -  library) 02:40, 20 November 2011 (UTC)

Watchlist dumps updated.
Since the reformat is over, I was able to update our watchlist dumps. The "Watchlist dump" pages contain text to copy-paste into your raw watchlist. The "Watchlist" pages are watchlist-like pages for the list of articles/etc in question (If you don't want to bloat your watchlist).  Here they are:  Enjoy! Manish Earth Talk • Stalk 14:07, 18 November 2011 (UTC)
 * Articles: Watchlist dump; Watchlist
 * Sandboxes: Watchlist dump; Watchlist
 * Students: Watchlist dump; Watchlist; Combined contribs (you have to scroll down)
 * Great, thanks! --Matthiaspaul (talk) 14:43, 18 November 2011 (UTC)

If anyone wants it, here's the script I use for generating the lists. It needs to be executed on WP:IEPS, using a Javascript Console (Chrome has one, I think IE and Greasemonkey have them, too). Execute the chunks separated by // separately, in order (The first chunk prepares the ground, the second chunk displays the usernames, third articles, fourth sandboxes). The last three chunks will temporarily hang your browser (or part of it.. on Chrome all Wikipedia pages are readable, but you can't type anything on them). The code is a bit badly written (I wrote it in a rush), but it works:

$.fn.removeCol = function(col){ // Make sure col has value if(!col){ col = 1; } $('tr td:nth-child('+col+'), tr th:nth-child('+col+')', this).remove; return this; }; function getLink(str){ var ret if(str.indexOf("index.php")!=-1){ ret=str.split("title=")[1].split("&")[0]; }else{ ret=str.split("/wiki/")[1] } return unescape(ret).replace(/_/ig," ").split("#")[0] }

function compareLinks(l1,l2){ s1=getLink(l1.href) s2=getLink(l2.href) if(s1>s2){ return 1; }else if(s1<s2){ return -1 } return 0; }

var a=document.getElementsByClassName("IEPtable") doc = document.createElement("body"); for(i=0;i<a.length;i++){ doc.appendChild(a[i])

} document.body=doc; for(i=5;i<13;i++){ $(".IEPtable").removeCol(5) } $(".IEPtable").removeCol(1) users=[] userC=0 articles=[] articleC=0 sandboxes=[] sandboxC=0; a=document.getElementsByTagName("a") for(i=0;i<a.length;i++){ h=getLink(a[i].href) if(h.indexOf("User:")!=-1){ if(h.indexOf("andbox")!=-1||h.indexOf("/")!=-1){ sandboxes[sandboxC]=a[i] sandboxC++; }else{ users[userC]=a[i] userC++ } } if(h.indexOf("User:")==-1&&h.indexOf("User talk:")==-1&&h.indexOf("Special:Contributions")==-1&&a[i].innerHTML.indexOf("YOUR ARTICLE")==-1){ articles[articleC]=a[i] articleC++ }

}

//

document.body.innerHTML="" users=users.sort(compareLinks) for(j=0;j"; }

//

document.body.innerHTML="" articles=articles.sort(compareLinks) for(j=0;j" }

//

document.body.innerHTML="" sandboxes=sandboxes.sort(compareLinks) for(j=0;j" }

Manish Earth Talk • Stalk 15:47, 18 November 2011 (UTC)


 * Cool! It appears the students have stopped making major edits but I'll wait a week or so before running the CCI program just in case. MER-C 03:37, 19 November 2011 (UTC)
 * Remember that not all student usernames have been added to the IEPS table. We may need to manually backcheck from the articles (If the WMF doesn't supply them.. they filled in most, but a few are still left. Manish Earth Talk •  Stalk 10:26, 19 November 2011 (UTC)
 * The CCI program can be run incrementally -- it doesn't require the list to be complete (but we do). The sooner we start systematically cleaning up this mess (after my exams, 3 days now), the better. MER-C 11:57, 19 November 2011 (UTC)
 * Once you/Moonriddengirl/etc start the CCI, could you ensure that you log all cleanups to the master list? I don't want the OAs to have to duplicate the effort. You could either write "Cleaned by CCI" in the "Cleanup comments" using a timestamped bulleted list (see above), or I could add a separate column for the CCI.
 * Note that if a student is marked as 'checked' by an OA, it still will need a CCI, but your job will be easier. The OAs will have done internet copyvio checks and commonsense "looks like it was copied" checks, and blanked any copyvios. We will still need someone to revdel the stuff, and a final confirming check by the CCI (which knows much more about copyright than the OAs do). Manish Earth Talk •  Stalk 07:40, 20 November 2011 (UTC)
 * I think shutting down the OA cleanup on the master list and redirecting the OAs to the CCI would be a better solution. MER-C 09:47, 20 November 2011 (UTC)

The issue with that is that not all OAs have copyright experience (Not all are generally wiki-aware, either). This point has been brought up a few times before (on this same page), so it would be better if the OAs to stay out of the CCI. They can do the bulk of the cleanup while the CCI can doublecheck and tie up the loose ends. That way, the CCI isn't overburdened, and the cleanup is still done systematically. Manish Earth Talk • Stalk 12:10, 20 November 2011 (UTC)
 * There's nothing wrong with repurposing a CCI as a general cleanup listing, it is, after all much more thorough than what the WMF made up on the spot. Instead of removing the diff listing -- something I was going to recommend against anyway with a messagebox and instruction modification, we should use multiple comments with OA comments being explicitly marked as such (example follows):
 * Poverty in India:
 * OA comment: seems OK User:Example OA 09:25, 21 November 2011 (UTC)
 * reverted MER-C 09:25, 21 November 2011 (UTC)
 * I don't want the people at CCI doing any more work or form-filling than they have to, especially if it is due to WMF incompetence. We are stretched far enough as it is. MER-C 09:25, 21 November 2011 (UTC)
 * I think I'm going to stop the cleanup efforts I've been doing; I think the experts who do CCI work are going to do a better job than I've been able to do and I don't want to duplicate effort. There aren't many editors doing OA cleanup here so it might be OK just to shut it down here and move to CCI. Mike Christie (talk - contribs -  library) 15:18, 21 November 2011 (UTC)
 * I agree, though I have nothing against the OAs helping out at the CCI (the community may oppose it, though). Anyways, you seem to be just about the only active cleanup OA. If the CCI people are fine with it, you could help them out in the way MER-C suggested. Manish Earth Talk •  Stalk 12:20, 22 November 2011 (UTC)
 * We're fine with CCI taking over now if that's what you all would like. We certainly feel like it's our responsibility to facilitate the cleanup efforts as much as we can, but our U.S. OAs are in the crunch time for supporting their students right now, so honestly I doubt much more cleanup will happen from their end for another few weeks. We're certainly happy to ask for more help again after the term wraps up in the United States, because we're committed to ensuring cleanup happens, but if CCI is ready to take on the project now, we welcome the help. We'll stop asking our OAs for help at this time to let the CCI process happen, but do let us know if you'd ever like us to resume it. -- LiAnna Davis (WMF) (talk) 00:11, 23 November 2011 (UTC)
 * I have no qualm with OAs continuing cleanup at the CCI as long as the format above is followed and their comments are labelled as OA comments.


 * The best way you can facilitate the cleanup right now is to supply a full list of student usernames. My request for such a list made in the IRC office hour last month was neglected. We cobbled up one ourselves based on the information we have, but I strongly suspect it is incomplete given that it was rather poorly maintained throughout the semester. MER-C 08:01, 23 November 2011 (UTC)

Yep, the list I've generated is based on WP:IEPS, which is still incomplete. On a whim, I just checked out the database that Frank Schulenburg uses to run the student-o-meter on the toolserver (anyone with a ts account can access it). I've kept the student data I gleaned here. Use this diff to see the differences between those pages. Mostly, we have student usernames that the WMF does not, though they do have some that we don't. If you find any (most of the red lines are due to bad diff alignment), try to add it to this table (if you can find where the username fits in the other tables, that's fine, too), and then to this list if you feel like it. I checked the students till E, there are still a lot more to copy. I'm going to do this with a script after a while (I didn't expect there to be so many extras so I started by hand). Remember, we still don't have all usernames, this is just a bunch of 100-odd extras.  Manish Earth Talk •  Stalk 13:43, 23 November 2011 (UTC)
 * I have added more usernames. The list (almost) looks complete besides 10 usernames from the class "Machine Drawings & Computer Graphics". I have mailed the professor for this class asking for usernames. I'll add the remaining 10 usernames as soon as I get the list. Nitika.t (talk) 06:12, 24 November 2011 (UTC)
 * Added more. Only 1 or 2 missing now. Nitika.t (talk) 10:30, 25 November 2011 (UTC)

Look out for possible copyright violations in this article
This article has been found to be edited by students of the India Education Program project as part of their course-work. Unfortunately, many of the edits in this program so far have been identified as plain copy-jobs from books and online resources and therefore had to be reverted. See the India Education Program talk page for details. In order to maintain the WP standards and policies, let's all have a careful eye on this and other related articles to ensure that no material violating copyrights remains in here. MiszaBot II (talk) 18:06, 9 March 2012 (UTC)  Now, we may need to change the last line to "..policies, please check the article for any copyvios...", and add something about the grammatical errors, but otherwise it looks OK (Though I'm not sure if we need to go through this at all). Thoughts? Also, please let me know if there are any other automated tasks that the cleanup needs (I may need another BRFA for them, though). Manish Earth Talk • Stalk 12:51, 27 November 2011 (UTC)

Learnings, another mailing list discussion and planning for IEP v2.0
From :

Yes you have. You did not post it on this page.

This was also posted on the wikimediaindia-l, starts here, thread name is "IEP Pilot - Preliminary Analysis". These discussions reference an early draft document containing the plan for the second iteration of the IEP. MER-C 05:58, 1 December 2011 (UTC)


 * Rather amusingly someone on the mailing list suggested that a few questions along the lines of "how much effort was spent cleaning up the mess" should be added, only to be told they were being too negative. <b style="color:#FF0000;">Hut 8.5</b> 09:36, 1 December 2011 (UTC)
 * Ram Shankar Yadav is a campus ambassador for the program. MER-C 12:26, 1 December 2011 (UTC)
 * So, how much donor money has been spent physically flying consultants to and from SF and Pune throughout this program? Danger High voltage! 20:59, 1 December 2011 (UTC)
 * First or cattle class? (actually  you  don't  wanna know, really) Kudpung กุดผึ้ง (talk) 09:32, 3 December 2011 (UTC)

The discussion continues here, same thread title. The list of people being interviewed is here, see also User talk:Toryread. MER-C 03:17, 3 December 2011 (UTC)

Thanks for posting a link to the interview list, I was just about to put it on my talk page. One modification: I haven't yet posted questions to Wikipedians on the list who prefer email or talk pages as the interview method, so those people will have through the end of this week to provide input on the questions.Toryread (talk) 23:29, 4 December 2011 (UTC)

Here is a list of work I am doing on the Pune Pilot Review: 20 hours in US reviewing talk pages and email list communication regarding IEP; Interviews (in person, unless otherwise noted): - Barry Newstead, WMF SF - Frank Schulenburg, WMF SF - Annie Lin, WMF SF - LiAnna Davis, WMF SF - Hisham Mundol, WMF India - Nitika Tandon, WMF India - Shiju Alex, WMF India - Ram Shankar Yadav, CA Pune - Ishita Ghosh, professor, SSE, Pune - informal conversation with 2 SSE students and 2 CAs over lunch in Pune - Rashmi Barua, SSE student - Devanchi Tripathi, SSE student and CA - 3 members of Pune WP community (one doesn't want his name here, so I'm not naming any of them), group interview over dinner - Abhilasha Sharma, SSE student - Anushikha Benazur, SSE student - Dr. Jyoti Chandiramani, SSE Director - 3 more SSE students, informal conversation over lunch - Debanjan Bandyopadhyay, CA SSE - Radha Misra, professor, SNDT Women's College, Pune - Shweta Shinde, student, COEP - Gautam Akiwate, student, COEP - informal conversation with 3 CAs and 1 student, COEP - Dr. Anil V. Sahasrabudhe, Director, COEP - Dr. Pradeep Waychal, professor, COEP - Kudpung, WP Admin (skype video) - Srikanth Lakshmanan, WP editor, OA for IEP (skype audio) - Hisham Mundol, WMF India (telephone) - Wasim, CA COEP - Pratik Lohati, CA COEP - Arjun M. K., CA COEP - Vaibhav Chandak, CA COEP - Prof. Abhijit Sir, professor, COEP - Bala Jeyaraman (telephone) - Moonriddengirl (telephone) - Risker (skype) - Ruud (email) - Andy Dingley (email) - Voceditenore (email) - Fluffernutter (email) - Danger (skype) - MER-C (talk page) - Matthiaspaul (email) - Cindamuse (skype) - Ayush Khanna, Data Analyst, Global Development Program, WMF SF (telephone).

Data gathering closes Monday, 5 December, except I will take the statistical numbers from WMF whenever I get them, and participants via email/talk page will have through end of the week to answer because I'm just getting questions to them later today. My contract includes 20 interviews, and I am already doing many more, because I've determined that the job requires it. This list was developed in consultation with WMF staff, India WP community members, and global WP community members.Toryread (talk) 00:20, 5 December 2011 (UTC)


 * Sincere apologies. I missed posting it on the en.wikipedia page. I'll make sure that I post updates on both the pages going forward.Nitika.t (talk) 04:05, 5 December 2011 (UTC)

FYI, I'm going completely dark from now through 12 December, 23:00 UTC. No phone, no internet, no computer. I'm looking forward to checking talk pages and beginning my synthesis when I return.Toryread (talk) 18:47, 5 December 2011 (UTC)

Pilot Analysis Plan
(Cross-posting to several pages) I've just created the page India_Education_Program/Analysis to document our planned analysis of the Pune pilot. We've been collecting ideas in many different places, but we wanted to have one central page where we'll be analyzing the learnings from the Pune pilot over the next few months. We will using the results of this analysis to plan our next pilot in India, which will be kicking off in mid-2012. We will not be running the India Education Program in the first term of 2012. We are committed to using the next few months to get all the learnings we can out of the analysis, so we can launch a new pilot in six months or so that addresses all of the concerns raised from the Pune pilot.

We do have one major outstanding question in terms of how to analyze the pilot, which is how do we measure the impact of the pilot on the community? I really encourage anyone who has good ideas of how to do data collection around this to contribute to the discussion on talk page. -- LiAnna Davis (WMF) (talk) 22:45, 1 December 2011 (UTC)


 * I also tried to leave talk page messages for everyone who had posted more than 3 times on this talk page; my apologies if I missed you! -- LiAnna Davis (WMF) (talk) 23:16, 1 December 2011 (UTC)