Wikipedia:Bots/Requests for approval/EranBot 2


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol keep vote.svg Approved

EranBot 2
Operator:
 * User: supporting the development

Time filed: 20:28, Thursday December 18, 2014 (UTC)

Bot does not make edits to mainspace

Programming language(s): Python (pywikibot framework)

Source code available: https://github.com/valhallasw/plagiabot

Function overview: EranBot has been checking all articles tagged with either WP:MED or WP:PHARM for nearly 6 months now. It has picked up more than 200 confirmed issues with copyright violations. Articles that the bot flagged have issues more than half the time. This has allowed us to pick up classes of students with issues around copyright that would have otherwise been missed.

We are considering the possibility of expanding the scope of this bot to other topic areas and hopefully eventually to all of En Wikipedia.

Before we can do this we need
 * 1) community support / support from the BAG
 * 2) Turnitin to agree to allow us greater use of their API
 * 3) the ability to follow up on the concerns raised (either with staff or volunteers)

Links to relevant discussions (where appropriate):
 * Discussion regarding staff to support the effort

Edit period(s): Few times per day

Estimated number of pages affected: The bot will only make edits to a handful of pages. No edits will be made to mainspace

Exclusion compliant (Yes/No): ?

Already has a bot flag : Yes

Function details: The plan is to have the list of difs sortable by Wikiproject and by whether or not a student from the education program was involved. This will allow people to concentrate on the subject area they are interested in. Hope is to change the output to be more similar in formatting to https://en.wikipedia.org/wiki/Special:NewPagesFeed with a drop down box in the sort by area.

Discussion
(procedural)  please endorse this request for a change to your bot's scope. — xaosflux  Talk 23:34, 18 December 2014 (UTC)
 * I support this request.
 * It seems that there are other users who run bots for copyright violation detection ( - EarwigBot, / - CorenSearchBot/MadmanBot), so I would like to have comments/ideas based on previous experience.
 * The bot aims to take a different approach from those bots in mainly 2 aspects:
 * It scans diffs, rather than newly created articles.
 * It uses a commercial software for copyright violation detection (ithenticate; see more in Turnitin) rather than commercial software for search (yahoo boss; Yahoo search).
 * Eran (talk) 08:09, 19 December 2014 (UTC)
 * Thank you, — xaosflux  Talk 13:14, 19 December 2014 (UTC)


 * Support. Andy Mabbett ( Pigsonthewing ); Talk to Andy; Andy's edits 10:01, 20 December 2014 (UTC)

-- Magioladitis (talk) 16:16, 30 December 2014 (UTC)
 * Thanks. Will likely take us a month or so to get the trial running. Doc James  (talk · contribs · email) 01:20, 31 December 2014 (UTC)
 * Please post here when ready to begin, the trial days will start then. — xaosflux  Talk 01:43, 31 December 2014 (UTC)
 * We have collected data for 1.5 hours. We are now working on formatting this data. Hope is to run it further once more development is done. The data is here  Doc James  (talk · contribs · email) 02:46, 26 January 2015 (UTC)
 * Where's this at? Josh Parris 10:17, 4 March 2015 (UTC)
 * Work is slowly ongoing to improve the formatting and follow up mechanisms of the output data. Doc James  (talk · contribs · email) 19:54, 4 March 2015 (UTC)

Bot working well. Ready for full launch IMO. You can see it here  Doc James  (talk · contribs · email) 13:34, 4 April 2015 (UTC)
 * There are still improvements needed but working well enough to be useful. Doc James  (talk · contribs · email) 00:40, 6 April 2015 (UTC)

Magioladitis (talk) 15:43, 24 April 2015 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.