Wikipedia talk:Labels/Edit quality

ORES review tool deployed! Now, let's do a new campaign to make the models more accurate.
Ping, , , , , , , , , , , , , , , , , , , , , and

Hey folks. We just deployed the mw:ORES review tool on English Wikipedia. Check it out in your Beta features preferences. We've gotten a lot of interest in how to help improve ORES accuracy. The best way to do that is to help us gather more data. So, I've generated a new random sample of revisions to review. See "Edit quality (20k 2016 sample)" at WP:Labels. I've already auto-labeled ~13.7k edits by admins, checkusers, bureaucrats and other high privilege users, so we only have to review the remaining 6.3k edits. Assuming each workset takes 5 minutes (which is how long it took last time), that means we have a total of 10 hours of work. If we can divide that between 23 of us and recruit ~7 more, that means we can all spend less than a half an hour working. I'm planning to put in an hour or so in before the end of the week. --EpochFail (talk &bull; contribs) 21:09, 23 August 2016 (UTC)
 * Ping, , , , , , , , , , , , , , , , , , , , , , , , ,
 * Thanks to all of you for contributing labels to this campaign! We're currently at 1303 out of 6333 labeled edits (20.6%).  That's a lot of work.  I'm personally really excited about incorporating these labels into ORES training because it should allow us to increase ORES accuracy substantially and to also make sure that ORES keeps up with trends in vandalism.  There's 28 of us working on this campaign.  If we split the work evenly, we'll only have to label less than 200 more revisions each (that's 4 worksets).  I'll be doing my 200 on my lunch break today.  :) --EpochFail  (talk &bull; contribs) 16:43, 20 April 2017 (UTC)
 * For convenience, we're being directed through to this page to find the worksets. Now this page also contains a link to "Discussion quality". Is our input requested on that project as well, and are there any guidance notes? Noyster (talk),  18:34, 20 April 2017 (UTC)
 * Good Q Noyster. The "discussion quality" campaign was started by a researcher who does not seem to be active anymore.  I'll disable that campaign and ping the researcher to request some documentation. --EpochFail  (talk &bull; contribs) 18:45, 20 April 2017 (UTC)

Still including edits in non-article spaces
Just one set of 50 included edits in Talk, User, User talk, Draft and Wikipedia spaces. Is this intended? Noyster (talk),  10:05, 25 August 2016 (UTC)


 * That's right. This is by design.  I was just doing a study of activity on vandalism in User space.  See T141829.  It's important that we train ORES to catch this type of vandalism.  --EpochFail  (talk &bull; contribs) 17:54, 26 August 2016 (UTC)

Wikipedia style AI evaluation
It's critical that the artificial intelligence (AI) models that power Wikipedia's tools are aligned to the community. I'm working with Tzusheng to build a system to evaluate the quality of these AI models used across Wikimedia projects, such as ORES and Liftwing. The system is specifically designed to support wiki-style discussion processes. I need your feedback! If you are interested in testing the system and sharing your feedback, please see m:Research:Community-centered Evaluation of AI Models on Wikipedia/Study Recruitment.

This project is documented at m:Research:Community-centered Evaluation of AI Models on Wikipedia.

Thanks! --EpochFail (talk &bull; contribs) 20:07, 27 June 2023 (UTC)