Wikipedia:New pages patrol/Analysis and proposal

New Pages Patrol Analysis and Proposal Danny Horn, Wikimedia Foundation Community Tech team* May 31, 2017

Introduction
Thousands of volunteers work every day to make sure that English Wikipedia is a source of reliable, high-quality information. The team of 400+ New Page Reviewers perform an important function — triaging newly-created articles to weed out obviously bad pages, and fixing pages that need to be fixed. An average of 1,180 new mainspace articles are created every day, which is a lot to keep up with. Reviewers use the New Pages Feed to organize their work, and the current feed shows that there are almost 22,000 articles that have not yet been marked as reviewed. This backlog stretches back to late December 2016, which means that there are pages which have been waiting for five months without being marked as reviewed, or removed from the queue. The size of this backlog has raised concerns among reviewers that bad pages could be slipping through the system – attack pages, vandalism, nonsense and self-promotion – that could weaken the reliability and reputation of the encyclopedia. As a step towards solving this problem, some reviewers have suggested preventing non-autoconfirmed contributors from creating articles, to reduce the number of new articles to review, and give the reviewers some breathing room. The Wikimedia Foundation's Community Tech, Editing and Research teams have been analyzing the situation, and this report details our findings and recommendations. For the immediate question, we believe that restricting new contributors wouldn't have the desired effect; pages created by non-autoconfirmed users make up a small percentage of the five-month backlog. More importantly, the current system for reviewing new pages is not sustainable, even with a reduction in the number of new pages, or an increase in the number of reviewers. The majority of the pages in the backlog are not bad articles that can be easily cleaned up or deleted; they're marginal, time-consuming pages that aren't bad enough for speedy-delete, but need more research and improvement to bring up to current New Page Review standards than an individual reviewer wants to give them.

To create a more sustainable system, the process needs to reinstate a 30 day or 60 day expiration period, which it had from 2007-2012, so that pages that reviewers don't have the time or interest to improve will naturally age out of the backlog. At that point, the page becomes another "imperfect" page of the encyclopedia, which editors as a whole can improve over time. Reviving the expiration period may require making some improvements to the New Page Review process, which the Foundation can help with. This will be an ongoing conversation between the Foundation and New Page Reviewers. Below, we'll first look at the immediate question about pages created by non-autoconfirmed users. Then we'll take a broader look at how the goals of the New Page Review process have evolved over the years, and dive into the backlog to see what’s in there. At the end, we'll offer suggestions on ways that the Foundation can help reviewers manage the workflow.

Non-autoconfirmed contributors
According to our calculations, there's an average of 1,180 new mainspace articles created on English Wikipedia every day, with an average of 78 created by non-autoconfirmed users. Restricting page creation by non-autoconfirmed users would reduce the total number of new pages by about 7%. That total average includes pages that are created by autopatrolled users, which don't end up in the New Pages Feed, so pages by non-autoconfirmed users make up a higher percentage of pages that initially appear in the feed. However, as the following graph shows, those pages don’t make up a sizeable percentage of the backlog.



The above graph is a snapshot of the New Pages Feed backlog, as of June 7, 2017. The x axis is a timeline, showing how many pages in the June 7th backlog were created on each day, from December 2016 to early June 2017. The green lines represent the number of articles created by autoconfirmed users that are still in the backlog, and the blue lines represent the articles created by non-autoconfirmed users. Pages created by non-autoconfirmed users make up 15% of the still unreviewed pages in the backlog. While removing pages created by non-autoconfirmed users would reduce the burden on that first wave of reviewers, it would result in the loss of many potential good articles. It would also send a clear message to new Wikipedia editors that their contributions aren't wanted, potentially stunting the growth of the editing community. Most importantly, it wouldn't actually solve the problem of the growing New Page Review backlog. As we'll see, the Foundation and the New Page Reviewers can take positive steps that will reduce the backlog in a sustainable way. To get there, we need to look at the current situation, and how it came about.

2007-2008: New Page Patrol
In November 2007, when the first proposal to restrict the "mark as patrolled" feature to autocomfirmed editors was introduced into the MediaWiki software, the instructions for how to patrol pages said (emphasis added): What to mark as patrolled What not to mark as patrolled At that time, the aim was to mark the majority of pages as patrolled. If the page wasn't speedy deletable, then the patroller would tag it, and remove it from the queue. In June 2008, the goals were amended to add another item, instructing patrollers to keep "any page that is not speedy deletable but still has serious issues and is not (yet) tagged appropriately." The question of what constitutes a "serious issue" has been debated, on and off, for about a decade.
 * Any page that is tagged for speedy deletion, so people do not waste time reviewing the same page multiple times.
 * Any page that is appropriate for Wikipedia.
 * Any page that is not speedy deletable but still has issues should be marked as patrolled and then fixed/tagged.
 * Pages you are not sure about and want a second (or third) opinion.
 * Pages you have created, as this would be a conflict of interest.

2012: NewPagesFeed
In 2012, the Foundation developed the New Pages Feed and Page Curation toolbar to help patrollers (now known as "reviewers") evaluate the new article stream. The New Pages Feed is a special page that lists each unreviewed article, with metadata like the page creator's edit count and tenure. The Page Curation toolbar is a tool that makes it easier for reviewers to evaluate and tag pages. An important change was made, when the toolbar was released: the end of the 30 day expiration period. Until 2012, articles that lasted more than 30 days in the queue were automatically marked as reviewed. With the new tools making it easier to review pages, the expiry was removed. But in the documentation, there was a caveat:
 * A driving factor for the continued reviewing of new pages is the existence of the 30 day expiry. This expiration date serves as a motivator: reviewers are encouraged to clear the queue. It is unknown what the removal of this limit will do with regards to reviewer motivation.

The page goes on to say that the 30 day expiration should be re-enabled if the change "radically impacts the performance of the triaging system", but there's no indication of how they were planning to judge that impact. Currently, there is no expiration; a page stays in the queue until it's marked by a reviewer.

2015: The scope expands
In September 2015, the purpose of New Page Reviewing evolved from a filtering role to a gatekeeper role. Up to that point, the definition of patrolling was (emphasis added):
 * "The primary tasks of new page patrollers are to identify new pages that meet the criteria for deletion, and to detect copyright violations."

In September 2015, that was changed to:


 * "The primary purposes of new page patrolling are to identify articles which do not meet the criteria for inclusion and/or to tag them for any glaring issues that need attention."

In other words, the guiding principle of the New Page Review process used to be that new articles are mostly good, and the job of reviewers is to weed out the bad ones. After the September 2015 edit, the guiding principle is that new articles are mostly bad, and the job of reviewers is to hand-pick the good ones. At the same time, the page which lists the "criteria for inclusion" changed tone as well. There's a checklist with a broad series of questions that should be considered when reviewing an article, including: The summary at the top of the checklist page used to say:
 * Is the article referenced?
 * Is the article categorized?
 * Do other pages link to this article?
 * Are there versions of this article in other languages?
 * Is the article properly formatted?
 * Does the article have grammar or spelling mistakes?
 * Does the article have any other glaring issues?
 * "New page patrollers are not required to perform these checks, but they are highly encouraged."

With this edit in September 2015, the summary was changed to:
 * "New page reviewers are not required to perform these checks, but if they can't, it's best to leave the reviewing of a particular page to a reviewer who can."

This expanded scope, along with the encouragement to leave difficult pages for another reviewer, has contributed a great deal to the growth of the backlog.

2016: Making "patroller" a user right
In late 2016, after concerns that some active reviewers were reviewing pages too quickly, there were two RfCs that established "patroller" as a user right that contributors must apply for, and which others can remove if they’re not reviewing correctly. The first RfC, which concluded in early October 2016, established consensus for creating the user right. The second RfC, which concluded in late October, established the qualifications for granting the patroller right:
 * "Made at least 500 undeleted edits to mainspace, and have a registered account for at least 90 days, plus appropriate experience of a kind that clearly demonstrates knowledge of article quality control."

The user right was created in November, and contributors who want to join the New Page Reviewers apply for the right at a Request for Permission page. Based on the above changes, the goal of New Page Review has gradually evolved from a "page triage" model, where articles that don't meet a minimum standard are removed, to a model where "reviewed" constitutes an affirmative seal of quality, which can only be granted by a selected group of editors. If this is the case, it would represent a major shift in the way that Wikipedia works. The encyclopedia's success is based on the efforts of a very large and distributed group of editors, who improve articles over time. For New Page Review, the combination of restrictions on who can review pages, along with the expanding scope of the mission, has shifted the responsibility of curating non-damaging new articles from the whole population of Wikipedians to a group of volunteers with a specific user right.

Reviewer participation
There have been significant changes in the number of active reviewers and reviews, with noticeable changes in June 2016 and November 2016. The following chart shows the number of review actions taken each month, from January 2015 to May 2017.



The following table shows the unique number of active reviewers per month.

One factor contributing to the changes in 2016 was telling a particularly active reviewer, SwisterTwister, to not review pages so quickly. That change can be seen in the following charts, which shows a big dropoff from May to June 2016:



In November 2016, switching to a patroller user right brought the number of active reviewers per month from the high 900s to the mid 300s. While the remaining reviewers are still very active, the 66% decrease in active reviewers has added more work for them.

Anatomy of a Backlog
To understand what's in the backlog, it's important to remember that New Page Reviewers are generalists. They're experienced contributors, but they're not expected to be experts in every field. When they evaluate new articles, they don't necessarily have perfect command of the notability requirements for every long-tail topic. This turns out to be more important than you'd think. Here are two examples that illustrate the kinds of articles that end up in the backlog: Earl Edgar — created March 12th by Yugbodh, and improved by six other editors between March 12th and April 1st, including two administrators and one new page reviewer. Still in the backlog as of May 24th. Earl Edgar is a singer from Bahrain who writes music for Bollywood movie soundtracks. IMDb says he's credited on 5 movies; Google’s Knowledge Graph lists 25 movies. Edgar is probably notable, but the article is badly written. It has no references or internal links, and the language is clearly promotional. From the lede:
 * Composing comes naturally to Earl, he creates a different yet commercial sound production beside his simple "hindesh" lines and 'verses' to entice the listener and even gets them to sing along. He is known for phrases like "Magical India 'outaf a hat, Desi style yaara, is where we're at", "Feel the dholbajao baby" for High School Musical 2, and of course the international "Your my love, my love, Janeja you're my love" from the movie Partner... [etc.]

Following the Article namespace checklist – the minimum effort that a reviewer is supposed to do – this article would probably take days to fix. You'd have to track down references, most of them not in English, and completely rewrite the page from scratch. If the reviewer doesn't have the time or patience to do that work, then they shouldn't mark the page as reviewed. But it would be a shame to delete the page, because somebody could probably make a good page out of it. Luxembourgian National Division (women's handball) — created May 12th by Sweech38, who is the only editor. Still in the backlog as of May 24th. This page looks like a Wikipedia article. It's got six sections, with an infobox, tables and rankings, and one external reference. It's categorized, it's properly formatted, and there are no glaring mistakes. The only reason a reviewer wouldn't mark this as reviewed is if they're not sure whether this is notable. Do we have pages for European women's handball divisions? If we do, is this what they're supposed to look like? Again, the reviewer looks at the page, and makes a mental calculation about how long it would take to properly review the article according to the checklist. Then they think about how much they care about European women’s handball teams, and they leave it for somebody else. So there are three reasons why articles stay in the backlog, read but not marked as reviewed: To take an unscientific but interesting sample, here are the first 25 pages still in the backlog that were created on April 24th, a month before this section was written. That categorization is actually fairly arbitrary; with the exception of the unfinished and unnecessary List of Veep characters, pretty much all of them are a weird mix of both. They're about a Nigerian deputy director of social work, a Russian interior designer, an Indian actor, a bridge in Louisiana, and yes, another European women’s handball team. Two of them are about American businessmen of dubious notability, four of them are about musicians, three of them are about songs. None of them are attack pages or vandalism. The one thing that they all have in common is that bringing them up to the current standards of New Page Review would require more work than any reviewer who’s looked at the page wants to invest. These pages are time-consuming judgment calls. That's what being in the backlog means.
 * They're probably notable but badly written (like Earl Edgar)
 * They're well-written but have questionable notability (like the Luxembourgian women’s handball division)
 * They're a weird mix of both (because life is complicated)
 * Probably notable but badly written: Daniel Caesar, Culinary Genius, Le Pont D’or Bridge, Shan (Tamil actor), Ekaterina Elizarova, Titilola Obilade, Chantry Johnson, Federal Council for the Advancement of Aborigines and Torres Strait Islanders
 * Well-written but questionable notability: Women of the Young Lords, Switch (Iggy Azalea song), Phyllis Bomberry, Monarchism in Brazil, Aichatou Ousmane Issaka, Way Too Long, 2016–17 Nemzeti Bajnokság I/B (women's handball)
 * Weird mix of both: Ben Ricciardi, List of Playboy Cybergirls of the Month, Anwar Jibawi, Information Processing in Medical Imaging, Yūsuke Takita, Toddie Lee Wynne, Ada sahillerinde bekliyorum, The Great Cat and Dog Massacre, Glenn Seven Allen
 * Just a bad, unfinished page: List of Veep characters

Time-Consuming Judgment Calls
Admittedly, those are examples, rather than hard data. It's difficult to put exact numbers on this, because the most important event is invisible: New Page Reviewers looking at an unreviewed page, deciding that it's a time-consuming judgment call that they don't have time to deal with, closing that tab, and moving on to another page. And that's what the current system is encouraging people to do: give up, and move on. As of September 2015, the instructions say: "New page reviewers are not required to perform these checks, but if they can't, it's best to leave the reviewing of a particular page to a reviewer who can." To continue with this sample of 25, here are the number of pageviews that each page had from April 24th to May 7th, the first two weeks after creation:
 * Culinary Genius (TV series): 2,021 pageviews (144/day)
 * Anwar Jibawi: 1,998 pageviews (143/day)
 * List of Playboy Cybergirls of the Month: 569 pageviews (41/day)
 * Daniel Caesar: 422 pageviews (30/day)
 * Glenn Seven Allen: 338 pageviews (24/day)
 * Chantry Johnson: 191 pageviews (14/day)
 * Way Too Long: 136 pageviews (10/day)
 * Titilola Obilade: 130 pageviews (9/day)
 * Women of the Young Lords: 126 pageviews (9/day)
 * List of Veep characters: 87 pageviews (6/day)
 * Shan (Tamil actor): 79 pageviews (6/day)
 * Monarchism in Brazil: 78 pageviews (6/day)
 * The Great Cat and Dog Massacre: 77 pageviews (6/day)
 * Information Processing in Medical Imaging: 66 pageviews (5/day)
 * Ekaterina Elizarova: 58 pageviews (4/day)
 * Ada sahillerinde bekliyorum: 55 pageviews (4/day)
 * Aichatou Ousmane Issaka: 51 pageviews (4/day)
 * Toddie Lee Wynne: 46 pageviews (3/day)
 * 2016–17 Nemzeti Bajnokság I/B (women's handball): 42 pageviews (3/day)
 * Phyllis Bomberry: 38 pageviews (3/day)
 * Yūsuke Takita: 32 pageviews (2/day)
 * Ben Ricciardi: 27 pageviews (2/day)
 * Switch (Iggy Azalea song): 26 pageviews (2/day)
 * Federal Council for the Advancement of Aborigines and Torres Strait Islanders: 19 pageviews (1/day)
 * Le Pont d’Or Bridge: 16 pageviews (1/day)

Even looking at the least-viewed pages on this list — Le Pont D'or Bridge, Federal Council for the Advancement of Aborigines and Torres Strait Islanders, Switch (Iggy Azalea song) — they seem like perfectly ordinary Wikipedia pages. They are not toxic needles in the haystack that are bringing the encyclopedia into disrepute. In fact, there are a lot of articles in the backlog that new page reviewers have edited, and still not marked as reviewed. Of the articles that were created in April, 28.3% were (a) edited by at least one reviewer who then did not mark it as reviewed, or (b) edited by multiple reviewers before it was marked as reviewed. Why are page reviewers looking at a Time-Consuming Judgment Call page, and not marking it as reviewed? Well, partly because the instructions specifically say that's what they should do. The top of the New Pages Feed says, in bold letters, "Rather than speed, quality and depth of patrolling and the use of correct CSD criteria are essential to good reviewing." A reviewer who doesn't spend enough quality time on a given review risks being blocked from reviewing, but there's no downside to closing the tab and moving on – nobody will even know that you looked at the page. The only thing that happens is that the backlog gets a little bigger, which is everybody's problem, and not yours.

This system is not sustainable
In the present system, with its current mix of carrots and sticks, the only measure that really matters is the time and patience of each individual reviewer, which cannot be meaningfully expanded. Each page stuck at the end of the backlog has run through a gauntlet of reviewers, who have each individually decided that that page is a Time-Consuming Judgment Call. If they have the time and the interest to completely rewrite the Earl Edgar page, then they'll do it. If they don’t, it gets passed to the next person in the queue. The Earl Edgar page got 135 pageviews in its first month on Wikipedia, and it’s still not marked as reviewed. If that page is too much trouble for reviewer #1 to handle, then the chances are good that it will also be too much trouble for reviewer #2, and reviewer #3, and so on. It just goes into the backlog, and stays there. At a certain point, it's just another imperfect page of the encyclopedia, which will be improved by people who care about that subject. The important thing to note is that adding more reviewers to this system will not make it work better. We could triple the number of people looking at the Earl Edgar page, and each individual would still make the same calculation, and probably come to the same decision: this is a Time-Consuming Judgment Call, and it's not worth my time. I should close this page, and move on to the next page. Each time a reviewer looks at a page and leaves it in the queue, it creates duplicate work for the next reviewer who comes along, which makes the system less efficient. Adding more people to review the same page is a waste of time that could be spent improving the encyclopedia in other ways. Obviously, tripling the number of reviewers would have some effect, but with diminishing returns. This is a natural selection process, with the backlog made up of the most Time-Consuming of the Judgment Calls. With extra reviewers, you could strip off the next layer of pages — let's say, 10% of the current backlog. But the 90% that's left would be even more impenetrable; you’d have to keep throwing bodies at the backlog, and you'd still end up with a couple Russian interior designers and a smattering of European women’s handball teams. There's a limit to a generalist’s patience. To properly triage the Time-Consuming Judgment Calls, you need a subject matter expert, and the long tail of subjects is very, very long. On the other side of the divide, reducing the number of pages entering this system will not make it work better either. You can use any means to stem the tide — restrict non-autoconfirmed people from creating articles, ban anybody who writes three bad articles in a row, alternate between articles that start with A-M on one week and N-Z on the next. But the Time-Consuming Judgment Calls will still be just as time-consuming, and they will become part of the backlog. That backlog will grow more slowly, but the backlog as currently understood is not a number that will ever meaningfully decrease. Even if pages by non-autoconfirmed users are twice as likely to be bad as pages created by autoconfirmed users, that wouldn't impact the backlog. Bad pages are easy to review, and satisfying to delete. The backlog is not made up of bad pages; it's made up of marginal pages that are time-consuming to review.

Proposals and Recommendations
Before we move to the recommendations, we should note that the New Page Reviewers are doing very important work for the quality and reliability of the encyclopedia, and that they are have been doing that work successfully for ten years. This process, and the people who organize it and contribute to it, are keeping countless numbers of bad pages from polluting Wikipedia. However, the page review process has gradually changed over time into a system that is unsustainable. The problem is not about an increase in pages or a decline in volunteers; it’s about scope creep. The expectations for what constitutes a successful review are much higher now, and reviewers are spending time improving marginal pages that they could be using to track down and delete bad pages. Essentially, the current backlog exists because of the following changes: Of those changes, the one crucial step was removing the 30-day expiration in 2012. Keeping every article in the queue until it’s reviewed creates a backlog; the other changes merely accelerated a process that would have reached critical capacity sooner or later. The only sustainable way to manage the backlog is to reinstitute the expiration date, which the system had from 2007 to 2012. An article that survives the gauntlet of reviewers for a reasonable amount of time – say, 30 days or 60 days – is unlikely to be picked up and fixed by a generalist new page reviewer. Pages that survive past that deadline should be improved by subject matter experts, which is the way that Wikipedia works. With a 30 day expiration, the backlog on May 30th, 2017 would have 5,650 pages instead of 21,800. With 60 day expiration, it would have 10,200 pages. The Community Tech team has made some fixes to the system recently:
 * Removing the 30-day expiration
 * Expanding scope from weeding out damaging articles to bringing every article up to a standard
 * Restricting page review to a user right, given to a small set of users
 * Actively instructing reviewers to be slower and more deliberative

Our team is able to make other small fixes in the future, but the Foundation shouldn't invest a lot of resources in optimizing a system that is fundamentally unsustainable.
 * New Database report on Editors eligible for Autopatrol privilege
 * You can now get a list of the top reviewers for the past day, week, or month from the API (so you don't have to scour the logs for this).
 * Fixed T165891 - Special:NewPagesFeed shows users as blocked that aren't currently blocked.
 * Fixed T165738 - Number of pages in filtered list is not updated.
 * Fixed T44254 - List filters keep getting reset.

There are some larger investments that a product team could take on:
 * 1) Automatically flag serious issues, like promotional language: CopyPatrol is an existing community/Wikimedia Foundation collaboration that examines every edit for copyright problems. We could work on similar tools to help with specific problems that are possible to detect and automatically flag for review.
 * 2) Integrate ORES scores: Surface ORES' article quality and/or draft quality scores in the New Pages Feed to help identify pages that are likely to be good (for speedy approval) and likely to be bad (for speedy delete). The scores would be surfaced for human review, not used to automate the decisions.
 * 3) Research and build an improved system for connecting articles in need with subject experts: On the theory that subject expertise is often the crucial and missing element required for judging and improving a wide range of articles, create a expanded and improved system for flagging articles as needing subject review and an expanded system for connecting those articles with knowledgeable reviewers. Such a system could tie in to Wikiprojects whenever possible, but could not be limited to these, since they don't cover all topics and many are inactive.
 * 4) Research, design and build a system for guiding new page creation: This would include community input and significant user testing as part of the design process, to ensure that editors are helped by the system, not just discouraged and chased away.
 * 5) Mobile app: Test out a mobile app that allows reviewers to read and tag articles from their phones. Long-form editing is tricky on phones, but choosing from a menu of tags could work very well.

Conclusion
New Page Review is a vital workflow that keeps Wikipedia alive and healthy, and the people who organize and participate in it are doing important work, very successfully. The Foundation wants to support that work, in a positive and sustainable way. Reinstituting a 30 day or 60 day expiration date is not an easy choice for the organizers and reviewers. It will probably mean that the reviewer team will want to make some adjustments in other parts of the system, so they feel comfortable with letting old pages leave the system. The Wikimedia Foundation product teams are interested in talking about supporting those adjustments in the workflow, to help bring the system back to a sustainable place. We know that this report is going to spark a lot of conversation, both agreement and disagreement -- see the talk page for some of both. We're looking forward to participating in those conversations, and talking with people about the future of New Page Review.

(*) This report was researched and written with the assistance of people from many Wikimedia Foundation teams:
 * Community Tech: Danny Horn, Ryan Kaldari, MusikAnimal, Sam Wilson
 * Editing: Joe Matazzoni
 * Reading: Tilman Bayer
 * Research: Jonathan Morgan, Aaron Halfaker
 * Analytics: Dan Andreescu
 * Community Engagement: Sherry Snyder, Nick Wilson, Benoît Evellin, Erica Litrenta
 * Communications: Heather Walls, Melody Kramer
 * Product: Toby Negrin