Wikipedia:WikiProject Council/Proposals/Sweep


 * The following discussion is an archived proposal of the WikiProject below. Please do not modify it. Subsequent comments should be made on the appropriate discussion page (such as the project's talk page (if created) or the WikiProject Council).  No further edits should be made to this page.

The proposed WikiProject was created at WP:WikiProject Sweep.  PJvanMill ) talk ( 14:21, 26 March 2021 (UTC)

Description
This project will facilitate a comprehensive sweep of all of Wikipedia's early articles to ensure that they meet present-day standards.

Impetus: This has been discussed informally on occasion, but I think it's time we start considering it more seriously. The basic problem is that, in Wikipedia's early days, there were few and ill-defined notability and sourcing standards, leading to the creation of many pages that, by today's standards, should not exist. However, because many of them are on very obscure subjects with few pageviews, they have been able to survive and avoid deletion. Our current ad hoc approach to discovering these pages will never get us to the point of being able to say with confidence that nearly every page on Wikipedia is for a notable subject, since we won't know how many undiscovered non-notable pages remain.

Details: This WikiProject will address that issue by creating and facilitating a patrol process (perhaps similar to those used by AfC or NPP) that will comprehensively review every article created before a certain date that meets certain criteria. Articles that are "swept" will receive a talk page banner, and those needing tags/PRODs/AfDs will be dealt with as needed. Initial tasks for the project will include building the set of articles to be checked (e.g. determining an end date, and perhaps exempting pages rated highly or with many references), constructing the tools/templates necessary, affirming community consensus for the sweep, and establishing criteria for who is allowed to check off pages. Attention will then shift to recruiting and motivating sweepers until all pages have been reviewed. Given that around 3 million pages were created before 2010, this is obviously a huge task, but there's ultimately no way around it, and once completed, it will mark a major milestone in the maturation of the project. &#123;{u&#124; Sdkb  }&#125;  talk 18:37, 26 October 2020 (UTC)

List of important pages and categories for this proposed group
 * articles created before 2010 (number of pages in the hypothetical category: ~3 million)
 * all articles lacking sources (number of pages in the category: 174,000)
 * articles created before 2010 (number of pages in the hypothetical category: ~3 million)
 * all articles lacking sources (number of pages in the category: 174,000)

Support
Also, specify whether or not you would join the project.
 * 1) &#123;{u&#124;  Sdkb  }&#125;  talk 18:28, 26 October 2020 (UTC)
 * 2) Sure, I would join. -- Danre98 ( talk ^ contribs ) 21:38, 27 October 2020 (UTC)
 * 3) I am willing to join this WikiProject. Techie3 (talk) 03:49, 28 October 2020 (UTC)
 * 4) Yep. GenQuest  "Talk to Me" 00:07, 29 October 2020 (UTC)
 * 5) I believe this is a good idea and would be willing to join, focusing on quality of reviewing rather than quantity. — Bilorv ( talk ) 01:58, 31 October 2020 (UTC)
 * 6) I think this is needed, and I'd help as able, although I don't know how much attention I'll be able to give to this. Hog Farm Bacon 02:32, 1 November 2020 (UTC)
 * 7) I agree with this idea and am willing to help out, although I might not be super active. Remagoxer (talk) 18:47, 1 November 2020 (UTC)
 * 8) I agree with this proposal and am willing to join. Jh15s (talk) 23:04, 4 November 2020 (UTC)
 * 9) Great idea, would be happy to help — Yours,  Berrely  • Talk∕Contribs 09:58, 25 January 2021 (UTC)
 * 10) I'd like to help out with this. JesterSocks (talk) 10:18, 13 March 2021 (UTC)

Discussion

 * Is there an easy way to pull up a sample of articles with early creation dates? In my (non-exhaustive, not-quantified-in-any-way) experience, many of the articles with pre-2010 creation dates are just important topics that obviously merited an encyclopedia article. If I understand your goal correctly, perhaps you'll find more articles in need of attention by patrolling Database reports/Forgotten articles (articles with the longest time since last edit)? Presumably articles on non-notable topics are more likely to sit unedited after creation? Just a thought. Also, if you go with your original plan, I just stumbled upon Wikipedia's oldest articles which I'd never seen before, and which may have some useful info for your efforts. Ajpolino (talk) 04:46, 30 October 2020 (UTC)
 * I'd hope it'd be possible to filter out the obviously important pages by using exclusion criteria such as monthly pageviews or total number of edits. Regarding "forgotten articles", that very well might turn up some non-notable pages, but it might miss others where e.g. someone from the typo team swung by last year without even noticing the page's topic. More generally, there are certainly others trying to find non-notable pages, but what I hope this project will be able to achieve is doing so with a degree of comprehensiveness, such that once we're done, we'll be able to say that every page here has been judged notable by modern standards. Cheers, &#123;{u&#124; Sdkb  }&#125;  talk 05:33, 30 October 2020 (UTC)
 * It's troubling that the proposal puts so much more emphasis on deletion, than on improving articles. Andy Mabbett ( Pigsonthewing ); Talk to Andy; Andy's edits 13:06, 30 October 2020 (UTC)
 * I did mention cleanup tags above, and editors who want to go beyond that and address issues directly will certainly be welcome to. However, if we're going to get through patrolling some significant fraction of three million pages, it'd be unrealistic to ask for more than just tagging and notability assessment. &#123;{u&#124; Sdkb  }&#125;  talk 17:19, 30 October 2020 (UTC)
 * This has excessive emphasis on deletion. This would result in proding many articles that no one would be watching, or flooding AFD with far more articles than it could cope with. I expect there would be enough prod taggers and fast deleters around to ensure that many proded pages would not get any real review. Since there is no rush, there should instead be a focus on removing puffery, adding sources and proving notability. However I do think it is a good idea to review the old neglected stuff. Graeme Bartlett (talk) 23:20, 30 October 2020 (UTC)
 * I've come across enough bad PROD deletions to be sympathetic to the concern. But I'm not sure what sort of restriction we would want to impose that wouldn't also be appropriate for PRODs more generally. Most unchallenged PRODs are already for pages that aren't monitored, so I don't see how this would be that different, except that we'd be finding more pages. If we were to disallow PRODs here, the same argument would probably carry to disallowing them everywhere else. That might be a good idea (the system clearly has problems, with nominators ignoring WP:BEFORE and deleters not adequately checking), but that goes beyond the scope here. &#123;{u&#124; Sdkb  }&#125;  talk 00:31, 31 October 2020 (UTC)
 * I think this is long overdue. For this to successfully work, I think we need to be patient and make sure we have full support of the community before action is begun, rather than once normal processes begin getting flooded. I think we also need a very clear and specific brief to be given to participants: what is the goal here? Is it (a) to delete (CSD/PROD) pages clearly contrary to WP:NOT (e.g. BLP which is an advert of someone obviously non-notable); (b) to re-review all articles for notability/suitability according to current NPP standards; (c) to re-review all articles according to the stricter (in practice) AFC standards; (d) some measure inbetween? What do we do with "permastubs" or pages created on seemingly notable topics (e.g. towns, species) but with no references? I'm thinking particularly of some of our longest-term editors who have hundreds of page creations with two-sentence content referenced to either zero or one secondary sources and maybe a primary source. We need ironclad consensus and clear protocol before flooding someone's talk page with 348 deletion notices. If we could establish a scope and community consensus then perhaps we could also have a dedicated process for deletion e.g. "Request for deletion" where established editors (maybe those with NPP rights or higher) vote (not !vote) in favour of delete/merge/redirect/keep and after an indefinite amount of time, when there's a vote tally of +3 on one decision, that decision is enacted; at any point, someone could put the request on hold if they wish to do deep research and improve the article significantly. — Bilorv ( talk ) 01:58, 31 October 2020 (UTC)
 * As a way of reducing false positives, would it make sense to exclude those articles flagged as containing text taken from the 1911 and Britannicas? IIRC, a large number of early articles were pre-populated using text from them since they were public domain. If it was in the EB, it stands to reason it's notable. Matt Deres (talk) 15:40, 31 October 2020 (UTC)
 * I like this in theory, and my thoughts mostly align with Bilorv's first paragraph. There are quite a few absolutely terrible articles I've found that date from the late 2000s. My guess is we can remove all articles tagged with a VA, with more than X revisions, with more than X pageviews that came from PD encyclopedias, rated as a certain class (B? GA? FA?) and have a relatively high feel that they are good. This is assuming the project is just to get rid of the most blatantly non-notable articles-- we would likely have no choice but to give the doubt to borderline cases to get through 3-million articles? I'm also concerned that this may result in flooding the deletion processes unduly. But again, it's something that does need to be done. Eddie891 Talk Work 16:52, 3 November 2020 (UTC)
 * My honest prediction is that there's going to be a high percentage of albums/songs related articles that need dealt with, most of which can be redirected without PROD/AFD, for the simple reason that inclusion standards for albums/songs has gotten a lot higher lately. Although there are going to be some tougher cases.  Doing CAT:NN cleanup a few months back, I found an article that had been a straight copyvio since it was created in 2008.  Generally, we'll only need to sort through stuff that's kinda slipped through the cracks.  Anything rated GA/FA is more than likely notable, as is anything tagged as a vital article.  Eliminating articles with x number of revisions would be a good idea, as we should probably focus on the ones that slipped through the cracks.  This has the potential to have massive scope creep issues unless we specifically narrow what we are looking at.  IMO, the best way to do this is to find screens that will focus in on the ones that slipped through the cracks.  Purging enwiki of every single non-notable article would be nice, but there's no way that can be accomplished.  So let's just focus on a subset of articles, and see which ones are fine (which is surely many), and which ones shouldn't stand alone.  I'm predicting a lot of WP:NALBUM and WP:NSONG failures, as well as a good deal of WP:DICDEF issues; those are generally better handled with merging/redirection. Hog Farm Bacon 17:07, 3 November 2020 (UTC)


 * Comment., I found this interesting former task-force that perhaps you may find useul in its approach, strategy, methodology, etc.: WikiProject Good articles/Project quality task force/Sweeps. Cordially,  History DMZ  ( talk )+( ping )  15:48, 13 November 2020 (UTC)
 * Thanks, that's useful! &#123;{u&#124; Sdkb  }&#125;  talk 16:04, 13 November 2020 (UTC)


 * @Sdkb, @GenQuest, @Bilorv, @Hog Farm, @... There are clearly quite a few people who'd like to see this WikiProject exist. I get that there are still some particulars to figure out and possible concerns to address, but consider creating the project now, while there is some momentum left. Otherwise, you run the risk of this proposal going stale and leading to nothing. Kind regards from  PJvanMill ) talk ( 01:47, 24 March 2021 (UTC)
 * Thanks for the follow-up. I'll start building out the project at WP:WikiProject Sweep. I will definitely need some help, though, especially from more technical editors who can do things like help build the categories of articles to review. &#123;{u&#124; Sdkb  }&#125;  talk 01:59, 24 March 2021 (UTC)