Wikipedia talk:Arbitration Committee/Requests for comment/Article creation at scale/Archive 3

Division of labor
@Valereee, I know you are already processing a lot, but I have an idea to share:

One of the complaints about (very) large-scale article creation is that the original editor does not expand the article later. I think it would be a good idea to address this head-on: Should editors create large numbers of articles that they do not plan to expand themselves?

This seems to be okay with everyone:


 * I create 50 related stubs this month.
 * I spend the rest of the year systematically expanding them into nice little articles. (Contributions from others are very welcome, but probably unnecessary.)

This seems to not be okay with some editors:


 * I create 50 related stubs this month.
 * I never edit them again. (Contributions from others not only very welcome, but also necessary if those articles will ever be expanded.)

I think that it would be good for editors to address this as a community value. Back in the olden days, we had more of a division of labor notion. Different folks did different things, and not just gnoming or AWB runs. Now it seems like there is more of an unwritten expectation that if you start an article, unless it's on a hot topic, you should expand it yourself. Some editors seem to feel that you are being neglectful and burdening the review processes if the end result is "only" a sourced stub.

The practical result for this subject is: If editors are expected to expand the articles they start past the stub stage, we should write down that rule somewhere. And if they aren't, then people should stop complaining that editors who create articles that are "only" stubs are doing something wrong. WhatamIdoing (talk) 16:46, 13 September 2022 (UTC)


 * Hey, @WhatamIdoing! I think one of the problems (and I'm surprised it didn't come up) is that we reward people for creating articles and only grudgingly for expanding them. We're inadvertently encouraging people to stop work once they've got the minimum and move on to the next creation, because you can't rack up those top numbers if you don't move fast. If I hadn't been moderating this, I'd have proposed that we stop counting as a "creation" any article under 1500 characters.
 * But more to your point is that many if not most of the articles created by people systematically going through databases to add an article for every entry, which then cause the problems at AfD, aren't expandable. The guy who played in one 1870s baseball game is just not someone we know more about than his name and how many at-bats he had in that game. It's not simply that the article doesn't get expanded. It's that in many if not most cases it can't be expanded. Valereee (talk) 17:29, 13 September 2022 (UTC)
 * ETA: And you'll be able to propose new questions in the RfC. The workshopping was to develop a first set. But if you want to propose an additional idea, you'll be able to. Valereee (talk) 17:35, 13 September 2022 (UTC)
 * If the article can't be expanded past a stub, then IMO Wikipedia should not have that article, per WP:WHYN. Technically, no such rule has ever been adopted, but even if it had been, we'd still have the problem of people complaining about quality-at-creation for notable subjects.  The first version of Breast cancer contained four sentences and zero sources.  It is unquestionably notable, and at the time, it was considered a helpful contribution.  I wonder whether our current editors would agree that even a four-sentence, two-source stub on an equally unquestionably notable subject would be a helpful contribution, or if they would see it as a burden inflicted upon them by someone who should have written at least a dozen sentences before bothering them with the contribution. WhatamIdoing (talk) 19:17, 13 September 2022 (UTC)
 * @WhatamIdoing, I feel like we're talking at cross-purposes. From my understanding, what people are objecting to is permamicrostubs created, often from sources that don't constitute sigcov, about subjects that don't meet GNG, which then overwhelm AfD. How does breast cancer fit into that anywhere? The fact Breast cancer was created as a stub is completely immaterial, surely? It's not stubs that are the problem, and I would be surprised to find out that any experienced editor would tell you otherwise. Valereee (talk) 21:53, 13 September 2022 (UTC)
 * When people think that very brief articles harm Wikipedia's reputation, they are unlikely to think that a stub about an "important" subject is good for Wikipedia. They might even think that it's worse to have a "bad" article about an important-seeming subject than to have no article at all, because more people might read it and be disappointed.
 * For example, at the moment, we don't have an article on Spanish Renaissance sculpture. This is an obviously notable subject, about which a substantial number of scholarly sources have been published.  Now imagine that someone writes a bad stub about this obviously notable subject:  "Spanish Renaissance sculpture is the kind of sculpture they did in Spain during the Renaissance."  Perhaps the editor would even add a ==See also== section that links to Spanish Renaissance and Renaissance sculpture.  Maybe the editor would toss in a URL to a website.
 * Twenty years ago, this would have been an unsurprising place for an article start. Ten years ago it might have had a chance (though see Newbie treatment at Criteria for speedy deletion).  What do you think would happen to that article today?
 * If the problem were only the doomed permastub nature of some articles, then that bad stub would be welcomed with open arms. But somehow, I doubt that the response would be very welcoming.   WhatamIdoing (talk) 03:46, 14 September 2022 (UTC)
 * This seems to be okay with everyone I'm a bit uncomfortable with the idea because it seems unnecessary, and an inferior way of going about things. Provided these are truly stubs in which editors won't take much interest (so essentially, not current events), such that collaboration beyond gnoming would be unlikely, why not create those 50 stubs in user space (or in draft space, on your local computer, etc.)? And push them into main space as you see fit? Ovinus (talk) 17:47, 13 September 2022 (UTC) Apologies if this reply was against the threaded discussion rule; feel free to hat/remove it Ovinus (talk) 17:48, 13 September 2022 (UTC)
 * Why not give people access to good information now, even if I plan to give them access to additional information later?
 * Also, related articles sometimes need to link to each other, and if you don't create them first, then you either have to write them with bad links (e.g., ) that you have to fix in the article later, or you have to write the article with red links, which prevents you from noticing typos in the links or other mistakes while you're writing.  (Also, when you do move the page to the mainspace, then someone who's unfamiliar with the actual rules for red links will "helpfully" remove them.) WhatamIdoing (talk) 19:02, 13 September 2022 (UTC)
 * That makes sense, thx. Ovinus (talk) 21:58, 13 September 2022 (UTC)
 * Here are three ways I've though of to ask this:
 * Should editors be permitted to create stub articles?
 * If an editor creates an article, do we expect that editor to develop it beyond the stub size?
 * What expectations about further development, if any, should we impose on editors who create stubs?
 * I'd value any feedback about potential questions. The closers might prefer the first, as it lends itself well to a straight-up vote. WhatamIdoing (talk) 19:25, 13 September 2022 (UTC)
 * Perhaps it would be useful to add something in there about timescales? I'm not somebody who has an issue with the presence of stub articles, but I think more people would be accepting of a rule saying that if someone creates a stub then they should come back and expand it, than a rule prohibiting stub articles all together. If so we should probably get a feel for what timescales people think are reasonable - something like say 3 months is going to impact deletion processes very differently to something like 3 days. Thryduulf (talk) 21:14, 13 September 2022 (UTC)
 * @WhatamIdoing, when you say The closers might prefer the first, as it lends itself well to a straight-up vote, are you proposing asking in the RfC the question:
 * Proposed: Forbid creation of stubs.
 * Is that what you're proposing? I'm afraid to me, as simply an editor, that would seem rhetorical and pointy. As moderator I will reiterate that proposals for solutions will be welcome for the first 7 days of the RfC.
 * Valereee (talk) 21:48, 13 September 2022 (UTC)
 * This would probably require a redefinition of what constitutes a stub article. Currently WP:STUBDEF reads "There is no set size at which an article stops being a stub." - Enos733 (talk) 22:40, 13 September 2022 (UTC)
 * It might require a clearer definition, or at least a definition specific to this process, but that is more achievable now than it was back in the day.
 * Valereee, such a proposal would be a departure from our long-standing practice, but that might be what the core community wants now. If you think back through these conversations (on this page and elsewhere), there have been multiple editors saying that one of the problems with mass-creation of articles is that the resulting articles are not up to their preferred standards.  We don't hear "individual species obviously aren't notable"; instead we hear "it's only three sentences long, it only has two sources, and the editor who started the page hasn't come back to expand it into something much bigger".
 * I think we could expect to find editors in three camps:
 * Of course stubs are okay. That's how the wiki grows.
 * Stubs are terrible, and nobody should create them. If they're not expanded promptly, they should be hidden or deleted.
 * It's okay for editors make a few stubs, but nobody should make very many of them (i.e., ban mass-creation of stubs).
 * In the last two categories, I would not be surprised if a few editors said that we should specifically ban creation of stubs about BLPs and any subject covered by Notability (organizations and companies).
 * The outcome that I would hope to see is that we either agree that stubs are okay (in which case, the "your embarrassing garbage isn't good enough for me" crew can like it or lump it) or that they're not okay (in which case, the "don't be afraid to start small" folks can update their standards to the modern era), and that we can stop having debates about whether articles need to be past the stub stage to be permitted in the mainspace. WhatamIdoing (talk) 00:00, 14 September 2022 (UTC)
 * Personally, I created a number of poorly sourced stubs when I first started editing. A few have been converted to redirects, I have moderately improved some, and and others have received a lot of attention from other editors. But, there are still some that need improvement. These days, my minimum standard for a new article seems to be at least 100 words of prose and three decent sources (Savannah River point). Anything less than that and I work the material into an existing article (For instance, I recently created an article for Tarver, Florida, but then though better of it, deleted the article, and placed the material in Historic communities of Alachua County. So, that is where I personally draw the line. I would like to see something like that adopted as a standard expectation for new articles. Unfortunately, I do not have high hopes of it being adopted, but I will argue for any standard that moves in that direction. - Donald Albury 02:01, 14 September 2022 (UTC)
 * I am generally in camp 3 - a stub article is fine (as long as there is a verifiable source that suggests notability), but there should be limits on how many stubs an editor can create in a given period. -
 * Enos733 (talk) 15:08, 14 September 2022 (UTC)

I am the user who highlighted the failure of an article creator to expand it. I proposed a solution here: Newly added stubs not containing at least one (or two) non-database ref(s), that have not been expanded within five days, should be automatically userfied, thus putting the burden of expansion back onto the creator. I know nothing about the technical side of things, so I have no idea how feasible this is, but if it could be done, it would avoid having to bring 50 articles to AfD, with all the headaches that involves, and also avoid the situation where an article survives AfD because it can be expanded, but it never actually is. Scolaire (talk) 10:35, 14 September 2022 (UTC)


 * @Scolaire, if we ask a question like this, do you think you could explain why it's important to you that all articles be expanded past the stub stage? (There's no point in explaining now; I'm just asking whether this is the kind of thing that could be explained, or if it's basically ineffable.) WhatamIdoing (talk) 15:40, 14 September 2022 (UTC)


 * Yes, I think I could explain why it is important. Scolaire (talk) 16:32, 14 September 2022 (UTC)


 * - The essential point, which I think Valereee has mentioned a few times, is not that anyone objects to the creation and existence of stub articles. What they object to is the creation of many thousands of stub articles about non-notable subjects - subject about which no significant coverage exists anywhere - based on database sources.
 * The classic example of this was the creation of a very large number of articles, mostly by a single editor, about supposed "villages" and "ghost towns" based on the Iranian Census, GEOnet Names Server, and GNIS, none of which are actually structured to reliably identify populated or formerly-populated villages, and which we have now spent several years clearing up with no end in sight.
 * The other major example was the creation of many thousands of articles about Olympians, cricketeers, association football players and other athletes based on sports-reference.com (a statistical database). All that is known about these people is a name (quite often not their whole name) and some statistics. These people simply aren't notable, do not have any coverage anywhere beyond a few numbers in a database. No substantial article can be written.
 * This database-imports are a scourge on this encyclopaedia, sucking up massive amounts of time from other editing tasks to deal with. For example, the deletion of 216 articles about random shops, factories, farms and so-forth that the Iranian census had just happened to count people near to, required a full AFD that lasted a week and involved 17 editors, however it took their creator no more than an hour or two to create them.
 * THESE are the target of what we are talking about here, not simply one stub or even 50 created over a week that already has a source in them that shows the existence of significant coverage from which an article can be written. FOARP (talk) 16:17, 14 September 2022 (UTC)
 * Yes! What we should be looking for is the simplest requirement that will prevent mass-creation of stubs based solely on a database, without putting a burden on the creation of other articles. I believe this sums up what this RfC should produce. Donald Albury 16:41, 14 September 2022 (UTC)
 * I want to echo what User:FOARP and User:Donald Albury. We need to focus on the most outlandish mass-creation that lead to a Fait accompli (as ArbCom has unanimously agreed). This cannot turn into a referendum on article creation or deletion in general. This is a problem primarily created by volume. (And it might help to clarify that Wikipedia WP:NOT a database, and Wikipedia articles are WP:NOT database entries or compilations from other databases.) Shooterwalker (talk) 17:12, 14 September 2022 (UTC)
 * I feel like some editors are having trouble distinguishing between "non-notable" and "stub".
 * There are editors who object to stubs, especially if they are just one or two sentences long. This is a real thing, even if it's not your own personal objection.  I have seen editors complain about three-sentence, two-source stubs because they are very short, even when you explain that there are more sources and that more sentences could be written.
 * I also feel like some editors are having trouble distinguishing between "non-notable" and "sourced to a database".
 * If you haven't looked at it earlier, then look at Wikipedia talk:Arbitration Committee/Requests for comment/Article creation at scale/Archive 2 to see how much information a single secondary database contained for one species. Turned into prose, that would be too long and too comprehensive to be counted as a stub – which IMO means that we have achieved SIGCOV.  It is also a notable subject; the database entry itself lists 14 publications, and it's an incomplete list (notably missing the 1917 full species description published in a very respectable peer-reviewed academic journal, which is the source of its Valid name (zoology)).
 * Some editors seem to look at typical stubs and think "Boring subject, cookie cutter article, just a short stub, the first version only cites a database – that proves that it's a non-notable subject, so let's delete it". But they are wrong.  It is (the actual article) a stub, and the article was (at best) created from a database (though the original one-sentence stub in 2009 didn't actually cite anything at all).  However, it is (a) a notable subject based on the number and depth of publications on it, and (b) capable of becoming something beyond a stub, as I think I've proved in the linked example above.
 * It seems to me that objections to "stub articles about non-notable subjects - subject about which no significant coverage exists anywhere - based on database sources" are being turned into objections to "stub articles...based on database sources". Some editors do seem to object to stub articles based on database sources even when the subject is notable and significant coverage does exist in academic journals. WhatamIdoing (talk) 03:43, 15 September 2022 (UTC)
 * Even if you're describing the feelings of some editors, that concern is outside the scope of this RFC. The focus here is on mass creation. So the issue is when someone creates such a massive volume of stubs that it becomes practically impossible to discern between stubs that can be expanded with reliable third party coverage, and those that cannot. Shooterwalker (talk) 03:56, 15 September 2022 (UTC)
 * The connection is this: If editors shouldn't be creating stubs at all, then it is also/automatically true that editors shouldn't be creating a lot of stubs.
 * A related problem: Would we have been happier with someone mass-creating non-stubs about non-notable subjects?  Imagine that the Iranian census ghost towns were all much larger than stubs, but otherwise had the same problems.  What would we be talking about, instead of "mass creation of stubs"?  "Mass creation of possibly wrong articles"?   WhatamIdoing (talk) 04:02, 15 September 2022 (UTC)
 * I'm not sure what point you're trying to make other than to pick a fight about something that you described as dividing Wikipedia into three different camps. That's counter productive, and there is no value on expanding this to be a referendum on stubs in general. Our job is to build a WP:CONSENSUS. What we can say for certain is that most people don't think the cycle of mass creation and mass deletion is a healthy process, and we ought to be able to make some incremental changes to reduce the WP:BATTLEGROUND. Shooterwalker (talk) 04:13, 15 September 2022 (UTC)
 * - "to see how much information a single secondary database contained for one species. Turned into prose, that would be too long and too comprehensive to be counted as a stub – which IMO means that we have achieved SIGCOV" - I think the crux of what we're saying here is that where there is at least one SIGCOV source, there is not likely to be problem as far as this RFC is concerned. The problem is when we have articles like Harry Oppenheim. In this case there was only one real source (multiple sources were cited in the article but these were essentially copies of each other) which was a statistical database that said he had played a single association football match for Austria in 1909, which is clearly not SIGCOV. Pushing further and it seems that Harry Oppenheim may anyway have not been the name he was known by, since his real name was probably Heinrich Oppenheim - and this is literally all that is known or ever going to be known from reliable sources about him. The creator of Harry Oppenheim created many thousands of other articles with exactly the same problems using exactly the same sources. FOARP (talk) 07:53, 15 September 2022 (UTC)


 * It's fine for y'all to use this talk page to workshop a proposed question you'd like to add in the first seven days of the RfC, but I'm unsubscribing. Valereee (talk) 12:03, 15 September 2022 (UTC)
 * Actually, I'm going to have to ask you to take the discussion elsewhere, as there's been misunderstanding at WT:ACN about it. Valereee (talk) 15:53, 15 September 2022 (UTC)

Workshop phase is over
I am currently drafting for the RfC from what came out of the workshopping, which is fully contained in Archive 2. , when you archive the closed discussion above, please put it in Archive 3 so folks will know it wasn't part of the workshopping, thanks. Let's leave this message here until we're ready to start the RfC. Valereee (talk) 10:35, 18 September 2022 (UTC)


 * @Valereee: ✅. { &#8211; MJL &thinsp;‐Talk‐☖ 18:05, 19 September 2022 (UTC)

Fix?
I don't think the whole RfC is supposed to go into the status header box thingie...@MJL, can you fix that? Then go ahead and archive this page so we can start fresh (starting fresh with the question below, which is about this RfC) with a new archive for the RfC, thanks! Valereee (talk) 16:43, 3 October 2022 (UTC)


 * MJL doesn't seem to be available so anyone else who knows how to fix this, please do! Valereee (talk) 20:01, 4 October 2022 (UTC)
 * Thank you, Enterprisey! Valereee (talk) 21:15, 4 October 2022 (UTC)