Wikipedia talk:WikiProject Core Content

Selection criteria
How do we determine if an article is "core" content? Some suggestions: Thoughts on these, or any other ideas? I think whatever we decide, it should be objective, to reduce arguments about inclusion. Levivich 20:05, 15 August 2022 (UTC)
 * 1) any topic that has three monographs or entries in scholarly encyclopedias dedicated to the subject
 * 2) any topic that has an entry in three major general encyclopedias (e.g., Encyclopædia Britannica, World Book Encyclopedia, Encarta)


 * Pinging BD2412, CactiStaccingCrane, and Apaugasma in case they want to comment. Levivich 20:14, 15 August 2022 (UTC)
 * Are we not including articles on the subject published in reputable peer-reviewed journals for this purpose? In other words, this is solely for topics covered in three publications, these being either books or entries in major general encyclopedias? BD2412  T 20:19, 15 August 2022 (UTC)
 * I would allow monographs and encyclopedic entries to add up (e.g., one monograph and two entries would make it).
 * I would also specifically allow more specialized scholarly encyclopedias (e.g., Dictionary of National Biography, Encyclopaedia of Islam), though these must be of excellent academic reputation (perhaps 'scholarly encyclopedias of high repute'?).
 * Yes, we should also count papers in academic journals or chapters in edited volumes in some way: perhaps we can let three such papers or chapters count for one monograph/encyclopedic entry? That would include everything which has nine papers/chapters, even if there is no monograph/encyclopedic entry. Or is it better to require twelve, making four count for one? ☿ Apaugasma  ( talk  ☉) 20:31, 15 August 2022 (UTC)
 * If we're going to go that route, let's just have a points system. Ten points = inclusion. A monograph or encyclopedia entry counts for four points. BD2412  T 20:39, 15 August 2022 (UTC)
 * What about just WP:GNG, except the sources must be academic? Which would include academic journals, academic books, encyclopedias, etc. ("Tier 1" sources), as long they're WP:SIGCOV? Or, perhaps more than SIGCOV, they must be the topic of the source? Levivich 20:41, 15 August 2022 (UTC)
 * I was also thinking of a point system because it seems wrong to give as much weight to a journal article as to a monograph. But perhaps it's better to emulate WP:GNG, leaving it vague how many sources exactly are needed in the first place. We could just state that sources must be WP:TIER1, though monographs, literature/systematic reviews or entries in scholarly encyclopedias should be given more weight than single peer-reviewed articles or chapters in edited volumes (the latter are still missing from WP:TIER1).
 * WP:SIGCOV says "it does not need to be the main topic of the source material". I would change that for our purposes into it must be the main topic, or one of the main topics, of the source material. On the one hand we need much more than a few paragraphs or pages, but there often is more than one main topic, and demanding it to be the main topic would both be too strict and create problems with establishing whether something truly is the main topic of a source or not. ☿ Apaugasma  ( talk  ☉) 21:17, 15 August 2022 (UTC)
 * (I would include "chapters in edited volumes" as part of TIER1's "books published by university presses", but maybe TIER1 should be clarified.)
 * The question I have is whether we really need to weigh academic sources against other academic sources for the purpose of determining whether a topic is in scope of this project (as opposed to, say, for determining whether something is WP:DUE in an article). I feel like a topic either "makes the cut" or doesn't "make the cut".
 * Say, for example, we decide that a topic should be supported by at least three academic sources to be in-scope (with the topic being a main topic of the source material, but not necessarily the main topic or sole main topic). Does it matter to us if the three sources are three books, three journal articles, three encyclopedia entries, or one of each? Do we want to say that topics that are the subject of three journal articles are not in scope? I'm not sure what my answer is to that question.
 * For practical purposes, we could start strict and expand the scope later. After all, if we're talking about a scope that includes hundreds of thousands of articles (most likely), we'll need to start with a smaller batch anyway (when it comes to adding talk page banners and such). Levivich 18:08, 16 August 2022 (UTC)
 * Yes, start strict. That's why three journal articles, which in my experience covers a lot more than three monographs/encyclopedic entries, should probably not make it, at least not to start with. A simple statement that articles in academic journal contribute a little less to academic notability should suffice though. Note that if we differentiate there, chapters in edited volumes should also be differentiated from whole books (btw, such chapters are, as a very general rule, of lower quality than papers in journals, as well as than monographs). That's my view, but I would like to hear from others. ☿ Apaugasma  ( talk  ☉) 19:28, 16 August 2022 (UTC)
 * How about, and then we define the terms:
 * "Significant coverage" means...
 * WP:SECONDARY or WP:TERTIARY coverage but not WP:PRIMARY
 * The topic must be a main focus (but not necessarily the sole focus) of the source
 * The amount of coverage must be:
 * 3(?) monographs, or
 * 3(?) encyclopedia entries, or
 * X journal articles, or
 * Y book chapters, or
 * Some combination of the above (point system?)
 * "Independent" means independent of the topic and of each other
 * "Academic" means published by a reputable academic publisher
 * We can then adjust the scope by adjusting what we mean by "significant". Thoughts? Levivich 16:56, 17 August 2022 (UTC)
 * Agreed. Try it out, see if it is logical. For example, day would meet the criteria, because it has received coverage on:
 * Britannica,
 * Encyclopedia of Time: Science, Philosophy, Theology, & Culture (search "day"), and
 * Encyclopedia of Microcomputers: Volume 19 - Truth Maintenance Systems to Visual Display Quality, titled "clock" but talks a lot about how day are treated in computers
 * -- CactiStaccingCrane (talk) 17:05, 17 August 2022 (UTC)
 * OK, I guess that's workable for now. Which bring me to my next question, how to populate the list of articles in scope, which I'll start a separate thread for below. Levivich 15:01, 18 August 2022 (UTC)
 * I like it! A friendly competition between WP:WikiProject Vital Articles and WP:WikiProject Core Content would be extremely interesting to see. As for the inclusion criteria, I suggest not to have a hard-and-fast rule and instead making a few examples to roughly establish the criteria. I would also suggest to start making a drive to attract editors. CactiStaccingCrane (talk) 00:16, 16 August 2022 (UTC)

Tools
Any other helpful tools? Levivich (the unhelpful tool) 20:12, 15 August 2022 (UTC)
 * Recent changes: once selection criteria is established, adding a hidden category called Category:Core content will make it easy to have a dedicated Special:RelatedChanges feed for patrolling
 * Talk page WikiProject banner tagging will give us the typical suite of WikiProject tools, like watchlists, article alerts


 * Maintenance backlogs, like WP:WikiProject Core Content/Unsourced and WP:WikiProject Core Content/Undersourced. these lists might be useful for that unsourced-articles task force we were talking about recently. Levivich 19:48, 28 August 2022 (UTC)

Council participate
Just saw this come. ... want any help with building the project page? I have helped with many projects format page.. most recently WikiProject COVID-19. Moxy - 23:22, 15 August 2022 (UTC)
 * Sure and thanks! Although I'm not sure yet whether there's enough interest to make this a "real" WikiProject. Levivich 03:00, 16 August 2022 (UTC)
 * Ping me when things are better....will also keep an eye out. Moxy -Maple Leaf (Pantone).svg 03:57, 16 August 2022 (UTC)
 * @Moxy: I hope you're still interested/available? :-) We have a preliminary list of ~17k mainspace pages (WikiProject Core Content/Articles) that I think would grow, maybe significantly (>100k), if this WikiProject moved forward, but a list of 17k seems like enough to start with for now.
 * I read WikiProject Council/Guide, WikiProject Council/Guide/Technical notes, and Version 1.0 Editorial Team/Using the bot. Is there anything else I should read?
 * What's the usual procedure for "I want to add banners to 17k talk pages"? Is there a bot that does this already? Do I need to create my own and apply for BRFA for this task?
 * Thanks for your help! (Or if you're not available, thanks anyway!) Levivich 18:35, 26 August 2022 (UTC)
 * Will be around after the weekend...... :-) Moxy -Maple Leaf (Pantone).svg 02:26, 27 August 2022 (UTC)
 * Hi I suppose this could just be done via AutoWikiBrowser: Someone would obtain the list, append Talk: to each entry, and use the software to insert the banner. According to WP:ASSISTED though, a consensus is first needed and a BRFA should be opened if there are any doubts. But if the WikiProject gets created, I would see that as a clear consensus for making those edits. 0x Deadbeef  16:26, 2 September 2022 (UTC)
 * Thanks, Deadbeef! I've looked around and there are a number of bots that already have approved tasks for WikiProject banner tagging (User:Legobot, User:EarwigBot, User:AnomieBOT, User:Hazard-Bot, User:KiranBOT). I'm thinking we'll probably want to fork one of those rather than asking to use them directly, due to the large number of pages and the likelihood of wanting to run multiple runs over time (we don't want to keep bugging a bot operator, nor take over their bot). And we probably want to establish the WikiProject before seeking BRFA: not sure tagging 50k+ pages is justified if there are only a few people interested? Levivich 17:50, 2 September 2022 (UTC)
 * Apologies for those pings, I should have used noping. Levivich 17:51, 2 September 2022 (UTC)
 * Apparently there is no official measurement used to distinguish a group of editors from a WikiProject. I've created the WikiProject Core Content/Participants page and I suppose it would be a basis to mass tagging. 0x Deadbeef 08:09, 3 September 2022 (UTC)
 * I just saw the ping to my bot's ac. A couple of years back, I tried to revive wikiproject organised crime, and wikiproject espionage. They are somewhat better than before. If the targeted articles for the wikiproject are going to be a lot, then it is better to have a some sort of consensus (even if informal) somewhere. I think village pump would be a good idea for that. There we can also get further suggestions. From a technical point of view regarding project banners - it is a little trickier than expected. AWB/AWB bots rely primarily on categories for that task. So the non-recursive (bottom of the category/the category with no sub-cat) should contain only the articles that are expected to be tagged. Please feel free to ping me if you have any questions — about the bot, or wikiprojects in general. I will participate if I find it interesting, and if the time permits. At the least, I will always be available for discussions :-) —usernamekiran (talk) 17:16, 5 September 2022 (UTC)

Creating the list of articles
How do we generate a list of articles within the scope of this WikiProject? My thoughts: Other thoughts? Levivich 15:15, 18 August 2022 (UTC)
 * All 1,000 articles in WP:VA3, right?
 * I'm not as sure about WP:VA4 and WP:VA5, because they cover non-academic topics (e.g. pop culture). For example, I'm not sure that Sean Connery, a VA4 topic, is the subject of significant scholarly coverage (though undoubtedly he is the subject of significant non-scholarly coverage). Perhaps there are certain VA4 and VA5 categories that we can "automatically import", while not automatically importing other VA categories (like "People")? If we do this for VA5, it'll bring the article count into the tens of thousands. That'll be a good place to start (in terms of setting up an RC feed, etc.), but we'll also be duplicating VA at this "level".
 * Is every topic that is an entry in Encyclopedia Britannica in-scope? I believe there are about a couple hundred thousand entries in Britannica. The 1911 and 1922 versions, though old, are available on WikiSource. With a bit of technical magic, we could generate a list of every entry in those two versions. No doubt that will not include modern (post-1922) topics, but it's a start. This would bring us to a six-figure article count, and include articles that are outside the scope of VA.
 * Alternatively, we could cross-reference Britannica with another encyclopedia or two, and try to come up with a list of entries that are in multiple encyclopedias. I'm not quite sure how to get a list of topics for another encyclopedia (including modern Britannica). Maybe a web crawler? I don't know.
 * I don't really trust our category system, but maybe taking all the articles that are linked one or two category levels below Category:Main topic classifications? My thought is that would grab all the top-level or broad-topic articles... but I really don't trust our category system. For example, two levels down is Category:Communication studies, which you'd think would hold top-level communications articles, but it also has Making Chastity Sexy and Pastel QAnon, which strike me as out-of-scope. So I'm not sure this one is a good approach.
 * Another thought I had is pulling from a list of academic journals. If there's an entire academic journal on a topic, that topic is very likely to be in-scope, right? (And I mean real journals, not the predatory ones.) This would probably be the same as a list of academic disciplines, and we already have a category for that (Category:Academic disciplines), so maybe this approach won't get us far.


 * Just a quick note that I personally won't be working on this for the foreseeable future. Sincerely appreciate the efforts, and hope that others will step in. ☿ Apaugasma  ( talk  ☉) 15:52, 18 August 2022 (UTC)
 * Agreed. Let's do it then. CactiStaccingCrane (talk) 15:41, 25 August 2022 (UTC)
 * Cacti, I heart your enthusiasm :-) I have been looking at places where we can pull together a list of articles. Because let's face it, if we are to do something that is not just duplicating VA, we're talking about way more than 50k articles, probably something like 250k-500k articles. Generating a list of 250k-500k topics is not possible manually; it'll have to be done by combining some other lists somehow. It's very hard, or at least I think it's very hard, to figure out a way to generate a list of 500k topics that have significant academic coverage.
 * Anyway, I've been looking at the 1911 Encyclopedia Britannica, which has a handy list of topics at WikiSource: 1911 Encyclopædia Britannica/Classified List of Articles. I had thought, well, OK, it'll be an outdated list and won't have modern topics, like WWI and WWII, but it's a start, right?
 * Well, I'm not so sure. After looking at the list of 1911 EB topics, I see they include wonderful entries such as "Quadroon" and "Mandingo". That makes me rather uncomfortable about just importing that list and saying "it's academic" because it's in an encyclopedia. I naively forgot how ridiculously racist the Western world was 100 years ago (like even more than today, amazingly).
 * Thoughts? This project is a great idea, but how do we actually come up with a list of academic topics? Levivich 16:41, 25 August 2022 (UTC)
 * Well, I think that we should import from a lot more encyclopedia then. WP:Missing encyclopedic articles may be of interest to you. CactiStaccingCrane (talk) 23:40, 25 August 2022 (UTC)
 * Yes! Thank you for reminding me about that. That's where I got the original idea of using EB1911. They also have lists of topics, plus excluded topics: WikiProject Missing encyclopedic articles/1911 verification. I think, unless someone beats me to it first, I'll try to distill their list of non-excluded topics into one list that just has blue links and we'll see how many that is. (I may not get to this for a few days or a week.) Levivich 00:31, 26 August 2022 (UTC)
 * Well, that was much easier than I thought it would be. I took the list of EB1911 articles, resolved the redirects, and separated out the disambiguation pages. WikiProject Core Content/EB1911 articles is a list of 16,719 EB1911 articles, and WikiProject Core Content/Dabs is a list of 295 dabs. I guess we'll have to go through the dabs manually to figure out which article(s) are in-scope, but 295 is manageable over time. As for the 16,719 legit articles, those are now the beginning of the list of articles in scope, at WikiProject Core Content/Articles, which of course we should expand. We now also have a recent changes feed for the list of articles, at Special:RelatedChanges/Wikipedia:WikiProject Core Content/Articles. I linked the list of articles and recent changes feed on the WikiProject main page. So that's 17k. There's also WikiProject Missing encyclopedic articles/Hot, a list of 72k encyclopedia topics, from multiple encyclopedias. Plus Vital Articles, that would bring the list over 100k. For VA, I was thinking all of WP:VA3 and all of WP:VA5 except the "People" and "Everyday life" categories... I think the other categories are all academic in nature, but those two need some more careful scrutiny, maybe by sub-category. Any thoughts? Levivich 00:17, 27 August 2022 (UTC)
 * Well, I think it's time to do it. It would be much easier to spot what's missing once the list materializes. CactiStaccingCrane (talk) 01:05, 27 August 2022 (UTC)
 * I just want to say that having the RC feed is really great. I already made my first revert, by sheer coincidence on an article related to the very topic I'm writing about these days.
 * Yes, the list from the Missing encyclopedic article project looks okay (though a spot-check seemed to reveal a lot of articles that may not meet the WP:ANG), and yes, better leave out some subcategories from VA (VA5 Cities, and to a lesser extent VA5 Culture, also look problematic).
 * In general though I would model the inclusion process on the general editing process: anyone can boldly add an article, anyone can boldly remove it, and if two or more editors are at odds about it they are expected to discuss. When a sufficient amount of editors are aware of the process, articles will soon be added and removed all the time. We need a start to get it going, but it doesn't need to be 100% accurate. ☿ Apaugasma  ( talk  ☉) 01:28, 27 August 2022 (UTC)
 * Exactly. CactiStaccingCrane (talk) 01:45, 27 August 2022 (UTC)
 * Sounds good to me. I am going to see about adding List of encyclopedia topics (77k) and WikiProject Missing encyclopedic articles/Hot (73k), and VA3 (1k). After excluding redlinks, redirects, dabs, and deduping, I'm not sure how many that'll leave us, but we'll see. And then after that I'll see about selected VA5 categories. Levivich 02:01, 27 August 2022 (UTC)
 * Unfortunately for both these lists, editors removed links as they turned blue, so what's left is a tiny proportion of their starting counts. I am not going to import WP:List of encyclopedia topics at all, because it seems to have too many non-academic topics (e.g. Acid Head, Auchlochan, Bass Brook). The Hot list looks solid, but I think the import count will sadly be <6k from the current revisions of the pages. I might later see about pulling more topics from the page histories, but there are dozens of sub-pages and each one has been filled and culled multiple times, so not a quick/easy thing. Alas, I no longer think we'll get to 100k quite so quickly and easily. Levivich 00:09, 28 August 2022 (UTC)
 * Done - current version of the Hotlist pages added, about another 5k articles, all VA3, and all VA5 except "People" and "Everyday life" categories (these two categories need a closer look), about 31k; we're now at 53,496 articles. I think that's all the mass importing I plan to do for now; if anyone else needs help with large list import, feel free to let me know. Levivich 19:21, 28 August 2022 (UTC)
 * Levivich, could you add that to the bar of Recent Changes? That would be really helpful for us. CactiStaccingCrane (talk) 14:47, 2 September 2022 (UTC)
 * @CactiStaccingCrane: Absolutely! The recent changes feed (the link on the wikiproject page) automatically draws from the /Articles subpage, so as soon as articles are added to that subpage, the RC feed is automatically updated. So right now the RC feed should be monitoring all 53k articles in /Articles. Levivich 16:08, 2 September 2022 (UTC)
 * Nice! Time to use the "likely" filter more often :) CactiStaccingCrane (talk) 16:11, 2 September 2022 (UTC)
 * Just noting here that as I'm going through the list of unsourced in-scope articles, I'm noticing a lot of false positives. :-( Levivich (talk) 02:01, 19 September 2022 (UTC)