Wikipedia talk:Wikipedia Signpost/2021-08-29/Recent research


 * Just wanted to say - thanks, TB, for keeping this column running for years and years. --Piotr Konieczny aka Prokonsul Piotrus&#124; reply here 04:12, 30 August 2021 (UTC)
 * Have read only the excerpt quoted here, but Jemielniak et al (2021)'s findings that Wikiproject Tropical Cyclones delivers the highest quality content of all Wikiprojects on English Wikipedia surprise me. I expected MilHist. Learn something everyday. Rotideypoc41352 (talk · contribs) 18:26, 30 August 2021 (UTC)
 * Paper is paywalled so can't say for sure, but it says it's using the square root of the total articles? A quick check shows MILHIST has 1,334 FAs out of 200,359 articles total.  Tropical cyclones has 165 FAs but out of mere ~3200 articles.  That seems like it should still favor MILHIST, but I guess Tropical Cyclones huge number of GAs helped it win out - which seems a bit off to me, I'm curious if the writers included A-class MILHIST articles that were also GAs, and also, there's only so many GANs that can really be processed, so that would hit larger projects more than smaller projects.  (Regardless, both of those projects are doing good work.) SnowFire (talk) 23:30, 30 August 2021 (UTC)
 * We ran into a similar problem in our 2015 paper about the misalignment between quality and viewership of articles (covered in the April 29, 2015 Signpost's research section: Popularity does not breed quality (and vice versa)). In that case, we were trying to understand topics of FAs with relatively low viewership, and used relative risk to measure it. The approach has a couple of parameters that can be tuned: the minimum number of articles in a project, which removes small projects that often focus on a very specific topic; and the minimum number of articles in the dataset, which controls how general the listed topics are (smaller values lead to more specific topics, IIRC).
 * Both MILHIST and Tropical Cyclones were in our paper (Table 9, page 7). The difference in relative risk was large: MILHIST ranked #7 with an RR of 5.3, whereas Tropical Cyclones ranked #2 with an RR of 99.3. I'm unsure what the number of articles in the projects were at that point in time, we don't mention it and instead focus on how MILHIST also has some articles with high viewership but relatively low quality. One was NATO, now a GA which made me happy to see! The other was Vietnam War, which is still a C-class article (and it makes sense that it still is, we had good discussions on the talk page of the Signpost back in 2015 about some of the challenges of writing those types of articles). Cheers, Nettrom (talk) 08:01, 1 September 2021 (UTC)
 * Disturbingly, WikiProject Tropical cyclones is the subject of a very large copyright investigation opened in May 2021. "For the most part, this project remains rife with direct copy and pastes, unattributed PD copying, unattributed copying within Wikipedia, possible but unconfirmed translation vio, possible but unconfirmed cross-wiki translation vio, and possible but unconfirmed paywall vio, mostly to newspapers.com we believe." NebY (talk) 20:51, 1 September 2021 (UTC)
 * Not to sidetrack, but isn't "unattributed PD copying" legit? It's good practice to attribute, of course, but if it's public domain, it's fair game to copy wholesale.  SnowFire (talk) 22:45, 1 September 2021 (UTC)
 * Not my speciality, but Copyright violations has material from public domain sources or other compatibly licensed sources may also be used in accordance with the copyright policy, provided correct attribution is given. NebY (talk) 00:04, 2 September 2021 (UTC)
 * Attribution is more than just good practice; see Plagiarism. ☆ Bri (talk) 15:25, 2 September 2021 (UTC)
 * Unattributed PD copying still is something that us copyright editors open CCIs for and actively look to correct. Unattributed copying is the most important thing to correct; the act of copying so long as it is usable on Wikipedia is not bad itself. I expect a lot of disruption with all of the socking and vandalism and whatnot, let alone this CCI. Sennecaster  ( Chat ) 14:11, 4 September 2021 (UTC)
 * Yes, I'm always concerned when researchers use Wikipedia's FA label to judge the quality of the content here. The whole Clean Wermacht controversy with the military history articles a while back showed how problematic accepting FA status as an indicator of actual quality can be.  AugusteBlanqui (talk) 19:14, 2 September 2021 (UTC)
 * An article is an artistic creation, much as a book or movie is, but GAFA does not operate entirely like book reviews and movie awards. It works more by checklist, and the article is judged by how well it meets each of the requirements. So, some subjects like warships and hurricanes are born a dozen or three per year. They have very precise notability requirements and usually very precise official sources, and well-established traditions of article structure and checklists of what goes in the article and what does not. Checking off the boxes for the checklists of GAFA is easier for those whose own checklists are clear, than for articles about a war or a musical genre or a technical development, where the boundaries and criteria for aspects to be included are seldom so clear. This of course does not make GAFA unimportant, but like any tool its powers and limitations have to be understood.Jim.henderson (talk) 15:02, 4 September 2021 (UTC)
 * Courtesy link: Rotideypoc41352 (talk · contribs) 18:14, 7 September 2021 (UTC)