Wikipedia:Bots/Requests for approval/LivingBot 17


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol keep vote.svg Approved

LivingBot 17
Operator:

Time filed: 10:38, 5 June 2011 (UTC)

Automatic or Manually assisted: Automatic, manually assisted (mixture of the two)

Programming language(s): AWB, in all likelihood; using the CSV importer.

Source code available: Standard AWB

Function overview: Add Backwardscopy to article talk pages that need it

Links to relevant discussions (where appropriate):

Edit period(s): One time

Estimated number of pages affected: less than 2140

Exclusion compliant (Y/N): Y (basically N/A)

Already has a bot flag (Y/N): Y

Function details: Companies such as AlphaScript and BetaScript (both part of the same publishing house) make money out of compiling Wikipedia articles into ad-hoc books, for example:
 * "Miller, F. P., Vandome, A. F., & McBrewster, J. (2010). Frisian-Frankish wars: Frisian kingdom, Redbad, King of the Frisians, Charlemagne, Low Countries, Saxons, Varni tribe, Scheldt, Merovingian dynasty

This book would therefore duplicate the content of Frisian-Frankish wars, Frisian kingdom, Redbad, King of the Frisians, etc. The idea is to at least recognise these copies via the Backwardscopy template. LeadSongDog has compiled a list of these books; I have used a script to extract the relevant data (output) and will manually review all entries before starting the script (for example, fixing accent encoding issues). The bot will not create talk pages for articles that do not exist, and I expect to perform the run via a number of passes: one where the talk page does not exist at all; one with just headers; and increasing complexity until I basically have to adjust each manually. Note that articles with multiple mentions have been merged; whilst >3 mentions gets you a "skip" (I think this only happens once).

Discussion

 * Support the objective (obviously), but have no comment on the suitability of the proposed mechanism. There are other publishers for which the same type of intervention is needed, this was just the most blatant. LeadSongDog come howl!  17:21, 8 June 2011 (UTC)
 * Trusted botop, BAG member, seems like an uncontroversial task and is relatively easy to implement. I see no problems here. Regards, MacMedtalk stalk 23:54, 8 June 2011 (UTC)

— HELL KNOWZ  ▎TALK 13:51, 11 June 2011 (UTC)
 * It was a slow process, but when I did all the articles at once I can get up speed by grouping them into similar categories (non-existent talk pages, empty, etc.) - Jarry1250 [Weasel? Discuss.] 16:51, 11 June 2011 (UTC)

Edits look fine. Useful, uncontroversial task. Trusted op. I believe you will take care of any issues that might arise. — HELL KNOWZ  ▎TALK 13:59, 13 June 2011 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.

Review
I'm a bit late to the party, but I wish to further discuss this task [somehow I either overlooked the BRFA while it was going on, or I was busy IRL and i just missed it].

In theory, having a bot add backwardscopy to the various pages is a good idea. However, considering just how many damned books VDM Publishing (and its imprints such as Alphascript and Betascript) make, as well as those from Books, LLC, this will just amount to an insane quantity of talk page clutter. Especially considering what the "actual risk" someone thinks the Alphascript/Betascript/Books LCC were the original producer of the material, and that there's some attribution/copyvio problems with our articles.

Jarry told me this only covers books with OCLCs, but which books have OCLCs is just a matter of how successfully VDM Publishing and Books LCC have streamlined their bots to update the OCLC database. Is there really consensus for this task? Because I would certainly be opposed to systematically adding backwardscopy for these two publishers and their imprints. Headbomb {talk / contribs / physics / books} 04:26, 21 June 2011 (UTC)


 * I should point out that the current task is not to add the template to the talk pages of all 184,000 VDM books ever thought up. It is about adding the template to articles from a small subset (400) which actually exist, and which are actually owned by libraries and other individuals around the world. Did you read the discussion that prompted this bot task? - Jarry1250 [Weasel? Discuss.] 07:15, 21 June 2011 (UTC)
 * Do you mean the discussion at Village_pump_(policy)/Archive_87? Jowa fan (talk) 07:29, 21 June 2011 (UTC)
 * Yes, that's th eone. - Jarry1250 [Weasel? Discuss.] 20:37, 21 June 2011 (UTC)


 * People at WP:MATH are complaining about this too. Headbomb {talk / contribs / physics / books} 07:36, 21 June 2011 (UTC)


 * I don't think that's an entirely fair characterisation of the conversation at WT:MATH. They were surprised, not complaining. - Jarry1250 [Weasel? Discuss.] 20:37, 21 June 2011 (UTC)


 * "LivingBot is adding a number of dubious tags to some article talk pages, such as this [6]. Does anyone know about this? I'm inclined to revert... Jakob.scholbach (talk) 19:33, 20 June 2011 (UTC)", and "I do find the tag annoying though. RobHar (talk) 20:01, 20 June 2011 (UTC)" are complaints. Headbomb {talk / contribs / physics / books} 20:49, 21 June 2011 (UTC)


 * Okay, well if you read down, Jakob warmed to the tags. Fine, RobHar did say the tag was annoying. But that's hardly summation point of that thread. Best for an outside commenter to actually read it, was my point, rather than see "People are complaining" and think there's some sort of mass outpouring of hatred going on. - Jarry1250 [Weasel? Discuss.] 21:02, 21 June 2011 (UTC)

If the amount of articles to be tagged (in principle, not the amount proposed here) is too large, then perhaps a better solution is to have a bot constantly monitoring added references and post message(s) when an offending reference is actually used? — HELL KNOWZ  ▎TALK 08:50, 21 June 2011 (UTC)


 * I think what would be best to do at this point is revoke the approval (if only temporarily), and hold an RFC on what the community thinks should be done about these books. An edit filter might be sufficient, or maybe bot reacting to one of those books being added to an article, rather than pre-emptively tag thousands of pages with a "BTW, don't use that book as a reference, they copied from us, we didn't copy from them". If the community prefers preemptively tagging articles, then the bot could resume its task. If not, then some new bot would have to be devised. Opinions? Headbomb {talk / contribs / physics / books} 20:56, 21 June 2011 (UTC)


 * I've no great objection to an RFC. I could certainly contribute some views (though I'm away next week). "thousands" = 2140, by the way. And no, there's no need to revoke the approval. I'm not actually a two-year old. - Jarry1250 [Weasel? Discuss.] 21:12, 21 June 2011 (UTC)


 * Revoking would mostly be a bureaucratic thing so the BRFA is categorized where it should be (for archival purposes, etc...). Headbomb {talk / contribs / physics / books} 21:16, 21 June 2011 (UTC)


 * Still can't see any perceivable benefit myself (and I helped write the BRFA categorisation system), but if you want... - Jarry1250 [Weasel? Discuss.] 21:19, 21 June 2011 (UTC)

To be clear, the intent is to apply this tag to articles that are found in Worldcat, which indicates that some library has acquired and catalogued it as a "book". This is a much smaller number, as noted above.LeadSongDog come howl!  04:45, 23 June 2011 (UTC)


 * True, but it doesn't change the fact that each of these catalogued books means tagging several dozens of pages with an essentially useless message since it's incredibly unlikely that someone thinks the VDM Publishing / Books LLC are the original source of our articles, and have copyright claims over it. Suppose a library suddenly acquires 20 different books with ~20 articles in them. We now have to tags ~400 pages on account of one Library's bad purchase? Now scale this worldwide with one library making such a purchase every week or so. Is it worth the clutter? Very doubtful. Headbomb {talk / contribs / physics / books} 05:26, 23 June 2011 (UTC)


 * Here's some numbers of these books in WorldCat. Books LLC (12,321), VDM Publishing (251), Alphascript (901), Betascript (312). At ~20 articles per book, that's ~275,000 articles to tag. Headbomb {talk / contribs / physics / books} 05:32, 23 June 2011 (UTC)
 * If you look a little closer at the 12K entries for Books, LLC, about 8K of them are audiobooks, which are not likely to be cited. I'm not sure where the ~20 articles/book figure arose from, it seems high. The VDM books appear mostly to have one article. But what "clutter" are we worried about? A line on a talkpage? LeadSongDog come howl!  07:25, 23 June 2011 (UTC)
 * I think the alpha/betascript books have several articles per book. If too many of these tags appear, it starts to feel meaningless. Jowa fan (talk) 12:44, 23 June 2011 (UTC)


 * I think that talk page clutter is a reasonable concern, but not really related to this bot request. If the message is too prominent, the template can be changed. CRGreathouse (t | c) 15:06, 23 June 2011 (UTC)
 * Indeed, this template could adequately serve its purpose if it was a hidden maintenance template, putting the article into a hidden maintenance category . This might be an all around improvement. LeadSongDog  come howl!  17:20, 23 June 2011 (UTC)
 * It would serve some of its many purposes that include the literal prevention of deleting our own legal content and broader issues such as informing editors that work here is routinely reprinted and alerting outsiders. Personally, I think a visually smaller template could well be ideal if we wanted to expand this out from 2100 to 5 or 10,000. - Jarry1250 [Weasel? Discuss.] 17:43, 23 June 2011 (UTC)


 * I'm not opposed to that. My immediate point is just that the community can decide how to handle that apart from this bot's approval.  The questions here should be (1) is it worth marking these articles at all, and (2) does this bot do it well.  No one seems to have questioned #2, so it comes down to #1.  I think it's worthwhile to keep track of these articles, so I believe I favor approving the bot.
 * Actually, I like your particular suggestion (categorizing by publisher) because it could lead to a different solution if it later became worthwhile: display the notice if the publisher is other than (list of the usual Wikipedia copiers here). So if some existing, otherwise-reputable publisher copied just a few books from Wikipedia, there could be a displayed notice in just those cases where there might be legitimate confusion.  Not that we'd do this right away, only that there would be an option if the need arose.
 * CRGreathouse (t | c) 17:47, 23 June 2011 (UTC)

But that's the thing, this bot's approval (or any bot for what matters) should be contingent on the task it's supposed to do. So what needs to be done is have some kind of RFC on the issue (WP:VP seems like a good place to have it), discussing what ought to be done, if anything even needs to be done. If the community does not see a need to tag these articles, then the bot should not be approved. If the community thinks something needs to be done, but that tagging talk pages with backwardscopy is not the solution, then the bot should not be approved either. If the community thinks tagging talk pages with backwardscopy is the solution, then the bot should be approved. A bot can only be approved to perform tasks which have consensus. Headbomb {talk / contribs / physics / books} 08:59, 25 June 2011 (UTC)
 * There was discussion at Village_pump_(policy)/Archive_87 et sequelia just a month ago, but if you think it needs to be reopened, that should be linked from the new discussion. LeadSongDog come howl!  19:03, 25 June 2011 (UTC)


 * There was a general "these are shoddy publishers, even thought it's legally permitted" feeling, along with people being glad that Amazon was now taking steps against Alphascript and Betascript etc... at the VP. There was very little discussion about what should actually be done or what the best recourse would be, the BRFA wasn't advertised, and people are complaining about the bot. Also, what new discussion are you referring to? Headbomb {talk / contribs / physics / books} 20:39, 25 June 2011 (UTC)


 * I'm rapidly tiring of this discussion. Enough chat, someone open an RfC or something if we're really concerned about 2100 articles and talkpage clutter. - Jarry1250 [Weasel? Discuss.] 21:15, 25 June 2011 (UTC)


 * @Headbomb, you suggested an RFC in your 08:59, 25 June 2011 (UTC). I pointed out it should link to the prior discussion. By your 20:39, 25 June 2011 (UTC) it seems you stopped reading rather too soon in the VPP archive linked. LeadSongDog come howl!  02:29, 26 June 2011 (UTC)