Wikipedia:Bots/Requests for approval/RscprinterBot 6


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol oppose vote.svg Withdrawn by operator.

RscprinterBot 6
Operator:

Time filed: 20:08, Thursday September 12, 2013 (UTC)

Automatic, Supervised, or Manual: Supervised

Programming language(s): AWB XML

Source code available: Upon request

Function overview: Converts IMDB links to template example edit, similar to what it would do

Links to relevant discussions (where appropriate): Bot requests/Archive 45

Edit period(s): I have a record of being too hopeful with this field but I'll optimistically say every other day

Estimated number of pages affected: Into the thousands

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details: The bot finds all pages with a link to IMDB. It then converts this link to a template, for example: http://www.imdb.com/character/ch0002425/ would be changed to. It works with all the IMDB templates (name, character, company, title, episode) and when the link is enclosed with one square bracket. This is especially intended for use in external links sections and references. My thanks to Betacommand for writing up the XML script I will use.

Discussion
This changes the visual output, in fact, the new text is significantly longer and contains another link. This by itself can break a lot of tight fields, tables, etc. not to mention the original intent of the link, unless you only convert external links. I wouldn't say this is a straight-forward task, so I would say as currently proposed. — HELL KNOWZ  ▎TALK 21:27, 12 September 2013 (UTC)


 * I've been running the script as myself to test it out, and after a slightly bumpy start I've got it all working properly. I am confident it could be left to run by itself without my supervision. It is also set to do do general fixes when it makes an edit but this can be turned off if you like. As for H3llkn0wz's comment above, I welcome wider discussion and will accommodate any changes suggested by the community.  Rcsprinter   (talk to me)  @ 20:21, 13 September 2013 (UTC)


 * The example edit is not a very good edit since it does not set the title to the name of the film. -- WOSlinker (talk) 08:34, 20 September 2013 (UTC)
 * Yes, that's not a very good example. I've changed the link above to a better one.  Rcsprinter   (tell me stuff)  @ 14:46, 20 September 2013 (UTC)
 * The new example is still not very good as it's loosing all the film titles and now all the links have "Cinematic Titanic at the Internet Movie Database" -- WOSlinker (talk) 15:25, 20 September 2013 (UTC)
 * OK, I've got a perfect one now.  Rcsprinter  (gas)  @ 15:43, 20 September 2013 (UTC)
 * Is the bot going to be removing the titles as in the previous example or are you going to be fixing that? -- WOSlinker (talk) 18:35, 20 September 2013 (UTC)
 * Er, I don't know exactly how that would happen given the current limits of the current template but I agree it could be a problem. Perhaps to avoid occurrences such as these it should skip links which are in text and only convert in references and external links lists; I think that would fix the problem as most ones inside the prose do not need adjusting.  Rcsprinter  (articulate)  @ 12:25, 21 September 2013 (UTC)
 * I've added back in the titles, see [//en.wikipedia.org/w/index.php?title=Cinematic_Titanic&diff=573919283&oldid=568199762] -- WOSlinker (talk) 14:49, 21 September 2013 (UTC)
 * I just remembered it does include the titles if it was there already. That Cinematic Titanic link was made before I modified the script to do it.  Rcsprinter  (chatter)  @ 15:41, 21 September 2013 (UTC)
 * I undid this edit. It gives all the references to imdb the title of the article, no matter what the subject is. I think it is better with an unnamed link, instead of a incorrect named one. Christian75 (talk) 21:15, 22 September 2013 (UTC)

It's pretty clear you need to retrieve the title from IMDB itself if there isn't any given, otherwise the template has no way of knowing what the title is. Edits like above or [//en.wikipedia.org/w/index.php?title=San_Francisco_International_Airport_%28TV_series%29&diff=prev&oldid=572869809 like] [//en.wikipedia.org/w/index.php?title=Star_Wolf_%28TV_series%29&diff=prev&oldid=572869904 this] aren't acceptable. I mentioned this already, but you also cannot just replace them inline with different output, possibly breaking prose [//en.wikipedia.org/w/index.php?title=The_Giant_Spider_Invasion&diff=prev&oldid=572869682 like] [//en.wikipedia.org/w/index.php?title=Cinematic_Titanic&diff=prev&oldid=572869588 here]. By the way, you have to repair or revert any remaining errors, or other editors end up doing it for you as above [//en.wikipedia.org/w/index.php?title=Cinematic_Titanic&diff=next&oldid=573842941]. (I'm not sure why you haven't already, as a bot operator you should be well aware of this, especially after running a script without trial.) Also, have you advertised this task somewhere, as you haven't posted any links? — HELL KNOWZ  ▎TALK 21:28, 22 September 2013 (UTC)
 * Right, for your first two links I can safely say that these will be fixed when the script runs as part of the process. This removes all possibility of ending up with something that reads "xx at the Internet Movie Database at the Internet Movie Database".
 * For the third and fourth links you gave that happened before I amended the script as you can see above (my 15:41 21/09/13 comment).
 * Sorry for forgetting to go round fixing errors, it is a while since I started a new task and needed to check for things going wrong. I'm going to look through right now.
 * I did a little advertising at the Village Pump here and at Template talk:IMDb name.  Rcsprinter  (post)  @ 19:37, 23 September 2013 (UTC)
 * Finished correcting any that went wrong. There were a few more than I thought, but I can be certain that the script works well enough now so as not to make any more false positives. Should I make another "dry run" using my own account, monitoring it closely?  Rcsprinter  (converse)  @ 20:52, 23 September 2013 (UTC)
 * I'd say so, yes. Make sure it really works. Unless you already did, which it sounds like from the "I can be certain that the script works well enough now so as not to make any more false positives" comment? Anomie⚔ 00:08, 17 October 2013 (UTC)
 * BAGAssistanceNeeded Another 50 done. Please check for mistakes and give some kind of indication as to whether the task can run.  Rcsprinter  (gossip)  @ 19:25, 20 October 2013 (UTC)
 * Looks like you still have bugs. [//en.wikipedia.org/w/index.php?title=Ray_MacDonnell&diff=prev&oldid=577998421 This edit], for example, created a link in ref 1 saying "" when it should have been something along the lines of "". Anomie⚔ 20:02, 20 October 2013 (UTC)
 * Or with link to "A Little Bit of Heaven" titled as "Romany Malco". You clearly are not browsing the actual IMDB site and since the bot cannot parse context, I don't see how this can ever be error free. —  HELL KNOWZ  ▎TALK 20:57, 20 October 2013 (UTC)
 * There's a couple that slipped through the crack - a link to a title on a "name" (bio) article shouldn't have got edited. This wasn't happening for all the rest of the edits in that 50 so the filter must be working. I'm looking into it but are they the only mistakes you can find?  Rcsprinter   (talk to me)  @ 21:01, 20 October 2013 (UTC)
 * What does your filter do exactly? How does it know when the page title does or doesn't match the IMDB link? — HELL KNOWZ  ▎TALK 21:02, 20 October 2013 (UTC)

The filter doesn't go off page titles, it looks at if the article has key things such as persondata and life dates. If it can find those, it assumes the article is about a person, and so should have a "name" imdb template, and not make any changes because, as you have linked to, the imdb "title" page would show up in a link with the person's name. Until I find a way to read the site and add the captions in automatically, skipping those pages using this filter is the best I can do.  Rcsprinter  (chat)  @ 16:47, 21 October 2013 (UTC)

D Are you still planning on pursuing this? Josh Parris 10:20, 5 November 2013 (UTC)
 * Yup. I can't do anything without approval though.  Rcsprinter  (cackle)  @ 15:40, 5 November 2013 (UTC)
 * Let's see if you've squashed all the bugs. Josh Parris 06:31, 6 November 2013 (UTC)


 * It did 25 "name"s and 25 "title"s. I am confident there are no errors.  Rcsprinter   (talk to me)  @ 14:40, 9 November 2013 (UTC)
 * [//en.wikipedia.org/w/index.php?title=Eric_Bana&diff=prev&oldid=580898996 This edit] is an error of the same type as before. Anomie⚔ 18:27, 9 November 2013 (UTC)
 * I'm not sure where this BRFA is going, because Rcsprinter123 said he can't implement actual browsing of the urls and without browsing the urls, the bot can't know for sure where the links leads. Despite trials and filters in place, this keeps coming up and certainly will come up again. The task is marked "Supervised" (i.e. every edit is human-reviewed at some point), but it doesn't seem the botop has actually supervised the edits. If you don't do this even during trial, how can we expect you will after approval? You haven't even fixed all the previous errors. — HELL KNOWZ  ▎TALK 18:36, 9 November 2013 (UTC)
 * I tell you, I looked at every edit as it was being made. The filter is working, because there is one error and not 20. It just seems to not be running at 100% accuracy.  Rcsprinter  (constabulary)  @ 21:16, 9 November 2013 (UTC)
 * False positives are only acceptable at a very low rate for automated tasks when they cannot be reasonably foreseen. Your rate is not low, your task is not fully automated, and your error has been foreseen before and you yourself said you cannot fix it. Unless you can, you are asking BAG to approve a bot that we know runs with errors. — HELL KNOWZ  ▎TALK 21:26, 9 November 2013 (UTC)
 * I will get an AWB plugin made so that it can read the pages it's linking to and reduce any margin for error. I will reopen this brfa when that works. Until then,  Rcsprinter   (shout)  @ 21:49, 9 November 2013 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.