Wikipedia:Bots/Requests for approval/Lightbot 14


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol keep vote.svg Approved

Lightbot 14
Operator:

Automatic or Manually assisted: Automatic supervised

Programming language(s): AWB, monobook, vector, manual

Source code available: Source code for monobook or vector are available. Source code for AWB will vary but versions are often also kept as user pages.

Function overview: Janitorial edits to units that contain at least one unit of mass e.g. 30 stones, 50 lb/min.

Links to relevant discussions (where appropriate): This request duplicates the 'units of measure' section of Bots/Requests for approval/Lightbot 3. That BRFA was very similar to the two previous approvals: Bots/Requests for approval/Lightbot and Bots/Requests for approval/Lightbot 2.

A relevant guideline is at: The guideline is stable and has existed in various forms for a long time. Other editors and I have done many edits along these lines over a long period. Examples of such conversions exist in contributions list but it would be easier just to demonstrate with new edits.
 * mosnum - Unit symbols "Where English-speaking countries use different units for the same measurement, follow the "primary" unit with a conversion in parentheses."

Edit period(s): Multiple runs. Often by batch based on preprocessed list of selected target articles.

Estimated number of pages affected: Individual runs of tens, or hundreds, or thousands.

Exclusion compliant (Y/N): Yes, will comply with 'nobots'

Already has a bot flag (Y/N): No

Function details: For units that contain at least one unit of mass:
 * 1) Edits may add conversions to units e.g. "The engine weighs 160 pounds" -> "The engine weighs 160 lb"
 * 2) Edits may edit the format or spelling e.g. "200 Tons (180 MT)" -> "200 short tons (160 t)"
 * 3) Edits may add or remove links e.g. "160 lb" ->  "160 lb". This will be in accordance with Link.

Discussion
Please can we move to a 50 edit trial? Lightmouse (talk) 10:46, 16 April 2011 (UTC)

Some questions -
 * 1) How will you determine the precision of the conversion to use?
 * 2) Where the unit is ambiguous (e.g. "pounds" or "tons") how will you determine which is meant?
 * 3) Why would you change the format of units, and what would you do if there was an objection to a change you make?
 * 4) What will you do if you find an article that uses a mix of metric and imperial units (I've seen this, including ones that use the convert template with 'imperial (metric)' and 'metric (imperial)'? Thryduulf (talk) 12:14, 23 April 2011 (UTC)


 * Precision: The 'convert template' will usually be used with default precision. If you want to know more details about how it works, feel free to ask at Template talk:Convert. In many cases, this is a match, or +/- 1 significant figures. With the template conversion in place, it's easy for an editor to adjust precision.
 * Ambiguous units. In many cases of 'ambiguous units', there is no ambiguity in the context. For example, it's almost always easy to see when the author write 'gallon' but means the US gallon because it's in a US article about a US topic using US sources. Ambiguity will be avoided where the ambiguity is real.
 * Reasons for format change. It's not possible to use the convert template *without* adopting a standard format in accordance with guidelines. Non-template and template conversions will be (as far as I know) consistent. If somebody disagrees with the format used by the convert template, then the convert template will have to come out. But the issues are usually trivial or esoteric for example the addition of 'US' to gallon and/or the use of upper and lower case. From time to time, new variations on these issues do crop up and I've started many discussions myself in the relevant guideline pages following feedback on a conversion.
 * Mix of metric and non-metric. The bot isn't designed to resolve mixed units and has no provision for it in the code. Articles often contain primary metric alongside primary non-metric - sometimes it's for a good reason (such as mixing miles and metres on transport, as you suggest. I'm aware of this), sometimes it's not. Over the years using automation and seeing lots of articles pass in front of me, I have noticed suboptimal unit sequences and responded with human edits in either the Lightbot or Lightmouse accounts. So the option is useful, but is a low priority for Lightbot.
 * BAGAssistanceNeeded Please can we move to a 50 edit trial? Lightmouse (talk) 09:54, 25 April 2011 (UTC)


 * Recused  MBisanz  talk 01:50, 4 May 2011 (UTC)


 * Waiting for clarifications on Lightbot 7 and 12. Headbomb {talk / contribs / physics / books} 15:51, 4 May 2011 (UTC)


 * with the same terms as LightBot 12 (aka, edits which introduce conversions MUST be reviewed against WP:MOSCONVERSIONS to ensure they are not unwanted). Spelling/formatting/overlinking is uncontroversial copy-editing. Headbomb {talk / contribs / physics / books} 09:11, 10 May 2011 (UTC)

See Trial edits. Edit summary is 'L14. Edits to terms that contain at least one unit of mass' Lightmouse (talk) 18:29, 13 May 2011 (UTC)


 * André the Giant
 * Changing 119 12-ounce beers in 6 hours to 119 12-US-fluid-ounce (350 ml; 12 imp fl oz) beers in 6 hours is not my idea of a sane conversion
 * I agree. I kept the code simple by using the default settings for the template. By default, it converts US fluid ounce to both ml and UK fluid ounce. That seems unnecessary to me for the following reasons:
 * The ml value is sufficient, even in the UK.
 * UK and US fluid ounce values are very close. They're identical in many instances, such as this.
 * If you agree with me, I'll update the code so it drops the UK fluid ounce i.e. the outcome should be 119 12-US-fluid-ounce (350 ml) beers in 6 hours. Lightmouse (talk) 20:35, 20 May 2011 (UTC)


 * I know at this point this is only testing mass-related stuff, but ideally you'd change a height of 6'3" and weight of 240 pounds by age 12 to a height of 6'3" (1.91 m) and weight of 240 pounds (110 kg) by age 12.
 * Yes. I'll have a generic piece of code that does many units in one pass and is safe for using on many articles with low rates of human intervention. The more challenging cases (e.g. ounce, gallon, barrel) are best done using specific code with a human focussed and sensitised to the specific anomalies associated with those units. Lightmouse (talk) 20:35, 20 May 2011 (UTC)

More feedback later. Headbomb {talk / contribs / physics / books} 19:06, 20 May 2011 (UTC)


 * Thanks for responding. Lightmouse (talk) 20:35, 20 May 2011 (UTC)

I've been doing lots of pages with lengths but omitting weights. Please can we make progress on this to reduce the requirement for multiple passes? Lightmouse (talk) 15:42, 9 June 2011 (UTC)


 * Perhaps we should drop the imperial measures in the defaults.  The above made me think.  Does the bot recognise " 6'3" "? J IM ptalk·cont 23:10, 13 June 2011 (UTC)


 * The bot doesn't currently recognise 6'3" but it's in the plan. Instances like 6 feet 3 inches are much easier.
 * It seems to me that we're in the desirable position of discussing tasks for the bot, rather than merely approval. If so, can we move to approval?Lightmouse (talk) 10:58, 14 June 2011 (UTC)

I'd approve, but I want to see how the new code works before doing so. Headbomb {talk / contribs / physics / books} 16:26, 15 June 2011 (UTC)

Done. See contributions. Summary is ''L14. Contains at least one unit of mass''. Avoiding 'foot pounds', 'pounds force', 'pounds per square inch' is almost the same effort as converting them properly. I'll create an application for 'units of torque, force, and pressure' unless you want to make them in scope of this one. What do you think? Regards Lightmouse (talk) 18:17, 15 June 2011 (UTC)


 * Here "3lbs." is converted to "3 lb." when it should be "3 lb"
 * seems to forget the non-breaking spaces between numbers and units such as "3 km"
 * It works well on biographies etc..., but I was more worried about the science-type of articles. Any way you could get more sciency-type of articles for trial? Headbomb {talk / contribs / physics / books} 18:38, 15 June 2011 (UTC)


 * Concerning the force/torque/pressure units, I'd file a separate BRFA so the logic can be trialed and reviewed on its own page. If everything its just a little bit more bureaucratic, but if things go wrong it's easier to deal with it on its own page. Headbomb {talk / contribs / physics / books} 18:41, 15 June 2011 (UTC)


 * Taking the two feedback examples in turn:
 * Here "3lbs." is converted to "3 lb." when it should be "3 lb"
 * I agree. It's on my wishlist to eliminate the period but I don't think I'll ever work out how. A period may be an indication of the abbreviation or it may be punctuation. I can't distinguish the two without making false positives.
 * Non-breaking spaces. It hasn't forgotten it. That example uses numbers-as-words and I don't use the template, I have to specify the text. I've just never programmed for nbsp. Along with a minority of users, I don't like them as a way to control wrapping. The template has them by default and when I had AWB general fixes switched on, it added them.
 * I'll file another BRFA for other units. Lightmouse (talk) 19:20, 15 June 2011 (UTC)


 * I was responding to your interest in the combination of human height and weight by targetting the trial at human articles. I'd be happy to run another trial on 'science-type' articles. I've created Lightbot 15 to address a whole range of other units. Lightmouse (talk) 19:20, 16 June 2011 (UTC)


 * Alright, then let's go for another 50. Headbomb {talk / contribs / physics / books} 19:27, 16 June 2011 (UTC)

Done. I've tried to make it add conversions where usage was 'sciency' but it wasn't always easy to confine it to just that. If you want me to have another go, I can. See contributions as of now. Edit summary is ''L14b. Contains at least one unit of mass''. Regards Lightmouse (talk) 10:41, 22 June 2011 (UTC)


 * Works well enough with pounds, but these were the simpler case. I was more curious about fluid ounces, tons, etc... i.e. things that could be ambiguous. Headbomb {talk / contribs / physics / books} 19:47, 28 June 2011 (UTC)


 * I share your caution about ounces and tons. That's why, over the years, I've not done much about those units. In the cases where I have touched them, it's only with major human intervention and/or dedicated process/code. I'll run 50 edits with some examples. Lightmouse (talk) 11:43, 29 June 2011 (UTC)

Done. See contributions as of now. Edit summary is ''L14. ounce or L14. ton''. As I expected, these involved almost a lot of time/effort to set up and involved a lot of human involvement during operation to capture the various versions of the units and avoid false positives. I'm hoping the trials have demonstrated the actual and potential capabilities. Lightmouse (talk) 14:29, 30 June 2011 (UTC)


 * BAG assistance needed Lightmouse (talk) 18:01, 5 August 2011 (UTC)
 * As always please be sure to take due care with the bot and be responsive to comments/feedback. -- Chris 12:24, 6 August 2011 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.