Wikipedia:Bots/Requests for approval/Lightbot 15


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol keep vote.svg Approved

Lightbot 15
Operator:

Automatic or Manually assisted: Automatic supervised

Programming language(s): AWB, monobook, vector, manual

Source code available: Source code for monobook or vector are available. Source code for AWB will vary but versions are often also kept as user pages.

Function overview: Janitorial edits to selected units.

Links to relevant discussions (where appropriate): This request contains similar functionality to:
 * Bots/Requests for approval/Lightbot 13 (Janitorial edits to units that contain at least one unit of length, area, or volume).
 * Bots/Requests for approval/Lightbot 6 (Delink common units of measurement)

A relevant guideline is at: The guideline is stable and has existed in various forms for a long time. Other editors and I have done many edits along these lines over a long period. Examples of such conversions exist in contributions list but it would be easier just to demonstrate with new edits.
 * mosnum - Unit symbols "Where English-speaking countries use different units for the same measurement, follow the "primary" unit with a conversion in parentheses."

Edit period(s): Multiple runs. Often by batch based on preprocessed list of selected target articles.

Estimated number of pages affected: Individual runs of tens, or hundreds, or thousands.

Exclusion compliant (Y/N): Yes, will comply with 'nobots'

Already has a bot flag (Y/N): No

Function details: Units will contain at least one unit of:
 * amount e.g. mole
 * angle e.g. degree
 * capacitance e.g. farad
 * charge e.g. coulomb
 * current e.g. amp
 * energy e.g. joule
 * force e.g. newton
 * frequency e.g. hertz
 * length, area, volume e.g. metre
 * light e.g. candela, lumen, lux
 * mass, weight e.g. kilogram
 * pressure e.g. pascal
 * potential e.g. volt
 * power e.g. watt
 * resistance e.g. ohm
 * temperature e.g. degree Celsius
 * time e.g. second
 * torque e.g. N·m


 * 1) Edits may add conversions to units e.g. "The engine output was 160 hp" -> "The engine output was 160 hp"
 * 2) Edits may change text or template and change format or spelling e.g. "200,000 Kw" -> "200 MW"
 * 3) Edits may add or remove links e.g. "The supply was 230 volts" ->  "The supply is 230 volts". This will be in accordance with Link.

In many cases, the convert template will be used. In some cases plain text will be used.

Discussion

 * This is a suitable expansion on Lightbot 5 (units), not Lightbot 3 (units and dates) or Lightbot 2 (units and dates).
 * What exactly is there to convert with resistances/capacitance/frequency/etc... and many other such quantities? AFAIK, moles are unique, conversions between radians and degrees are undesirable, electrical units (amp, farad, ohm, volt, ...) are also unique (in as much as you don't want to convert between couloumb and statcoulombs, or similar), etc... Is this a request to perform janitorial edits on these units, or to do conversions? Headbomb {talk / contribs / physics / books} 19:24, 16 June 2011 (UTC)


 * I'd be happy to combine or merge any existing or proposed applications. They're all beginning to look very similar. That's no surprise because the issues of conversion, format, spelling etc are similar.
 * As the 'function details' section says, I'll be looking for scope to add conversions, change text/template and format/spelling plus link/delink. I gave each quantity an example SI unit because there's only one SI unit. But most of my time will be on non-SI units such as 'btu', 'ft.lb', 'psi' etc.
 * I haven't based it on demand, I just copied a list of unit types. I'm worried that if I waited until I encountered a requirement, it would involve extra effort from BAG and me, plus weeks/months of time.
 * Lightmouse (talk) 19:44, 16 June 2011 (UTC)


 * If you wanted, we could replace all prior applications (approved or not) with this one. We'd only need to include length, area, volume, mass. Lightmouse (talk) 19:59, 16 June 2011 (UTC)


 * Oppose " we could replace all prior applications (approved or not) with this one." Yes, we could; that's precisely the problem. This is another "Lightbot can do whatever Lightmouse wants" application, on the basis of a MOSNUM provision maintained by revert-war.
 * Headbomb, please stop "approving" these; it's a conflict of interest. Septentrionalis PMAnderson 21:45, 20 June 2011 (UTC)

A COI? That's a new one. Anyway, per the ARBCOM restrictions, Lightmouse is permitted to run a bot for one single task [or one group of closely-related task, with BAG having the discretion on what exactly "closely-related tasks" mean]. Lightbot was approved to perform janitorial edits to units, and this falls exactly under that scope, and is as far removed from "doing whatever Lightmouse wants" as is imaginable. Each expansion on that task gets its BRFA, is trialled to review the bot's behaviour and ensure the MOS compliance of the edits as well as their appropriateness, problematic stuff is corrected, then re-trialled, and so the cycle goes until the bot is performing satisfactorily.

If there's an actual problem with Lightbot's edits, raise the issue. If not, then this discussion is rather moot and a waste of everyone's time. Headbomb {talk / contribs / physics / books} 03:49, 21 June 2011 (UTC)


 * Anyway, digressions aside, let's move to trial. I'm giving you 25 edits per "type" of unit so the code can be tested (aka, 25 for amount, 25 for angles, 25 for capacitance, 25 for charge, etc...). Headbomb {talk / contribs / physics / books} 03:55, 21 June 2011 (UTC)

* amount e.g. mole * capacitance e.g. farad * charge e.g. coulomb * current e.g. amp * resistance e.g. ohm I'll park the struck-out ones. Lightmouse (talk) 17:27, 3 July 2011 (UTC)
 * angle e.g. degree Done
 * energy e.g. joule Done
 * force e.g. Newton Done
 * frequency e.g. Hz Done
 * light e.g. candela, lumen, lux Done
 * pressure e.g. pascal Done
 * potential e.g. volt Done
 * power e.g. watt Done
 * time e.g. second Done
 * torque e.g. N.m Done


 * For completeness, I've added length, area, volume, mass, weight. Lightmouse (talk) 09:20, 7 July 2011 (UTC)
 * Also units of temperature. Lightmouse (talk) 18:02, 16 July 2011 (UTC)


 * Where are the results of the trial? Jc3s5h (talk) 11:55, 25 July 2011 (UTC)


 * See the see trial results. The edit summary is prefixed by "L15. ". Regards Lightmouse (talk) 15:15, 25 July 2011 (UTC)

arbitrary break
Reports overview. In the reports below, I do not attempt to list every instance of an error; several of these kinds of errors occur in multiple edits. Jc3s5h (talk) 13:21, 25 July 2011 (UTC)

Error report. In this edit the bot incorrectly changes volt to Volt. The capitalized version is correct because (a) it is the title of a Wikipedia article and even if there is an error in a title, it should probably be preserved, and (b) SI units names that are ordinarily lowercase become capitalized when used or placed in a way that any other common noun would be capitalized, including when it is the first word of a title. I suspect this could be a fundamental weakness in the bot, in failing to distinguish titles and first words of sentences from other uses and placements. Jc3s5h (talk) 12:06, 25 July 2011 (UTC)


 * I agree with you. The bot was looking for upper/lower case errors for volt. It detected '12345 Volt' and assumed it was a voltage value rather than a link. It shouldn't have been changed. Your assessment is correct. It needs to be addressed prior to a proper bot run. Lightmouse (talk) 15:15, 25 July 2011 (UTC)

Failure to completely fix. In this edit the bot changed kv-a to kV-a rather than the kV·A or kV A allowed by NIST Special Publication 811, or the kVA allowed by ISO 31-0. Jc3s5h (talk) 12:18, 25 July 2011 (UTC)


 * The bot was looking for upper/lower case errors involving volt. The target 'kv' was correctly identified and replaced with 'kV'. I'm glad of the failure to fix feedback that the more complicated Volt-ampere errors are also in need of fixing. I'll add it to the wishlist and may do targetted runs build on experience and confidence with volt. I hope that's ok. Lightmouse (talk) 15:15, 25 July 2011 (UTC)

 Probable error report Failure to completely fix. In this edit the bot changes "30 degrees/second" to "30 degree/second". It is normal to use plurals of unit names when the units are spelled out, so this looks wrong to me. NIST Special Publication 811 page 32 indicates this should be degree per second. Jc3s5h (talk) 12:24, 25 July 2011 (UTC)


 * I actually did that as a manual edit. MOS says:
 * When unit names are combined by division, separate them with per (e.g., meter per second, not meter/second).
 * So it should have been ""30 degrees per second" or a fully symbolic form. So "30 degrees/second" and "30 degree/second" are both wrong. I don't care much right now about degrees of angle, but permission is necessary to correct some editor errors where degrees of temperature and degrees of angle are confused. Lightmouse (talk)

Error report. In this edit the bot changes the word degrees in a URL. Jc3s5h (talk) 12:33, 25 July 2011 (UTC)


 * Oops. I was struggling to find examples of 'degree' errors and I switched off the section of code that protects URLs. Thanks for letting me know.

Error report. In this edit the bot changes "candlepower" to "candela" but does not make the corresponding change to the abbreviation "cp" later. Since the audience for this article probably isn't very knowledgeable about light units, this change makes the article more confusing for readers. Jc3s5h (talk) 13:01, 25 July 2011 (UTC)


 * Good spot. I wasn't aware the abbreviation 'cp' was used on Wikipedia. Thanks. Lightmouse (talk) 15:15, 25 July 2011 (UTC)

Inability to deal with erroneous input. In this edit the input is "a 210 calorie cut" (referring to pizza). It should have been written "a 210-calorie cut". The bot changed it to "a 210 calories (880 kJ) cut." Jc3s5h (talk) 13:13, 25 July 2011 (UTC)


 * I think you're saying failure to fix. Yes, I don't fix all types and permutations of errors that editors make. I'd like to and as I gain experience and confidence, I'll build out from basic fixes. Lightmouse (talk) 15:15, 25 July 2011 (UTC)

Error report. In this edit a conversion is provided for BTU but the abbreviation BTU is really HVAC industry jargon for BTU/h, so the conversion should have been to kW. Jc3s5h (talk) 13:21, 25 July 2011 (UTC)


 * That one led to quite an extensive discussion. In some of the articles linked from that page, the solution is to eliminate the jargon and say BTU/h when that's what's meant. I think that's better solution for a non-specialist publication like WP. I did ask for community input as to whether the conversion template could be updated to cope with the jargon and convert BTU into kW. I think the conclusion was to use BTU/h instead. I don't know where people are with that one but it did generate interesting discussion well beyond my scope. Lightmouse (talk) 15:15, 25 July 2011 (UTC)
 * Jargon might or might not be tolerated, but it is unacceptable to blindly suppose that because the US HVAC community uses "BTU" as jargon for BTU/h, that the HVAC community in other countries uses kJ as jargon for jJ/h. Thus if the bot can't distinguish the jargon BTU from the actual BTU, it should not attempt to convert BTU at all. Jc3s5h (talk) 17:49, 25 July 2011 (UTC)

Questionable edit. In this edit the bot changes "The amount of sustained power required of the pilot is around one horsepower" to "The amount of sustained power required of the pilot is around 1 horsepower (0.75 kW)." Deciding whether a single-digit number should be spelled out or a numeral is complex, as described at WP:ORDINAL. I doubt a bot can be programmed to deal with such complexity. Jc3s5h (talk) 13:33, 25 July 2011 (UTC)


 * It's easier to do that way. If the consensus is that "1 horsepower (0.75 kW)" isn't permitted as a solution, then I may have to stop doing such conversions. So be it but it'd be a shame. Tackling numbers-as-words involves huge amounts of code if I'm to avoid false positives. Lightmouse (talk) 15:15, 25 July 2011 (UTC)

Error in production version. In this edit the abbreviation "fps" is replaced with "frame/s". If one wants to consider "f" to be a valid abbreviation, it should be "f/s", otherwise, it should be "frame per second". This should be fixed in the new version. Jc3s5h (talk) 13:45, 25 July 2011 (UTC)


 * Ah. On Wikipedia we have multiple versions of 'fps': first person shooter; feet per second; frames per second. I've even seen articles use more than one meaning within the same article. It doesn't matter much for specialist magazines, but conflicting abbreviations is less desirable for Wikipedia. Some units have acceptance (to some percentage) within all communities that the full and abbreviated can be identical (e.g. 'bit'). That's particularly helpful when used within the context of potentially conflicting abbreviations. The form 'frame/s' is used just like 'bit/'s on occasion e.g. and it seems to require no learning from non-specialists or specialists but if the form 'f/s' were deemed better, I can use that. Lightmouse (talk) 15:15, 25 July 2011 (UTC)


 * All this feedback is welcome. If need be, feel free to move forward with some unit types and strike others out. I'm keen to move on to operation or a more extensive trial if needed. Naturally, I'm seeking this as 'enabling permission' for as wide a scope as possible. Thanks. Lightmouse (talk) 15:15, 25 July 2011 (UTC)

I would suggest, to the extent possible, leaning towards a more-extensive regular expression to confirm that the error is exactly what you expect in the way of erroneous input, and taking no action of the regular expression is not satisfied. For example, if looking for unconverted calorie(s), check there is white space before, and white space or sentence terminal punctuation after calorie(s). Then if you encounter "calorie/" or "-calorie" you will know it is something the bot doesn't know how to handle, and do nothing.

In some categories of trial edits, there were more edits that failed to completely fix the problem than successful edits. A bot that converts one error to a different error once in a great while is OK, but when the number of error-to-error edits approaches something like 10%, I think we should decide the bot shouldn't attempt that task. Jc3s5h (talk) 17:49, 25 July 2011 (UTC)

General problem
We have something like 35 articles listed on Watt (surname). I daresay the surnames of other scientists who have units named after them are less common. Nevertheless, I think we are vulnerable to false positives of the form "in 1759 Watt formed a partnership with John Craig" being converted to "in 1759 W formed a partnership with John Craig".


 * Initially I didn't think to check Watts (surname). That lists about 70 more Watts. Jc3s5h (talk) 17:27, 10 August 2011 (UTC)

Process issues
There are process issues that could be improved here. We've debated this issue for more than a year now. Can we try another tack? Lightmouse (talk) 13:29, 26 July 2011 (UTC)
 * Feature requests are being incorrectly described as errors.
 * Most, if not all, editors develop code as non-bots prior to an application. The Lightmouse account isn't permitted to be used to develop code. Therefore BAG and Lightbot are spending extra effort assessing undeveloped code.


 * I disagree that feature requests are being incorrectly described as errors. I can think of at least two situations where the bot could make an error when it makes an edit to erroneous or sub-optimal text:
 * The text immediately surrounding the edit and clearly related to the edited text is still wrong. For example, "meters/sec" --> "meter/s".
 * Sub-optimal text is changed, but the rest of the article becomes harder to read, as when an old version of a spelled out unit is changed to the modern version, but the abbreviation is not changed.


 * Case 2 is an outright error, so old versions of units should not be changed to new versions unless (a) the abbreviations or symbols are the same and (b) the definition of the old and new unit is close enough that there is virtually no chance a measurement would be stated to enough precision that it would matter.


 * Case 1 is tolerable in small amounts, but if it happens frequently the bot should be fixed or should not attempt the particular error it is having trouble with. Bluntly, thinking of case 1 as a feature request suggests that all that matters is the bot operates as the bot author envisioned it would work, not thinking about whether it is satisfying the reader's need and the need for article stability. Jc3s5h (talk) 14:37, 26 July 2011 (UTC)

Sidenote I've been busy for the past month, but I should get time to review the edits sometimes during this week. Headbomb {talk / contribs / physics / books} 15:25, 26 July 2011 (UTC)


 * Headbomb, please consider how we should address overlapping scope. For example:
 * Bot runs to handle defects with 'foot' (length) encounter defects with 'foot pounds' (torque or energy).
 * Bot runs to handle defects with 'pounds' (weight) encounter defects with 'foot pounds' (torque or energy) and 'pounds' (force).
 * Bot runs to handle defects with '°C' and '°F' (temperature) encounter defects with 'degrees' (angle) and 'calorie' (energy).
 * Bot runs to handle defects with potential encounter defects with power.
 * Bot runs to handle defects with force encounter defects with pressure.
 * You can see that a simple scope targetting 'foot', 'lb' and degrees temperature will have considerable overlap. It's a *lot* easier to address overlapping units than to avoid them and leave defects in place. Multi-unit scope reduces input effort and increases benefit to articles. If you don't yet feel confident with one or more of the above units, please approve those that you can. You may wish to consider the benefits of moving to the more conventional approach encouraging BRFA requests *after* development in non-bot account (even if only as 'x automated edits per day'), rather than running BRFA trials live on undeveloped processes. Lightmouse (talk) 10:45, 27 July 2011 (UTC)


 * I don't know when the best point in the process would be to test for corrections of many different units, but certainly the bot should be trialed in it's "release candidate" configuration before going into production. Jc3s5h (talk) 13:04, 27 July 2011 (UTC)


 * Lightmouse (talk) 18:02, 5 August 2011 (UTC)

BAG Review/Approval

 * Alright, reviewing this. Sorry it took longer than expected. Headbomb {talk / contribs / physics / books} 17:10, 10 August 2011 (UTC)


 * Horsepower &rarr; Watt
 * Calorie &rarr;kJ
 * Frequency
 * BTU/ BTU/h &larr; rarr; kJ / kJ/h, seems too controversial/unimplementable.


 * ??? Candlepower &rarr; Candela, is there really consensus for this? Seems that the historical value of the candlepower differs from the candela. Or is the logic tweaked to ensure that improper conversions do not happen?
 * I just swept up all the units I could think of. Can you give me an instance of a conversion that you think is improper? Otherwise, it's easier just to deny it for now. Lightmouse (talk) 18:34, 10 August 2011 (UTC)
 * I don't know if there are improper converstions. Which is why I'm asking if there really is consensus for this. WP:MEASURE could probably tell us.
 * OK. Well, let's deny it for now so we can move on. Lightmouse (talk) 18:47, 10 August 2011 (UTC)
 * Force. Can you handle cases like this (lb.s.t = pounds of static trust)
 * If the thrust is static, it applies to all units, it'll still be static. It's like 'above mean sea level'. It looks fine to me. What would you like to see in that instance? Lightmouse (talk) 18:34, 10 August 2011 (UTC)
 * Well the result is "11,400 lbf s.t" which seems rather awkward. Headbomb {talk / contribs / physics / books} 18:40, 10 August 2011 (UTC)
 * Ah. I could expand it to 'static thrust'. Alternatively, we could ask the community what they want done. Lightmouse (talk) 18:47, 10 August 2011 (UTC)
 * More to come

Headbomb {talk / contribs / physics / books} 18:25, 10 August 2011 (UTC)

Headbomb wrote: I am seeking unit-independent approval of measurement quantities. Are those approvals are restricted to those particular units? Lightmouse (talk) 17:41, 29 August 2011 (UTC)
 * Horsepower &rarr; Watt
 * Calorie &rarr;kJ,

BAG assistance needed
 * Some parts of this request have been approved. Yet, it's not visible in the approved list. Can it be added please? Lightmouse (talk) 16:29, 10 October 2011 (UTC)

Error report
Lightbot edited a unit-related article. Jc3s5h (talk) 23:41, 12 August 2011 (UTC)


 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.