Wikipedia:Bots/Requests for approval/BattyBot 17


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol keep vote.svg Approved

BattyBot 17
Operator:

Time filed: 00:32, Saturday February 2, 2013 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): AutoWikiBrowser

Source code available: AWB

Function overview: Replace full stop with comma in the Infobox company num_employees field

Links to relevant discussions (where appropriate):
 * Bot_requests
 * Template_talk:Infobox_company
 * WP:MOSNUM

Edit period(s): One time run

Estimated number of pages affected: Thousands

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details: Many of the num_employees fields in Infobox company use a full stop (.) to separate thousands (e.g. 12.200, 5.200) instead of using a comma. This is confusing as a full stop (.) usually means the decimal point and this is also violates WP:MOSNUM. This bot task would be to use AWB's find and replace rules to change the full stop to a comma. Any AWB general fixes will also be done at the same time. Example of test edit done manually: this edit.

Discussion
I feel like this task has many ways it can mess up because of people inputting oddly formatted information.  MBisanz  talk 01:09, 2 February 2013 (UTC)
 * The regex rule I'm planning on using is (\|\s*num_employees\s*\=\s*)(\d+)\.(\d{3}) → $1$2,$3. Could you please give me an example of oddly formatted information that this rule might encounter?  Thanks!  GoingBatty (talk) 01:55, 2 February 2013 (UTC)
 * (non-BAG member)
 * * num_employees = 1.668.072
 * * num_employees = 1.668 million
 * Very, very unlikely though. Foxconn lists 1.23 million, but doesn't give a third digit.  There aren't more than a couple dozen firms over 1m, and they're all pretty well-watched. (I'm supportive of the task, I'm just a big believer in the perversity of our data set.) --j⚛e deckertalk 03:17, 2 February 2013 (UTC)
 * OK, I'll use two rules:
 * (\|\s*num_employees\s*\=\s*)(\d+)\.(\d{3})\.(\d{3}) → $1$2,$3,$4
 * (\|\s*num_employees\s*\=\s*)(\d+)\.(\d{3})(?!\s*million) → $1$2,$3
 * Other suggestions are welcome - thanks! GoingBatty (talk) 03:42, 2 February 2013 (UTC)


 * I'm willing to trial this, but I suspect there are many more edge cases than either of us has thought of.  MBisanz  talk 22:01, 3 February 2013 (UTC)
 * with no false positives - see diffs here. Decided halfway through that these edits should be marked as minor. GoingBatty (talk) 22:46, 3 February 2013 (UTC)
 *  MBisanz  talk 03:11, 9 February 2013 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.