Wikipedia:Bots/Requests for approval/MOSNUM Bot 2


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol oppose vote.svg Withdrawn by operator.

MOSNUM Bot 2
Operator:

Time filed: 16:51, Friday January 18, 2013 (UTC)

Automatic, Supervised, or Manual: Auto

Programming language(s): AWB

Source code available: User:Ohconfucius/Bot modules/dmy

Function overview: The aim is to make dates displayed within reference sections compatible with Manual of Style (dates and numbers) (MOSNUM):
 * 1) General article-level enabling fixes involve delinking all dates and date fragments, including day-month strings, days, months, decades (per Bots/Requests for approval/Hmainsbot1); removing instances of '&amp;nbsp;' from within date strings; removing or otherwise substituting date formats not compliant with MOSNUM; relocate misplaced dates
 * 2) specific fixes involve aligning dates (to the prevailing style) within 'date', 'accessdate' and 'archivedate' parameters and others in templates that may have clearly identified fields for dates (to be taken as read throughout this request). Examples below.
 * 3) adjusting parameters in certain templates to adjust display options dates.
 * Note: ISO dates (yyyy-mm-dd) are to be unlinked but will not be converted into any other format

Links to relevant discussions (where appropriate): WP:MOSNUM

Edit period(s): continuous

Estimated number of pages affected: The bot will only operate on articles that have been tagged with use dmy dates and use mdy dates – total potential population stands at ca. 260,000 as at 18 January 2013 (see Category:Use dmy dates and Category:Use mdy dates), but many will already have uniform dates

Exclusion compliant (Yes/No): Yes.

Already has a bot flag (Yes/No):

Function details:
 * It ensures uniform presentation of dates within any given article in compliance with Manual of Style (dates and numbers). As such, general fixes include delinking all dates and date fragments, including day-month strings, days, months, decades (per Bots/Requests for approval/Hmainsbot1)
 * universally remove instances of '&amp;nbsp;' from within date strings (such as '18 &amp;nbsp; January &amp;nbsp; 2013', 'January &amp;nbsp; 18, &amp;nbsp; 2013')
 * universally remove superscripting tags ( < /sup>) for ordinal numbers
 * due to the (I believe) very low risk of false positives, linked dates, including year-in-x easter eggs, in full configuration (e.g.,   or  ) that occur anywhere in the article may be unlinked and 'flipped' into the prevailing format of that article in the first instance
 * relocate dmy or mdy dates misplaced in 'author' by changing the latter into 'date'; remove the second date instance.
 * Essentially, the bot will align dates within 'date', 'accessdate' and 'archivedate' parameters (within citation templates) to the prevailing style; the same is to apply to date strings preceded by " "
 * Also, dates within templates that may have clearly identified fields for dates (xxxxdate),
 * or others that are populated solely by dates – for example first flight and introduced in Template:Infobox aircraft; Ship laid down, Ship launched, Ship commissioned, Ship decommissioned, Ship struck in Template:Infobox ship career, and released (in Template:Infobox album)


 * Changes include:
 * remove days of the week, and times of day
 * remove ordinal suffixes and constructions such as '5 th of September', 'December 25 th '
 * remove leading zeroes (e.g. November 0 8, 2001 and 0 8 November 2001)
 * add commas where necessary (e.g. February 28 2001)
 * remove redundant commas (e.g. July, 1997; 28 February , 2001)
 * "date", if containing only a 4-digit number falling into the range 1000–2099, will be changed to "year"
 * remove or otherwise substitute date formats not compliant with MOSNUM
 * expand all month names within said template parameters.


 * It acts on incorrect formats (e.g. mdy dates in articles tagged with use dmy dates – and vice versa), other often used but not MOSNUM-compliant date formats within the references section and in date parameters within certain templates;
 * acts on mdy dates in 'dmy' articles, making them all dmy, and same for dmy dates in 'mdy' articles;
 * where necessary, converts all dates and date fragments (including simple date ranges) into the prevailing format – either dd mmm yyyy, mmm dd, yyyy;
 * converts other often used (but not MOSNUM-compliant) date formats, namely those with hyphens, dashes and slashes, single- or double-digit days or months and two- or four-digit years (e.g. dd-mm-yyyy, d-m-yyyy, dd/mm/yyyy or mm/dd/yyyy) or abbreviated month names with intervening hyphens or dashes (e.g. dd-Mar-yyyy, d-mar-yyyy, dd–Apr–yyyy, d–apr–yyyy, yyyy-Mar-dd, yyyy-mar-dd,  yyyy–Apr–dd, yyyy–apr–dd), to the prevailing format as identified by the  tag;
 * will convert formats such as 2013–01–25 (with dashes) nor 2013-1-25 (with missing leading zeros) to fully compliant ISO format.
 * insert " " within the following templates:
 * Birth date, bda, Birth date and age
 * Death date
 * Start date and age (launch date, release date)
 * start date, end date
 * film date
 * Wayback

Discussion
Given the inherent controversy of dates, could you leave notes at WT:MOSDATE and WP:VPR referring people to this request? Thanks.  MBisanz  talk 02:01, 20 January 2013 (UTC)
 * ✅ --  Ohconfucius  ping / poke 02:20, 21 January 2013 (UTC)
 * Sounds like a great bot-driven task. Can it also move the m-d-y and d-m-y, or whatever, templates to the end of the article? I think they are in the way at the top of articles as the majority are now placed. GenQuest  "Talk to Me" 18:48, 20 January 2013 (UTC)
 * I think I'd prefer leave the 'clutter' discussion for elsewhere. Anyway, from a technical standpoint, I don't know how I would move the template to the bottom of the article. --  Ohconfucius  ping / poke 02:20, 21 January 2013 (UTC)


 * Is your bot exclusion compliant? Because of the nature of the task, I believe it critical that pages can have the choice to reject your bot.— cyberpower ChatOnline 03:06, 21 January 2013 (UTC)
 * Yes, it will be. (I have amended the above accordingly) --  Ohconfucius  ping / poke 04:14, 21 January 2013 (UTC)
 * Thank you.— cyberpower ChatOnline 13:18, 21 January 2013 (UTC)


 * Can you give some examples of "certain templates that have [spelt-out] dates as parameters"? Ideally, such list would be publicly available, given the number of potentially affected pages. — HELL KNOWZ  ▎TALK 16:37, 22 January 2013 (UTC)
 * What comes to mind immediately (because I have been editing ships articles), in addition to the ones above in Template:Infobox aircraft, would include Ship laid down, Ship launched, Ship commissioned, Ship decommissioned, Ship struck in Template:Infobox ship career, and released (in Template:Infobox album). I have in mind also to include any date strings within reference sections that are preceded by the words 'Retrieved (on)' or 'Accessed (on)', but these are not yet in the body of the request . I am sure there will be others I do not know about but would want to add – would all these need to be o stated in this request? --  Ohconfucius  ping / poke 02:41, 23 January 2013 (UTC)
 * No need to list all, as long as other follow a similar pattern to the examples. — HELL KNOWZ  ▎TALK 09:54, 23 January 2013 (UTC)


 * As the BRFA currently reads, the bot will convert all citation dates to use xxx dates -- both publication dates and access/archive dates. It was my understanding MOS#CONST specifically says those two can be in different formats. As far as I can see, use xxx dates is used for prose/publication dates (i.e. for national ties, etc.). Am I understanding this right? — HELL KNOWZ  ▎TALK 16:37, 22 January 2013 (UTC)
 * I intend for the bot to abide by MOS:DATEUNIFY as much as possible without turning any ISO dates into the prevailing style, whether dmy or mdy. The does not apply exclusively to prose or publication dates, but is meant to indicate the prevailing style. I think it would be pure nonsense and waste of time and resources to run this bot but not to change instances of non-compliant (whether in terms of MOSNUM or in terms of the prevailing date style) dates. If, say, the given reference section of a dmy article has a mixture of several date styles (such as dmy, mdy, ISO, dd-mmm-yyyy, dd/mmm/yyyy), the bot will reduce it to just two – dmy and ISO – which is in accordance with my reading of MOSNUM. Given the nature of wikipedia and the limits of automation, I cannot technically manage the unification of all dates of a single type (publication, access, archive) within a given article without causing false positives or false negatives. I feel that the request as currently framed would strictly limit false positives; furthermore, it will respect the existence of ISO dates.  --  Ohconfucius  ping / poke 02:41, 23 January 2013 (UTC)
 * So, in short, use dmy dates converts all mdy -> dmy and use mdy dates converts all dmy -> mdy. ymd (ISO) are untouched. Sounds fine by me then. (I'll refrain adding my thoughts on utility of those templates...) — HELL KNOWZ  ▎TALK 09:54, 23 January 2013 (UTC)
 * Great idea for a bot; in longer articles with dates in multiple format, the utility of having the dates is lost in the chaos of varying formats. I hope you start on sciences articles. Operator appears to be able to converse with other users and communicate.--64.134.221.141 (talk) 00:29, 23 January 2013 (UTC)
 * Will it leave the date in the Infobox football biography which has the timestamp generated by use of  in the update fields? It may also be worth looking at the date info incorrectly placed in the author field of cite templates. Keith D (talk) 12:03, 23 January 2013 (UTC)
 * seems to render as " ". The bot as currently configured will ignore such a string. This is probably not a problem when it is on a use dmy dates article, but perhaps could be flipped for mdy articles? As to the misplaced data, it already gets treated when using my MOSNUM script, of which this bot is a stripped down version. Although in my experience, the author field invariably duplicates date information already populated to date (although the format may vary quite considerably so the script doesn't catch all instances), I'm concerned that data will be lost if this date data is not already populated should I set the bot to remove it systematically. Thoughts? --  Ohconfucius  ping / poke 14:53, 23 January 2013 (UTC)
 * Often it is present in the date field but not always, so if it cannot be moved then may be just generate a list for manual attention. May be worth tracking down the tool that adds it and get that to stop adding new instances. Keith D (talk) 17:48, 23 January 2013 (UTC)
 * I believe WP:Reflinks may be 'responsible' for much of this misplaced data, though I don't know to what extent it's due to faulty metadata insertion up the line (GIGO). The real problem is when people run Reflinks and then leave as completed without casting an eye over the output. --  Ohconfucius  ping / poke 03:07, 24 January 2013 (UTC)
 * ✅ now incorporated code to convert "author " into "date "
 * Thinking about it may be a tracking category is better than a list for maintenance purposes. Keith D (talk) 18:46, 23 January 2013 (UTC)


 * Before, the references had 4 publication dates in yyyy-mm-dd format, 2 accessdates in yyyy-mm-dd format, 2 accessdates in mdy format, and 2 accessdates in dmy format. The script removed the 4 yyyyy-mm-dd publication date formats and converted them (and the 2 mdy accessdates) to dmy format. It would appear that despite the  yyyy-mm-dd formats being in the majority, the script removed them. Is this how the proposed bot would operate? Likewise  changed a large number of archivedate fields so they no longer matched the accessdate fields. Is this also how the proposed bot would operate? Gimmetoo (talk) 10:27, 25 January 2013 (UTC)
 * That is exactly why the script is a script, and the bot is a bot. I would refer to the text and explanations above wrt to bot, in which it has been I believe clearly elaborated on the treatment of yyyy-mm-dd/ISO 8601 dates. As Gimmetoo seems also to be adept at scripting, I would invite same to evaluate and make suggestions regarding the bot source code. If that is not possible for whatever reason, there is a mirrored version in script form at User:Ohconfucius/test/test.js available for test running. --  Ohconfucius  ping / poke 11:45, 25 January 2013 (UTC)
 * I'm asking because, in the descripton above, you say "bot will align dates within '|date=', '|accessdate=' and '|archivedate=' to the prevailing style", but you appear to do something different with the script. So there are two issues. One is the implementation of the bot - is it different from the script? The second is its use - will it be used differently from the script? And your statement about 'date strings preceded by "\s(?:Accessed|Publish|Retrieved|Archived)( on|: |)\s"' is incomplete in your description (it's a sentence fragment following a semi-colon). So it's unclear what exactly you intend to do. Gimmetoo (talk) 14:48, 25 January 2013 (UTC)
 * This is a BRFA, and we are talking here only about the workings of the proposed bot. The script, although now quite sophisticated, is not entirely error-free, and its different component modules require users' discretion and judgement to activate, and will as such never pass muster as a fully automated device. So, without being at all facetious, the answer to your question about how it will be used differently from the script, I think, is more likely to be found in WP:Bots if not already cited above. As a side note, I would say that the process of writing and testing this bot has given me ideas for improving the mosnum script. I re-iterate that, in terms of functionality, the bot is a stripped down version of the script. Thank you – I have now amended the apparently incomplete sentence in the section above. Also, once again, as it seems to be your [ahem] "major concern", I copy from above, and put it IN BOLD: ISO dates (yyyy-mm-dd) are to be unlinked but will not be converted into any other format. I trust that clarifies any ambiguity that was in your mind. Regards, --  Ohconfucius  ping / poke 15:31, 25 January 2013 (UTC)
 * Perhaps I'm mistaken, but your code appears to take dates in the format 2013–01–25 (with dashes) and change them to dmy or mdy. Why not simply replace the dashes to hyphens, if dashes are not allowed in such formats? Would that not be more in accord with retaining the existing style? Likewise, it appears to change dates in the format 2013-1-25 (omitting a zero) to dmy or mdy. Why not just add the 0 if its needed? Gimmetoo (talk) 17:51, 25 January 2013 (UTC)
 * No, you're not mistaken. I'm glad you spotted that, but ISO 8601 states that the only valid format has 2-digit months and days, and the only valid separator is the hyphen. Thus neither 2013–01–25 (with dashes) nor 2013-1-25 (with missing leading zeros) are "valid" styles; and I think that it is only a trite more reasonable than to argue that other all-numerical formats like 25-01-2013 or 1/25/2013 should also be replaced by 2013-01-25 (with hyphens) because this would be more in accord with retaining the existing style. After all, a person can't be "just a little bit pregnant" ;-) Another consideration as to what WP:RETAIN would suggest doing is the prevalence of ISO dates vs dmy or mdy in the reference section. If a human were to do it, it's a judgement call. For a bot, I think the 'correct' thing to do is to go by the prevailing format in th case of misformatted dates. --  Ohconfucius  ping / poke 18:29, 25 January 2013 (UTC)
 * And what if the prevailing format is yyyy-mm-dd? Gimmetoo (talk) 19:38, 25 January 2013 (UTC)
 * And what if the prevailing format is dmy? GIven that the objective is to align articles already tagged to dmy/mdy, my proposal (and code) as currently framed seems to be the logical way to go. Either way, that decision on the "prevailing style" of the reference section must be taken individually at article level with human intervention when it is processed, and is deliberately beyond the scope of this bot proposal. I'm merely fixing up errors and format drift not involving ISO dates. --  Ohconfucius  ping / poke 01:04, 26 January 2013 (UTC)
 * This is a BRFA, which means investigating whether your proposed code is cpmpliant with guidelines, and whether it is the "logical way to go". Your proposal seems to be to automatically convert dates formatted as 2013-1-25 to dmy or mdy, without human interaction. There are at least two other reasonable algorithmic ways to handle this. Since you already say you are not affecting yyyy-mm-dd formatted dates, you would be leaving some of those in any event. So rather than risk adding inconsistently by inappropriately changing 2013-1-25 to 25 January 2013 in an article predominantly using yyyy-mm-dd, it would seem better to change it to 2013-01-25. That's one way, and would it not be more in the spirit of retaining the existing format? Alternately, code could count the body dates, publication dates, and accessdates, and the various formats for each, and use an agreed-upon and conservative heuristic or algorithm to determine when some format is predominant, and which it is. Gimmetoo (talk) 14:14, 26 January 2013 (UTC)
 * My proposed bot is to remove formats inconsistent with MOSNUM, and the only presupposition is that the body's dates are to be followed where there are non-compliant formats because it's the only reasonable interpretation of WP:CONSISTENCY. My proposal is not to avoid non-compliant formats that can be easily converted to ISO because Gimmetoo believes in the sanctity of anything resembling an ISO date. But the bottom line is that I'm only going to write what I know how to write.  --  Ohconfucius  ping / poke 15:42, 26 January 2013 (UTC)
 * Interesting speculation on motives - is this indicative of the way you would respond as a bot operator? An automatic bot shouldn't make controversial edits, and needs to have a very low error rate. You've already been to Arbcom once over date formating. Gimmetoo (talk) 15:56, 26 January 2013 (UTC)
 * No, I'll be honest here and say it's symptomatic of the animosity between you and me, and the stress I feel when you are around. The reason this bot is simple as it is (compared with the script) is an aim to avoid errors; the model you propose is likely to be too complex to produce no errors. As to controversy, I recall the saying that one nightingale doesn't make it spring. --  Ohconfucius  ping / poke 16:42, 26 January 2013 (UTC)
 * I have now amended the proposal to the effect that the bot will render ISO-compliant formats such as 2013–01–25 (with dashes) nor 2013-1-25 (with missing leading zeros). --  Ohconfucius  ping / poke 05:45, 27 January 2013 (UTC)
 * I think the community support question should be had at the MOS dates community, wherever that is. Making reference sections internally consistent increases the readability of the article. It appeared this was all the bot was doing for certain types of reference sections.with.specific mangled date styles. Maybe what the bot does can be settled elsewhere. --166.137.210.16 (talk) 22:53, 28 January 2013 (UTC)
 * Please define "mangle", and if possible, please supply a diff that illustrates same. --  Ohconfucius  ping / poke 04:12, 20 February 2013 (UTC)
 * Non-uniform. Since it is your bot proposal to make "uniform presentation of dates within any given article in compliance with Wikipedia:Manual of Style (dates and numbers)," if "non-uniform" is not good enough, then please define "uniform." --68.99.89.234 (talk) 06:23, 21 February 2013 (UTC)
 * Interesting that you choose to throw the question back at me in that manner. You still haven't defined "mangle" or supplied a diff to illustrate what you mean. In practice, this will mean only "greater uniformity" based on the removal of formats not consistent with the prevalent date style. No automated process is going to render the entire article completely compliant with MOSNUM, so I'm programming the bot to do the best it can under the constraints laid down. Thus, for the purposes of this request, "uniform" means replacing all ISO-like dates that are template parameter data with ones formatted in accordance with ISO dates, and all other non-compliant date formats (ie dates that are not dmy, mdy nor ISO) also parameter data within templates be replaced with dates that are in line with the tagging (whether use dmy dates or use mdy dates) in the article concerned. Such date formatting treatment as proposed, and as amended, has already long enjoyed strong consensus at MOSNUM, and I fail to see how you can assert this needs to seek further consensus. I don't see Gimmetoo challenging this any more since I amended the proposal. BAG should simply decide to accept or refuse a trial run for this request. --  Ohconfucius  ping / poke 10:10, 21 February 2013 (UTC)
 * You asked "please define mangle," and I replied, "non-uniform," now you say I still haven't defined it. You're now arguing because you don't like my response? Or are you trying to engage me in a back and forth like with the other user above, just to consume time? I think your communication is going to be a problem with you running this bot task, and your reply here and odd obsession with the word choice of "mangle" just prove my point. Editors may use synonyms you don't like, side-tracking to your picking at another editor over their word choice will make your running this bot a problem. Seeing how you can interact with the larger community seems more important now than ever. -68.99.89.234 (talk) 14:50, 21 February 2013 (UTC)
 * Does the community support this task?

Yes

 * 1) I really don't think the task should be done.  But if it means you won't damage articles by using your script, I'm for it; it seems now to properly fall within the consensus at MOSDATE.  — Arthur Rubin  (talk) 10:06, 6 February 2013 (UTC)
 * 2) As it stands, I don't see how this can do any harm, and it will clean up a large number of articles making them more compliant with the Manual of Style. Skinsmoke (talk) 07:58, 7 February 2013 (UTC)
 * 3) It seems to me that this task has been carefully considered to improve what it can, allow for the udse of correctly formatted ISO dates per options in MOSNUM, and leave alone what a bot can't reliably improve. Rjwilmsi  18:25, 13 February 2013 (UTC)
 * 4) It does appear to be a useful and well-written bot designed to implement an appropriate task. However, the back and forth between the bot operator and Gimmetoo suggests, to me, that this bot task should be decided by a wider community, then brought back to RFBA. --68.99.89.234 (talk) 05:45, 14 February 2013 (UTC)

No



 * I think we can give this a 50 edit trial.— cyberpower ChatOffline 16:54, 18 February 2013 (UTC) There are some concerns being raised and I am retracting my recommendations.— cyberpower ChatOnline 23:05, 21 February 2013 (UTC)


 * I disagree. This bot task seems to be appropriate, well thought out, and useful. But there appear to be community issues with another member of the Wikipedia community that suggest more input and a firm community consensus should come from outside RFBA, the larger Wikipedia community, before approval, and the bot owner has ignored my comments about this. I do not see any need for a trial with a bot owner who is not willing to communicate with the community, a require for running a bot on en.wikipedia. -166.137.210.25 (talk) 02:29, 20 February 2013 (UTC)


 * Request withdrawn. I know a dead horse when I see one. I'm self closing and archiving it. --  Ohconfucius  ping / poke 02:04, 22 February 2013 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.