Wikipedia:Requests for comment/Start date in NRHP articles

Should articles with U.S. National Register of Historic Places be bot-tagged by "Start date" template in the the "built=" field of NRHP infoboxes? Previous RFCs suggested case-by-case consideration should be given to application of microformatting; this is one big case involving 30,000 current articles, growing to 85,000. Questions and concerns have been raised in several previous discussions:
 * Requests for comment/Microformats (2010 RFC)
 * Wikipedia talk:Bot requests/Archive 2 (May, 2012, general bot request)
 * Bot requests/Archive 51 (January, 2013, NRHP bot request)
 * Wikipedia talk:WikiProject National Register of Historic Places/Archive 54 (January, 2013, wp:NRHP discussion)
 * Bot requests/Archive 54 (March, 2013, NRHP bot request)
 * Bots/Requests for approval/RileyBot 9 (March, 2013, RileyBot 9 bot test)

This RFC has been announced: at WT:NRHP, at WT:MILHIST, at WikiProject Lighthouses, at WT:SHIPS, at Reliable Sources noticeboard, and at Village Pump (proposals)

View by Doncram
No. Not at all ready for prime time, because:
 * 1. Bot proposed could easily and better be implemented by a simple edit to the NRHP infobox template code; no need to hard-code the "start date" template into 30,000+ articles (eventually 85,000), especially when hard-coding will introduce factual errors (see following reasons)
 * 2. What is a "start date", anyhow? The NRHP designation and infobox is often secondary in articles, and it seems wrong to assume the begin or end of construction for an NRHP-listed memorial or building or other item that is secondary to the article topic, is meant to be the "start date" of a given article topic.  The bot run proposed would identify the "start date" of the 1935 Labor Day hurricane to be 1937, the date of construction of a memorial.  It would assert the battleship USS Utah (BB-31) was started in 1914, five years after its hull was laid down.  It would assert the start dates of many churches are the dates of starting or of completion of construction of one of their past buildings, rather than the earlier founding dates of the churches, and rather than the begin or end of their current buildings.  (If construction is during a range of years, which is wanted, the beginning or end year?) It would assert that a lighthouse lit in 1904 was in fact "started" in 1924.  Is the goal to report the start or the end of construction of the first of many buildings at a place?  Or to report the start of the last one, or all of these?  What if there are multiple NRHP infoboxes and corresponding built= date ranges in one article?
 * 3. In one discussion, one editor suggested that "start date" will be understood to apply to the infobox, rather than to the article (although there is no indication within template:start date and template:end date documentation, and there is no microformatting field to identify the infobox name). If "start date" and "end date" were indeed to be infobox-specific, then arguably for NRHP listings it would be most relevant to use date NRHP-listed and delisted.  Or to use an earlier date of architectural design of a building, in cases where that is available and earlier than the beginning of the construction period.  ("Designdate=" is not currently part of the NRHP infobox, but could be added.)  If it is not infobox specific, earlier and later dates would often be more relevant.
 * 4. No guidance to editors is available at any central location, as far as i know, for when the "start date" is supposed to apply.  template:Start date currently is defined pretty much exactly the same as template:end date, generically as "the purpose is to return a date", any date.  As if randomly applying micro-formatting to any and every date appearing in wikipedia is going to be helpful to anyone.  What if an article includes several NRHP infoboxes, and other infoboxes, too?  Is it desirable to have many "start dates" in one article, in one infobox?  Where is any central guidance to outside programmers who are supposedly going to use the data, about what Wikipedia editors are trying to define as a start date, and where is any guidance to wikipedia editors about how they are to serve these outsiders?
 * 5. NRHP editors are working hard to develop articles on NRHP-listed places and to verify construction built ranges from multiple, sometimes conflicting sources, under frequent criticism from others, and have enough to do to get the basics right, without taking on another nebulous task! Fix the "start date" and "end date" concepts somewhere else, and forget about a bot run in NRHP articles for a year or two at least! -- do  ncr  am  01:19, 7 April 2013 (UTC)

Users who endorse this view

 * 1) --Atlasowa (talk) 16:27, 10 April 2013 (UTC)

Comments

 * We've told you what the start date template does, and you don't care. As you already know, because we've had to tell you when you don't want to listen, the problem with having Start date around incorrect years is that the information in the article was already wrong.  Stop pretending that the bot will cause additional problems and stop stonewalling what the project has already decided to do, and start helping by fixing incorrect information in infoboxes.  Nyttend (talk) 03:08, 8 April 2013 (UTC)
 * Tone down the vitriol, please. Don't state I do or don't care about.  I'll offer:  I do care about accurate information being presented in Wikipedia, and I am legitimately concerned that the "start date" bot run will introduce incorrect information, undermining the potential usefulness of microformatting available from Wikipedia.  I have stated 5 objections above, which have not been answered, please address those.
 * Nyttend understands correctly that some previous discussion focussed on the fact that the NRHP infobox "built=" date field is sometimes erroneous. Nyttend fails to understand or to acknowledge that previous discussion and my comments above point out that using the "built=" field, even when it is accurate, probably will introduce incorrect information, i.e. will add microformatting that asserts to the world that such a date is the start date for many Wikipedia topics, e.g. churches that were founded (started) earlier.  There is no definition for "start date" proposed anywhere officially (i.e. there is no mention at template:start date), but by any definition imaginable to me, the NRHP built date will be incorrect in many cases.  Please try to address my concern positively, e.g. by proposing a definition of what "start date" is supposed to mean, and then let's discuss the problems (e.g. what if an article has multiple NRHP built dates).  Why on earth is there no proposed definition of "start date" anywhere????  IMO, that is the place to start. -- do  ncr  am  16:32, 10 April 2013 (UTC)
 * "I am legitimately concerned that the "start date" bot run will introduce incorrect information, " - You have been told, more than once, that this task will not introduce any information. You have yet to evidence that the requested change "asserts to the world that such a date is the start date for many Wikipedia topics, e.g. churches that were founded (started) earlier"; and your specific claims that it will have been refuted in my earlier response. Andy Mabbett ( Pigsonthewing ); Talk to Andy; Andy's edits 19:26, 10 April 2013 (UTC)
 * What a joke. You're introducing the information (to be emitted to outside programmers) that the start date of a given topic is, well, one specific date rather than some other specific date.  You could involve NRHP editors constructively in ensuring that something sensible is provided.  If there's no information provided, then do let's cancel the proposed bot run and cancel this discussion. -- do  ncr  am  23:11, 10 April 2013 (UTC)
 * It's still not a joke. Marking up various dates within an article as a commonly recognised classes is contextualised by the container and the re-user is perfectly capable of recognising whether the "start date" (for example) applies to a building or a birth or a hurricane or a ship or a shipwreck. There's no point in worrying about them misinterpreting anything. Our goal within an infobox is merely to supply verified data; if they want a fully-nuanced interpretation of the data, they would be better off reading the article. --RexxS (talk) 01:18, 11 April 2013 (UTC)


 * 1) What is the "simple edit to the NRHP infobox template code" that would achieve the easier better implementation that you claim?
 * 2) A start date microformat is a date which is enclosed in a wrapper (often span) which has the assigned classes "bday dtstart published updated". These are themselves enclosed in a container element (often table or div) such as in 1935 Labor Day hurricane, where the dtstart date is within a table which has the "infobox" class applied and the first div in that table contains the text "Florida Keys Memorial". Any re-user scraping microformats from the raw html would spot that the dtstart applied to a date which was inside an infobox table which clearly applies to a Memorial in the Florida Keys. If multiple dates of class "dtstart" exist in an article, then the parser will find them inside other identifiable structures that will yield the correct context. This suggested bot run will cause no problem for any realistic re-user for the reason you are worrying about.
 * 3) Because smart re-users like Google not only look for microformats but also use natural language analysis, they can make particular use of the label-data pairs that exist in infoboxes, see Intelligence in Wikipedia. However they are most valuable for training the recognition algorithms when many articles use the same labels. It may seem like a good idea to deploy a "designdate" parameter, but "built" has far more common and is therefore more useful.
 * 4) I dealt with the query about multiple dtstart microformats above: the context (container) has always to be taken into account when making use of them. I agree that better documentation would be beneficial. Perhaps someone can be persuaded to write it, instead of removing it?
 * 5) If NRHP editors are working hard already, then I would expect that they would be grateful for a bot to do an obvious job and save them from doing it manually.
 * It's worth adding if it wasn't already clear that the text assigned to a dtstart class can be any sort of date: a single year; a full dmy date; a date range; even "date A OR date B". The interpretation and context of the values is a task for the re-user to determine. --RexxS (talk) 23:10, 10 April 2013 (UTC)

View by Jc3s5h
Oppose. I agree with the views by doncram. In addition, some historic places will have been "built" (whatever that means) on or before 2 September 1752, so the dates currently in the template most likely are stated in the Julian calendar. But microformats must use the Gregorian calendar, so the Start date template will emit a false date. Jc3s5h (talk) 13:22, 7 April 2013 (UTC)
 * If there are any year values before 1583 then they should be excluded from this process; as is already made clear in Start date's and the infobox's documentation. Years between 1583 and 1752 should be given using the Gregorian calendar. Any that are not (examples?) should have a note to that effect (so will be ignored by the bot) or be fixed; this is not related to the bot's edits. Andy Mabbett ( Pigsonthewing ); Talk to Andy; Andy's edits 14:49, 8 April 2013 (UTC)
 * I'm doing a query on the National Register database. There are some Pre-Columbian era or precontact archaeological sites in the National Register, some of which date back to before 1583.  Nambé Pueblo, New Mexico, for example, has a start date of 1540.  The original plan of St. Augustine, Florida is dated at 1565.  A number of sites have years between 1583 and 1752.  That said, the National Register database only lists a year, not a specific date within the year, so the Julian to Gregorian date problem wouldn't happen unless a structure was built in the last 11 days of the year (or the first 11 days of the next year -- I'm confused.)  Besides, I would expect that construction records and exact dates of establishment would be difficult or impossible to find for all but the most well-documented historical events, such as the landing of the Mayflower.  --Elkman (Elkspeak) 22:21, 10 April 2013 (UTC)

View by Andy Mabbett (User:Pigsonthewing)
Doncram says "Bot proposed could easily and better be implemented by a simple edit to the NRHP infobox template code". He does not say what that edit should be.

As has already been pointed out to Doncram, the Wikipedia talk:Bot requests/Archive 2 has already approved this task; the point made in the rationale there apply here. The "case-by-case consideration " of an earlier RfC applied to the implementation of new microformats; not to tasks such as this, tweaking the content of existing microformats. His concerns about the use of dates in NRHP infobox is about content; the proposed edits do not change that content. The claims in his second bullet point are false. They can be disproved thus:


 * The NRHP infobox in 1935 Labor Day hurricane is about an object called "Florida Keys Memorial"; it is that object's start date which the bot will cause to be emitted as "1937" in the template's microformat. The article currently says it "was unveiled in 1937".
 * The NRHP infobox in USS Utah (BB-31) is about an object called "USS Utah (BB-31)"; the infobox currently asserts that the "shipwreck" was built in 1901. The bot will not change that.
 * The NRHP infobox in Point Retreat Light is about an object called "Point Retreat Light Station"; it is that object's start date which the bot will cause to be emitted as "1924" in the template's microformat. the article currently says "1924, when a new combination lighthouse and fog signal was built".

Doncram's points 3 and 4 are about his lack of understanding of what is proposed; not a problem with the proposed edits. He has already been referred to the relevant template documentation and our articles on microformats.

The claim that "hard-coding will introduce factual errors" is unfounded and without merit. All these points have already been made clear to Doncram; in some cases more than once. This is forum shopping. His choice of announcement venues is selectively one-sided; and includes biased and misleading canvassing. This is a disruptive intervention by an editor who cannot understand or wilfully refuses to understand the answers he has already been given to his false concerns. Andy Mabbett ( Pigsonthewing ); Talk to Andy; Andy's edits 14:41, 8 April 2013 (UTC)

Comments
Please stop with the vitriol. "All of these points" have not been answered. The 2010 and 2012 discussions did not anticipate or discuss any of the points raised here. The 2013 discussions were not real discussions; you have sought to close them with reference to previous consensus, without addressing the points raised. As one example, my point #1 above was raised in the January 2013 discussion. Hellknowz commented there that the "Last centralized RfC [with link to the 2010 RFC] concluded that microformat deployment should be on a case by case basis. The above discussion addresses such a case, so I don't see a problem with Doncram's comments. His point is valid that the same result could be achieved via infobox or MediaWiki enhancement. Except nobody is taking any steps to do so, instead converting individual article infobox fields to "start date"-style templates. — HELLKNOWZ  ▎TALK 16:54, 4 January 2013 (UTC)". There was no reply to Hellknowz in that discussion. If you want the NRHP infobox built= field to be identified as a "start date", with or without ever defining what a "start date" is, that could be done by programming in template:infobox NRHP. It should be simple to implement some code "if the built field contains just a 4 digit year-like number greater than 1758, then put 'start date' around it". You and I don't know how to program that in mediawiki, but my guess is that it is easy for a programmer. The burden is on you to get that done, or prove that it can't be done. This would be far better than a bot run to change 30,000 articles, plus update bot runs later. One big advantage is that if/when there is any central definition of what "start date" is supposed to mean, and it is then apparent that the NRHP built= date is often not appropriate, then the infobox code can be changed to remove it, centrally. -- do ncr  am  16:32, 10 April 2013 (UTC)
 * This is not about "microformat deployment", so Hellknowz' comments are irrelevant' Your proposed change to the code of the infobox would not cater for YYYY-MM or YYYY-MM-DD dates; or parameters with a YYYY value and a text annotation. There is no need to defining a start date. There is no case demonstrated, where the NRHP built= date is not "appropriate". But thank you for acknowledging the previous consensus.  Andy Mabbett ( Pigsonthewing ); Talk to Andy; Andy's edits 19:23, 10 April 2013 (UTC)

View by Nyttend
We already approved this microformat, so attempting to unapprove it is disruptive. All infoboxes either contain correct dates in this field that should be encased in the template, or they contain incorrect dates that should be changed. Nyttend (talk) 02:55, 8 April 2013 (UTC)

Users who endorse this view

 * 1)  Andy Mabbett ( Pigsonthewing ); Talk to Andy; Andy's edits 19:42, 10 April 2013 (UTC)

Comments
Where has this previously been approved? The May, 2012 general discussion was general. It was asserted there by Andy Mabbett that "There are already over 85,700 transclusions of (up from 54,500 this time last year); there is no opposition when such changes are made manually; the vast majority - indeed probably all - of the templates in question stipulate the use of "Start date" in their documentation, again with no controversy....". But the NRHP infobox does not include any mention of "Start date" in its infobox, and there has been no real discussion and definitely no consensus that the "built=" field of the NRHP infobox should be used for "start date". What if an NRHP-listed building was in fact built during a range of years, e.g. "built=1910-25" or "built=1910-1925", while the current infobox shows just "built=1910". The bot would put "start date" around 1910. Then when an NRHP editor corrects the bad info by showing a range, should the "start date" be removed entirely? Or should it apply to the beginning of the construction period? Or to the end of the construction period, because the use of the building does not start until then? Or to both? What about buildings for which we know a design was completed at an earlier date (and what if we put "design-date=" into the infobox)? What about when there are multiple NRHP infoboxes in an article? What about churches founded earlier than the built date of an NRHP building? Etc. The May, 2012 discussion assumed there would be no complications or objections. Well, there are complications and objections. What about points #1, #2, #3, #4, #5 above? -- do ncr  am  16:42, 10 April 2013 (UTC)
 * "What if an NRHP-listed building was in fact built during a range of years, e.g. "built=1910-25" or "built=1910-1925", while the current infobox shows just "built=1910". " - that is a pre-existing error, which will not be changed by the requested bot edits. You have already been asked more than once to stop conflating such content errors with the valid work requested of the bot. "the NRHP infobox does not include any mention of "Start date" in its infobox" - that is demonstrably false, see the documentation for Infobox NRHP. Andy Mabbett ( Pigsonthewing ); Talk to Andy; Andy's edits 19:34, 10 April 2013 (UTC)
 * You continue not to understand, or to pretend not to understand, that a built= field can be entirely accurate for its purpose, but not be accurate for "start date" application. You call my attention to your January 3, 2013 edits to the documentation for the NRHP infobox which reflected no consensus of NRHP editors and which went unnoticed by me and probably everyone else (reaching this version);  i have just reverted those (reaching this version).  (The edits didn't make sense anyhow, including some mention of a founding date and so on, for example.  Discuss at the Talk page of that documentation if you wish, or, better, let this RFC conclude first.) -- do  ncr  am  20:02, 10 April 2013 (UTC)
 * Your assertion that "a built= field can be entirely accurate for its purpose, but not be accurate for "start date" application" is (with the rare exceptions acknowledged above and already catered for) entirely without foundation or merit. If you disagree, I invite you to prove (and not merely assert) that that is the case. Andy Mabbett ( Pigsonthewing ); Talk to Andy; Andy's edits 20:20, 10 April 2013 (UTC)
 * And you won't provide a definition anywhere of what is wanted, for "start-date". There is no discussion, no consensus expressed anywhere.  It's a joke: template:end date is the same as template:start date!   But okay, let me define it:  suppose what we want is the date of founding of a church (as the start of the article topic).  Or say we want the NRHP listing date (as the start of the period of NRHP listing).  Then, trivially, the built= date is different and wrong.  You want, or should want, to enlist NRHP editors into putting some reasonable date into a "start date" microformat.  But, instead of facilitating a process of making sense of this, a process of consensus-building about what is wanted, you just dismiss all questions.  You make it clear that there will be no consensus created, so, then, the RFC should be closed as NO, the bot run should not be done. -- do  ncr  am  23:08, 10 April 2013 (UTC)
 * It's not a joke: template:start date supplies the classes "bday dtstart published updated" (line 10); template:end date supplies the class "dtend" (line 10) - each recognised usually, but not exclusively within hCalendar formats. As far as microformats are concerned, the name of the infobox parameter is irrelevant. Nevertheless, you already know the advantage of keeping the number of different infobox parameter names to as small a number as practical. --RexxS (talk) 01:06, 11 April 2013 (UTC)
 * One person sounding off repeatedly and at length, with canvassing and forum shopping, does not mean that consensus cannot be determined. Your claim that "'no discussion, no consensus expressed anywhere.''" is a bare-faced lie; you yourself cite Wikipedia talk:Bot requests/Archive 2 at the top of this page. Andy Mabbett ( Pigsonthewing ); Talk to Andy; Andy's edits 09:31, 11 April 2013 (UTC)

View by Elkman
Why would you start an RFC and then agitate for it to be closed four days later? This is "Requests for comment", not simply "requests for agreement". And calling other people's opinions "a joke" appears to me to be a personal attack. --Elkman (Elkspeak) 00:44, 11 April 2013 (UTC)

Users who endorse this view

 * 1)  Andy Mabbett ( Pigsonthewing ); Talk to Andy; Andy's edits 09:20, 11 April 2013 (UTC)