Template talk:Citation/core/Archive 8

Quote marks
editprotected Currently the template uses "typewriter" instead of “typographical” quote marks around quotes. This is unnecessary, ugly and trivial to fix. --Divide (talk) 18:14, 25 November 2009 (UTC)
 * (What's needed is to change  to  .) --Divide (talk) 18:17, 25 November 2009 (UTC)
 * The manual of style, however, recommends using typewriter quotation marks: WP:MOS. I don't really have an opinion either way, but if you want to change that recommendation you should start a discussion at WT:MOS first. Have a look into those archives though, I'm sure this has been discussed a couple of times already. Amalthea  18:22, 25 November 2009 (UTC)
 * Ah, I see. Thanks for the nudge, I'll look into it. Just for reference, other instances of quote marks in the template:
 * in addition to
 * mentioned earlier (properly formatted this time ;) ). --Divide (talk) 18:33, 25 November 2009 (UTC)
 * in addition to
 * mentioned earlier (properly formatted this time ;) ). --Divide (talk) 18:33, 25 November 2009 (UTC)
 * in addition to
 * mentioned earlier (properly formatted this time ;) ). --Divide (talk) 18:33, 25 November 2009 (UTC)
 * mentioned earlier (properly formatted this time ;) ). --Divide (talk) 18:33, 25 November 2009 (UTC)


 * As Amalthea says, this is deliberate. It is in fact necessary to use straight quotes for the sake of find-in-page in current browsers (which don't equate ascii quotes to curly ones) and for the sake of ensuring consistency within articles. Chris Cunningham (not at work) - talk 09:26, 27 November 2009 (UTC)

Order of fields in a bibliography
I have been sent here from the Template:Citation talk page, and will repeat the argument.

My problem is that when I create a bibliographic (not a footnote reference) listing using the template or one of its derivatives, the order of the output fields differs from that derived from other sources. To be more specific: I have my personal bibliography, more or less directly copied and pasted from the Library of Congress online catalog. The fields are presented in order of decreasing importance: Author, Title, Publisher, Date. (Every other style manual I have consulted uses the same ordering; this may or may not be significant, as I have consulted only three.) If a book is not in my personal bibliography, I would like to use the template, but the output is in the order Author, Date, Title, Publisher. If consistency is regarded as a virtue, then I would have to edit all the entries copied and pasted from the LoC, and I am unwilling to undertake this process. With this as background, my question is this: Does a switch exist in the template commands that puts the output into LoC order, or is there another template that I should be using? PKKloeppel (talk) 02:08, 8 December 2009 (UTC)


 * No. You'd either need to have the templates changed (a tall order) or use another template. I've recently ben working on a new template cit book, which might do the trick for you: it uses Vancouver system style, which puts the date more or less where you want it. This template is still experimental so I don't recommend using it quite just yet in articles, but you might take a look at it and see whether it does the sort of thing that you want. Here's an example of usage and its output:
 * Feedback is welcome on Template talk:Cit book. Eubulides (talk) 03:40, 8 December 2009 (UTC)
 * It should be noted that this thread actually began at Template talk:Cite book, and not in Template talk:Citation (a page unchanged since 26 Nov) as implied earlier. --Redrose64 (talk) 11:36, 8 December 2009 (UTC)
 * Thank you, Eubulides. If will do what I think it will do when complete, it will be just what I wanted.
 * As for the comment by Redrose64: My bad for misidentifying the template where it all began. Sorry about that. PKKloeppel (talk) 15:40, 8 December 2009 (UTC)
 * Thank you, Eubulides. If will do what I think it will do when complete, it will be just what I wanted.
 * As for the comment by Redrose64: My bad for misidentifying the template where it all began. Sorry about that. PKKloeppel (talk) 15:40, 8 December 2009 (UTC)

Quotes in wrong position bug
(Raised by User:Redrose64 on behalf of User:Lidos, from thread originally at Template talk:Cite web and then re-raised at Template talk:Citation). Basically, the situation is this. When calls  (in section "Title of included work"), quotation marks are added around the outside of  thus: Lidos's question is: would it be possible to move them inside the second parameter of ? The desired effect would be to move the arrowed box symbol outside the quotation marks. --Redrose64 (talk) 11:07, 10 December 2009 (UTC) (pp User:Lidos)

""
 * What you are asking about is basically changing this behaviour (simplified examples):

""


 * I don't think it will be a problem to change this and it should simplify the code somewhat as well. I'll have a go at a local copy and then put it in the sandbox. --Tothwolf (talk) 11:50, 10 December 2009 (UTC)


 * Here is what the test cases look like with the changes I made in the sandbox: Template:Citation/core/testcases --Tothwolf (talk) 13:05, 10 December 2009 (UTC)


 * Thanks for everyone's input to this - and good that a solution has been proposed, let's hope it goes ahead.--Lidos (talk) 09:52, 11 December 2009 (UTC)


 * While testing the quoting changes, I also discovered a logic bug when IncludedWorkTitle and Periodical are used together. The test case where this showed up is the 4th one in the group I linked to. I've fixed this bug in the sandbox as well. --Tothwolf (talk) 16:03, 11 December 2009 (UTC)
 * I've spotted one thing which others may have views on - at present, the quotes are black and the linked text is blue. Moving the quotes will make them part of the linked title, and so will also appear in blue. Is the colour of the quotes important? --Redrose64 (talk) 11:24, 12 December 2009 (UTC)
 * There really isn't any way to change that since adding the quotes outside the link text will place the closing quote after the link icon. It doesn't really matter to me but I did want to see what others thought before it was implemented. --Tothwolf (talk) 13:54, 12 December 2009 (UTC)
 * Having the quotes as part of the link, and thus in the link color, is a small issue and it's worth it to get the external link icon out of the quotes. &mdash; John Cardinal (talk) 15:06, 12 December 2009 (UTC)
 * The only other way I can think of offhand to do this would be to place the link in  and then try to tack on the icon after the close quote. The code would be somewhat complex and ugly though. --Tothwolf (talk) 17:44, 12 December 2009 (UTC)

Why is bolded?
The MoS recommends that we don't go overboard on markup. The bolding of seems to contradict this. On journal-heavy reference sections, this leads to excessive use of bold markup. Chris Cunningham (not at work) - talk 18:23, 17 October 2009 (UTC)


 * I believe that the citation templates try (in places) to stick to the ISO standardised format. I've often seem this point often come up, and it is always resolved in the favour of keeping the bold for clarity in communication.  Martin  (Smith609 – Talk)  21:03, 17 October 2009 (UTC)


 * The ISO format (ISO 690-2) does not require bold face volume numbers: instead, you can write something like "vol. 42, no. 6, p. 549–560". I would like an option to the citation templates to omit the bold face in volume numbers, as the bold face is an unnecessary distraction. Many high-quality publications omit the bold face, e.g., the Journal of the American Medical Association, PLoS Biology. Eubulides (talk) 22:14, 17 October 2009 (UTC)


 * I don't see that an option is required. If the standard does not require bolding, then we should not bold. Chris Cunningham (not at work) - talk 23:16, 17 October 2009 (UTC)


 * In past, I have noticed that people complain about this issue when they encounter articles in which the citation templates are used incorrectly, with all kinds of strange information in the "volume" field (and in general abusing "cite journal" for non-journal articles). The "volume" field should contain only the volume number, like "volume=12", and nothing else. It will certainly look very strange if you put anything else in the "volume" field. — Miym (talk) 21:48, 17 October 2009 (UTC)
 * When the volume has its own title, surely that should go in too - as in Vol. 1: The Early Years, see Oldham, Ashton and Guide Bridge Railway for actual examples. --Redrose64 (talk) 10:37, 18 October 2009 (UTC)
 * Looks horrible. I think part of the problem is that traditionally references to journal articles have been formatted differently from references to books, but these citation templates try to use a similar format for both of those. Boldface in journal volumes looks normal; boldface in book volumes is a bit strange. Fortunately, there is a simple solution: just include the volume information in the "title" field, as suggested in the cite book documentation:
 * With cite book, I think the only sensible use of the "volume" field is the following: if you have a book series, then you can use the field "volume" together with the "series" field like this:
 * Even that looks a bit strange, but I can tolerate it as long as the part in boldface is just a volume number and nothing else. — Miym (talk) 11:20, 18 October 2009 (UTC)
 * By the way, here are some previous discussions on the same topic; might be good to check these first:
 * Template_talk:Citation
 * Template_talk:Citation/Archive_2
 * Template_talk:Cite_journal/Archive_4
 * Template_talk:Cite_journal/Archive_2
 * — Miym (talk) 23:52, 17 October 2009 (UTC)
 * Template_talk:Cite_journal/Archive_2
 * — Miym (talk) 23:52, 17 October 2009 (UTC)

<-- See also Template talk:Cite book, Template talk:Cite book/Archive 5 and Template talk:Cite book/Archive 5

For books I do not think that the volume information should be in bold, and it should be in the format, "vol. #". It should not be included as a cludge in some other field, as most people are not unreasonably going to use the volume field if it is there, and it looks odd in italics as it is not (usually) part of the name of the book. PBS (talk) 14:27, 16 December 2009 (UTC)

Escalated render error: items lacking author.
Hi, this is an escalation of A bug reported at Cite Journal.

Compare

Generated from:

Why does the title appear after the page reference without an author, when it appears behind the first major identifier of provenance (the title for authorless works) when there's an author present?

Instead, rendering as suggested below would appear to make more sense:
 * Arnold, Denis (April 1982). "Monteverdi:L'incoronazione di Poppea ed. Curtis" Gramophone (London: Haymarket): p. 88
 * "Monteverdi: L'incoronazione di Poppea" (May 1990). Gramophone (London: Haymarket): p. 123.

Additionally dateformat= isn't documented in any of the templates atm. Fifelfoo (talk) 01:25, 25 November 2009 (UTC)


 * I brought up this issue as well on the cite web talk page. I decided to manually rewrite and remove the cite templates for something in the neighborhood of ten citations because of this. It doesn't seem consistent and I'd like to know what style manual is being followed by using this format.  To be clear when no author is named the date should follow right after the title not after the work.


 * Current: →
 * Better: → "Title". (Date). Work.


 * 1 edit. — Lambanog (talk) 11:45, 20 December 2009 (UTC)

Superfluity of stops
Maybe it's me and my templates but: Seems over punctuated to me. The "." before "in" the extra "." after "et al." the extra "." after "(6th ed.)"... maybe the rest would look OK those three gone. Rich Farmbrough, 22:18, 14 December 2009 (UTC).
 * The template doesn't test (I don't know whether it's reasonably doable) whether a field ends with a dot. That's what is causing “et al..” and similar problems (e.g. ). I changed Template:Fitzpatrick 6, so that it shows correctly. Svick (talk) 23:25, 14 December 2009 (UTC)
 * User:Art LaPella/Citation template double period bug Art LaPella (talk) 18:44, 19 December 2009 (UTC)
 * In some APA citation guides it is supposed to appear as a capitalized In (not italicized) therefore the period would appear to be correct based on that style. The period after (6th ed.) might be correct but maybe the one after medicine is not. So from my perspective the reference should look something like this:


 * → McLean, David I.; Harley A. Haynes. (2003). "Chapter 184: Cutaneous Manifestations of Internal Malignant Disease: Cutaneous Paraneoplastic Syndromes". In Freedberg et al. Fitzpatrick's Dermatology in General Medicine (6th ed.). McGraw-Hill. ISBN 0-07-138067-1.


 * I would add I'm not familiar with the use of a semicolon separating authors and don't know what style uses that. I use the author parameter to avoid this format using the semicolon.  As for the dot, with familiarity the extra period can be accounted for by the person using the template and omitted when that parameter is being filled out.  Effort should be put into correcting formatting the user cannot work around without totally abandoning the template.  Part of the problem is that it is not clear what particular style these cite templates are based on.  Lambanog (talk) 11:23, 20 December 2009 (UTC)
 * I know that I've rattled on before about what our college preferred... but we were told to use a semicolon to separate the elements of a horizontal list. This is backed up by the WP article Semicolon. I can understand that, because we already use a comma to separate lastname form firstname, so we must use a different character to separate the authors. Consider this: "Aldrin, Edwin, Armstrong, Neil, Collins, Michael". Now imagine nine pairs instead of three - looking at the middle of the list, without pairing off from the beginning you might spot "Neil, Collins" and take them as a pair. The semicolon overcomes that: "Aldrin, Edwin; Armstrong, Neil; Collins, Michael" --Redrose64 (talk) 11:35, 20 December 2009 (UTC)
 * That's why it's important to know what style we are supposed to be following. If this was APA only last names are complete and initials are used for first names.  For MLA only the first author's name is inverted last name first and it is understood the rest are in first name last name order. The style being adopted now seems to be some unclear mishmash.  Although for all I know maybe it is Chicago, Turabian, Harvard, ISO, or something else. I've defaulted to APA with MLA author listing. Lambanog (talk) 11:55, 20 December 2009 (UTC)

Multiple bugfixes and updates
editprotected

Please copy the updated code from Citation/core/sandbox to Citation/core. This fixes numerous issues, some of which are documented here, here, and here. This update also includes some general refactoring and removal of a large amount of trailing whitespace. Test cases can be found at Template:Citation/core/testcases, Template:Citation/core/testcases, and Template:Citation/testcases. --Tothwolf (talk) 07:15, 5 January 2010 (UTC)
 * If possible, could another editor who is familiar with the workings of a this template double-check the proposed changes? &mdash; Martin (MSGJ · talk) 09:33, 5 January 2010 (UTC)
 * I don't see any issues, so I have deployed the changes. —Th e DJ (talk • contribs) 15:31, 7 January 2010 (UTC)

Relevant discussion
There is a discussion occurring at Centralized discussion/Wikipedia Citation Style. Your participation would be appreciated. — V = I * R (talk to Ohms law) 23:28, 18 January 2010 (UTC)

We should never render invalid HTML
(This thread follows up on the "We should never render invalid HTML" comment in the  thread above.)

Unfortunately, with the current citation approach it's all too easy for an article to render invalid HTML. For example, the of the featured article Autism currently generates invalid HTML because it cites these two articles: These articles have the same authors and year and therefore have the same ID " ". And now that I've copied these two citations here, this talk page now has invalid HTML as well. This is a serious problem that should be fixed, for the reasons described in the previous thread.

One way to fix this problem is to go through all articles such as Autism, and systematically add the (undocumented) none parameter to all invocations of the -family templates, so that these templates do not generate IDs that might clash. However, that would be a lot of work. There are many tens of thousands (perhaps hundreds of thousands) of pages that use the -family templates.

An easier solution is to go through articles that use both Harvard and -family templates, and add a new parameter harv to invocations of  -family templates as needed. This new parameter would mean "please generate an ID suitable for Harvard templates" (the current behavior). The default would be changed, so that by default the -family templates would not generate an ID. On English Wikipedia, there are about 5,000 articles that use both Harvard and -family templates; this is a much smaller number, and it would be relatively easy to fix them with the Citation bot. An advantage of this approach is that it would not generate useless IDs in the vast majority of -family invocations that do not need IDs, and it would thus lessen the amount of unnecessary bloat in the HTML version of Wikipedia articles.

The first step to this solution is to add support for harv to the -family templates. I have done that in the sandbox, with a to Citation/core. I propose that this be installed, and that we then fix the affected articles to add harv as needed. Between the time the change is installed, and the time that an article is fixed, the article's text will still be fine; it's only the intrapage Wikilinks that will be ineffective. This will be a glitch, but it's not a major one.

I'll mention this issue in Template talk:Harvard citation and Template talk:Harvard citation no brackets as well, to give them a heads-up. Eubulides (talk) 19:34, 21 September 2009 (UTC)


 * Why should we temporarily break the links? Shouldn't it be better to first add the parameter to the articles and only then change the ? Svick (talk) 21:02, 21 September 2009 (UTC)
 * If the parameter is added first as described above, that would break articles that have not been updated. It's possible to come up with a more-complicated scheme, involving a new parameter, that would allow a phase-in period like that. If there's sentiment for this solution I can draft an implementation along these lines. Eubulides (talk) 23:28, 21 September 2009 (UTC)
 * Best practice is to add letters after the years when the same author has had two cited publications in the same year, e.g. and  . Both the Harvard citation template and the cite-family templates support having years with letters added. Would a bot-assisted process be feasible that adds these letters wherever they are needed to differentiate between same-author-same-year sources? On pages that use Harvard citations, it could dump a post on the talk page, asking a human editor to look at the citations and figure out which of the several publications by the author was meant in each case. (That is no great loss, because the Harvard citations would have been ambiguous and possibly wrong to begin with). -- JN  466  21:11, 21 September 2009 (UTC)


 * The Harv templates are used in 5 or 10 thousand articles, so are you proposing to break all of those to fix the few that have multiple links? There has to be a way to address the problem specifically. We only need to touch those few articles that actually have the multiple links. Can we build a bot that will find the second matching link, add   to the template. Is this doable?
 * Yes, as I mentioned, Harvard references are used in about 5,000 articles. The articles that are broken (in the sense that they generate invalid HTML) are not just the Harvard-reference articles: they are all the articles that use cite journal, cite web, etc., including articles that do not use Harvard references. There are many tens of thousands of such articles, far more than there are Harvard-using articles; it's not known how many of these tens of thousands of articles are broken, but I expect that quite a few are. Many articles that contain no Harvard references contain several invalid IDs because of this problem, e.g., the of Daylight saving time. It would be helpful to build a bot no matter what solution is adopted. Eubulides (talk) 23:28, 21 September 2009 (UTC)
 * It is not a good idea to add "none" merely to citations that happen to have a problem now. That is because adding a citation to an article could easily break the article. We should not encourage such a brittle system. It would be somewhat better to add "none" to all citations that are not targets of Harvard references. But this solution is not a good one either, as it places a burden on the large number of uses of cite journal etc. in order to support the relatively small population of Harvard-reference-using articles. Instead, in the long run the burden should be placed on the Harvard-reference-using articles: they should specify which citations need these special IDs. In the short run we may have a transition strategy in which non-Harvard articles continue to be broken while we add parameters to Harvard articles to get things working with them; that's OK so long as the problem is fixed relatively soon. Eubulides (talk) 23:35, 21 September 2009 (UTC)


 * The ideal solution would be for Citation/core to automatically detect that if the link has already been used, and then automatically render the year as  if necessary, in both the the link and the visible text. This is standard bibliographic practice.  CharlesGillingham (talk) 22:39, 21 September 2009 (UTC)
 * I know of no way to write a template to do that, and am skeptical that it's possible. The problem is that templates must be self-contained; they cannot vary in behavior depending on whether some other template was invoked earlier in the page. Eubulides (talk) 23:28, 21 September 2009 (UTC)
 * I agree that generating ids for citations that do not need them only clutter our HTML code. I like Eubulides proposal (sandbox) to include a new parameter to only generate ids when necessary, otherwise the id won't be generated. Then, we should contact a bot to add the new parameter to references that require them and bingo! Articles using harvard references will only be broken for a while. Locos epraix ~ Beastepraix 23:40, 21 September 2009 (UTC)
 * Hi, I mentioned this problem in the Village Pump after encountering it in the Wikipedia Press coverage page. I found it useful to be able to link to a mention of a specific article, but I had to fix the reference "by hand" since there were several articles by Cohen on that page. Please keep in mind that this template is used, not just for academic citations, but for newspaper articles as well, which are hundreds of times more likely to produce collisions than academic papers. For what it's worth, I think ids are great, not "clutter", and we should have as many as possible by default, as long as something like XPointer is not more widely usable. But they shouldn't produce invalid HTML, or they should do so much more rarely. So, for example, using author, year, month, and day is probably an acceptable solution for news articles. --Kai Carver (talk) 09:15, 22 September 2009 (UTC)
 * I've had a look at the sources (using Firefox, view page source is Ctrl-U) of some of the pages mentioned, and can't actually see what is invalid about the HTML. What am I looking for?
 * When I want to cite two different works written by the same author(s) in the same year, I add the month as well to create a unique ID, you can see an example in Abingdon Road Halt railway station, but briefly, it goes like this:
 * Admittedly those are cite book but I use the same technique for cite web. --Redrose64 (talk) 12:43, 22 September 2009 (UTC)
 * Admittedly those are cite book but I use the same technique for cite web. --Redrose64 (talk) 12:43, 22 September 2009 (UTC)
 * Admittedly those are cite book but I use the same technique for cite web. --Redrose64 (talk) 12:43, 22 September 2009 (UTC)
 * Admittedly those are cite book but I use the same technique for cite web. --Redrose64 (talk) 12:43, 22 September 2009 (UTC)
 * Admittedly those are cite book but I use the same technique for cite web. --Redrose64 (talk) 12:43, 22 September 2009 (UTC)


 * Note that you could have done this:
 * Now your references look like standard academic references, and no "coding" was necessary. CharlesGillingham (talk) 17:56, 22 September 2009 (UTC)
 * I must admit that I'd never really noticed sfn until your 16:55 post below. I'll try it, when I come to do Hinksey Halt railway station. --Redrose64 (talk) 18:35, 22 September 2009 (UTC)
 * Now your references look like standard academic references, and no "coding" was necessary. CharlesGillingham (talk) 17:56, 22 September 2009 (UTC)
 * I must admit that I'd never really noticed sfn until your 16:55 post below. I'll try it, when I come to do Hinksey Halt railway station. --Redrose64 (talk) 18:35, 22 September 2009 (UTC)
 * Now your references look like standard academic references, and no "coding" was necessary. CharlesGillingham (talk) 17:56, 22 September 2009 (UTC)
 * I must admit that I'd never really noticed sfn until your 16:55 post below. I'll try it, when I come to do Hinksey Halt railway station. --Redrose64 (talk) 18:35, 22 September 2009 (UTC)


 * Run this page though the W3C Markup Validation Service and you will get . This is because the template generated duplicate ids for two different cites. Thus, links to those cites are now ambiguous and the output is invalid. ---—  Gadget850 (Ed)  talk 13:38, 22 September 2009 (UTC)

(outdent) This so-called "invalid" html hype is absurd, and ass-backwards. {citation/core} is not generating invalid html. So whatever the "problem" is, it needs dealing with elsewhere. If the wiki engine allowed &lt;a name="xxx"&gt;, we would be using that instead of &lt;span id=&gt;. But it doesn't, so blame the wiki software for forcing everyone to "violate" the w3c standards and for failing to be backwards compatible. Or sue Tim. Or take it to the pump. Or whatever. But for heaven's sake quit making a pedantic fuss about a non-issue that neither has specifically to do with citation/core, nor is "invalid html", nor a compatibility problem, nor needs to be worried about. This failure to get a green light from the w3c validator is a non-issue. -- Fullstop (talk) 19:15, 22 September 2009 (UTC)
 * A) Citation/core is not generating invalid html. It is bibliographic lists in their entirety that have id collisions. That is the crux of the grossly exaggerated "invalid html" hype, but is not something that citation/core has to be modified to "fix". The issue with duplicate ids is not even specific to bibliographic lists; a talk page with two sections with the same name will also have duplicate ids, and duplicate ids can be generated by anchors as well.
 * B) Duplicate ids are not "invalid html", and never were. Example of invalid html are &lt;span&gt;&lt;div /&gt;&lt;/span&gt; or &lt;body&gt;&lt;title&gt;...&lt;/title&gt;&lt;/span&gt; or &lt;span id="<">"&gt;, and are not an issue here. The supposition that the w3c validator is identifying duplicate ids as "invalid html" constitutes a misreading/misrepresentation of what the w3c validator does. What the w3c validator does is identify issues that don't comply with w3c standards, and which are then potential compatibility problems. That is all. A failure to comply with w3c standards does not automatically imply "invalid html". All browsers ignore duplicate ids, and that functionality is de-facto standard even if the w3c says not a word about it. All browsers also ignore css that they don't recognize, and so the multicolumn bibliographies still work fine even though they (and indeed every wiki page) have "invalid" css. In both cases non-conformance with w3c standards is not a problem, leave alone a "serious" problem. So quit making it one.
 * C) the {cite xyz} family has never generated ids, and since these are apparently the ones with the problem, the id generation should be moved from {citation/core} to {citation}.
 * D) ids are only meaningful when both author/editor+  (not to be confused with   ) have been explictly specified, and ids should not be generated if either of these two are missing.
 * E) duplicate ids are not specific to bibliographic lists, and are in any case an article issue, not an individual citation issue. So, if duplicate ids are really a problem, then the wiki backend (to include tidy) should be sanitizing them, in the same manner that other problematic html is sanitized.


 * Thanks for the suggestion of generating IDs for Citation, and disabling them for cite journal/cite book/etc. This an alternative solution that, although obviously not perfect, is an improvement and would fix all instances of the problem that I've seen. I have followed up in below. Eubulides (talk) 21:26, 22 September 2009 (UTC)


 * I don't have an opinion about whether or not core should be changed to try and avoid duplicate IDs, however, I disagree with your argument that HTML with duplicate IDs is not invalid. The HTML 4.01 specification clearly states that the ID attribute "assigns a name to an element. This name must be unique in a document." The term "invalid HTML" has a specific meaning and while it is sometimes used incorrectly, it was used properly in this case. &mdash; John Cardinal (talk) 20:17, 22 September 2009 (UTC)


 * Agreed. Besides, whether the problem is called "invalid HTML" or "failure to conform with W3C standards" is a terminology issue that is not that important here. There is widespread consensus that Wikipedia articles in HTML form should conform to the W3C standards for HTML. Also, failure to get a green light from the W3C validator is an issue. The validator's messages about duplicate IDs make it harder to find real problems, regardless of whether one agrees with the hypothesis that the duplicate-ID diagnostics are spurious. Eubulides (talk) 21:26, 22 September 2009 (UTC)
 * Fair enough. Sounds sensible. The point is that standards should be treated as a means to an end, and not an end in and of themselves. -- Fullstop (talk) 22:37, 22 September 2009 (UTC)

Removing the id as a default
To separate out some of the discussion... ---— Gadget850 (Ed)  talk 13:42, 22 September 2009 (UTC)


 * Are ids needed for non-Harvard cites?


 * No, an ID that is defined but never used is not useful. Eubulides (talk) 14:51, 22 September 2009 (UTC)


 * Should ids be removed as the default?


 * Yes, that's the proposal. They should be generated only when needed. There seems to be some sentiment for this. Eubulides (talk) 14:51, 22 September 2009 (UTC)
 * It would be better if this can be accomplished without temporarily breaking thousands of articles. The link should still work for the 5,000+ articles that use Harv* or sfn. I realize that this may be impossible given the current technology. I'm only recommending that we exercise caution and due diligence before undertaking any fix.


 * The article on Albert Speer, for example, uses a strict shortened footnotes format, and makes extensive use of Harvnb. Note that the articles that use Harv tend to be long, well developed articles. The shortened footnotes style in particular tends to be used when there are a large number of citations. (It provides a method for citing multiple pages of the same source in an a way that looks fairly normal to most academics.) The ideal solution would allow an article like Albert Speer to continue to function correctly without any edits whatsoever.  CharlesGillingham (talk) 16:55, 22 September 2009 (UTC)


 * I agree that caution is appropriate. Fullstop has suggested a method that would not require any changes to Albert Speer, and I've followed up in  below. Eubulides (talk) 21:26, 22 September 2009 (UTC)


 * Should duplicate ids be corrected?


 * The standard academic method for dealing with identical surname-year combinations is to add a letter after the year, like so: "Johnson (2007a)". This method is the only "normal" solution and works for the print version as well as the on-line version of the encyclopedia. Therefor, a duplicate ID is only generated by Citation/core when an editor has used the year parameter "incorrectly". The editor should have marked the second year with an "a".


 * There are two ways to fix these. First, this standard method should be described in the documentation for Harv. Second, could these duplicates be fixed by Citation bot?


 * Note that this proposal need not conflict with the previous -- it only needs to fix duplicate refs in articles that use Harv* or sfn. CharlesGillingham (talk) 17:41, 22 September 2009 (UTC)


 * I don't see how the Citation bot could fix a duplicate, since there's genuine ambiguity as to which citation is meant. The method of appending "a" should be described in the harv documentation, as it is standard in academic sources that use Harvard format. Just to be clear, though, the appending-"a" method is not necessary and typically is not used in academic sources that use numbered footnotes. Eubulides (talk) 21:26, 22 September 2009 (UTC)


 * The bot would only update the year parameter of Citation, in the order that duplicates are found. This is completely algorithmic. (In articles that use Harv, the citations are listed at the end of the article in alphabetical order, i.e., a bibliography.) Editors would need to fix the Harv calls in the text. Note that these were ambiguous to start with and needed to fixed anyway. CharlesGillingham (talk) 22:08, 22 September 2009 (UTC)


 * Should the algorithm be removed to allow hand-crafting or beefed up?


 * I don't think anybody is proposing that the algorithm for generating IDs be removed, only that it generate IDs only when needed. The templates already allow hand-crafting, no? as one can specify the ID by hand as a ref parameter to cite journal etc. I'm not sure how exactly the algorithm would be beefed up; can someone give an example call to harv and cite book to show what exactly is meant? Eubulides (talk) 14:51, 22 September 2009 (UTC)


 * Forcing editors to use the ref parameter seems like a big step backwards to me. It requires each editor to develop their own naming convention and keep the names in sync. The current system is elegant and simple to use.  CharlesGillingham (talk) 17:08, 22 September 2009 (UTC)


 * Should the footnotes documentation recommend HTML validation to catch id duplication?


 * Yes, that's a good suggestion. Eubulides (talk) 14:51, 22 September 2009 (UTC)

Harvard IDs for Citation, not for Cite xxx
Based on a suggestion by Fullstop above, I'm proposing that the change be applied only to cite journal, cite web, etc., and that it not be applied to Citation. Hence, no changes will be required for the common case of an article such as Albert Speer that uses harv or harvnb in combination with Citation.

The revised proposal is to add a new special value  for the ref parameter of Citation, cite web, cite journal, etc. This value will cause these templates to generate an ID suitable for harv and harvnb. The default is harv for Citation, and none (i.e., do not generate an ID) for the other templates. Hence Citation's default behavior will remain unchanged, and the default for cite web etc. will be change to not generate an ID unless a ref parameter is specifried.

To implement this, apply to Citation/core, and  to Citation. No change is needed to the other templates. Eubulides (talk) 21:26, 22 September 2009 (UTC)


 * Well, this will certainly work. But I have to say, it depresses me that we're reforking something that was unforked, if you'll excuse the expression. The documentation has only recently caught up with the fact that Harv works with cite book. The documentation will have to be re-updated. If this is the only solution we have, then so be it, but it does create some doc work. CharlesGillingham (talk) 22:00, 22 September 2009 (UTC)


 * Yes, to some extent this proposal exploits the fact that Harvard references didn't work with cite book etc. until recently. I will volunteer to do the doc work; it shouldn't be that much. Eubulides (talk) 22:19, 22 September 2009 (UTC)


 * I just now checked the Featured article candidates for HTML validation errors, and found a ton of them due to this problem. Also, as suggested above, I updated the documentation for citation, cite web, etc., to document the proposed behavior. Since there is consensus I have added editprotected below. Eubulides (talk) 05:56, 26 September 2009 (UTC)

-

As described at the start of this subthread, please apply to Citation/core, and  to Citation. Eubulides (talk) 05:56, 26 September 2009 (UTC)
 * Done, albeit with a little detour... Skomorokh  07:31, 26 September 2009 (UTC)
 * Thanks. I verified the fix by looking at the ten articles most recently nominated to be featured articles, at WP:FAC. Of the ten articles, 8 formerly had HTML that was not valid because it had duplicate IDs. This patch fixed 7 of the 8, so that they now conform to the standard. The patch did not introduce any problems that I could see. (I fixed the remaining article by hand; its problem was independent of this change.) So this is a sign that the patch is working in real-world articles. The 7 fixed articles were Well Dunn, Neverwinter Nights 2: Mysteries of Westgate, Trump International Hotel and Tower (Chicago), Anarchy Online, Tiananmen Square self-immolation incident, Ethan Hawke, and 1982 British Army Gazelle friendly fire incident. The article I by hand was John Tavares (ice hockey). The 2 articles that were already OK were Sydney Riot of 1879 and Bramall Hall. Eubulides (talk) 08:11, 26 September 2009 (UTC)


 * John Tavares (ice hockey) had duplicate ids due to table markup, not this template. Otherwise, this seems to have resolved the issue. What documentation needs to be updated? ---— Gadget850 (Ed)  talk 12:29, 26 September 2009 (UTC)
 * I updated all the documentation I know about. This includes documentation pages for citation and the Cite templates (cite web, cite news, etc.). Eubulides (talk) 17:00, 26 September 2009 (UTC)
 * Sorry, but this edit has broken cite book, where sfn is used, and the browser is Firefox 3.0. I was pointed in the direction of that template by CharlesGillingham on 22 September 2009; I tried it out on new article Hinksey Halt railway station which worked perfectly - at the time. On 25 September 2009, between 16:38 and 17:46 (UTC), I tried the same method on an existing article - List of rail accidents in the United Kingdom. To that article, I added three cite book templates, without using ref; I also added a number of in-line references using sfn. When the final one had been added, I tested them by making sure that each linked from the text to the entry in the reflist, and further, that clicking such an entry would move to the relevant cite book row. It did, at the time (I also noticed during such testing that when using Firefox 3, the row moved to is also highlighted in pale blue; Internet Explorer 7 moved but didn't highlight).
 * [pause for breath]
 * Now, IE7 will move, but Firefox won't. Viewing the page sources shows that the lines generated for the cite book differ in the first . Specifically, there is an   field present in IE but absent in Firefox, that is the intended target of an   (xxxxxx represents the authors, and 9999 the year).
 * Does this mean that since we should be browser-independent, I've now got to go and search for where I've used sfn and change these all to the  style, at the same time adding a refid to all the cite book? --Redrose64 (talk) 21:29, 26 September 2009 (UTC)


 * No, all you need to do is add harv to the invocations of cite book, as I've done to the two articles you mentioned . Admittedly this is a downside of the solution that was adopted, but it seemed like the least worst thing to do, since it's only recently that sfn began to "work" with the Cite family templates. Eubulides (talk) 21:51, 26 September 2009 (UTC)

Problem introduced by above changes?
Before,  the following properly linked the   to the  and now it doesn't link. (Example from Fermi-Dirac Statistics.)

For a system of identical fermions, the average number of fermions in a single-particle state $$i$$, is given by the Fermi-Dirac (F-D) distribution,

==References==



==Footnotes==

--Bob K31416 (talk) 21:53, 26 September 2009 (UTC)


 * I found the fix discussed above: ref=harv . --Bob K31416 (talk) 01:00, 27 September 2009 (UTC)

Fixing what's broke
To find the articles that are broken, we need to find all those articles that use any of (harv* or sfn and also use any of the cite * (or rather, the specific list of cite templates that use Citation/core). Does anyone know how to do an intersection search for these? It would be nice to have a list of the broken articles. CharlesGillingham (talk) 04:46, 28 September 2009 (UTC)


 * Few articles use sfn and I have fixed all of them that had a problem. Harv* is a bigger problem of course. For starters, what is the complete list of the the harv* templates, and the commonly used redirects to these templates? Eubulides (talk) 05:57, 28 September 2009 (UTC)
 * I think that CharlesGillingham might be asleep, judging by his user page and also normal edit times... so I'll make some suggestions to begin with, if I may. This is possibly not an exhaustive list, but Template:Sfn/doc states
 * Editors editing this template are requested to consider making parallel changes to Harv, Harvnb and Harvs.
 * So, I'd begin with those. Note that these are redirects to Harvard citation, Harvard citation no brackets and Harvard citations respectively, so these also need searching for; and judging by the documentation for those, there is also Harvtxt which redirects to Harvard citation text. Total 8. --Redrose64 (talk) 11:28, 28 September 2009 (UTC)
 * Looking at Special:PrefixIndex/Template:Harv, there are also harvardnb (used in one article), harvcolnb and harvcol. Also list of redirects is not necessary, as both Special:WhatLinksHere and API understand them. Svick (talk) 12:12, 28 September 2009 (UTC)
 * Well, you learn something every day. I didn't know about Special:PrefixIndex and I expect there are a whole bunch of useful tools similarly unknown to me. I overlooked harvardnb etc. because they're not mentioned in the documentation that I went through. I listed the redirects being unsure of the mechanism; I had presumed that somebody would be searching for eg. Harvard citation and would need to search for Harv as a separate operation; after all, Eubulides did ask for "the commonly used redirects to these templates". --Redrose64 (talk) 13:57, 28 September 2009 (UTC)
 * Using the API, I compiled list of pages in the main namespace that transclude both one of the  templates  and one of the   templates . It is accessible at http://svick.aspweb.cz/Harv.aspx and I can also easily generate this list in another format or extend that web interface e.g. by ability to mark pages as checked, if anyone wanted. Svick (talk) 17:20, 28 September 2009 (UTC)
 * I sampled the first ten articles in that compilation. Of those articles, two had breakages due to this change: 16"/45 caliber Mark 6 gun and `Abdu'l-Bahá . The latter article had several related errors, which broke Harvard references for reasons independently of the recent change; I fixed all these errors while I was at it. Four articles also had errors that broke Harvard references, regardless of the recent change (see fixes, , , ). Five articles had working Harvard references which were not affected by the change.
 * From this small sample we can guess (1) Harvard references are commonly broken, regardless of this recent change (breakages observed in 50% of sampled articles), and (2) the change made things somewhat worse (breakages due to the change occurred in 20% of sampled articles).
 * If we're going to write a bot, I suggest writing one that catches all occurrences of IDs used but not defined (the breakages mentioned above), and all duplicate definitions of IDs (the breakage that prompted this change). That should be fairly easy to write. I don't offhand see how to write a bot that catches only the breakages caused by the recent change, and anyway such a bot would be less useful than the more-general bot.
 * Eubulides (talk) 22:17, 28 September 2009 (UTC)
 * Hey, I'm awake now. Thanks for compiling the list, Svick. (Yes, to the best of my knowledge, that is the correct list of Harv templates. Note that Harvard reference is deprecated, and that Harvardnb is actually a sandbox edit, and should be deprecated.)
 * Thanks for all your fixes Eubulides. I fixed another ten, starting at Wotanism. No problems (3), unrelated problems (4), this problem (3).Combining our data, I estimate that this change broke 761 articles. However, three or four thousand articles are probably broken for other reasons. There is a lot of broken plumbing in Wikipedia, let's face it.
 * I think that Eubulides suggestion for a bot is excellent. It would be great if we had a constantly-updated list of articles with broken "intra-article" links, such as: (1) Link with no anchor (2) anchor with no link (3) duplicate anchor. (Is there a Wikiproject that deals with this sort of thing?) And, by the way, I'm actually going to bed now, believe it or not. CharlesGillingham (talk) 06:26, 29 September 2009 (UTC)
 * I think this could fall into WikiProject Check Wikipedia's scope. They have a bot that periodically checks the database dump for various errors. I will probably write a script that will filter the list I generated for articles that have mentioned errors in HTML. But it is unacceptable to do this for all articles. That would have to be done using database dump and that would mean finding errors in wikitext, not HTML, which is harder. We could ask sk, who runs the WikiProject Check Wikipedia bot, or I could try to do this myself, but that could take a while. Svick (talk) 10:13, 29 September 2009 (UTC)
 * I finally genetared the second list: articles from the first list that have the errors you mentioned. The list is available here: http://svick.aspweb.cz/Harv2.aspx. Because I don't think that IDs without links (e.g. using without ) are serious problem, the list doesn't contain articles that only have this problem, but that can be changed by clicking a link. Every listed article has a details page, that lists all its problems and link to that article. Please report, if you have any kind of problem with the list. Svick (talk) 14:32, 3 October 2009 (UTC)
 * That is impressive. I looked at several of the articles, and they were all broken in similar ways. There are 2100+ articles, so this will take years to fix. Is there a way to generate this list myself, in six months, when I've fixed a number of them? I wonder how many could be fixed by a bot that rolled through adding ref=harv to cite *?  CharlesGillingham (talk) 17:04, 13 October 2009 (UTC)
 * I removed fixed articles from the list today and plan to do this periodically once a month. I understand that relying just on me isn't the best idea, so I can publish the program that compiles the list and its source code. But I think it wouldn't be much usefull to anyone except me in its current state, mainly because there is no frontend except the ASP.NET one at and it's relying on MS SQL. So, I'm thinking about generating the list in some format more useful for someone else. I guess outputting first N invalid articles in wikitext formtat (so that you can e.g. save it to you usersapce and use from there) would be sensible. What do you think? Svick (talk) 01:26, 14 October 2009 (UTC)
 * I changed the site, so that it updates automatically every week, so this should be solved. Svick (talk) 13:10, 25 October 2009 (UTC)

I just found this discussion. Let me get this straight. You broke the formatting on 2100 articles a month ago, there is still no plan to fix them all, and you think this was a good idea? All for the purity of your precious bodily fluids html correctness which no browser even cares about? —David Eppstein (talk) 06:49, 25 October 2009 (UTC)
 * If you'll see the analysis of the small sample of ten articles, you'll see that most likely far fewer than 2100 articles were actually broken by the change. My own impression from that small sample is that the number of errors in using Harvard references is so large that the number of articles newly broken by this change were in the noise. Nobody was happy about this breakage, but the alternative was breaking the much, much larger number of articles that do not use Harvard references. Eubulides (talk) 07:21, 25 October 2009 (UTC)
 * My best estimate is that about a 1000 (plus or minus 400) templates were broken by this change. I think there is about another 1000 that were broken already. I can explain why I think so, based on various numbers given above, but I think a better idea would be to take a random sample of Svick's list of 2100 broken articles. It's difficult to get a precise count, because this change may have further broken some articles that were already broken already.
 * My principle disagreement with this change is that it complicates the process of creating shortened footnotes and forces new editors to search guidelines and talk pages for obscure and opaque bits of "computerese" such as  before they can figure out how we cite multiple pages of the same book. This sort of thing should be handled by the code, not the editor.   CharlesGillingham (talk) 23:36, 25 October 2009 (UTC)

Could one of you familiar with the technical details figure out how to make the 2100 compliant with the change, and put in a bot request to fix it please? Skomorokh, barbarian  21:32, 2 November 2009 (UTC)


 * Here is an example correction to Adhesive. The problem (fixed for this one case) is that cite book and harvnb need to agree on the list of last names to embed in the generated CITEREF identifier.LeadSongDog come howl  21:21, 27 January 2010 (UTC)
 * Another is fixed at Amon Henry Wilds by adding refharv to each entry in the bibliography. Mostly, this could be easily automated without fear of further breakage, at least for works with well-structured author lists. One problem case was a corporate author that was listed with other that I changed to last. Such cases should be addressed manually. LeadSongDog come howl  22:27, 27 January 2010 (UTC)
 * Funny cases, where harv is unsuitable because there is neither author nor editor, can sometimes be fixed by the use of or its alias . Here is an example borrowed from James Cudworth of an anonymous, although dated, online text:
 * This could be referenced using either
 * or
 * I left the URL clickable, so that you can check that it really is anonymous. -- Red rose64 (talk) 22:57, 27 January 2010 (UTC)
 * Yes, it's odd that for some reason we rarely fill in Anonymous even if it would make sense. As an aside though, I notice in that Cudworth instance the cited website in turn cites John Marshall. Biographical dictionary of railway engineers. 2nd ed. London: Railway & Canal Historical Society, 2003. ISBN 9780901461223 [Originally published Newton Abbot: David & Charles, 1978. ISBN 9780715374894]. Given the number of errors on the website, the book might be a more reliable source to use. LeadSongDog come howl  01:18, 28 January 2010 (UTC)
 * I have a copy of Marshall; and not all the material on that web page comes from that source. I agree that some is in error. -- Red rose64 (talk) 12:12, 28 January 2010 (UTC)
 * I left the URL clickable, so that you can check that it really is anonymous. -- Red rose64 (talk) 22:57, 27 January 2010 (UTC)
 * Yes, it's odd that for some reason we rarely fill in Anonymous even if it would make sense. As an aside though, I notice in that Cudworth instance the cited website in turn cites John Marshall. Biographical dictionary of railway engineers. 2nd ed. London: Railway & Canal Historical Society, 2003. ISBN 9780901461223 [Originally published Newton Abbot: David & Charles, 1978. ISBN 9780715374894]. Given the number of errors on the website, the book might be a more reliable source to use. LeadSongDog come howl  01:18, 28 January 2010 (UTC)
 * I have a copy of Marshall; and not all the material on that web page comes from that source. I agree that some is in error. -- Red rose64 (talk) 12:12, 28 January 2010 (UTC)