User:Jason Quinn/Cite template parameters do have a mostly determined logical ordering

Wikipedia is based on textual data. Every article is ultimately created by editing text, so called "wikitext". Some of that wikitext consists of "templates" that process text. One of the most common kind of templates are our cite and citation templates, used to aid formatting the references in our articles. These cite templates consist of the template name followed by a sometimes large number of template parameters for things like the title and author names etc. Editors usually pay no attention to the ordering of these parameters but they should. This essay will help explain why and suggest a nearly "perfect" way of ordering the parameters that "just makes sense". The main idea is that parameters that go together should be grouped together and that information should roughly follow the order of the presentation.

Key big ideas

 * The cite template parameters should roughly be in the order in which they tend to display.
 * Parameters that are closely related should be closely grouped.
 * Very roughly speaking, the ordering of the parameters should flow from general to specific (e.g., information about the volume itself like author, title, and year to information about the location within the volume like page) and important to less important (author clearly of super high importance while an LCCN number more candy than key).
 * Parameters with information intended to be read by humans should be near the front and computer readable parameters (like url's) should be near the back. I like to use the parameter url as a signal to editors that everything until the end of the reference can be semi-ignored while copy-editing the source.
 * Spacing matters in cite templates! There is an almost objectively best choice for spacing which is Bloggs with a space before the pipe and not Bloggs or Bloggs or any other permutation. This is the best because of how text-wrapping works, for instance with Wikipedia's traditional source editor. All the other variations make it harder to parse the source code and are prune for editors to introduce errors because the text wrapping is confusing. This is one point where editors might object that it's a pointless change so I try not to do spacing changes by themselves but it's also important for an article's source to be consistent in style. The articles almost never are; so, preemptively to editors who wish to object to tweaks in spacing: if an editor is fully-copying editing an article's references, let them work in the style they like. Almost nobody does this dull, laborious task of making the articles consistent so unless you wish to do it too, don't bother somebody who's actually doing the valuable work because you don't like the way they go about it.

Specific smaller ideas
Many of my edits include changes like this:
 * As per the cite template documentation, use the parameters last and first rather than author when appropriate.
 * If there is only one author last and first should be used and not last1 and first1. Conversely, if there's more than one author, then last1 and first1 should be used and not last and first.

Motivating example
When the author name is given with last and first, why would you not want to keep those two things close together in the list of parameters? Unfortunately it is not uncommon to see this "in the wild". The problem is that an editor seeing last or by itself might think that first is missing and when the parameter list is long, is can be difficult to find the matching first or overlook it, which causes them to add another one. In the end it just causes wasted time and effort. They go together like peas and carrots and should be adjacent.

It's less clear in which order the author parameters should appear. Should it be last then first or first then last? Here's there's more leeway but using last first is better. Why? Two reasons. It matches the most common presentation order of the cite template and this helps locate the source text for a given reference in the article. And also, it helps to alphabetize references by author last name as is often done in the References and Bibliography sections.

There's one more parameter to mention, author-link, which is commonly used and associated with an author. It should come after the name parameters such that the ordering is,


 * last first author-link

There's a common alias for author-link which is authorlink. I actually prefer the dashlash version because it's less visual clutter. But I am flexible either way with this. More important is that the article is consistent in its usage throughout.

With this idea explained in slow detail, let's move on to the first "group" of parameters that has a rationale behind an ordering. The "Authors group". Each of the following groups could have similar explanations about why they are groups and why the suggested order is what it is but I will mostly be brief.

Authors group
The parameters related to an author should go together like so


 * last first author-link

or, if there's more than author,


 * last1 first1 author-link1 ... lastN firstN author-linkN

where "N" stands for the number of the last author.

There are several assumptions here justified in the :
 * author-link's go after the names.
 * last comes before first.

Lastly, sometimes you'll see display-authors, which should obviously come after the actual author names.

Editors group
Similarly, after the authors should go the editors.


 * editor-last editor-first editor-link

for just one author, or, if there's more than one,


 * editor1-last editor1-first editor1-link ... editorN-last editorN-first editorN-link

The same advice about numbering for authors applies to editors: If there's just one, don't number the parameters and if there's more than one editor, number the first.

Url group
Common cite template parameters are url and access-date, the which should only be used if url is present. These two parameters go together like peas and carrots. It makes no sense to have them far apart from each other. They should be next to each other. In fact, if they are far apart it introduces several problems. It makes noticing both are there much harder. So sometimes editors will change the value of url without updating accessdate because they didn't even notice accessdate was used. When they are randomly scattered, it also slows down the human parsing the cite template. This helps make editing less productive. It also makes sense that accessdate goes after url because it depends on the url. Conclusion: these two parameters should appear in this order:


 * url accessdate

Similarly, there's often this pair


 * archive-url archive-date

which clearly belong together in that order.

Also you'll see also url-status. This most tightly refers to url so it should be closer to it and the logical ordering should be:


 * url accessdate url-status archive-url archive-date

Publisher group
There's another obvious pair


 * publisher location

Title group

 * title trans-title edition chapter

It's obvious that trans-title, if it exists, should come directly after title. Also, edition should be very very closely connected to title.

The use of chapter is rarer but not uncommon. It is kind of an anomaly because it describes a part of a source instead of the whole source. But due to the way it is rendered, this needs to be part of the title group.

Date group

 * date orig-date

OR
 * year orig-year

A preliminary sketch for a group ordering
[Authors group] [Editors group] [Date group] [Title group] [Publisher group] [Url group]

This group order basically follows the citation presentation for both CS1 and CS2 cite templates. It really helps source code editors to have the cite parameters match up roughly with the cite template's presentation.

One thing to point out here. It is much nicer to have the URL group near the end of the citation. URLs tend to be long strings of semi-random characters. Putting most of the material of human interest before the URL stuff helps ensure it's read and scrutinized.

But we are missing some key parameters here, especially things like metadata related to the source (ISBN's, DOI's, etc.). Generally we will want to group these together but this requires more discussion.

For cite journal
For this we will need to identify some subgroups of meta data. Let's start with a group commonly seen with cite journal.


 * volume issue

These two clearly go together and should appear as shown. However, they are often used with page (or pages). Now page is a very special parameter. It is the first parameter that does not give information that applies to the entire source. It's an internal detail about a source. Generally speaking, we will want this order: volume issue page. But ideally we specify internal details completely after the specify the source details. This is why the rp template is sometimes used after the ref tags. Unfortunately, the presentation that rp produces is kind of clunky and it's often avoided for that reason.

But generally speaking we would like things like this:


 * [Details that apply to the whole source] followed by [Details that specify location within the source]

but unfortunately there are technical hurdles to doing this ideally within Mediawiki using ref tags and templates.

Long story short, we want page as far to the end of the list parameters as possible. But this runs into a problem. That "unreadable" URL group is there. The most practical solution is to put it before the URL group, which has the benefit of keeping it near issue.

These aren't the only meta data parameters. There's a whole bunch of them: (isbn, sbn, jstor, etc.). Generally speaking, I put these other meta parameters after publisher since the publisher is often responsible for creating the IDs for their values but before this volume issue page subgroup. While I tend to put the ubiquitous isbn directly after publisher, for the most part the order completely doesn't matter for the less common ones.

For cite book
For books, things are a bit different. For books, volume seems better suited to be part of the title group so that I'd usually use title volume.

Spacing

 * exhibit A

Odds-n-ends
Some parameters don't have a particular obvious place to go. Just do your best in those cases. Also the set of parameters themselves are still evolving and changing. So copy-editors who like to cleanup article source need to stay abreast of that and adapt.