User:SMcCandlish/Replacement of Template:Rp

, a klugey form of partial parenthetical citation that was created in 2007 to handle cases in which an article needed to cite the same source many times but at different page numbers,, and has been for several years.

What this template does is use superscripted notes, inline in the content, to indicate page numbers or other in-source locations in sources after the footnote indicators. E.g.: "A claim with sources.[1]:12–14&#8202;[2]:321, footnote 27&#8202;[3]:ix, 9[4]:26 This is more article text." While instances like [2]:321, footnote 27 are the extreme end, simple cases like[4]:26 are also problematic. Use of this crusty format at all poses several usability concerns, from reader inclarity about what these text strings even mean, to unnecessary splitting up of citation details, to cluttering the prose with metadata; and may also pose an accessibility issue for readers with poor eyesight.

, that keep of the citation information inside the citation footnotes at the bottom of the page where they belong, leaving only the simple superscript indicators – A claim with sources.[1][2][3] – that link to the citations. The community consensus established in 2020 is to deprecate and replace inline parenthetical citations – those that inject citation details into the prose of the article – and this necessarily includes parenthetical partial citations like those created by.

Replacing them is not instantaneous and requires some judgment and some work. Below are tips for converting citations to the most common method of shortened footnotes, the / template set, which replace or work directly with p citation markup. There are some alternative citation methods, and this documentation may later be updated to include conversion instructions for them, if there is sufficient demand, but most of these alternatives are themselves obsolescent and disused.

Converting templated (CS1/2) citations that use
The instructions in this section are for citations that have been using  along with CS1 citation templates like, , and. This will also generally apply to the disused CS2 template,, which now uses the same set of parameters as CS1 (and has matching output).

The instructions below specifically recommend and illustrate the use of and (rather than different-formatting variants of them), because they are intentionally consistent with the output of the CS1/2 citation templates. WP:Citing sources (WP:CITESTYLE) instructs us to use a single, consistent citation style within an article.

Summary of the process

 * 1) Identify sources using  but which are being only cited at a single page number (or range of page numbers). These are the lowest-hanging fruit, as the  page number(s) can simply be moved into page or pages in the citation template, and  removed.
 * 2) Identify the sources that are being cited multiple times at different page numbers, and copy the full citation (, etc.), without page numbers, to the   section below the   section so they are in a central location; create such a subsection if needed.
 * 3) * The exact names of these sections vary by article; you might find  and , or   and  , or whatever, and maybe both will be level-2 headings, or the second heading might simply be absent. The point is that these re-used citations are at the bottom of the article, below s or , usually in a bullet list, and sometimes between a  and a  template. Rarely, they may be embedded with ref tags   or s (known as list-defined references). Any such format is fine; just match your new entries to whatever is used in the article already (if anything). Note: articles on writers often have a content section named "Bibliography" that has nothing to do with a section for reference citations; don't get it confused with one.
 * 4) For the first source cited at multiple page numbers/ranges, replace things like   and   with  (without a  tag around it) in both cases.
 * 5) * Repeat this sub-job for each of the different pages (or ranges) cited in that source.
 * 6) * Due to the "smarts" of, it just does not matter if the page is being cited more than once and which is the first instance; the template will create merged short citations, not duplicates.
 * 7) Repeat steps 2–3 for the next source being used multiple times at different page numbers/ranges. Repeat until there are no  instances left.

Concurrent cleanup
In the course of doing this, you may run into various citation inconsistencies, and this is a good time to fix them. Probably the easiest and most common is a source that is only being cited one time yet has a ref name like  ; this can be simplified to just ref.

Also look out for old CS2 templates mixed in with CS1's more specific types (, etc.); completely untemplated citations; half-templated citations; bare URLs wrapped around content instead of being inside citations at all; duplicate citations (that are not going to be replaced with the  operation described above); citations with confusing names in the ref tag; citations misusing various parameters (especially multiple authors put into the same parameter, unless it is vauthors with correct Vancouver formatting, and even then that should be replaced unless the entire article is done in that citation style); citations missing key information; redundant entries in "Further reading" or "External links" that have already been fully cited as citations above; the wrong citation template for a source type (e.g.  used for a journal article); multiple pieces of information in one parameter like 74, 3rd edition instead of 743rd, non-citation bibliographic information like hardback 237pp, vertical citations inline in the text (that format is good for list-defined references at page bottom, but visually disruptive in mid-article code); and other messiness.

cleanup bit to do is changing instances of  and   to   and , respectively. See below for a regex search–replace operation to do this easily. [Update: That code is in process of an overhaul, should be finished in Jan. 2024.] This cleanup is beneficial for at least three reasons: 1) It normalizes the formatting to consistent patterns, which aids with the search–replace operations covered below. 2) The quoting "future-proofs" the citation names, since the quotes are required around any value containing a space or punctuation; if a citation presently is named something like  there's a reasonable chance someone will later change it to   or whatever, but possibly forget the now-necessary quotation marks. (And if it's already,  , etc., then it already actually requires the quotation marks even if an earlier editor did not understand that.) The quotes are also required if it includes any non-ASCII Unicode characters at all (which includes a lot of accented Latin-alphabet characters, and anything in another character set like Cyrillic, Greek, or Japanese). There are just too many ways to break this to not quote it, so quote it. 3) The space before the  is understood by more parsers than the unspaced version, so is better for reuse of WP content. See also the mw:Help:Cite documentation for ref: Note that identifiers used in the name attribute require alphabetic characters; solely relying on numerals will generate an error message. Quotation marks [specifcally, straight double ones] are always preferred for names, and are It is recommended that names be kept simple and restricted to the ASCII character set. [Emphasis added.]

The process in more detail, for specific kinds of cases
This section is in process of major revision to stop recommending  when  will suffice.


 * Copy the citation's details as a list item down into the Sources/Bibliography/Cited works/whatever subsection (hereafter "the bibliogaphy") under the auto-generated citations, but without any page numbers:
 * If the work is only being cited one time in the article (and is likely to remain that way), it is preferred in the citation style of most articles to leave that specific source citation entirely inline instead of moved down to the bibliography; this can vary by article.)
 * For a source used once,  :
 * Replace  with
 * Or, if this citation's details have been moved to page-bottom because the source is likely to be reused later at another page number: Replace  with, or if you prefer the longer syntax,
 * If it has  reduce this to
 * For a source used once, , with or without a page number (shown here with one):
 * Leave  as-is.
 * Or, if this citation's details have been moved to page-bottom because the source is likely to be reused later at another page number: Replace  with, or if you prefer the longer syntax,
 * If it has  reduce this to
 * For a source reused,, named like :
 * Leave this as-is, if it is unlikely that additional citations to it will be created with specific page numbers, and leave it inline instead of moving citation details to the bibliography section.
 * If later citations to specific page numbers are likely, replace complete-citation instance thus:  becomes
 * In either case, referential shortened instances of  remain the same.
 * For a source reused, at specific page number or range, named and  :  :
 * If additional citations to it at other page numbers are unlikely, leave it inline instead of in the bibliography, and replace complete-citation instance thus:  becomes
 * If later citations to other page numbers are likely, replace complete-citation instance thus:  becomes
 * Replace referential shortened instances of  with   (note:  all cases of  !)
 * You can optionally just leave  as-is in these steps if this source is not likely to be cited for anything else later at another page number.
 * For a source reused, at specific page number or range, named but  :  :
 * If additional citations to it at other page numbers are unlikely, leave it inline instead of in bibliography, and replace complete-citation instance thus:  becomes   (i.e., just add page number to ref name).
 * If later citations to other page numbers are likely, replace complete-citation instance thus:  becomes
 * Replace referential shortened instances of  with
 * You can optionally just leave  as-is in these steps if this source is not likely to be cited for anything else later at another page number.
 * For a source reused, at page numbers,  :
 * Replace complete-citation instance of particular page reference thus:  becomes
 * Replace referential shortened instances of  ( all cases of  !) with
 * Find other instances of  with or without  and decide how each has to be handled.
 * For a source reused, at page numbers, , something like:
 * Replace complete-citation instance of particular page reference thus:  becomes
 * Referential shortened instances of  can be left as-is (though if it is not formatted consistently with other citation ref-names, now is a good time to normalize them along with the "master" instance that has the  in it).
 * Find other instances of  and decide how each has to be handled.
 * Especially with, be on the lookout for redundant instances of specific-page citations that can be merged.
 * The above examples are examples, not rules. If you prefer ref names in a form like  or   or even   to mimic the output, that's fine. If the "Smith (2023), p. 37" output of  and  do not match the rest of the page's citation style, you can get "Smith 2023, p. 27" output (which is permissible but less clear) with  and, and there are several other formatting alternatives; just be consistent within the article. However, because WP:CITESTYLE does ask that we impose a  style across the citations within a single article, these alternatives  in most cases, since the output of CS1/2 citation templates uses "(2023)" and "p. 27" formatting. The  and  templates are specifically recommended throughout this tutorial because they are  to be consistent with CS1/2.

Common issues
at (also covers, etc.). Crash course in common issues:
 * To cite multiple authors: or . This may require cleanup of bad citations, such as those with no citation templating, or misuse of author or authors to dump multiple author names into one parameter. Switch to last1first1last2first2, or iff the article is consistently using Vancouver-style citations, switch to vauthors (which has to be formatted a specific way, e.g. Smith JB, Chen BC Jr, Ocampo P).
 * To cite mutiple pages: or  (While technically it's preferred to use pp, in the  or, for multiple pages, it actually works fine to just use p – it's more important to get rid of  than to be ultra-precise with sfnp/harvp niceties. The template documentation claims that such a mismatch can cause breakage, but in pretty extensive testing so far, this has proven false. And there will be a regex search–replace detailed below to fix this anyway. [forthcoming].)
 * If there's no author name to use, you need to use some other meaningful string instead, e.g. publisher's name or acronym, or a key word from the title. This is done with ref inside the full-citation template, e.g.:  and then cite this with  or .  author to repeat the publisher name; use ref for its actual purpose.
 * The same technique can be used when a date is unknown but the author is named: ref
 * It's also useful for shortening the name of a long organizational author; if you have something like  you can add ref and cite it with that short name.
 * If the same author has more than one publication in the same year, the conventional thing is to refer to them as, e.g.,  and  . The simplest and clearest way to do this is again with ref, like so: ref (there is a different "legacy" way to do this, that operator-overloads the year parameter while simulaneously using date in the same citation, but this is confusing, obsolescent, and  by later editors). The ref parameter exists for good reasons, so please use it when it is called for.

Dealing with list-defined references embedded inside a or
[forthcoming]

Converting annotated or un-templated citations that use
[forthcoming]

Using scripts and regular expressions to speed the conversion
There is a powerful tool at your disposal: If you have Wikipedia's built-in editor enabled, click the item in the top menu, and on the far right will appear an hour-glass icon for advanced in-page search and replace. This supports regular expressions (regex). They are complicated to learn, but you don't have to learn them in detail, just adapt the ones provided here. Select the "Treat search string as a regular expression" checkbox and turn off the other two options when using these. If you have replaced the built-in editor with wikEd (an advanced editor you can install via "Gadgets" in the Wikipedia "Preferences" menu), it also has a regex search feature. So does any good external text editor.

doing any regex search–replaces, make the entire process much smoother by normalizing the citation spacing so that your search–replace operations work reliably and don't miss instances. There's a one-click, regex-based tool to do this all for you: TidyCitations!
 * Put the line:

in either your common.js or the skin.js of your current skin, save the page, and bypass your browser cache.
 * This gives you a script named " " in the "Tools" menu on the left while editing a page (might be somewhere else, depending on your skin). Edit the article you're cleaning up, and click that script. This will fix inconsistent spacing in citations.
 * See the short documentation at the top of User:SMcCandlish/TidyCitations for what to do if the article uses vertically formatted list-defined references (LDR) at the bottom of the article.
 * If you are using wikEd, you'll need to temporarily turn off wikEd (it's incompatible with many scripts like this) by pressing the wikEd logo.png button, making the changes with TidyCitations, then re-enable wikEd.

regex action to apply (after the above script) is this fancy one to normalize all the  and   instances in the page (with or without quotes, with or without space before , with or without unnecessary spacing like  , with or without other attributes in the ref tag, etc.) all to a consistent and robust format of   and  , so that later searches are guaranteed to find all cases and not miss ones because they don't have quotation marks or exactly the same spacing.
 * Ignore this for now. The regex work below has been surpassed by an in-development version that handles more cases, by using a series of regexes like this one to handle ref tags with multiple attributes like  in any order.
 * Use this regex in the "Search for" field: follow
 * This one is of course too complex to explain in detail here, but can be pored over at: https://regex101.com/r/xubdCt/14
 * Use this string in the "Replace with" field: 
 * Immediately after doing that, do a -regex search–replace changing "/> to " />
 * This will all work on virtually any ref name, even something as ridiculous as . The few known limitations (all pertaining to invalid nested quotation marks in ref names), are detailed in this footnote:
 * This regex will also clean up extraneous whitespace inside ref, e.g.  will be normalized to , including when any of those unnecessary spaces are line breaks. It even works for extraneous leading/trailing whitespace inside the quotation marks, as in.
 * This does not presently handle  and similar constructions (where more than one attribute is present and   is not the first one). One step at a time here .... This should not be particularly problematic, because such citations are rare, they are usually sparse in a page when found (easily manually addressed), almost never have, and will not be broken by our regex operations, because a ref tag like   or   cannot be later referred to as  , only as   or  , and none of those will match our search specifics. Then eventual multi-regex script will handle them all carefully anyway.

to find nearly all instances of a particular source's secondary citations with  numbers, and replace them with something clearer and non-rp:
 * Ignore this for now: It needs to be updated to stop doing  citations when  ones will suffice.
 * Do a regex search in the "Search for" field on, replacing "TIEOB" with whatever (often very unclear) name you are searching for, such as "about the buyout" or ":2" or "May'19".
 * If you are wondering about the technical details: the first  "escapes" the   symbol, which otherwise has a special regex meaning; same with the second , which escapes the   symbol; the   create a substitution group that we can call (as  ) in the replacement text; the   create a character-clustering group that   can operate on;   means any numeral or basic alphabetic letter, and this is followed by characters that also might appear in page number citations, including a comma, a dot (also escaped with   because it has a special regex meaning of "any single character"), an en dash, mistaken use of a hyphen for a dash (also escaped with   because of its special regex meaning as a range indicator when inside square brackets), a space, a semicolon in case someone silly did that instead of a comma, the section symbol, the paragraph symbol, and round-brackets (escaped because in regex they are grouping markup); and the   means "any of the characters, however many times they may appear, that are specified in the square-bracket group".
 * This is not the most foolproof possible regex, as it will not include any "page number" matches that contain things like accented letters, CJK/Cyrillic/Greek/Indic characters, or extraneous punctuation, but such messes are not very common in and are easily manually fixed. It  find things like ,  , etc.
 * A potential issue is the  value containing characters that need to be backslash  escaped – any of these characters:   (other chars that  need to be escaped in certain regex constructions do not need to be here). It is probably best to replace the   value (using a -regex search–replace) with something that doesn't have one of these characters, so you (and others later) don't have to remember to escape it with a backslash. E.g., replace   with
 * In the "Replace with" field, put something like, where "Shamos" is the author surname and "1993" the publication year.
 * The  means "swap in the text that was captured in the round-bracketed substitution group by the regex search". You can do multiple such groups, and they are processed and numbered left-to-right. E.g., if you wanted to keep the original ref name and just move the  page numbers into the ref name after the original name text, you could search for  – notice the new (TIEOB) round brackets – and replace with.
 * Clicking "Replace all" with the above search and replace strings will convert something like  to undefined, and   to undefined
 * Search for the original ref name again, e.g. <ref name="TIEOB" to find any that need manual treatment, e.g. something like, or other citations to the same source with different names like
 * Lastly, because that search–replace was a "blunt instrument" that just created secondary citations, a proper citation will be needed for each distinct page number. Search now for <ref name="Shamos 1993, and just make a list of all the page numbers being cited. For each of them, convert the first instance on the page from something like   to   if there is more than one instance of this page citation, or to just the short format   with no ref tags, if it is used only once. After one primary, complete citation for each separate page-reference to it is done, all the other secondary ones like   will just work automatically.

Let me know what other kinds of regex examples would be helpful.

Important notes

 * Search-and-replace operations performed with the built-in editor's search–replace function with Ctrl-Z (Cmd-Z on a Mac). Thus it is very important to copy the full text of the article code after performing a successful regex operation, and paste it into a text editor, so that if you do a second operation and it doesn't work right, you can just paste the last-good results back over the results that failed, instead of having to start over.
 * Remember to turn on/off the "Treat search string as a regular expression" checkbox depending on what kind of search–replace you are doing. If you have this set wrong, the results (if the search matched something) will get boogered pretty badly.
 * If you are using some off-site editor to do these regexes, you may need to wrap the entire regex with forward slashes: ; see its documentation for how it wants regexes formatted. You might need   for global (don't stop at first match), or   for global and multi-line (don't stop at a line break). It varies by application. You might also end up needing to backslash  escape more characters (typically   and , depending on the editor). There are many "flavors" of regex, and any given editor might be using a rather particular one.

Major cleanup example
For a case study of an cleanup (among some other tweaks) of one of Wikipedia's longest and most complex glossary articles, see this combined diff. Ironically, this removed from the very article that this now-obsolete template was originally created for. The entire process, a combination of manual setting-up of needed initial citations for particular works and pages, and multiple regex search–replace operations, took about two hours (much of it spent working out the regex syntax), plus a cleanup tweak or two a bit later.