User:Monkbot/task 18: cosmetic cs1 template cleanup

Monkbot task 18 is a WP:COSMETICBOT task to cleanup cs1|2 templates. It is authorized by Bots/Requests for approval/Monkbot 18.

cs1|2 templates are transcluded in more than 4.7 million pages. Task 18 is constrained to the canonical list of cs1|2 templates and their more common redirects. Wrapper templates are not considered.

Task 18 does these things:
 * 1) deletes empty known and unknown named parameters
 * 2) deletes empty positional parameters
 * 3) deletes or repairs parameters inside html comments
 * 4) deletes non-contributing parameter/value pairs
 * 5) hyphenates cs1|2 parameters when they are written using the to-be-deprecated all-run-together form
 * 6) converts known language names assigned to language to their MediaWiki codes

This task strives to leave template style as it found it so does not change whitespace in and around template parameters. There is an exception to that whitespace rule: blank lines within vertically formatted cs1|2 templates are removed. Though executed by WP:AWB, this task does not perform AWB general fixes.

Task 18 operates only in mainspace. Any article that has shall be skipped. Additionally, task 18 maintains a list of articles that it will not edit. Contact the bot's operator to get an article added to the list.

Update 2020-12-15: A modified version of the bot (namespace restriction and empty positional parameter portions disabled) is created for use on templates that wrap the cs1|2 canonical templates. This version will be run manually only in template namespace and is identified as task 18a.

Update 2021-01-10: Because the bot depends on results from mw:Help:CirrusSearch to focus its work, an alternate form of the bot task (18b) has been created that focuses on templates that are not cs1|2 templates but are (in many but not all cases) wrapper templates that use a cs1|2 template. Task 18b runs separately from the main task 18.

delete empty parameters
It is to be expected that some significant portion of cs1|2 transclusions will hold some number of superfluous parameters. One purpose of this task to remove those superfluous parameters. Empty parameters in cs1|2 templates occur for various reasons: the parameter is no longer required or not required in 'this' citation; a skeleton template was copied from a template documentation page and only partially filled in; deprecated error messages were 'fixed' by removing the parameter's value; the template was reworked; the template was placed by automated or semiautomated tools that inserted empty parameters; there are, no doubt, other explanations. For cs1|2, empty parameters serve no purpose.

Editors commonly complain that inline references make it more difficult to read wikitext. Empty parameters in cs1|2 templates occupy space for no meaningful purpose and, as a consequence, contribute to the wikitext readability problem.

The table lists all of the cs1|2 templates. The second column states the approximate number of mainspace articles using the template where at least one parameter was empty. The counts were last updated on the date listed at the bottom of the table.

delete empty positional parameters
cs1|2 does not support positional (unnamed) parameters. A positional parameter is one that does not contain an assignment operator. Positional parameters that contain nothing or only whitespace, have two forms:

Task 18 deletes the first pipe and any whitespace that exists between it and the succeeding pipe or brace.

Task 18 ignores positional parameters that have content other than whitespace because cs1|2 emits a reader-facing text ignored error message for those sorts of positional parameters.

delete empty named parameters in html comments
To cs1|2, parameters that are hidden with html comment markup (content) can appear to be empty positional parameters because MediaWiki removes html comments and their content before handing the template to Module:Citation/CS1. cs1|2 templates entirely within html markup are ignored.

Empty parameters that have this form are deleted along with the html comment markup:

Similarly, empty parameters inside html comment markup are deleted; parameters with assigned values are retained:

repair parameters in html comments
To cs1|2, parameters that are hidden with html comment markup (content) can appear to be empty positional parameters because MediaWiki removes html comments and their content before handing the template to Module:Citation/CS1. cs1|2 templates entirely within html markup are ignored.

Parameters that have this form are repaired by moving the preceding pipe into the html comment markup:

A similar configuration is commonly found. This form may or may not look like an empty positional parameter to cs1|2 depending on the surrounding wikimarkup. This form is repaired for the purposes of consistency:

Comments that do not have an assignment operator but which occupy all of the space between a pair of pipes or a pipe and the template's closing brace are repaired by moving the leading pipe inside the html comment:

delete non-contributing parameters
There are several parameters and specific parameter/value pairs that, under certain conditions, to not contribute the rendering of the template. Task 18 deletes these:
 * deadurl – is routinely added (with the assigned value ) to cs1|2 templates created by the unmaintained tool ReFill.  Because deadurl is no-longer supported by cs1|2, when it is present in cs1|2 templates, cs1|2 emits the error message:  and adds the article to  which gnomes keep mostly empty.  When task 18 encounters y (or the similar yes and true and their hyphenated forms), the parameter shall be deleted.  Replacement is not necessary because y (when it was supported) was redundant to the cs1|2 default state.  See url-status documentation.
 * mode – this parameter accepts one of two keywords:  or  .  The purpose is to direct the template to render in the specified style.  For example,  is a native cs2 template.   will render as if it were a cs1 template when given cs1; similarly, cs1 templates render in cs2 style when given cs2.  Setting mode to the template's native style does not change the rendering.  Task 18 deletes mode parameters when they match the template's native style. See mode documentation.
 * name-list-style – controls how name lists are rendered. When assigned one of the values ,  ,  ,  , or  , cs1|2 inserts an ampersand or the word 'and' between the last two names of a name list.  When none of the name lists have two or more names, the parameter does not change the rendering.  When assigned the value  , name lists are rendered in Vancouver style.  vanc requires last and first (or aliases); without first aliases, the parameter does not change the rendering. See name-list-style documentation.
 * &lt;name-list>-link, &lt;name-list>-mask[ a] – these wikilink or mask a  in an associated  .  When there is no associated &lt;name> (for example, 2 but no author14), these parameters do not change the rendering.
 * no-pp – suppresses p. and pp. page annotation when either of page or pages has a value; when there is no page or pages parameter, this parameter does not change the rendering. When found in  templates where journal has an assigned value or when found in, no-pp has no value because pagination in journal citations does not use the P. and pp. annotation.
 * orig-year – a free-form parameter that accompanies year, date or when neither of those are present, publication-date. When none of those are present, orig-year is not rendered.
 * postscript – specifies the rendered template's terminal punctuation. Default terminal punctuation for cs1 is a dot; for cs2, terminal punctuation is omitted.  . in cs1 templates and none in cs2 templates does not change the rendering.  The form &lt;!--none--&gt; is ignored by cs1|2 templates because the html comments make postscript an empty parameter.
 * harv – At the 2020-04-18 update to Module:Citation/CS1, harv was internally defined as the default state so the explicit harv does not change the template's function. Use of this parameter/value pair is tracked in .  See ref documentation.
 * url-status – like its predecessors dead-url and deadurl, url-status is a display control parameter that determines how a cs1|2 template will be rendered when that template has archive-url and archive-date. When archive-url and archive-date are empty or omitted, url-status is non-functional.  See url-status documentation.

hyphenate cs1|2 parameter names
Because cs1|2 is an amalgam of several individually developed templates, it acquired a variety of parameter-name styles: lowercase, Capitalized, camelCase, underscore_separated, space separated, hyphen-separated, allruntogether. The Capitalized, camelCase, underscore_separated, and space separated parameter name styles have all been deprecated and support for these styles withdrawn in favor of the lowercase and hyphen-separated forms as a result of this RfC. For parameter names that are multiword, cs1|2 is gradually shifting to prefer the hyphenated form. The table lists the all-run-together form with the approximate number of articles that transclude cs1|2 templates using these parameters. Task 18 replaces the all-run-together forms of these parameters with the hyphenated forms.

The accessdate and authorlink parameter names have been included in awb's genfixes since. There has been some pushback against using genfixes to normalize cs1|2 parameter name style. See.

There are commonly used tools that continue to insert the all-run-together names. An interface protected edit request has been submitted to normalize all-run-together parameter names emitted by WP:RefToolbar. See.

convert language names to codes
As a courtesy to editors at other-language wikis, it is desirable to use the MediaWiki language codes instead of English-language language names in language so that cs1|2 can render language names using the other wiki's language without the need for an editor to make a translation. Task 18 will replace language names with the associated MediaWiki code. See the language documentation and the list of (mostly) English-language language names and associated codes.

edit summary messaging
For each edit made, task 18 creates an edit summary detailing what was done. For example, this edit summary from a test (no save) of Ida B. Wells:
 * Task 18 (cosmetic) (dev test): eval 146 templates: del empty params (1251×); hyphenate params (4×); del pos params (1×); del |ref=harv (4×);

Each summary always begins with the 'Task 18 (cosmetic) (dev test): eval n templates:' leader where n is the total number of cs1|2 templates and recognized redirects that task 18 evaluated. The remaining portion of the edit summary is assembled from one or more of these:
 * del cmtd params (n×); – deleted commented parameters
 * rep cmtd params (n×); – repaired commented parameters
 * del empty params (n×); – deleted empty parameters
 * hyphenate params (n×); – 2 parameter names|hyphenated parameter names
 * del empty pos params (n×); – deleted empty positional parameters
 * del |ref=harv (n×); – deleted |ref=harv
 * del |mode= (n×); – deleted |mode=
 * del |postscript= (n×); – deleted |postscript=
 * del |url-status= (n×); – deleted |url-status=
 * del |deadurl= (n×); – deleted |deadurl=
 * del |no-pp= (n×); – deleted |no-pp=
 * del |orig-year= (n×); – deleted |orig-year
 * del |name-list-style= (n×); – deleted |name-list-style=
 * del |&lt;name-list>-mask= (n×); – deleted |&lt;name-list>-mask=
 * del |&lt;name-list>-link= (n×); – deleted |&lt;name-list>-link=
 * cvt lang vals (n×); – converted language values
 * skip (n×); – during development of task 18, it found malformed cs1|2 templates that contain the opening  or closing   that belongs to wikitables.  An example of this is found in  which is missing the closing   for .  Left unchecked, task 18 will see the table content as part of that broken  template and delete many of the double pipe markup used to separate cells in the wikitable.  This particular garbage is detected and skipped but editors are endlessly clever when it comes to breaking cs1|2 templates.

AWB truncates long edit summaries (phab:T199347) so the messaging is necessarily terse in an attempt to list everything accomplished by task 18 in the summary. When that is not possible, task 18 truncates its summary and adds an ellipsis.

known issues
When an article transcludes a named reference that exactly matches a named reference in the article, and task 18 edits the article's named reference, MediaWiki will emit this error message:
 * The named reference "$1" was defined multiple times with different content (help page).

Task 18 cannot know that a cs1|2 template that it edits is mirrored in a transcluded template. If you know of articles where this is a problem, give a list of those articles to the bot's operator so that the bot can be instructed to skip those articles.