User:Trappist the monk/IL2LDR – In-line to list-defined referencing converter

IL2LDR is an AWB script that attempts to convert an article using in-line references to an article using list-defined referencing (LDR). In-line references are scraped out of article text and placed inside the article's template. The purpose is to make it easier for editors to work with article text without the clutter of the necessary referencing getting in the way.

While it is possible to use this tool to bulk-convert many articles, it is not intended for that purpose. Rather, it is intended to do the bulk of the work necessary when the decision has been taken, following proper consideration, to change an article from in-line to list-defined referencing.

What it does
The first thing the tool does is to look for a template or a  tag. The tool does not support templates with the group parameter. At the end of the process, the tool will replace the existing template with  where the refs parameter lists all of the in-line references scraped from the article text. If an acceptable template or a  tag can't be found, the tool abandons the edit with a status message.

Next follow several steps that make later processing easier and improve style consistency throughout the article:
 * 1) the tool standardizes the format of the article's  tags. It does this by removing extraneous spaces, ordering attributes, and quoting attribute values.  Here, the tool does support the group attribute but only so far as to make  tags that contain it consistent with those that do not.
 * 2) HTML and HTML-like tags are hidden by replacing the opening  and closing   characters with the specific text strings   and
 * 3) vertical format references are flattened to horizontal format

List-defined referencing makes use of reference definitions and reference intances. A reference definition has the form:

where  is unique to that reference. Reference instances have the form:

For unnamed reference definitions, the tool looks at the reference content and if the content is a cs1|2 template, attempts to extract something that can be used as a unique name. Unique names can be taken from several of the common identifier parameters supported by. In its current configuration, the tool will extract the parameter value from one of pmc, pmid, doi, or isbn in that order. The tool then adds a name attribute to the tag using the identifier value: if the tool used the value from 12345 then reference tag becomes:, etc.

If the tool is unable to extract an identifier from the reference content, it creates a name that takes the form  where   is a three-digit number beginning at 001 and increasing by one for every unnamed reference that uses a created name:, etc. After the tool has run successfully, editors should consider renaming definitions and instances that use these automated names because they are contextually meaningless.

Once named, reference definitions are moved into a list and sorted by name, leaving behind a reference instance to mark the definition's original location:. Because reference definitions must be unique, the tool can now check for duplicate definitions. If duplicate definitions are found, the tool abandons the edit with a status message. If definitions couldn't be moved into the list, the tool abandons the edit with a status message.

With a complete list of reference definitions, the tool replaces the with  where   is the list of reference definitions, each on its own line, in   ascending order.

The last step is to restore the hidden HTML and HTML-like tags.

Status messages
The tool emits several status messages that are available in AWB's Logs tab. There is one success message:
 * Converted in-line references to list-defined references – the tool thinks that it was able to move all references from article text into the template.  Editors should, of course, review the results before clicking the Save button.

When things don't go quite right, the tool reports that and abandons the edit. In some cases the article is skipped, in others the incomplete edit is shown so that editors can find and fix the problem in the source. Incomplete edits should not be saved.
 * article has no  – the tool requires a  template, with or without column parameters, or  tag; the tool does not support  nor does it support articles that already have have a list-defined referencing structure
 * duplicate ref name:  – all reference names must be unique; this message identifies one that is not

There are messages in the code that because of improvements should not display. If any of these messages are encountered, please report them.
 * unable to name all unnamed references – the tool was unable to extract any of the cs1|2 identifiers pmc, pmid, isbn, and doi as unique names from the reference content for use as a name; this message also occurs when the reference content contains html markup because of automatic name creation and because HTML is hidden, we should never see this message
 * unable to move reference:  – the tool was unable to move one or more reference definitions; because vertical format citations are flattened to horizontal format, we should never see this message
 * no references to move – all named references are in vertical format because vertical format citations are flattened to horizontal format, we should never see this message

How to make it work

 * 1) Start AWB
 * 2) log in, load your default settings file or manually set AWB's settings – because the changes that the tool makes are substantial, when first using this tool editors should limit AWB functionality to just this tool
 * 3) From the AWB Tools menu select Make module
 * 4) In Module make sure that the Enabled checkbox is checked
 * 5) In Module make sure that C# 2.0 is selected in the dropdown box
 * 6) Come back to this page; copy everything from the shaded box in §Script to your clipboard
 * 7) In Module, replace the content of the large text box with the content of your clipboard
 * 8) In Module, click the Make module button. If the module makes, you should see Module compiled and loaded in green and the current time under the dropdown box.  If the module didn't make, make sure that the whole of §Script replaced the whole of the default content of large box.
 * 9) Close Module
 * 10) Add the page or pages to be edited to the page list
 * 11) Click start
 * 12) If the tool reports problems fix them outside of AWB and try again