Wikipedia talk:WikiProject National Register of Historic Places/Progress/Instructions

This page is intended to provide documentation about the NRHPPROGRESS statistics and map system with detail and clarity enough for others to participate in its development and maintenance. The system uses Javascript coded by User:Dudemanfellabra and currently maintained by User:Magicpiano. Please discuss issues or improvements at either WT:NRHP or the maintainer's talk page.

Update statistics

 * (adapted from User:Dudemanfellabra/UpdateNRHPProgress)

Anyone can do this, does not require bot approval. There's an elaborate program is involved but in the end all that it does is edit one page in WikiProject NRHP space, the wp:NRHPPROGRESS page. To enable its use, add the following to the bottom of your common.js or vector.js, depending on which skin you use to view Wikipedia:

This script generates a button at the top of the Progress page which reads "Update Statistics". This may or may not be visible when using the Google Chrome browser. If it does not appear, refresh the page (Shift-F5) until it does. To see it more reliably, use Microsoft Edge or Mozilla Firefox. The script is best executed in a relatively recently started browser, because it requires a fairly large amount of memory.

Clicking on that button will start the script, which will then check all ~4000 lists under the scope of WP:NRHP for statistics by state and by county for number of listings, pictures uploaded, articles created, and "quality" statistics (e.g. number of stubs, start+, untagged, etc.). The script will automatically save the updated statistics to the wp:NRHPprogress page, unless there are issues (see below).

How it works
When the "Update Statistics" button is clicked, the script does the following actions:
 * 1) Extract the wikitext of the Progress page using the Wikimedia API. If an error is encountered at this stage, the script aborts itself.
 * 2) After extracting the wikitext of the Progress page, the script fires off in rapid succession asynchronous API queries for the wikitext of each county list.
 * 3) When each wikitext query completes, the script extracts instances of NRHP row to obtain a total number of listings for that county and gathers information about whether or not each listing is illustrated (i.e. whether or not the image parameter is non-blank). If any error is encountered at this stage, a fatal error is triggered (explained below), and the user is asked to skip the county that produced errors or try to query the county again.
 * 4) Next, the script fires off asynchronous queries to find out if each article is a bluelink (Note: Links to disambiguation pages are counted as redlinks) ; the articles are grouped into batches of 50 here to reduce the number of API calls. At the same time the script checks if a page is an NRIS-only article and records that information.
 * 5) After all queries for each batch of articles are complete, the script queries the talk page to find out quality statistics (Stub-class, Start+, etc.); the articles are here again grouped into batches of 50 to reduce strain on the API. (Note: If a listing links to another list of NRHP properties (e.g. a MPS), the listing is counted as unarticled. If a listing links to some other type of list (e.g. a list of contributing properties to a historic district), the link is counted as Stub-class.)
 * 6) After the above steps are completed for all of the ~4000 lists on the page, the script totals up the statistics for each county with sublists, each state, and the entire nation, taking into account the duplicate information found at WP:NRHPPROGRESS/Duplicates, adds this information to the previously-fetched wikitext, and edits the page with the newly generated wikitext. After the edit is completed, a diff link is generated, and the script exits.

Explanation of error messages
To prevent the script from writing gibberish to the Progress page, if at any time during execution the script encounters an error while parsing the wikitext of a given list, it will ask the user how to proceed. The script will identify the problematic list, and the user will have the option to make the script retry that county (in case of connectivity/unknown issues) or to skip the problematic county in favor of later manually updating it. The following error messages may be encountered:


 * Error: No county section found for LISTNAME! – This error is only triggered for links from the Progress page which are redirects. It means that the script was unable to find a section located at the target of the redirect link specified. The most likely cause of this error is that the redirect does not point to a specific section on the page. For example, the link for National Register of Historic Places listings in Autauga County, Alabama (a redirect) points to the section National Register of Historic Places listings in Alabama. If someone has edited the state list to read something different (e.g. just "Autauga"), the script will produce an error. To fix this, simply change the redirect to point to the correct section or change the section title to match the redirect (e.g. for the above example, one would have to change the redirect to point to National Register of Historic Places listings in Alabama or change the section name to read "Autauga County" to match the redirect). Usually the former option–changing the redirect–is preferable to changing section titles, which may lead to edit warring.
 * Error: No table found for LISTNAME! – This error means that there was no table found at the location of the link specified. This is the least common error, and should only trigger if someone has vandalized a county list by blanking it or removing a large chunk of random code, or someone has vandalized the Progress page itself so that the link there does not point to the correct page. To fix it, revert the vandalism and retry the county.
 * Error: Incorrectly formatted table for LISTNAME! – This error means that a table was found at the target link but it does not seem to include a list of sites on the National Register, meaning it does not use the NRHP row template. Like the previous error, this one should only be triggered if someone has vandalized the county list or the Progress page itself. To fix it, revert the vandalism and retry the county.

Other errors
If any other error other than the above is encountered, the script simply retries the query after a brief pause. The most common error encountered is the API returning a warning that the "rate limit"–the number of API queries per unit time–has been exceeded, so the script should slow itself down. If too many of these errors are encountered, the script throttles itself by increasing the gap between each subsequent API query. At the end of execution, these less serious errors are written to the JavaScript console for examination by the user. The script should be rerun after reported errors have been corrected.


 * If listing pages have been moved or redirected, the script may fail on those pages. One of the above messages may pop up, and you should record the affected page names.  Correction typically involves updating the table in WP:NRHPPROGRESS to use the proper non-redirected link and name for the page.
 * The script is somewhat picky about formatting in the listing pages, including the presence of whitespace in some areas. If a page fails to be processed by the script, you should examine that page's history for recent changes that may have an impact.  Examine changed areas of the page for extraneous whitespace, removing if necessary.
 * If listing pages have been split or merged, and the progress page has not been updated to reflect that activity, counts in affected states will be off. This is presently not easy to detect, and a potential source of error in the data, since counts are either omitted or duplicated.  If they differences are small, the state totals may not be obviously wrong.  Correction consists of editing the affected state tables on the progress page to reflect the current structure of the listing pages.

Update duplicates info
This task is performed by User:NationalRegisterBot.

Update "substub" listing
This task is performed by User:NationalRegisterBot.

Update maps
The maps at the top of the WP:NRHPPROGRESS page are updated by a different method, requiring a second script, an account at Wikimedia Commons, and the use of a text editor on your computer. You should NOT use Microsoft Word (or any other formatting word processor such as LibreOffice), or an editor specifically designed for editing SVG images, due to the nature of the edits to be made. Use Notepad on Windows or something like GEDIT on Linux.

To enable use of the script add the following to your common.js or vector.js:

This creates a button on the WP:NRHPPROGRESS page labeled "Generate SVG Data"; if you have that page already open, you will have to do a browser refresh to see it. (It is also subject to the same appearance issue affecting the data update button on Google Chrome.)


 * 1) Click on "Generate SVG Output". This will run a fast script that populates WikiProject National Register of Historic Places/Progress/SVG with data that can be used to update the maps.  It creates partial sections of SVG formatting information that needs to be edited into the map files.


 * 1) If you have not previously done so, save the following map files to your hard drive:
 * 2) *commons:File:NRHP Illustrated Counties.svg
 * 3) *commons:File:NRHP Articled Counties.svg
 * 4) *commons:File:NRHP Start+ Counties.svg
 * 5) *commons:File:NRHP Counties Net Quality.svg
 * 6) One by one, open each of these files in your text editor.
 * 7) You first need to identify a section of text to remove. It begins with (and includes) the comment header for ALABAMA (near the top of the image file), and ends just before the   tag that is just before the line containing , which may be located by searching for "county-group".  This entire block of formatting is deleted.
 * 8) In place of what you removed, paste in the block of formatting from WikiProject National Register of Historic Places/Progress/SVG that matches the file you are editing.
 * 9) Save the file, and verify that it correctly displays in an image viewing program capable of rendering SVGs. You should also verify at this point that the data appears correct, by examining areas you expect to have changed recently.
 * 10) One by one, upload new versions of each map to Commons.
 * 11) After uploading each file, do a server purge of that file. This ensures that the entire Wikimedia server farm will soon get correct copies of the file.
 * 12) Update the progress page to reflect the date on which you did the update. You should also then server purge the progress page.

Note in the uploading process, sometimes commons does not accept the update immediately but instead asks whether you really want to upload. This happens if it detects no change in the file, which can occur for the "Started" file that changes less quickly. You can/should over-ride this and select option to upload it anyhow, just to keep dates in synch and avoid doubt/questions later about the information in the files. (History of map versions is used in mini-video of file versions.)

Links for purging these files:
 * purge the Illustrated map at Commons
 * purge the Articled map at Commons
 * purge the Started map at Commons
 * purge the "Net Quality" map at Commons
 * purge the NRHPPROGRESS page

Possibly it also useful or necessary to purge Wikipedia versions of these files:
 * purge the Illustrated map at Wikipedia
 * purge the Articled map at Wikipedia
 * purge the Started map at Wikipedia
 * purge the "Net Quality" map at Wikipedia