User:Magicpiano/NRBot

From Wikipedia, the free encyclopedia

Page for development of NationalRegisterBot.

Links[edit]

Final destinations:

Intermediate outputs:

Bot sources:

Other NRHP script sources:

Programming documentation:

Configuration[edit]

NationalRegisterBot's vector.js needs to import the NRISOnly.js file (currently at User:Magicpiano/NRBot/NRISOnly.js).

Operating instructions[edit]

  1. Log into NationalRegisterBot.
  2. Go to Wikipedia:WikiProject National Register of Historic Places/Progress.
  3. Click on button labelled "Update list for NationalRegisterBot". This takes up to 3 hours, and (re)populates User:NationalRegisterBot/AllNRHPPages, User:NationalRegisterBot/Substubs and User:NationalRegisterBot/NRISOnly.
  4. Go to User:NationalRegisterBot/NRISOnly.
  5. Check for missing refnums, links to disambiguation pages, and other errors reported on the page. Fix as needed, manually removing entries to omit from further processing.
  6. Click on button labelled "Tag these articles". This will tag and untag the articles listed.
  7. Go to User:NationalRegisterBot/AllNRHPPages/Duplications.
  8. Compare current and previous revision. Examine new entries for possible incorrect uses of refnums, which may have been introduced by the addition of new entries (manifested as appearance of a "duplicate" in the same list). If necessary, manually remove entries to omit from further processing.
  9. Click on button labelled "Gather duplicate stats". This will update Wikipedia:WikiProject National Register of Historic Places/Progress/Duplicates.
  10. To have the updated duplicates used on the progress page, it needs to updated. This should be done in your user account (import User:Magicpiano/NRBot/UpdateNRHPProgress.js).

Troubleshooting and maintenance[edit]

Changes in the names and organization of listing files occur from time to time. Some of this structure is encapsulated in the code the bot runs. If listing files are redirected, or created by splitting existing listing files, those changes need to reflected into the StateStructure array in the CheckDuplicates function. The names in this structure should be reflective of the actual listing files, i.e. not referencing redirects.

Improvements[edit]

  • Change duplicate-checking and article tagging code to use asynchronous retrieval methods.
  • Create status files that can be transcluded into the bot's userpage, so that it can show most recent operator and most recent (or ongoing) run.
  • Improve handling of malformatted {{NRHP row}} contents, including wikilinks in article field