User:Yapperbot/Scantag

Scantag is a Yapperbot task that runs as a low priority, scanning through every single page on Wikipedia and tagging them appropriately with maintenance tags where certain patterns match.

This is useful for tagging articles that have broken templates especially, as they would not show up as transclusions of the template, but it can also be used for a number of other things; any issue that needs a maintenance template, and which can be detected through a pattern search of the body of the article, is potentially a candidate for use here.

What's currently running?
To see the raw details of the currently running rules, take a look at the JSON file that configures them.

How can I request a Scantag pattern?
Add a request on the for a new pattern. In your request, you should explain:
 * What the pattern is you want to be added (if you know regex, please provide a regex pattern; if you don't, please explain as carefully as you can, so someone can craft one for you)
 * Why you want this pattern to be scanned
 * What you want the found articles to be tagged with
 * That you understand that this will not happen immediately

Scantag rules may not go live for a long period of time, as the bot will only reread the rules when it has finished scanning the entire corpus of Wikipedia pages. You should not expect your rules to start scanning for at least a week, probably longer, after you make your request.

Either the bot operator,, or any administrator who is comfortable doing so, may add rules to the bot.

Instructions for admins
Scantag rules can be modified by any administrator, as they are stored in Yapperbot's user JSON pages. However, as these rules will be applied to many, many pages, it is very important that they are accurate. To that end, '''any administrator modifying Scantag rules should first ensure that they are completely comfortable with doing so. If you have any doubts, do not modify the live rules.'''

Scantag rules
Scantag rules can be tested by modifying the sandbox JSON page. A Scantag rule is made up of the following components:

prefix, suffix and testpage are optional; all other tags are required.

The value of prefix is assumed to have the same precedence for MOS:ORDER as a maintenance template. The value of suffix is simply appended to the end of the article.

Rule sandbox and test pages
Once you have modified the sandbox JSON page, within five minutes, should update the sandbox report page, which contains information explaining each of the rules that Scantag has been given in the sandbox. If you set a testpage parameter in the Scantag rule, Yapperbot will also have run the rule over that page twice. If you see two runs, rather than just one, in the page history linked (click "Up-to-date"), this means that your noTagIf regex is not matching the result of prefix or suffix. This is bad; it means that the prefix and/or suffix will be added to matching pages every time the bot runs, not just the first time the bot spots the issue. Correct your noTagIf regex if you see this happening.

If you modify the sandbox JSON page, the sandbox report will be automatically regenerated within the next five minutes. If you modify the test pages, or any other part of the system, you can manually force a sandbox refresh by removing the  template from the top of the sandbox report page.

Pushing rules live
Never push rules live if you have not first tested them in the sandbox, even if a trusted user wrote them.

It is strongly advised to consult with at the very least  or one other sysop before making a rule live.

Once you have tested the rules you set up in the sandbox, and you are satisfied that they are working correctly, you can add the sandbox rules to the production JSON file. Note that, because the bot runs over the entire contents of the article namespace, it may take a long time before it finishes its current run, and restarts with the new rules; consequently, the lead time for the rules to take effect may be long.