User:GoldenRing/BRFA Draft

Lua error in Module:BRFA at line 17: Invalid page name.
Lua error in Module:BRFA at line 17: Invalid page name.

Operator:

Time filed: 12:49, Tuesday, June 25, 2019 (UTC)

Function overview: Automatically place word count templates on statements at WP:A/R/C and related pages.

Automatic, Supervised, or Manual: Automatic

Programming language(s): Python

Source code available: Not yet

Links to relevant discussions (where appropriate): This has been discussed on the clerks-l mailing list with no objections.

Edit period(s): Continuous.

Estimated number of pages affected: One or two pages per arbitration case.

Namespace(s): Wikipedia

Exclusion compliant (Yes/No): No

Function details: This bot will edit only pages in the (new) category Category:ArbCom pages with automatic evidence length headers. It will run every time any edit is made to a page in that category. If this is not acceptable, falling back to running once in 24 hours would be a reasonable substitute. Initially this category will contain only Arbitration/Requests/Case but we would envisage it soon being used on case evidence pages.

Each time a page in that category is edited, the bot will retrieve both the wikitext of that page and the rendered HTML of the same version.

Working on the HTML text:
 * For each h3 tag in the page:
 * Extract all DOM elements and text between the h3 tag and the next h3 tag, or the end of the page if there is no next h3 tag.
 * If the h3 tag has the  class, ignore this tag.
 * Count the visible words in the extracted elements and text. For the purposes of counting words, the following are ignored:
 * Any tag that jQuery considers.
 * Any tag with the  class.
 * Any  tag with the ,   or   classes.
 * Any  tag with the   class.
 * Any string matching the regular expression  (this removes timestamps).
 * Count the diffs in the extracted elements and text. Initially this would be any wikilink matching   or any URL with host   and a   or   parameter, though I anticipate that this will require some tweaking.
 * If the extracted text already contains a ArbCom evidence length header template, update the  parameter with the new word count and the   parameter with the new diff count.
 * If the extracted text does not already contain a ArbCom evidence length header and the word count exceeds 450, add the template with the calculated word count and diff count.
 * Save the page.

Discussion
I hope this will be relatively uncontroversial, but I'm certainly open to suggestions of how the above logic might be flawed or produce undesired results. The intention here is to lighten the load on the arbitration clerks. We do not anticipate automated notifications of users who exceed the word count; IMO this is better done by a person. GoldenRing (talk) 12:49, 25 June 2019 (UTC)