User:Harej/sandbox

s a n d b o x since 2005

= Notes =
 * Navigation header
 * Template:Navigation header
 * Template:Navigation header/doc
 * Template:Navigation header/styles.css
 * Module:Navigation header
 * Module:Navigation header/doc
 * Metrics dashboard
 * Template:Metrics dashboard
 * Template:Metrics dashboard/doc
 * Template:Metrics dashboard/styles.css
 * Module:Metrics dashboard
 * Module:Metrics dashboard/doc
 * Alert list
 * Template:Alert list
 * Template:Alert list/doc
 * Template:Alert list/styles.css
 * Module:Alert list
 * Module:Alert list/doc
 * Blocks
 * Template:Blocks
 * Template:Blocks/doc
 * Template:Blocks/styles.css
 * Module:Blocks
 * Module:Blocks/doc
 * Standard icons
 * Module:Standard icons
 * Module:Standard icons/doc
 * Workspace intro
 * Template:Workspace intro
 * Template:Workspace intro/styles.css
 * Template:Workspace intro/doc
 * Module:Workspace intro
 * Module:Workspace intro/doc
 * Preview link
 * Template:Preview link
 * Template:Preview link/styles.css
 * Template:Preview link/doc
 * Module:Preview link
 * Module:Preview link/doc
 * Participant box
 * Template:Participant box
 * Template:Participant box/styles.css
 * Template:Participant box/doc
 * Module:Participant box
 * Module:Participant box/doc

Missing articles

 * Find a Grave famous people/M/Mas
 * Two separate requests under the same title. The title is a blue link, but the linked article is a living person and neither of the requested subjects

Citation watchlist script
https://en.wikipedia.org/w/index.php?title=Capital_punishment_in_the_United_States&diff=prev&oldid=1203024750

https://en.wikipedia.org/w/api.php?action=compare&fromrev=1203018841&torev=1203024750&format=json This diff adds a new sentence to the article and also adds a new link to a source.

In this one diff these two sources are cited:
 * https://www.theguardian.com/world/2014/jan/17/dennis-mcguire-ohio-execution-untested-method-lawsuit
 * https://www.cbsnews.com/news/ohio-delays-executions-until-2017-over-lack-of-lethal-drugs/

Given a watchlist:


 * 1) Isolate each revision id and previous id from each line in the watchlist
 * 2) Check every five seconds if there is a revision id / previous id pair that hasn't been checked yet.

Given a pair (or batch of them):
 * 1) Use the "action=compare" endpoint.
 * 2) Screen out URLs with a regular expression (joke about now having an additional problem to solve for)
 * 3) Isolate domain names from URLs
 * 4) Check those sources against internal representation of RSP (hardcoded in script for now)
 * 5) If there's a hit, add an indicator next to the diff. (Red Triangle "!" for warn-list, yellow circle "?" for caution-list)

The problems I have with this approach:


 * Each user is doing the lookups and computations themselves, rather than going through a centralized service that does it for them

In the future when we have a centralized service doing this work, because we are doing something more complicated than screens against RSP,

The user script:
 * 1) Seeks consent to access the external service where data is coming from
 * 2) Scans each revision ID / prev ID on a watchlist
 * 3) Submits them to the service in batch
 * 4) Retrieves data
 * 5) Adds to HTML based on retrieved data

What about this "service"? If I set up WRDB as an ongoing, self-updated service, then all this service would need to do is check the revision ID in WRDB. At the moment, however, WRDB only supports a one-time build, and domain information is not directly stored in the database. However, this will help with support for non-URL references in the future.

Citation Watchlist testing
https://dailymail.co.uk

https://avensonline.org