User:Shubinator/DYKcheck

DYKcheck is a JavaScript tool for checking Did You Know eligibility.

What DYKcheck is
The DYKcheck tool was made to help editors review nominations for Did You Know (DYK) that appear on Template talk:Did you know (T:TDYK). Nominators have complained that they get notified of problems too late and don't get a chance to fix the nomination. With DYKcheck, reviewers can very quickly spot some common errors and check nominations early in the process. The tool scans nominations against a slew of rules and shows the results. It's up to the reviewer to act on the results. DYKcheck scans for:
 * Article prose length
 * The date the article was created, and the user
 * If the article was created as a redirect
 * Moves from userspace within the last 100 edits
 * Stub tags both on the article page and its talk page
 * Previous appearances at Did You Know and In The News
 * Inline citations
 * Dispute and deletion tags
 * Start of expansion date, within the last 500 edits

In addition, DYKcheck has a rapid-fire nomination checker on T:TDYK. It has the capability to check nominations with multiple articles and multiple hooks. On top of checking everything above for each nominated article, DYKcheck also finds hooks, both the original and alternates, and calculates the hook length.

What DYKcheck is not
DYKcheck is not:
 * a replacement for humans. DYK nominations will always be checked by real people; DYKcheck is meant to make this more fun and less tedious.
 * a bot. DYKcheck cannot make edits, automated or otherwise.
 * "the law." DYKcheck tries to reflect consensus and does not prescribe it. I'll be happy to tweak the tool when consensus changes.
 * for checking when Template:Did You Know had one-fifth its prose; you can though (it hasn't been expanded 5x since creation).

Using DYKcheck
If you're a user on Wikipedia, you can easily install DYKcheck by adding the line below to your common.js file. After installation you may need to bypass your browser cache on pages you've visited before for the tool to appear in the toolbox. If installation worked, "DYK check" should appear in the Toolbox, below "Cite this page". Simply click on the "DYK check" link in the Toolbox to start a scan. If you're not using Firefox or you've got a fancy personalized layout, you may want to turn off fixing the sidebar using the fixedSidebar variable (see below for details), as it can cause T:TDYK to look strange.

Prose length
DYKcheck calculates the prose of the article. This part is a slightly modified version of the prosesize tool. It does not count tables, block quotes, headers, images, captions, templates, infoboxes, edit buttons, categories, references, lists, superscripts, or reference link numbers like [1]. The text is highlighted so the user can see what's being counted as prose. (In older versions of Internet Explorer, headers may be highlighted even though they aren't counted as prose.)

Article creation
DYKcheck finds who created the article and the date it was created. The only limit is if Wikipedia's servers have the information. For example, information on the creation of the Main Page is not available. In that case, DYKcheck will return the first edit recorded in Wikipedia's servers. The tool will also see if the article was created as a redirect. If so, the next three revisions of the article will be checked to see when it was made into a real article. After the userspace move and expansion check are also calculated, DYKcheck finds the latest of the three dates and compares it to today's date. The tool shows if the latest date is not within the past 10 days. Number of days is rounded down in favor of nominators.

Moves from userspace
DYKcheck scans the last 100 edits for a move from userspace. The script will find the last move from userspace if the article has been moved multiple times.

Stub tags
The tool looks for stub tags on both the article page and its talk page.

DYK and ITN
DYKcheck looks at the talk page of the article for the and  templates.

Inline citations
DYKcheck searches for inline citations (with tags ) and notes if there aren't any.

Dispute and deletion tags
DYKcheck scans the article for various tags. If the article is at Articles for Deletion, the script gives a link to the deletion discussion. Other tags are found by their images (full list is 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13). Speedy deletion tags are also detected.

Expansion
DYKcheck finds the start of expansion date, assuming the article is now 5x expanded. It checks the last 500 edits with a binary search algorithm. Because of this, DYKcheck assumes that the article has more or less been always increasing in size. Minor fluctuations won't matter, but if the article crosses the 1x line multiple times, the tool can get confused. If it does get confused, it will err against the nominator; it will never show a more recent date for 1x than the actual. Therefore, reviewers should  always  check expansion manually if DYKcheck says the article hasn't been expanded 5x. This part of the scan takes the longest because up to nine requests from the server are made sequentially.

Rapid-fire mode on T:TDYK
DYKcheck can scan individual nominations from the nominations page (T:TDYK). For each nomination it can process multiple articles and multiple hooks. Hook length is calculated without the leading "... ", but including "that" and "?". If "(pictured)" is included in the hook, it isn't counted, but any phrases along with it are (for example, (cat pictured) would add four characters to the hook length). The phrase "(pictured)", including the parentheses, must have the same formatting throughout (in other words, completely italicized) for DYKcheck to pick up on it. The hooks are detected by "... that " at the start and "?" at the end. The tool will not get thrown off by stray question marks in the nomination comments, but it will get thrown off by stray "... that "s, and stray question marks inside the hooks. Since DYKcheck finds the end of the hook by the question mark, it does not process multiple sentence hooks.

Since all of the nominations have the article titles as subsubheadings, DYKcheck searches for subsubheadings to get the individual nominations. If there are any subsubheadings that don't correspond to nominations, the tool will get confused. Getting the tool to do multiple article noms was actually trickier than I thought it would be. I could get it to find the article titles just fine. But when I iterated through them, the tool went haywire. It turns out that each article was being processed concurrently, resulting in nothing getting done. This didn't (and doesn't) matter for one article, because even though multiple pieces (for example, checking the talk page and checking for userspace moves) are being processed at the same time, they don't conflict with each other. With multiple articles though, each is trying to use the same methods. My solution was to have the articles be processed in order. A function checks when one article is finished and lets the next one go.

By default, DYKcheck only checks for expansion if the string "5x expan" shows up in the nomination. This saves resources, because the expansion check is by far the most costly. This is a configurable option (see below), so you can have it always check for expansion if you wish.

DYKcheck can also jump to nominations in sections of T:TDYK. If your URL shows a hash, DYKcheck will go to the section specified. For example, if the URL says  http://en.wikipedia.org/wiki/T:TDYK#Older_nominations , the tool will start processing nominations from Older nominations. This makes it easier to jump around on a page with more than 200 noms. It also means that if you make an edit to a nomination, DYKcheck will pick up at that nom.

Since T:TDYK is such a long page, it would be tedious to keep scrolling up and down to hit the "DYKcheck" button in the toolbox. DYKcheck's solution is to fix the sidebar in place if you're on T:TDYK. This is another configurable option (see below), so if you hate it, or love it and want to use it all the time, it's easy to change the settings. It looks best if you have a big screen. Thanks to Omegatron and the meta user styles page for the sidebar code.

Options
There are five options to configure DYKcheck.
 * Date format – ok, this is a bit cheesy...I used this to test if I could add options, and it stuck. The American format gives dates like February 16, 2009; the British format would show 16 February 2009.
 * Variable dateFormat, parameters "british" and "american"; default "american"


 * Hook length warnings – since some reviewers are more flexible than others in hook length, this can be used to change when warnings appear to your liking. Enter the number above which you want to see the warning color. For example, if I used hookLengthYellow = 190, hooks longer than 190 characters would be flagged yellow. If hookLengthRed is less than hookLengthYellow, no hooks will be flagged yellow; it'll be red or nothing.
 * Variables hookLengthYellow, hookLengthRed; default 200 and 220 respectively


 * Fixing the sidebar – it's quite annoying to keep scrolling up and down to hit the DYKcheck button, so this variable controls the sidebar. It can be set to always be fixed, fixed only on T:TDYK, or never fixed.
 * Variable fixedSidebar, parameters "always", "onttdyk", "never"; default "onttdyk"


 * Checking for expansion – the expansion check takes up the most resources, even with a binary search algorithm. Use this variable to control when you want the tool to check for 5x expansions on T:TDYK. With the default, DYKcheck only runs the expansion check if "5x expan" is present in the nomination.
 * Variable check5xNoms, parameters "always", "ifnom5x", "never"; default "ifnom5x"


 * Unlock – the DYKcheck tool only appears on article pages and user pages. If you want to use DYKcheck to scan other namespaces, like finding out when Template:Did You Know was at one-fifth its size, use this variable.
 * Variable unlock, parameters true and false

To use an option, enter variable = parameter; below the importScript line in your skin.js. The quotes, or lack of them, is important! For example, if I wanted the hook length yellow warning to appear at 215 characters, the red warning at 230 characters, and the sidebar to never be fixed, I would enter the lines below into my skin.js:

Code
You can see the code for DYKcheck at User:Shubinator/DYKcheck.js.

Ideas for development
These are ideas I might incorporate into the tool at some point. Feel free to add requests.
 * Highlight hooks detected by the tool
 * Show number of inline citations and number of unique citations
 * If an article passes the check, display a button for nominating it, which opens up an edit window for the appropriate section of T:TDYK and displays hook composition instructions along with a copy-and-paste-able nomination template filled out with as much information as the script can figure out (creator, nominator, new or expanded)
 * When showing 5x expansion statistics, add original prose size (or add an option to do so). For example, the second to last bullet point would then read "Assuming article is at 5x now, expansion began XX edits ago on at XXXX B (XX words)". APerson (talk!) 03:20, 17 November 2013 (UTC)
 * Feature turned on for the new Draft namespace.
 * How about we make it so that if a user wants to they can make the DYK check tell them when they first open a page if has DYK possibility (is it possible so you couldn't tell it was doing it and it didn't say anything if it isn't eligible)? --  NG39  (Used to be NickGibson3900) Talk 08:10, 21 August 2014 (UTC)
 * DYK has recently decided an article can be run again if it hasn't appeared in the past five years. Can DYKcheck be tweaked to include this? Valereee (talk) 13:09, 22 January 2024 (UTC)

Compatibility
DYKcheck works best with Mozilla Firefox. The script also works with Internet Explorer, Safari, Google Chrome, Opera, and Konqueror.

DYKcheck works on Wikimedia's secure server.