User:DragonflySixtyseven/Maintenance stuff

Filename issues

 * Tool listing short filenames for renaming


 * Filenames shadowed on Commons


 * Category:Wikipedia orphaned files - mostly need either moving to Commons (sometimes with rename), or deletion


 * 1000 shortest filenames on enwiki

Circular refs

 * articles that cite Wikipedia

Refnames

 * search hits for bad refnames (put in searchbox entire nowiki'd string: insource:"//" - or alter the number. 0, 1, 2, 3, 01, 02, 03, etc.)

Warning: if you try to search for only " ref name=: ", it'll show you every instance of "ref name", not just the flawed ones.

Dangling referrer tags

 * search hits for dangling UTM Source links (put in searchbox entire nowiki'd string: insource:"&utm_source" )


 * search hits for dangling UTM Medium links (put in searchbox entire nowiki'd string: insource:"&utm_medium" )


 * search hits for dangling UTM Campaign links (put in searchbox entire nowiki'd string: insource:"&utm_campaign" )


 * search hits for dangling UTM Term links (put in searchbox entire nowiki'd string: insource:"&utm_term" )


 * search hits for dangling UTM Content links (put in searchbox entire nowiki'd string: insource:"&utm_content" )


 * Search hits for dangling CMPID links


 * Search hits for dangling DLVRIT links


 * Search hits for dangling NOTEBOOK_ID links


 * Search hits for dangling "Feedtype" links


 * Search hits for dangling "Feedname" links


 * } Search hits for dangling "?hp&action=click"

Metadata issues and link rot (archive.org and .is)

 * Search hits for "Download limit exceeded"


 * Search hits for "Forbes Welcome" (Forbes' landing page)


 * Search hits for 'Huge Domains', a domain squatter whose stuff needs to be replaced with links to the originals in archive.org (if .org doesn't have it, try .is - although they might not have it either)
 * Search hits for ".com is for sale(pipe)publisher" - again, indicator of domain squatters; largely but not entirely coincident with HugeDomains
 * Search hits for ".org is for sale(pipe)publisher"


 * Search hits for "BuyDomains"


 * Search hits for "Account Suspended"


 * Search hits for "Page cannot be found"


 * Search hits for "Page not found"


 * Search hits for "Page cannot be displayed"


 * Search hits for '404 error'


 * search hits for '404 not found'


 * search hits for 'Error 404'


 * search hits for 'title=404' (overlaps some with other entries in this list)


 * Search hits for "this website is for sale"


 * Search hits for "page error"


 * Search hits for "problem loading page"


 * Search hits for "account has been suspended" (warning, has some false positives)


 * Search hits for "untitled document"


 * Search hits for instances of Google being cited directly (false positives: Google Patents Archive and Google News)


 * Same as above, but with "http" instead of "https"


 * Search hits for "may have been moved or renamed"


 * Search hits for "Are you a robot" (Bloomberg's CAPTCHA page)

Timebinding

 * Search hits for 'currently'


 * Search hits for 'recently'

Opinions (beware of false positives!)

 * Search hits for 'clearly'


 * Search hits for 'obviously'

"Lead"/"Led" (beware of false positives!)

 * Search hits for 'have lead'


 * Search hits for 'has lead'


 * Search hits for 'had lead'


 * Search hits for 'were lead'


 * Search hits for 'been lead'


 * Search hits for 'are lead'


 * Search hits for 'being lead'


 * Search hits for 'having lead'


 * Search hits for 'is lead'


 * Search hits for 'which lead'


 * Search hits for 'that lead'

Eliminate false positives by replacing suitable instances of "lead" with " lead ". Don't change direct quotes - instead, add " [sic]", and then add with the comment. Don't change URLs, don't add "sic", but a comment in metadata works fine. Remember to properly terminate your comment with a "-->" or you'll make things worse. Yes, comments don't break the URLs, but they might interact poorly with archive bots.


 * Please keep in mind that many editors are also replacing "lead" with "lede" without actually understanding the reasons why.


 * According to [ http://grammarist.com/usage/lead-lede/ ]: "Long ago the noun lede was an alternative spelling of lead, but now lede is mainly journalism jargon for the introductory portion of a news story—or what might be called the lead portion of the news story. Strictly speaking, the lede is the first sentence or short portion of an article that gives the gist of the story and contains the most important points readers need to know."


 * Also see [ http://howardowens.com/lede-vs-lead/ ], [ http://lisawaananen.com/noted/2014/09/12/lead-vs-lede-and-tradition-vs-substance/ ], [ https://www.grammarphobia.com/blog/2012/10/lede-time.html ], and [ https://thebettereditor.wordpress.com/2012/12/08/should-we-bury-this-lede/ ] --Guy Macon (talk) 02:24, 1 June 2019 (UTC)


 * Good spot. I just searched and fixed. I found pleasingly few obvious errors, though I left several cases where either word would fit but I'd have used "lead". Certes (talk) 10:43, 1 June 2019 (UTC)


 * Misuse of "lede" is rampant on pages discussing Wikipedia edits, but of course per WP:TPOC we don't fix those. Good to hear it isn't common in article space. --Guy Macon (talk) 13:13, 1 June 2019 (UTC)

Search-engine sloppiness

 * Search hits for "www.google.com/search"
 * [https://en.wikipedia.org/w/index.php?search=insource%3A%22%3Freferrer%3Dhttp%22&title=Special%3ASearch&profile=advanced&fulltext=1&advancedSearch-current=%7B%22namespaces%22%3A%5B0%5D%7D&ns0=1 Search hits for "?referrer=http"
 * } Search hits for 'bing.com']