User:Monk of the highest order/ASE/code

This is the code I used to calculate Wikipedia articles which have only seen one human editor (usually the page creator). The last time I ran this was two years ago, it produced a list about 2000 entries long which since has been whittled down to about 100 or so - in other words all but one hundred have seen review. I'll probably run this script again soon, accounting for those articles already reviewed from the first run. When I do that, I'll clean these up, re-organize, give more meaningful filenames, etc.

pageparser_db.py
much of this is obsolete and no longer used... sqlite is rather no good for some high load things, I feel. :* just kidding, I'm just no good at sqlite optimization

wiki_pageset.py
for understanding and filtering sets of page history for bots, redirects, etc. parser.py is used to load and call the classes and functions in here, usually.

xml_to_pageset.py
The core function of making use of all that xml. parser.py is used to load and call the classes and functions in here, usually.

one_authorize.py
All-in-one for creating a tally of how many edits each author has made (on the assumption of a complete and non-redundant set of csv pagesets) and for removing pages from a pageset based on user editcounts parser.py is used to load and call the classes and functions in here, usually.

utility.py
I know, I know, more descriptive names, I'll give it one. This is just a set of toolbox functions I typically carry with me everywhere

serch.py
this is the way to update editor data from the website realtime. Incredibly slow, and server heavy. That's why you only use this on the list of pages which had a single editor as of your most recent version of the stub-meta-history file. Because then it is about 1/26th the number of files to check and it doesn't take several months and dozens of gb of transfer.

shell commands
a couple of shell commands I made use of... I need to integrate these into the code, even though it will take more lines when using python. But basically they seem random and unintuitive but they're mostly for quickly converting from pageset to title list or dealing with editcount stuff.

get_redirects.py
deals with the enwiki-pages.sql file to get a list of redirects for wiki_pageset.py usually called on its own with a little bit of customization.

list of bots
bot list used can be found here. tho you'll probably want the more recent version from the category page.