User talk:Zorglbot/Shortpages

Unlisting deleted pages
Perhaps remove the ones have been deleted from the list entirely? —Centrx→talk &bull; 21:43, 5 November 2006 (UTC)
 * Great idea... - crz crztalk 21:59, 5 November 2006 (UTC)
 * Done ! (although I tried to run it manually and got some pages which refuses to load so it is not visible yet) Schutz 22:54, 5 November 2006 (UTC)

Links to page history and deletion log
Also, could a link to the history be added (and a link to the deletion log for the deleted pages). —Centrx→talk &bull; 22:07, 5 November 2006 (UTC)


 * (I understand that by "deleted" you mean "pages that have been deleted, tagged deleted and protected"). I put this on the TODO list; it should be pretty straightforward. One other thing that I had in mind was to split the table in two — in the first part, all entries which look suspect (e.g. those for which the size has not changed since the server generated the page and which can not be explained by the presence of a particular template), and in the second one, all other entries, mainly for reference. In the longer term, when/if the toolserver gets an uptodate copy of the en.w.o database again (it does not seem to be the case yet), I am thinking of running the script there and try to have results which are always up to date. Schutz 22:54, 5 November 2006 (UTC)

Links to history and deletion logs have been added (I just found out a small bug when the page contains quotes, but otherwise it's quite useful.) Schutz 08:06, 10 November 2006 (UTC)

Sorting by current size
A new feature: the list is now sorted by the current size of articles, and not the size they had when Special:Shortpages was cached. This way, articles most likely in need of attention are always at the top. Otherwise, all requests made on my talk page have been implemented, except for the automatic execution of the bot as soon as the page is regenerated; this will come later. Schutz 21:00, 5 December 2006 (UTC)

Updating
How does the bot decide to update or is it manually run? it seems to be updating more or less at arbitrary times more than once a day (which is good), where before it was run once per day at about the same time. —Centrx→talk &bull; 10:09, 4 January 2007 (UTC)
 * Currently, the bot is scheduled to run automatically once a day (in the morning UTC). If I am on Wikipedia and notices that Special:Shortpages has been updated recently (as was the case today), and many of the entries have been processed, I will rerun it manually (in particular if I just processed many entries myself and want to clear up the list). The trouble is that you can't know if many entries have been processed until you actually look at them individually, so that the bot could not decide by itself if it should run again or not (making the decision would include the same work as actually updating the page).
 * Although one workaround would be to just download Special:Shortpages and count if many red links have been added since the last run; if it is the case, it means someone has been processing the list and it may be worth updating it. I also still have to add the intelligent behaviour where the bot would run more than once a day but update the list in case there has been an update of Special:Shortpages. I put that on the backburner since there was almost no update in December but I should go back to it soon. Schutz 12:20, 4 January 2007 (UTC)

It seems your bot more often than not updates its page an hour or so before Special:Shortpages itself is updated in the morning. Maybe ask from devs what their update schedule for this special page is? jni 10:28, 25 January 2007 (UTC)
 * I tried to track the time of updates of Special:Shortpages before Christmas but did not get anything consistent; my guess was that the time needed for the task could vary greatly (probably depending on the server load). I have just run it manually now; once I managed to implement the intelligent behaviour described above, it should solve the problem (hopefully very soon). Cheers, Schutz 10:46, 25 January 2007 (UTC)

Soft Redirects
Could entries which are soft redirects please be coloured in too? Examples which appear in the list just now are: List of tongue-twisters & Glossary of truck jargon. Cheers, Davidprior 19:48, 9 February 2007 (UTC)
 * They should be recognised; if it is not the case, ping me next time you see it happening. Thanks, Schutz 20:59, 30 July 2007 (UTC)

Unrecognised dab templates
Pages marked with some disambiguation templates don't seem to be being recognised as dabs by the bot - some I've spotted are: Cheers, Davidprior 19:48, 9 February 2007 (UTC)
 * {3cc} (e.g. SH2)
 * {4cc} (e.g. EALA, WSET & EERC)
 * {hndis} (e.g. Glenn Moore (disambiguation))


 * It looks like this might be because of capitalization. —Centrx→talk &bull; 22:38, 9 February 2007 (UTC)
 * Similarly, isn't recognised either. cf. Oak Street --Closedmouth 15:49, 6 June 2007 (UTC)


 * Sorry for not having replied earlier; these are all recognised now. Schutz 20:58, 30 July 2007 (UTC)

More frequent updates
Now that the Specialpage is being updated more frequently, can we get more frequent updates of this bot page? Otherwise it is every day just a list of pages that have already been addressed, and one cannot wait for the next day for them to be filtered out because there is another new listing. —Centrx→talk &bull; 19:37, 8 March 2007 (UTC)
 * Checking, Schultz, who runs this bot, has not had a non-bot edit since the 3rd. So he may be on vacation, or a Wikibreak, or such.  We shall see. - TexasAndroid 19:43, 8 March 2007 (UTC)

It took a bit of time, but I have just modified the bot — not perfect yet, but already improved: from now on, it will be able to detect if the cache has been updated, and if yes, will update the page within one hour or so of the update. This is going to be tested in the next few days, please tell me if there is any problem.

I can easily run the bot more often, but don't really want to use too much ressource (one run means >1000 pages read, and several 100 Kb of text written). I could define some heuristic for the bot to decide when to run (e.g. run often just after a cache update, a bit less afterwards), will see what is possible. Another hope is that the service provided by this bot may move soon to the Toolserver, it which case we could get live updates. Stay tuned ! Schutz 20:55, 30 July 2007 (UTC)