Wikipedia:Bots/Requests for approval/StatBot


 * The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Symbol neutral vote.svg Request Expired.

StatBot
Operator: Lavernius (talk)

Automatic or Manually Assisted: Automatic

Programming Language(s): Python

Function Summary: Collects statistics. Currently collects spelling mistake statistics and outputs them into a table, on its user talk page (When approved as a bot)

Edit period(s) (e.g. Continuous, daily, one time run): Daily

Already has a bot flag (Y/N): No

Function Details: Scan through pages for common spelling mistakes - As found on [of common misspellings/For machines]. After 50 to 100 pages, it outputs it to a table on its user-page. In the future it may be possible to list the page with the mistake, so that any human can look over it and correct it, if it is a legitimate error.

Discussion

 * Wouldn't this be better handled by working from a database dump? --Carnildo (talk) 19:57, 2 September 2008 (UTC)
 * They are currently on hold right now LegoKontribsTalkM 05:14, 7 September 2008 (UTC)
 * That doesn't mean they're unusable. If its only generating statistics, is there any reason to think there is going to be a significant difference between a couple months ago and now? Mr.Z-man 05:42, 7 September 2008 (UTC)
 * pages-articles.xml.bz2 was updated about two weeks ago, and, seems to be getting updated more frequently now, in my personal opinion, it's easier on all parties, to just use the dumps. SQL Query me!  06:09, 7 September 2008 (UTC)
 * Not all parties. It will be a lot harder for me. I just want to run the bot in the background. I often torrent files overnight, so I can leave it running then. I could also make it mark pages where there are a high percentage of spelling mistakes. Of course, it will not correct them, but it can generate a list of pages for humans to look at. --Lavernius (talk) 18:09, 29 September 2008 (UTC)
 * They should start those dumps again soon, but I would do something likely this every other week. Then again having Toolserver DB access might be better.   CWii ( Talk  22:15, 6 October 2008 (UTC)
 * I really don't think the benefit of running it an extra time a month makes up for the massive server cost of getting the text for every single article twice a month. Mr.Z-man 22:57, 6 October 2008 (UTC)
 * Good point. Monthly.   CWii ( Talk  19:55, 8 October 2008 (UTC)


 * --uǝʌǝsʎʇɹoɟʇs (st47) 03:10, 13 October 2008 (UTC)


 * OperatorAssistanceNeeded What's the status on this? Mr.Z-man 03:07, 2 November 2008 (UTC)

Mr.Z-man 21:13, 10 November 2008 (UTC)


 * The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.