User talk:WereSpielChequers/typo study

Thoughts
Hi WereSpielChequers, I agree that study has some flaws concerning its definition of what is considered a typo. Here are my thoughts: Thanks! GoingBatty (talk) 18:49, 30 December 2011 (UTC)
 * Would you get more accurate results by excluding words that start with a capital letter?
 * Would you get more accurate results by comparing articles to WP:AWB/T or Lists of common misspellings or Commonly misspelled words?
 * Is it possible to get WikiProject TypoScan (or something similar) restarted?
 * Hi GoingBatty. I expect that excluding words that start with a capital would make this more accurate but even if you made that "other than the first word in a sentence" you would then have substantial amounts of both overkill and underkill. At the moment the report is going to be very inaccurate because of overkill or false positives, you could potentially salvage that by calculating the overkill and weighting by that factor - it would be relatively easy to check a thousand examples and then multiply the result by the proportion it got right - my prediction would be less than 10% if it was run on the latest version of the articles. But it gets a tad hairier to do that if you have both overkill and underkill as you are compensating in two directions, and the false negatives are by definition the ones you can't count i.e. "unknown unknowns".
 * Comparing the articles to lists of typos and potential typos would be a lot more accurate but it would put us in the land of the known knowns and give us some false confidence, there are going to be typos out there that we have yet to put into AWB and these "known unknowns" are the ones that will have stuck around because we aren't so good at spotting them. I think that they would be the useful part of this but maybe the technology behind typoscan is what you really need.
 * As for restarting typoscan it relied on two people who seem to be on Wikibreak - one hasn't edited for several months the other has only a few dozen edits in that time. I'd suggest an email asking them if they might have time to run this in the near future and if not could they supply the code. If that gets you restarted fine if it gets you the code then a bot request will probably get a volunteer to take over. If it doesn't then after a decent interval a bot request might get someone to rewrite this.  Ϣere Spiel  Chequers  22:25, 30 December 2011 (UTC)
 * I posted a message on the three developers' talk pages. If this doesn't work, I'll submit a bot request.  Thanks!  GoingBatty (talk) 23:18, 30 December 2011 (UTC)