Wikipedia talk:AutoWikiBrowser/Typos/Archive 2

Development list
Would it be useful to have a page where you can test new regexes that will be loaded either with, or instead of, the main typo list, so you can debug live/reduce chances of causing problems to live lists?

—  Ree dy  15:51, 2 August 2008 (UTC)
 * I think testing should be done in Find&Replace. However, it would be FKING AWESOME if there was an "export to RETF" feature of Find&Replace once I'm done testing. --mboverload @  17:51, 2 August 2008 (UTC)
 * Just thinking, it wouldnt be difficult to have it copy it to clipboard as a typo style rule as an option.... —  Ree dy  20:44, 22 August 2008 (UTC)
 * That would be most neato.--mboverload @ 19:51, 25 August 2008 (UTC)

zero-width assertions and performance
I think that starting a search string with a zero-width look-ahead and then the desired search string, usually used to exclude certain proper names, is harder on performance than either avoiding the zero-width assertions or using a zero-width look-behind assertion after the desired search string. Putting it at the beginning doubles the effort on things like Tremelo: at each check point (in this case, between every letter), see if Tremelo is the next string; if not, see if tremelo/Tremelo/tremelos/Tremelos is the next string; if so, replace the middle with remolo. I replaced it with a (buggy, but now fixed) version with no zero-width assertions, but the look-behind version would have been: see if tremelo/Tremelo/tremelos/Tremelos is the next string; if so, and it doesn't end with an s, make sure it wasn't Tremelo; if it wasn't, replace the middle with remolo. So the extra check is only made once AWB has gotten a possible match, not on every spot.

There are a couple of other places where a similar change could be made, but I remember some possible problem with the look-behinds and some of the other tools that use this list, so I'd like to open it up for discussion first. -- JHunterJ (talk) 11:47, 25 August 2008 (UTC)
 * Hmm, the zero-width look-aheads at the start of a rule are very useful, and performance of the typo list as a whole seems good to me, so I would be cautious about changing them. Could you provide an example of how the Tremelo rule would work with a look behind, as the current rule after your change and my fix looks confusing, though it works correctly? Thanks Rjwilmsi  12:07, 25 August 2008 (UTC)
 * Something like this:
 * So, match either T or t, then remelo, then either an s at the end of the word, or if we're already at the end of the word with "remelo", look back and make sure we didn't just see Tremelo. The only time we stop to look around is after we've already matched either Tremelo or tremelo. -- JHunterJ (talk) 12:18, 25 August 2008 (UTC)
 * That seems to work just as well (though XML markup is wrong...). If it's true then the change is simply to move all  to , and the question is whether this causes problems for other tools using the typo list?  Rjwilmsi  12:44, 25 August 2008 (UTC)
 * Yours looks behind for "Tremelo" even in the case where we might have found an s at the end of the word. It should be possible to look behind only when the looked-for word could possibly appear, but either should perform better than starting with the look-ahead. -- JHunterJ (talk) 12:59, 25 August 2008 (UTC)
 * One other option to use the current version with possibly less confusion:
 * ?| is a "branch-reset" grouping, so each alternative therein should start numbering at 1. (Perl 5.10.0 and later). I can test it this evening (EST) if no one does so before then. -- JHunterJ (talk) 16:52, 25 August 2008 (UTC)
 * It looks good for this example but I thought we wanted a general solution/standard. I think my suggestion is the most simple/general so far, but we need some performance data to see if it improves on the current entries using (?!blah). Rjwilmsi  17:14, 25 August 2008 (UTC)
 * Why does the solution need to be generic? I'd prefer the slight complication of inserting the look-behind only where it can match. -- JHunterJ (talk) 17:18, 25 August 2008 (UTC)
 * (?| ... ) is an unrecognized grouping construct, according to the regexp tester in AWB. So the last bit is moot. -- JHunterJ (talk) 00:09, 26 August 2008 (UTC)
 * It looks good for this example but I thought we wanted a general solution/standard. I think my suggestion is the most simple/general so far, but we need some performance data to see if it improves on the current entries using (?!blah). Rjwilmsi  17:14, 25 August 2008 (UTC)
 * Why does the solution need to be generic? I'd prefer the slight complication of inserting the look-behind only where it can match. -- JHunterJ (talk) 17:18, 25 August 2008 (UTC)
 * (?| ... ) is an unrecognized grouping construct, according to the regexp tester in AWB. So the last bit is moot. -- JHunterJ (talk) 00:09, 26 August 2008 (UTC)

Double word
"My my" seems common enough to leave? Rich Farmbrough, 22:22 24 August 2008 (GMT).
 * Or perhaps first replaced by "My, my"? -- JHunterJ (talk) 11:54, 25 August 2008 (UTC)
 * IDK, perhaps, the Abba song is where I've bumped up against it. Rich Farmbrough, 14:23 26 August 2008 (GMT).
 * [[Image:Yes_check.svg|20x20px]] Remove "my" and "on" from the doubled word check. -- JHunterJ (talk) 23:38, 26 August 2008 (UTC)

Exclusion of words from title
This is another obvious way to avoid false positives. See for example nor'easter. Rich Farmbrough, 18:33 26 August 2008 (GMT).
 * Not the (perfectly reasonable) fix suggested, but I did except nor'easter from the easter match. -- JHunterJ (talk) 23:55, 26 August 2008 (UTC)
 * I highly support this feature being added to AWB. --mboverload @ 04:45, 27 August 2008 (UTC)

Proposal for simplification of some rules
The typo rule standard seems to be to explicitly match all endings of a word when the typo is in the start/middle of a word. It seems to me we could simplify such rules. Example:

Here it's clear that the error is a missing 'r' in the middle of the word and there's no ambiguity about which word this applies to, so the following would achieve the same result (edit summary would stay the same): I think if we adopted such a convention for such situations (some if not a majority of the typo rules) by using  or   we would benefit from: shorter rules, easier maintenance and easier addition of new rules. I would like feedback from others as to whether this seems like a good idea, particularly if there would likely be any performance change to the rules? Thanks Rjwilmsi  12:15, 25 August 2008 (UTC)
 * I agree. Or even


 * -- JHunterJ (talk) 12:23, 25 August 2008 (UTC)
 * Sounds like an idea. Rjwilmsi, AWB has typo profiling... It might be worth me creating a temporary page with a few of these changed rules, and time them against the old version. —  Ree dy  12:43, 25 August 2008 (UTC)
 * Yes please Reedy, and the lookbehind / lookahead change in the above section too, if possible. Thanks Rjwilmsi  12:45, 25 August 2008 (UTC)

Profiling
Against Alexandria (Tends to be MaxSem's standard test case, long article)

[44, \b(I|i)ntefer(\w+) > $1nterfer$2] [44, \b(I|i)ntefer([a-z]+)\b > $1nterfer$2] [43, \b(I|i)ntefer(e[ds]?|ence|ing)\b > $1nterfer$2]

[41, \b(I|i)ntefer(\w+) > $1nterfer$2] [41, \b(I|i)ntefer([a-z]+)\b > $1nterfer$2] [41, \b(I|i)ntefer(e[ds]?|ence|ing)\b > $1nterfer$2]

[45, \b(I|i)ntefer([a-z]+)\b > $1nterfer$2] [44, \b(I|i)ntefer(\w+) > $1nterfer$2] [44, \b(I|i)ntefer(e[ds]?|ence|ing)\b > $1nterfer$2]

[42, \b(I|i)ntefer(\w+) > $1nterfer$2] [42, \b(I|i)ntefer([a-z]+)\b > $1nterfer$2] [42, \b(I|i)ntefer(e[ds]?|ence|ing)\b > $1nterfer$2]

When the regexes have been run for the first time, they are quicker than the original run, but have the same execution time.

It would seem, that according to that, the execution time is slightly better (its in milliseconds) on the more verbose one

—  Ree dy  17:18, 25 August 2008 (UTC)
 * Hmm, I'd have said that within the measurement error there's no difference between the three. Perhaps we should try a longer one, where the advantage of simplification would be greater. Maybe:
 * Thanks. Rjwilmsi  17:30, 25 August 2008 (UTC)
 * I was thinking that myself to be honest. Is it a case of replacing the capture groups with \w+ and [a-zA-Z]+? (just thinking that it would be case sensitive as it is) —  Ree dy  18:03, 25 August 2008 (UTC)
 * Yes, I would envisage using a \w+ or \w* as appropriate to make suitable rules shorter and more readable, make it easier to add new rules and potentially to catch endings that have been missed to date, while supporting all existing fixes. By using \w+ rather than just cutting off the regex, we will display the complete word changed in the edit summary.
 * If there are no objections I'll start making a few changes tomorrow. Rjwilmsi  18:32, 25 August 2008 (UTC)
 * As a side thought, it will help reduce the size of the page to be loaded aswell, which cant be a bad thing. —  Ree dy  20:27, 27 August 2008 (UTC)
 * As a side thought, it will help reduce the size of the page to be loaded aswell, which cant be a bad thing. —  Ree dy  20:27, 27 August 2008 (UTC)

broke again
It appears it is broken again. error picture--Rockfang (talk) 22:36, 27 August 2008 (UTC)
 * It's fixed again ;) Rjwilmsi  23:40, 27 August 2008 (UTC)

Avoiding false positives on scientific (Latin) names
One of the most common false positives I come across seems to be matching on lowercase words in scientific (Latin) names. An example would be Blah carolina (what Blah is doesn't matter here). These are matched by rules like  as the regex   includes a. So the rule wants to be  but not. I'm struggling to find a neat way to do that beyond an explicit set of  (since there are many entries that could do with this change). Anybody have any ideas? Rjwilmsi 11:29, 29 August 2008 (UTC)
 * Add a zero-width negative look-ahead to make sure the next character isn't an apostrophe: . But this will prevent a match on "I forgot to capitalize south carolina's initials." So make it look for two apostrophes:  . -- JHunterJ (talk) 12:37, 29 August 2008 (UTC)
 * Interesting, I'll test that later. Rjwilmsi  12:44, 29 August 2008 (UTC)

I have been putting these in Tuxedo carolina templates. But it's not satisfactory. I'd rather have a scientific name mark-up. Rich Farmbrough, 19:36 1 September 2008 (GMT).
 * When I did that I got told off by WP:PLANT people! JHunterJ's  works well though.  Rjwilmsi  23:04, 1 September 2008 (UTC)
 * A separate "scientific name" markup would be nice. We still give up catching "I forgot to capitalize bastard out of carolina." with the current solution. -- JHunterJ (talk) 02:27, 2 September 2008 (UTC)

Typo bug
distictly goes to districtly, when context makes it clear it should be distinctly. Should this go here, or in the main AWB bugs section? gnfnrf (talk) 18:57, 30 August 2008 (UTC)
 * [[Image:Yes_check.svg|20x20px]] Here's the right place. I've added a new rule to cover 'distictly'. Thanks Rjwilmsi  00:50, 31 August 2008 (UTC)

qualified → qualifed
AWB tries to do the following: qualified → qualifed I can't figure if qualifed is even a word, it doesn't look right. Thanks. §hep  •   ¡Talk to me!  17:57, 2 September 2008 (UTC)
 * [[Image:Yes_check.svg|20x20px]] My changes earlier broke this fix. It's correct now after this fix. Thanks Rjwilmsi  18:23, 2 September 2008 (UTC)

"approxiatemately" → "approximatemately"
The title of this section is a regex bug I just found. I might come back and fix it myself later, but I'm simply noting it here for now. { { Nihiltres | talk | log } } 16:53, 3 September 2008 (UTC)
 * [[Image:Yes_check.svg|20x20px]]Thanks, I've fixed it. Rjwilmsi  17:00, 3 September 2008 (UTC)
 * Great, that was fast. :) { { Nihiltres | talk | log } } 17:15, 3 September 2008 (UTC)

Das ist borked
I'm getting a duplicate rule error in AWB while trying to load errors. I can screenshot it if needed.--Rockfang (talk) 00:19, 5 September 2008 (UTC)
 * [[Image:Yes_check.svg|20px]] Fixed yesterday. Rjwilmsi  00:22, 7 September 2008 (UTC)

spanish word for "effect"
As far as I know, the Spanish word for "effect" is "efecto". Currently, the typo part of AWB is seeing "Efecto" and suggesting it be changed to "Effecto". I'm not sure if/how other languages are tied into the typo fixing, but we may want to remove this fix.--Rockfang (talk) 20:28, 5 September 2008 (UTC)

The same is happening for the spanish word for "different". It sees "Diferente", and suggests "Differente".--Rockfang (talk) 20:33, 5 September 2008 (UTC)
 * The idea with foreign text is to use the effecto language tags, then the English typo fixes aren't applied to it. Rjwilmsi  20:39, 5 September 2008 (UTC)
 * Did you mean efecto ? That is the proper Spanish spelling of the word.--Rockfang (talk) 20:43, 5 September 2008 (UTC)
 * Doh! Yes, though I was just providing an example of the template syntax. Rjwilmsi  21:25, 5 September 2008 (UTC)
 * Thanks for the reply and the info. I didn't even know of that template.--Rockfang (talk) 21:30, 5 September 2008 (UTC)

"annoucned" → "announcned"
AWB is currently suggesting the above change. This should probably be fixed/changed.--Rockfang (talk) 21:47, 6 September 2008 (UTC)
 * [[Image:Yes_check.svg|20px]] Well spotted. This edit will catch it. Example of successful edit. Rjwilmsi  00:21, 7 September 2008 (UTC)

error in spellchecker
erroneously changes "spacious" to "spacitous" and "capacious" to "capacitous" Ling.Nut (WP:3IAR) 05:55, 4 September 2008 (UTC)
 * I found the error. Removed the following for scrutiny:
 * 
 * Ling.Nut (WP:3IAR) 06:01, 4 September 2008 (UTC)
 * [[Image:Yes_check.svg|20x20px]] Fixed with two rules, one for aciy -> acity, one for acitous -> acious. -- JHunterJ (talk) 11:14, 5 September 2008 (UTC)
 * Do you really want that "p" in the Replace field?--BillFlis (talk) 12:56, 5 September 2008 (UTC)
 * Nope; I fixed just now. Thanks! -- JHunterJ (talk) 13:53, 6 September 2008 (UTC)

Also, it erroneously changes "acompany" to "anccompany" rather than "accompany" (diff). — Jeff G. (talk&#124;contribs) 11:37, 7 September 2008 (UTC)
 * [[Image:Yes_check.svg|20x20px]] Fixed. Thanks Rjwilmsi  11:43, 7 September 2008 (UTC)

writter → writer
If there is a rule for the above, it's not working. If there is not, please make one. Thanks! — Jeff G. (talk&#124;contribs) 11:53, 7 September 2008 (UTC)
 * [[Image:Yes_check.svg|20px]] Existing rule expanded. Thanks Rjwilmsi  13:00, 7 September 2008 (UTC)

emminent
emminent currently corrects to eminent; sometimes it should become imminent instead; hypothetically, it could also be a mistaken immanent. Even though emminent is never correct, we might need to delete it. Or consider ways to eliminate the false fixes. "an emminent" might work to still catch some eminents with little chance of intending imminent, for example. -- JHunterJ (talk) 16:34, 25 August 2008 (UTC)
 * Pity we can't offer the editor choices. Rich Farmbrough, 14:28 26 August 2008 (GMT).
 * There is a feature request for something like that... —  Ree dy  17:04, 27 August 2008 (UTC)
 * I requested a feature like this on 19 September last year, but the suggestion was never picked up

Colonies Chris (talk) 10:57, 12 September 2008 (UTC)

Proper name getting mangled
"Beliveau", as in Jean Béliveau/"Jean Beliveau", should not be changed to "Believeau". Thanks, { { Nihiltres | talk | log } } 17:11, 10 September 2008 (UTC)


 * What if we just change each "Beliveau" to "Béliveau" (with accent acute)?--BillFlis (talk) 17:18, 10 September 2008 (UTC)
 * Exception added for Beliveau in the meantime. We could probably add the accent fix too. Rjwilmsi  17:24, 10 September 2008 (UTC)


 * I had heard that for whatever reason, the diacritics for names like this are being excluded from some pages (e.g. Montreal Canadiens); I don't think AWB should be making the correction for the accent (despite that I think we should have the accents; consensus overrules my preference). { { Nihiltres | talk | log } } 12:44, 11 September 2008 (UTC)
 * Yes, there's an agreement over at WP:HOCKEY that players' names don't show accents in the NHL context, because the NHL jerseys don't use them. But they're used in the player's own article. So AWB can't do it as a general fix. Colonies Chris (talk) 11:34, 12 September 2008 (UTC)

dispicable
"dispicable" should probably become "despicable", not "despairicable"

I didnt make the change anyway, as it was in quotes on the target page

—  Ree dy  12:34, 11 September 2008 (UTC)
 * [[Image:Yes_check.svg|20px]] fixed. Rjwilmsi  17:18, 11 September 2008 (UTC)

availble → availab$2
Could someone fix whatever's doing this please? Colonies Chris (talk) 10:42, 12 September 2008 (UTC)
 * [[Image:Yes_check.svg|20px]] Brackets were missing. Fixed. Rjwilmsi  11:12, 12 September 2008 (UTC)

Buddah
It has recently come to my attention that AWB recommends a correction of Buddah to Buddha. This is a very problematic correction because of the famous record label, Buddah Records, often shortened to just Buddah. There are probably a few hundred pages which mention Buddah Records, and because of this I'd like to ask that this correction be removed from the list. Chubbles (talk) 16:23, 13 September 2008 (UTC)
 * [[Image:Yes_check.svg|20px]] Exception added so that Buddah Records isn't changed. Rjwilmsi  17:28, 13 September 2008 (UTC)
 * I changed the expression from
 * Make sure the current position doesn't lead into Buddah Records
 * Look for Buddah
 * to
 * Look for Buddah
 * Make sure the current position doesn't lead into Records
 * which should be better-performing, as it has about half as much work to do. -- JHunterJ (talk) 19:05, 13 September 2008 (UTC)

Graph fixes
Graph looks like a good candidate for wholesale replacement, without trying to identify all the prefixes and suffixes. Can we fix any instance of "grpah", regardless of surrounding letters, or is there a false positive that that would hit? -- JHunterJ (talk) 20:17, 16 September 2008 (UTC)
 * We'll soon find out ;) Rjwilmsi  20:27, 16 September 2008 (UTC)

displease
This is currently replacing unpleased with displease$2.--balloonguy (talk) 21:56, 17 September 2008 (UTC)
 * [[Image:Yes_check.svg|20px]] —  Ree dy  22:06, 17 September 2008 (UTC)

Wonderful resource for spelling errors uncaught by AWB

 * Go look at History of Ethiopia. I have already manually changed several spelling errors that AWB didn't catch (look at the diff of my AWB edit as well; I manually changed a few there as well). If you keep looking, you'll probably spot more. Ling.Nut (talk&mdash;WP:3IAR) 01:43, 20 September 2008 (UTC)
 * [[Image:Yes_check.svg|20px]]sceptre is a word. New typos are "enroach", "asecended", "ephipany" added to typo list; Ethiopia article fixed. Thanks Rjwilmsi  08:45, 20 September 2008 (UTC)

variations on "accede" changed to "ascend"; rmvd offending regex
Here ya go. I'd fix it myself, but I'm busy washing dishes with WP:AWB:




 * Ling.Nut (talk&mdash;WP:3IAR) 12:57, 20 September 2008 (UTC)
 * [[Image:Yes_check.svg|20px]] Fixed -- JHunterJ (talk) 13:14, 20 September 2008 (UTC)

Noth shouldnt be changed to North
As per usage in Deuteronomist, its legit.

Or at least \bNoth\b shouldnt be, others are alright to be changed

—  Ree dy  11:19, 11 September 2008 (UTC)
 * [[Image:Yes_check.svg|20px]] Fixed. -- JHunterJ (talk) 11:50, 11 September 2008 (UTC)
 * Don't overlook the comment I posted few minutes ago (below), but another noth-north problem at Australian English. Ling.Nut (talk&mdash;WP:3IAR) 11:02, 23 September 2008 (UTC)

Aberravon
Something, probably this regex



is wrongly converting Aberavon to Aberravon, but I'm not sure how to fix it. Colonies Chris (talk) 18:56, 24 September 2008 (UTC)
 * [[Image:Yes_check.svg|20px]] exception added. Thanks Rjwilmsi  19:55, 24 September 2008 (UTC)

quatermain & quaternion --> quartermain & quarternion

 * Attempt to fix "quater-->quarter" hoses words that legitimately contain "quater-". Ling.Nut (talk&mdash;WP:3IAR) 10:45, 23 September 2008 (UTC)
 * [[Image:Yes_check.svg|20px]] Exceptions added. Thanks Rjwilmsi  11:22, 23 September 2008 (UTC)
 * ...doesn't catch plurals; see Cross product forex. Ling.Nut (talk&mdash;WP:3IAR) 08:31, 27 September 2008 (UTC)
 * [[Image:Yes_check.svg|20px]] does now. Rjwilmsi  08:56, 27 September 2008 (UTC)

Embarras River

 * obvious problems. Ling.Nut (talk&mdash;WP:3IAR) 15:13, 25 September 2008 (UTC)
 * [[Image:Yes_check.svg|20px]] exception added Rjwilmsi  17:04, 25 September 2008 (UTC)

medially→medically
The above change is in the list of typos. I suggest it be removed as medially is a word.--Rockfang (talk) 21:23, 4 October 2008 (UTC)
 * [[Image:Yes_check.svg|20px]] Exception added. Thanks Rjwilmsi  22:36, 4 October 2008 (UTC)

passable
The ending "-(s)ible" incorrectly converts "passable/-ably/-ability" to "passible/-ibly/-ibility". I think it's fixable by replacing  with , but I'm not confident enough to do it. Am I close? —S MALL JIM   22:15, 4 October 2008 (UTC)
 * [[Image:Yes_check.svg|20px]] Yes. Change made. Thanks Rjwilmsi  22:36, 4 October 2008 (UTC)

Xbox
\b(?i)xbox\b

Wouldnt using something like the above, make more sense? Ie do it all case insensitive, and therefore it'll match any of the variations (save having various hardcoded versions). —  Ree dy  10:23, 5 October 2008 (UTC)
 * As long as it doesn't match the correct variation (and isn't a performance problem). -- JHunterJ (talk) 12:03, 5 October 2008 (UTC)
 * perhaps -- JHunterJ (talk) 12:09, 5 October 2008 (UTC)

AWB replaces noting with nothing

 * [[Image:Yes_check.svg|20px]] Exception added. Thanks Rjwilmsi  15:21, 5 October 2008 (UTC)

This page is HUGE!
My Opera hangs for at least 10 seconds when loading the typos page. This is intolerable, let's take some measures to reduce it. I've already tweaked AWB to use Gzip compression when loading typos, but the list is still huge, and this has no effect on people who maintain or view the list from their browsers.
 * 1) I've also dropped the requirement for word="foo" attribute to be present in the rules in the next version of AWB, so removing them all will somewhat reduce the size, but will make it harder to understand what a rule is supposed to do.
 * 2) We could also replace those fancy  which should only capitalize thing like tHe and THere. I'm guessing it is matching case insensitively, which is going to mess up a number of other rules as well. I'm not familiar with the software, but perhaps this is an option that can be toggled? --ThaddeusB (talk) 03:56, 19 January 2009 (UTC)
 * I personally have no idea. I started a new topic on the wikEd talkpage and link to here. Hopefully you guys can figure out what's going on. Thanks for all your work on these tools! --Armchair info guy (talk) 04:01, 19 January 2009 (UTC)
 * ThaddeusB was right and I have fixed this in the latest release of wikEd. Cacycle (talk) 13:43, 26 January 2009 (UTC)

Error
I think something is wrong with this one:. Plrk (talk) 21:01, 27 January 2009 (UTC)


 * No wait, it is supposed to change "ghandi" to "Gandhi"? Is that really a good idea? Plrk (talk) 21:02, 27 January 2009 (UTC)

Restauranteur
AWB is currently suggesting that restauranteur be changed to restaurateur. At least according to Wiktionary, both appear to be valid spellings of the word. It may be prudent to remove the word change from the list.--Rockfang (talk) 17:19, 29 January 2009 (UTC)
 * I have just discovered this page, and removed that rule. I knew about the problem because I had earlier  such a change.  —AlanBarrett (talk) 09:02, 31 January 2009 (UTC)
 * ✅ Rules corrected to allow 'restauranteur' as a correct spelling variant. Rjwilmsi  09:50, 31 January 2009 (UTC)

"cataloged" changed to "catalogued"
changed "cataloged" to "catalogued", which I think should not be done, because both spellings are acceptable. However, I can't find the rule that would have made the change. Can anybody find the rule, and either fix it (of this was a false positive for a rule that has a legitimate purpose)? —AlanBarrett (talk) 09:13, 31 January 2009 (UTC)
 * ✅ 'Cataloged' is the US variant, per Concise OED. I think that 'correction' was introduced by mistake. List corrected. Thanks Rjwilmsi  09:40, 31 January 2009 (UTC)

Nestin
Nestin shouldn't be changed to nesting, see Nestin (protein) --Closedmouth (talk) 07:57, 3 February 2009 (UTC)
 * ✅ Rule removed. Thanks Rjwilmsi  08:59, 3 February 2009 (UTC)

Two incorrect typo "fixes"
In the article Greenwich, Connecticut, "disibilities" was changed to "dissibilities". It should be "disabilities".

In the article Jail Killing Day, "acquitted" was changed to "acquit". It should have been left as it was.

Thanks. --Auntof6 (talk) 06:47, 16 February 2009 (UTC)
 * ✅ New rule added for first problem, second was just non-printing character in middle of word in article, no change to typo rules needed for it. Thanks Rjwilmsi  08:15, 16 February 2009 (UTC)

Date Fix
How do you use AWB to change "2006-05-07" to "May 7, 2006". I've seen many pages use the former date format and it's a little unclear (e.g. List of Eureka Seven episodes). Thanks. - plau (talk) 06:41, 8 March 2009 (UTC)

"is is" to "it is"
I've seen AWB catch this typo frequently, however, the solution has never been to make that change. It has always been to just remove the first is (aka "is is" to "is"). Any way that can be fixed? --Kbdank71 20:23, 25 March 2009 (UTC)
 * ✅ Rich Farmbrough changed it just recently here. Rjwilmsi  08:04, 26 March 2009 (UTC)
 * Perhaps I wasn't clear. I meant to request undoing that change, as every time I've come across "is is", the correct typo fix is to just drop one is.  I have never encountered a situation when "it is" was the correct solution.  --Kbdank71 15:23, 27 March 2009 (UTC)
 * Examples:       --Kbdank71 15:43, 27 March 2009 (UTC)

Two common misspellings I've come across that are not in the list
Firstly there's "enoble" (91 article hits), which should be "ennoble".

Then there's "meterorite" which should be "meteorite" however I'm not so sure about this one, it could just be an American/British thing.

I'm very unfamiliar with how to add these, I haven't learned the proper rules/expressions yet and don't want to screw it up so can someone add these please? -- OlEnglish (Talk) 23:22, 27 March 2009 (UTC)
 * ✅. I added those two here. -- JHunterJ (talk) 23:30, 27 March 2009 (UTC)
 * "meterorite" gets only one hit, a redirect to the article with the correct spelling. I think it ought not to have been added.--BillFlis (talk) 19:46, 2 April 2009 (UTC)
 * OlEnglish fixed several of them on March 25. -- JHunterJ (talk) 19:51, 2 April 2009 (UTC)

Sadly passed
Does AWB typos extend to dealing with unnecessary phrases such as "Sadly passed" (6,791 hits) "Passed away" (65,434) "sadly passed away" (4,909) and "Sadly died" (6,986), the vast majority of which really want to say "died"?

If not, can you advise of anywhere that does deal with this sort of issue; thanks --Tagishsimon (talk) 20:17, 26 March 2009 (UTC)
 * I keep my own pet-peeve wordy phrases in my replacement list in AWB, but mostly they're not in AWB Typos unless they're wrong (as opposed to just verbose or over-written). -- JHunterJ (talk) 20:35, 26 March 2009 (UTC)


 * Feel free to add these. I noticed AutoWikiBrowser/Typos and thought these might be candidates for that, currently empty, space. --Tagishsimon (talk) 20:39, 26 March 2009 (UTC)


 * I removed the passed away additions. As I mentioned, there is nothing incorrect about saying someone passed away; since it isn't incorrect, it can't be corrected. -- JHunterJ (talk) 23:28, 27 March 2009 (UTC)

I also removed corrections for "at a young age", "sady died" (sic), and "tragically died", for the same reasons. "at a young age" -> "young", in particular, will result in awkward sentences (see http://www.google.com/search?q=%22at+a+young+age%22+site%3Aen.wikipedia.org ). And removing adverbs from sentences, while often useful from an editorial standpoint, is not typo fixing. -- JHunterJ (talk) 11:41, 1 April 2009 (UTC)


 * I think your removals are not justified by your explanation: "since it isn't incorrect, it can't be corrected". There is clear guideline support for doing away with the death euphemisms, above, in Words to avoid. Your "is not typo fixing" does not seem to mesh with AutoWikiBrowser/Typos. Like the person who added the phrases, like User:BillFlis, who probably knows his way around this place with his 2,700 odd contributions, I would wish to keep these. Perhaps you would consider reinstating them. --Tagishsimon (talk) 00:11, 2 April 2009 (UTC)

For what its worth, I agree that at least some of these "corrections" are appropriate. While it may not be technically incorrect to say 'passed away' it is against the style guide, which is a good enough reason to change it as far as I'm concerned. After all, many of our corrections already in use aren't, strictly speaking, "typo fixes."

I would tentatively support the following changes, but likely no others (as I feel other phrases may lead to undesirable changes). However, I could be persuaded against them if they are shown to cause false positives/undesirably changes. "passed away" (all lower case only) -> "died" "gave his(/her) life" -> "died" "died tragically" / "tragically died" -> "died"

--ThaddeusB (talk) 01:26, 2 April 2009 (UTC)
 * If the typo fixing rules can be used to assist in compliance to the agreed style guides then let's do it. Though as ThaddeusB says, if there are too many false positives we might have to remove or restrict the entries just like with any other typo rule. Rjwilmsi  11:06, 2 April 2009 (UTC)
 * I was not aware that the style guide covered them. Ones that are covered by a WP style guide and avoid false-positive problems, yes, I (no longer) have any objection to them. -- JHunterJ (talk) 11:35, 2 April 2009 (UTC)
 * I think that "gave his/her life" has too many possible false positives. A quick search shows it's being used in at least two other contexts: devotion to religion (e.g., "gave his life to Jesus"), and "gave his life new direction". — TKD:: {talk}  12:53, 2 April 2009 (UTC)
 * Thanks for reconsidering this; much appreciated. --Tagishsimon (talk) 13:39, 2 April 2009 (UTC)
 * How do we amend eg Euphemisms, to show it should not be changed by AWB from passed away to died? Kittybrewster  &#9742;  20:17, 5 April 2009 (UTC)

False positive -> Airbourne (band) being corrected to Airborne
i.e. –  xeno  ( talk ) 17:56, 10 April 2009 (UTC)
 * Fixed with this edit. -- JHunterJ (talk) 20:12, 10 April 2009 (UTC)

Homberg changed to Homburg - except there is a place called Homberg
One of the typo fixes changes "Homberg" to "Homburg" (it's buried under endings, search for word="-burg". Fair enough most of the time, except there is a place called Homberg, see Homberg (Efze).  I assumed it was the correct Anglicisation of a German word, but now I suspect it is not.  Mr Stephen (talk) 23:24, 9 April 2009 (UTC)
 * ... and several other Hombergs. Mr Stephen (talk) 23:26, 9 April 2009 (UTC)
 * Done, it should not catch Homberg any more.-- Dycedarg  &#x0436;  02:27, 10 April 2009 (UTC)
 * Thanks. Mr Stephen (talk) 10:03, 11 April 2009 (UTC)

Anser (genus)
Is there a way to tweak the corrections for "answer", so that it doesn't systematically suggest to correct the above, e.g. on Maharana Pratap Sagar? Generally it's used in one of the species of Anser (genus). -- User:Docu
 * Fixed with this edit. -- JHunterJ (talk) 12:44, 11 April 2009 (UTC)
 * Thanks. -- User:Docu

Nee/Née
Both nee and née are both acceptable.--BillFlis (talk) 10:11, 13 April 2009 (UTC)
 * If née is not preferred (I still think it could be preferred), then we should leave the rule so that it fixes incorrect accenting (e.g., neé) or remove the rule entirely? -- JHunterJ (talk) 11:02, 13 April 2009 (UTC)
 * Surely it is preferred. Kittybrewster  &#9742;  12:15, 13 April 2009 (UTC)
 * I would think so. --ThaddeusB (talk) 02:35, 17 April 2009 (UTC)
 * I second that. --bender235 (talk) 15:18, 17 April 2009 (UTC)

Suggestion for large-scale addition to the typos list
There are many redirects from titles without diacritics to the the correct article title, with diacritics - e.g. Jerome Bonaparte, Brunswick-Luneburg. I believe it would be possible to use these redirects to set up regexes to automatically add the missing diacritics wherever the non-diacritic version is used (but I don't have the skills to do it). Here's how I think it could be done:

1. For each item in Category:Redirects from title without diacritics, select only those where
 * a. the only difference between the source and target is the addition of diacritics (despite the name of the category, this isn't always the case)
 * b. there is at least one link to the redirect (an optional filter to reduce the size of the list)

and from each selected redirect, create an XML/regex (in the style of the typo list) to map source --> target

2. Add the generated list of corrections to the AWB typo list.

Does this sound feasible/desirable? Colonies Chris (talk) 11:16, 15 April 2009 (UTC)

Replace double hypen with em dash
Is it possible to have the AutoWikiBrowser detect double hyphens between letters (such as "abc--xyz", or spaced like "abc -- xyz") and replace them with correct em dashes? (see also MOS:EMDASH) --bender235 (talk) 22:20, 17 April 2009 (UTC)
 * It is possible, but should be added as a general fix if anything. I have requested it for you here. --ThaddeusB (talk) 00:08, 18 April 2009 (UTC)
 * And I've done it. Rjwilmsi  17:54, 19 April 2009 (UTC)

False positives
I had a couple of false positives for Welsh place-names when using AWB earlier - it wanted to turn Aberaeron to Aberraeron and Aberafon to Aberrafon. In both cases, the existing spelling is correct. —  Tivedshambo   (t/c) 22:18, 18 April 2009 (UTC)
 * Fixed with this edit. -- JHunterJ (talk) 15:13, 19 April 2009 (UTC)

Scenarios
appears to fail to fix the misspelling.  MBisanz  talk 23:24, 20 April 2009 (UTC)
 * ✅ That should sort it out. Rjwilmsi  23:39, 20 April 2009 (UTC)

Telecommunications
Not fixing at.  MBisanz  talk 23:53, 20 April 2009 (UTC)
 * I expand communicate to match telecommunicate cases here (I assume this is want you wanted done). Although, it won't actually match your example since telecommunications actually has one 'l' not two :) --ThaddeusB (talk) 03:16, 21 April 2009 (UTC)

Discernable
Currently it changes discernable → discernible, but I think both are acceptable. See Merriam Webster.  shirulashem     (talk)   00:23, 21 April 2009 (UTC)
 * You are correct 'discernable' is listed in several dictionaries, and thus should probably not be corrected. Interestingly, 'indiscernable' is listed in none.  Thus, I left the correction for indiscernable cases only. --ThaddeusB (talk) 03:29, 21 April 2009 (UTC)

Wicher/Witcher
This is almost always a false positive. I've encountered many false positives but never a correction.  - down  load  |   sign!  02:14, 21 April 2009 (UTC)
 * fixed here --ThaddeusB (talk) 03:37, 21 April 2009 (UTC)

"2×" instead of "2x"
Oftentimes in athletes' infoboxes there are things like "2x National Champion" or "4x Most Valuable Player". But it should be "2× ..." or "4× ...", respectively, using the multiplication sign. --bender235 (talk) 08:46, 7 April 2009 (UTC)
 * I've been reverted when making that kind of change on sports pages, because of other editors' preference for the ASCII representation x. -- JHunterJ (talk) 11:34, 7 April 2009 (UTC)
 * Where and why? Don't we replace - with – as well, because "p. 12-15" would be wrong (and "p. 12–15" correct)? --bender235 (talk) 12:47, 7 April 2009 (UTC)
 * Here. Don't know why. I didn't have the drive to pursue it. -- JHunterJ (talk) 14:53, 7 April 2009 (UTC)
 * Okay, let me do the dirty work. ;-) --bender235 (talk) 15:01, 7 April 2009 (UTC)
 * Since no one seems to oppose this proposal, I guess its fair to add this to the typo fixes, isn't it? --bender235 (talk) 23:25, 15 April 2009 (UTC)


 * Has anyone added this fix as of now? --bender235 (talk) 13:28, 24 April 2009 (UTC)

None the less
Hello, you (or, at least, the AWB bot) have been treating "none the less" (three words) as a typo, and changing it to nonetheless (one word).

Most dictionaries say it can be either. The Oxford Dictionary for Writers and Editors (ODWE), which I have always gone to when in doubt, says the three-word version is actually to be preferred (unlike "nevertheless", which is always one word).

It's a very small matter in the great scheme of things, but I think at the very least there is no need to change "none the less" when it appears as three words. Alarics (talk) 20:15, 21 April 2009 (UTC)


 * Thanks, I'll point it out to the devs.  MBisanz  talk 04:48, 22 April 2009 (UTC)

Subsequently
Could someone please add "subsequently", replacing misspellings like "supsequently" or "subsiquently"? --bender235 (talk) 14:02, 24 April 2009 (UTC)
 * The latter is already there ("-sequent" rule). I'll add a rule for the first. Rjwilmsi  17:56, 24 April 2009 (UTC)
 * ✅ Added. Rjwilmsi  18:00, 24 April 2009 (UTC)

Academey?
I was just using TypoRegex, and AWB tried to correct "Acadmey" with "Academey". Shouldn't it be "Academy"? --bender235 (talk) 21:23, 24 April 2009 (UTC)
 * with this edit. -- JHunterJ (talk) 11:44, 27 April 2009 (UTC)

requirments -> requirements
Please add this one to the Regex database. --bender235 (talk) 22:15, 24 April 2009 (UTC)
 * ✅ Existing rule expanded. Rjwilmsi  07:09, 25 April 2009 (UTC)

Replacing "1/2" with "½", etc.
I don't know whether this should be added as a "general fixes" request, but misspelled fractions like "1/2" or "3/4" should be replaced with ½ and ¾, respectively. That would include ½, ⅓, ⅔, ¼, ¾, ⅛, ⅜, ⅝, and ⅞. --bender235 (talk) 16:46, 27 April 2009 (UTC)
 * 1/2 isn't misspelled, but I get your point. There is the possibility for many false positives this way, though, in dates, military unit designations, etc. etc. -- JHunterJ (talk) 16:59, 27 April 2009 (UTC)


 * I think we have a guideline NOT to replace these with the Unicode characters somewhere, instead we should use upper/lowercase. Cacycle (talk) 12:20, 28 April 2009 (UTC)
 * WP:MOSNUM specifies using the frac template. Square and cube exponents are guidelined against using their Unicode characters though. -- JHunterJ (talk) 18:05, 28 April 2009 (UTC)
 * This seems like more of an AWB general fix than a typo rule. Rjwilmsi  18:17, 28 April 2009 (UTC)
 * It is definitely not a typo fix and probably not appropriate as a general fix either since "1/2" can mean a lot more things than just "one half". --ThaddeusB (talk) 00:31, 30 April 2009 (UTC)
 * If somebody can come up with an extremely reliable set of cases where fractions could be replaced then AWB could do it as a new general fix, otherwise, I think this can't go anywhere. Rjwilmsi  11:26, 30 April 2009 (UTC)

Example --> Exemple
Many false positives, as this is a word in French. I suggest it be removed.  - down  load  |   sign!  23:31, 29 April 2009 (UTC)
 * The word in French should be cast within a lang template, which will enclose it within a span identifying the language and protect it from automatic English-language fixes on the English-language projects. I don't think we wish to remove all strings that are words in other languages. -- JHunterJ (talk) 00:42, 30 April 2009 (UTC)
 * I agree. In some cases it could be a misspelling of the English word -- this is, after all, the English Wikipedia. Besides, this kind of thing is the reason that AWB changes are supposed to be checked by a human before being saved. --Auntof6 (talk) 05:00, 30 April 2009 (UTC)

fourtunate
Currently this corrects to ffortunate. Not sure if it's worth fixing. -- User:Docu
 * - The problem was an extra "f" in the replacement part. --ThaddeusB (talk) 19:08, 1 May 2009 (UTC)

Sources of revenue
"Corrected" here to References of revenue, which is nonsense. This is the second time this has happened; is there some way to encourage AWBers to look before they edit? Can the article be templated to be left alone? Septentrionalis PMAnderson 22:39, 6 May 2009 (UTC)
 * I'd hazard a guess that that's a problem in the general fixes, not in the Typo list. -- JHunterJ (talk) 00:44, 7 May 2009 (UTC)
 * Does not appear to happen in the current version of AWB. --ThaddeusB (talk) 01:46, 7 May 2009 (UTC)
 * It was a header; if there is a subprogram correcting sources to references in headers, I can see why it exists; but urge it be recomnsiders. Septentrionalis PMAnderson 02:11, 7 May 2009 (UTC)
 * I know - what I mean is I loaded the page in my current AWB and it didn't try to make the correction. Presumably, this means the "fix" was taken out or fixed to only match "==Sources==" and not "==Sources XXX== at some point.  I would have to guess that the user who made the change is using an older version or something.  --ThaddeusB (talk) 03:22, 7 May 2009 (UTC)

I suggest you contact the user who made the edit to ask them why it happened. It is not caused by any core AWB functionality. Rjwilmsi 06:44, 7 May 2009 (UTC)
 * Ah, I see you already have. The user in question just needs to improve their logic to make sure 'sources' is the entire text of the heading, rather than just the start of it. Rjwilmsi  06:48, 7 May 2009 (UTC)

nbsp; before units
I can't see a FAQ around here so... Why is AWB replacing spaces with nbsp; before units? Eg. "12 mm" to "12 mm"? ··gracefool&#9786; 15:28, 10 May 2009 (UTC)
 * So that the unit description doesn't fall on the next line; it will always be right next to the unit value. – xeno  talk  15:34, 10 May 2009 (UTC)

See also WP:NBSP --ThaddeusB (talk) 16:59, 10 May 2009 (UTC)

Saxon possessive plurals
We do womens = > women's  childrens => children's should we also correct mens? (And maybe oxens, vixens, and sheeps?) Rich Farmbrough, 10:24 12 May 2009 (UTC).


 * Unfortunately, it looks like these errors are ambiguous in that half are incorrect plural forms and half are incorrect possessive forms. Thus a typo rule is probably not ideal. --ThaddeusB (talk) 14:35, 12 May 2009 (UTC)
 * I added oxens & sheeps to the manual typo fixing list. Vixens is quite often correct as a proper noun, so I didn't add it. --ThaddeusB (talk) 14:40, 12 May 2009 (UTC)

Fine tuning
Petersberg Agreement is correct. Rich Farmbrough, 01:12, 4 June 2009 (UTC).
 * Hun? What is the correction you want adjusted here? --ThaddeusB (talk) 01:23, 4 June 2009 (UTC)
 * with these edits. -- JHunterJ (talk) 11:07, 4 June 2009 (UTC)

Double spacing
How about removing double spacing? Ie replacing ". X" with ". X"? This was mentioned earlier as part of a bunch of changes. So far I've got "\. [ ]+([A-Za-z\[])" → ". $1" but I'll probably find room for improvement. ··gracefool&#9786; 14:43, 10 May 2009 (UTC)
 * The problem with this is some people are running AWB "skip if no typo fix" and then this non-visible change would be considered a typo fix, effectively causing them to break the rule against insignificant edits. – xeno  talk  15:38, 10 May 2009 (UTC)
 * Are there many other non-visible changes like this? If so, we could make a new "skip non-visible changes" checkbox... ··gracefool&#9786; 16:28, 10 May 2009 (UTC)
 * This change is against MOS (unless it has changed since I last read) since we don't endorse one system of spacing over another (2 spaces in standard in American English). Also, it is completely pointless since most browsers compress multiple spaces into one. --ThaddeusB (talk) 17:01, 10 May 2009 (UTC)
 * It won't make a visible change, and is potentially controversial (though given it's not a visible change that seems a contradiction...), so doesn't seem worthwhile. Rjwilmsi  17:21, 10 May 2009 (UTC)
 * Indeed many of us prefer the double space after a full-stop even rhough it doesn't show. Rich Farmbrough, 10:18 12 May 2009 (UTC).
 * MOS says there is no guideline because it doesn't matter. But obviously it shouldn't be done by itself since that would be breaking the rule against insignificant edits. ··gracefool&#9786; 05:39, 13 May 2009 (UTC)
 * The supposed "rule" of two spaces after sentence-ending punctuation is not standard "American English", whatever that means. It is a hold-over from the bygone days of typewriters, with their (generally) non-proportional fonts. In type-set text, one space has always been the standard (see, e.g., U.S. Government Printing Office Style Manual, 1973, p. 11: "To conform with trade practice, a single justification space (close spacing) will be used between sentences."--BillFlis (talk) 18:10, 17 June 2009 (UTC)

Finally
Note: AWB tried to correct "finnaly" with "finnally", although it's "finally". --bender235 (talk) 14:43, 17 June 2009 (UTC)
 * now. Thanks. Rjwilmsi  16:54, 17 June 2009 (UTC)

I before e except after C
I looked through the list and did see these and I think they would be good to add if not there already.
 * Recieved to Received
 * Decieved to Deceived
 * Percieved to Perceieved --Kumioko (talk) 18:57, 24 June 2009 (UTC)
 * Those three are already covered. Rjwilmsi  16:36, 25 June 2009 (UTC)
 * I thought they probably where but I couldn't find them so I wanted to ask. --Kumioko (talk) 16:54, 25 June 2009 (UTC)

Archaeology
AWB tried to correct "archeaology" with "archeology", but it should be "archaeology". --bender235 (talk) 20:18, 25 June 2009 (UTC)
 * Archaeology and archeology are both acceptable. -- JHunterJ (talk) 21:44, 25 June 2009 (UTC)
 * But for a good reason all archaeological journals are spelled with "ae", and let's not forget the Wikipedia article is named "Archaeology". --bender235 (talk) 23:05, 25 June 2009 (UTC)
 * We didn't forget. Is there a Wikipedia style guideline for opting for ae? Should we use æ instead? Should we remove "archeology" from archaeology? -- JHunterJ (talk) 00:35, 26 June 2009 (UTC)

Ellipse etc.
I thought about this rule: I'm pretty sure it would work, adding the second 'l' to elipse, elipsis, elipses. I've not just added it for a few reasons: --ospalh (talk) 14:40, 25 June 2009 (UTC)
 * The one-'l' version is apparently correct in a number of laguages. For example there seems to be a Serbian band "Elipse".
 * I just went ahead and fixed all the unambiguous cases I could find
 * 'elipse' may be a typo for 'eclipse' as well
 * You can use a negative look-behind to allow Elipse:


 * Did you find cases where elipse was/might have been a typo for eclipse? -- JHunterJ (talk) 14:47, 25 June 2009 (UTC)
 * Yes, two. But that typo isn't too hard to spot.--ospalh (talk) 07:06, 26 June 2009 (UTC)
 * If we want to avoid it (and eclipses typos), we'd be left with just fixing "elipsis":


 * I'm not sure how cautious we should be here. -- JHunterJ (talk) 11:33, 26 June 2009 (UTC)

Hindenburg
I wanted to change the -burg rule to include Hindenberg->Hindenburg. to (O.K., I did and then undid it.)

There are some typos where Hindenberg should be fixed to Hindenburg. But there is also Basil Cameron, know as "Basil George Cameron Hindenberg" or "Basil Hindenberg". I don't know how to avoid those false positives. I think an extra rule for Basil will not help as we can't be sure of the order the rules are applied.--ospalh (talk) 08:45, 29 June 2009 (UTC)
 * I don't think there is a regexp to determine which Hindenbergs should be changed and which shouldn't. Both spellings appear to be valid surnames, and people are often referred to by just there surname in article bodies. -- JHunterJ (talk) 11:19, 29 June 2009 (UTC)
 * I just used the regexp on its own and most "Hindenberg"s needed to be changed. But there were a few that had to stay. (Most of those did, in a way, mean Hindenburg, too, but were quotes or file names.) So in the end it's too complicated for an automatic rule and should probably not be included.--ospalh (talk) 14:27, 29 June 2009 (UTC)

Journal parameters cleanup
You can look through (WikiProject_Academic_Journals/Journals_cited_by_Wikipedia alphabetical) and see patterns. For example, many journal parameters start with a ' for no reason, others are italicized twice (templates place entries in italics automatically, no need to tell it twice), and so on. Headbomb {{{sup|ταλκ}}κοντριβς – WP Physics} 01:17, 30 June 2009 (UTC)

Capitalisation in URLs
Is there a way we can prevent the capitalisation rules happening inside URLs? ··gracefool&#9786; 04:22, 25 June 2009 (UTC)
 * I have been thinking about the same problem!! Let's wait together for an answer!! --Siddhant (talk) 07:13, 4 July 2009 (UTC)

New words

 * 1) The list already has "tamil" → "Tamil". Can someone add "tamil nadu" → "Tamil Nadu".
 * 2) "indore" → "Indore". (However if the name is in a URL leave it uncapitalized.)
 * 3) "jallandhar" → "Jalandhar". (Wrong spelling of the city name.)

Thanks.--Siddhant (talk) 07:11, 4 July 2009 (UTC)


 * ✅ Jalandhara lists Jallandhar as an alternate spelling, so I don't think we can include it here. Others added with this edit. -- JHunterJ (talk) 20:50, 4 July 2009 (UTC)

humorous
AWB tried to replace "humourous" with "$umorous", but it should be "humorous". --bender235 (talk) 20:33, 4 July 2009 (UTC)
 * with this edit. -- JHunterJ (talk) 20:41, 4 July 2009 (UTC)

trilogy
AWB tried to replace "trilolgy" with "trilology", yet it should be "trilogy". --bender235 (talk) 15:30, 5 July 2009 (UTC)
 * AWB's reg exp typo tab should tell you which regexp was "hit" for this one. In this case, I suspect -olgy --> -ology as a general suffix hit. I don't think it needs to be changed, although possibly an earlier "trilogy" rule that catches "trilolgy" could be added. Since it appears that you fixed the only instance of "trilolgy" on Wikipedia, I don't think a change is needed. -- JHunterJ (talk) 15:51, 5 July 2009 (UTC)

screenwriter
AWB tried to replace "scrennwriter" with "screennwriter", although it should be "screenwriter". --bender235 (talk) 20:28, 10 July 2009 (UTC)
 * ✅ This will catch it. Thanks Rjwilmsi  07:06, 11 July 2009 (UTC)

Xbox
In Video game multiple console reviews, "XBOX" is all caps and doesn't work as "Xbox". BOVINEBOY 2008 16:31, 16 July 2009 (UTC)
 * Typo fixes are not applied within templates. Do you have an example diff of a problem? Rjwilmsi  16:41, 16 July 2009 (UTC)
 * BOVINEBOY 2008 16:51, 16 July 2009 (UTC)
 * Unfortunately, the link you have posted points to this page? Rjwilmsi  20:37, 17 July 2009 (UTC)
 * sorry. BOVINEBOY 2008 20:44, 17 July 2009 (UTC)
 * Okay, thank you for the link. I'm confused as to why this happened, because for me no typo fixes are applied to the article as I would expect, since the "XBOX" under question is within a template, so is ignored by AWB when applying typo corrections. I can only suppose that the user who made the edit has some customised logic running on AWB that does not implement this standard restriction. Rjwilmsi  22:10, 17 July 2009 (UTC)

I think I figured it out. One of the templates earlier in the article wasn't closed, so it may have voided something. I don't know... Either way thank you for taking note. BOVINEBOY 2008 22:14, 17 July 2009 (UTC)

Moiré/moire
I'm not sure that the rule for moiré should be kept. I think I've found a false positiv: Moire (fabric). It's not strictly an error to spell the fabric with an accent, but apparently not standard.--ospalh (talk) 07:51, 20 July 2009 (UTC)
 * with this edit. -- JHunterJ (talk) 10:58, 20 July 2009 (UTC)

"emporer"
"emporer" -> "emperor"  -shirulashem (talk) 12:41, 22 July 2009 (UTC)


 * ✅ along with a few other emperor fixes: --ThaddeusB (talk) 04:52, 25 July 2009 (UTC)

Regex out of links
Hello! I'm trying to write a regex to no match into links or templates

the example is: string is : "    a   b   c d  d  c  "

The match should be only the c outside the links (the bolted one).

Thanks for helping--Zorlot (talk) 04:17, 25 July 2009 (UTC)

à la
Please add a typo fix for "à la", replacing things like "a la", "a lá" or "ala". --bender235 (talk) 09:49, 31 July 2009 (UTC)
 * Ala appears to have some legitimate lowercase usages. Otherwise ✅ with this edit -- JHunterJ (talk) 11:29, 31 July 2009 (UTC)

(equals) (equals) (space)
I use autoed when i edit, and one of the edits it recommends a lot is deleting the space that is often between the "==" (of the header) and the actual section name. The proper format for a section header is ==sectionname==, NOT: ==(space)sectionname(space)==

so this is essentially two rules (as there needs to be one rule for the two equal signs on either side of the page)

replace "== " with "==" and Replace " ==" with "=="

tell me what you think, and if i need to elaborate--Tim1357 (talk) 17:23, 6 August 2009 (UTC)
 * There's no consensus for this change, and as it's not visible to an article reader I don't see much value in it. Rjwilmsi  18:33, 6 August 2009 (UTC)

Clinitian -> Clinician
eh. –xenotalk 18:44, 7 August 2009 (UTC)
 * No hits in WP search for "clinitian". Did you already fix a bunch of them? -- JHunterJ (talk) 12:12, 8 August 2009 (UTC)
 * I just saw someone use it once. Is this only for common typos? –xenotalk  19:34, 13 August 2009 (UTC)

notally?
AWB tried to replace "notaly" with "notally" here, although IMO it should've been "notably". --bender235 (talk) 09:39, 8 August 2009 (UTC)
 * This was the application of a suffix rule. (There's a tab in AWB that will display the rules that had matches on the current page; it can be helpful to include that info.) But I don't think it's a prevalent-enough typo to need a separate fix. -- you fixed the only occurrence. -- JHunterJ (talk) 12:10, 8 August 2009 (UTC)
 * Well, I added it anyway. Rjwilmsi  12:12, 8 August 2009 (UTC)

Acheievment?
AWB also tried to replace "acheievment" with "acheievement" here, although it should've been "achievement". --bender235 (talk) 11:05, 8 August 2009 (UTC)
 * ✅ with this edit. -- JHunterJ (talk) 12:23, 8 August 2009 (UTC)

"Passed away"
As I got reverted.. isn't this page supposed to be used to fix typos, and not to enforce WP:EUPHEMISM? We could just as well add a rule that exchanges "perversion" with "paraphilia". --Conti|✉ 19:28, 13 August 2009 (UTC)
 * Yes, it's a bit of a stretch to include style changes in a typo list. Maybe this should be included with WP:FRONDS instead? –xenotalk 19:33, 13 August 2009 (UTC)
 * Hmm, that would be fine by me. I don't think terms like these should be replaced among all the typos (which means that people won't really think about whether "passed away" might be appropriate after all in some situations). --Conti|✉ 19:40, 13 August 2009 (UTC)
 * Conti, I reverted your removal of "passed away", but I think your argument has merit and should be discussed. I remember the first time I saw the plugin change pass away to die and I was pretty surprised. Afterall, like you and xeno said, "pass away" isn't a typo. The reason I was so quick to revert your change is that I felt removing an entry like that needed some discussion first, and in the meantime, you can just do what I do: ignore the change that AWB wants to make when it comes accross "pass away" in an article.  -shirulashem (talk) 19:46, 13 August 2009 (UTC)
 * Well, if we're all in agreement, what about making that change, then? :) Usually this list is only used for things that are blatantly wrong, and so far I only had to cancel a change because this list wasn't perfect, not because I disagree with it. And I'd like it to stay that way. --Conti|✉ 20:03, 13 August 2009 (UTC)
 * ✅... Are there any more like this? –xeno</b><sup style="color:black; font-family:verdana;">talk 20:23, 13 August 2009 (UTC)
 * None that I know of, at least. Thanks. :) --Conti|✉ 20:57, 13 August 2009 (UTC)
 * Hmm. I'm not convinced that a whole hour and a half represents sufficient time to debate this issue and arrive at consensus. Here's the debate which led to the introduction of "passed away" et al. I think the argument is as strong now as then for its inclusion on grounds of policy (Words to avoid and suitability AutoWikiBrowser/Typos. --Tagishsimon (talk) 21:06, 13 August 2009 (UTC)
 * First of all, don't confuse guidelines with policy. In most cases, "died" is more appropriate than "passed away", I don't disagree with that. But I still disagree with including this entry here for two reasons: a) As I said above, this is the typo list, it contains terms that need to be fixed and are wrong 100% of the time. Which leads to b) "passed away" is usually not appropriate, but not always. Plot summaries come to mind, and of course quotes (or are we supposed to add a [sic] to someone being quoted as "He passed away", like we do with all typos?). Adding this term to WP:FRONDS instead, which people can use to hunt for badly phrased sentences, sound much better to me. --Conti|✉ 21:14, 13 August 2009 (UTC)
 * Behold, it came to pass that three hundred and twenty years had passed away, and the more wicked part of the Nephites were destroyed –<b style="font-family:verdana; color:black;">xeno</b><sup style="color:black; font-family:verdana;">talk 21:18, 13 August 2009 (UTC)
 * That's why editors need to preview EVERY tool edit before they make them, because there are times that the suggested edit will be wrong. Also, I agree with Tagishsimon. The discussion began, I left my office to commute home, ate a slice of pizza, turned on my computer, and the discussion was over and the change was made. I think it needs to be discussed more.  -shirulashem (talk) 00:54, 14 August 2009 (UTC)
 * My concern is that we're moving from typos to enforcing stylistic changes. Perhaps a different checkbox should be created for this, so editors don't blindly approve the fixes (even though they aren't supposed to). There's a reason the "phrases" section was, until "passed away", empty. It's a bit of a different bird. I'm not particularly fussed though, so if you want to put it back in while more people weigh in, I won't consider it to be edit warring or anything. –<b style="font-family:verdana; color:black;">xeno</b><sup style="color:black; font-family:verdana;">talk 00:57, 14 August 2009 (UTC)
 * (EC with Xeno) And, Conti, you have yet to demolish a couple of arguments: 1) AWB/Typos has for along long time had a section for "incorrect phrases", which seems to indicate an intention to deal with incorrect phrases. According to the guidelines, passed away et al are incorrect phrases. 2) Your own typo and [sic] argument reveals, per Shirulashem, that there are instances where 100% turns into slightly less than 100%; you're probably as likely to get an false positive with a conventional typo regex as you are with this phrase regex. I do take Xeno's point that phrases are a different kind of bird, but am concerned that WP:FRONDS is to immature to be considered a solution. Like Xeno, I'm happy that we keep passed away removed while we discuss; the discussion is more important than whether passed away happens to be in or out as we discuss. --Tagishsimon (talk) 01:07, 14 August 2009 (UTC)
 * 1) Yes, and as far as I can see, it has been empty from the day it's been added. Regardless of whether there ever was an intention to use this page to fix incorrect phrases, I simply disagree with the use of this page for that purpose. 2) I disagree here, too. Just do a search for "passed away" and see how many false positives you can find. There are a lot more than you will find when searching for actual typos. --Conti|✉ 08:52, 14 August 2009 (UTC)

&larr; Best of both worlds: Wikipedia talk:AutoWikiBrowser/Feature requests. –<b style="font-family:verdana; color:black;">xeno</b><sup style="color:black; font-family:verdana;">talk 01:17, 14 August 2009 (UTC)
 * Yup, that'd be a fine solution. --Tagishsimon (talk) 01:50, 14 August 2009 (UTC)

occasionally
AWB tried to replace "ocaissionaly" with "ocaissionally", but it should've been "occasionally". --bender235 (talk) 22:05, 18 August 2009 (UTC)
 * There's a tab in AWB that will show you which rule matched. In this case, I'm betting it was a suffix rule replacing -aly with -ally, which indeed did the expected thing here. Are you suggesting the addition of a new fix to apply to "ocaission"? -- JHunterJ (talk) 23:18, 18 August 2009 (UTC)
 * Sure, if that's what's necessary. --bender235 (talk) 12:52, 19 August 2009 (UTC)
 * One of the project "to-dos" is to remove rare words. The only instance of "ocaissionaly" has been fixed. I'm postulating that the addition of a fix for it is not necessary. -- JHunterJ (talk) 21:12, 20 August 2009 (UTC)

Search method
How does AWB search for typos? Does it search the wikisource of the page or actual text that we see on article tab? Thanks! —Preceding unsigned comment added by 70.26.3.12 (talk) 00:02, 25 August 2009 (UTC)
 * I need this information because I am trying to develop a list of typos for a different language Wikipedia. —Preceding unsigned comment added by 70.26.3.12 (talk) 09:29, 25 August 2009 (UTC)
 * I believe AWB will look for typos in the wikitext itself, not the displayed text. –<b style="font-family:verdana; color:black;">xeno</b><sup style="color:black; font-family:verdana;">talk 21:37, 29 August 2009 (UTC)

Linament instead of the correct liniment
Some people keep changing the correct spelling of liniment to "linament", using AWB in the article Slough. As you can see there is even an article for it with the correct spelling. Obviously this must be spelt incorrectly in the Browser, it wouldn't occur otherwise. Please, someone, change this. Dieter Simon (talk) 23:13, 3 September 2009 (UTC)
 * This is the line, but I have no idea how to add an exception. –<b style="font-family:verdana; color:black;">xeno</b><sup style="color:black;">talk 23:22, 3 September 2009 (UTC)

<Typo word="-ament" find="\b([Ff]il|[Ll]i[gn]|[Tt]est|[Tt]ourn)ia?ment(s?|ary)\b" replace="$1ament$2"/>
 * with this edit. -- JHunterJ (talk) 03:21, 4 September 2009 (UTC)

Enmáscarado
In the common beginnings section, there is code which changes Enm to Emm. Could someone add Enmascarado and Enmáscarado to the list of exceptions? Thanks! Plastikspork <sub style="font-size: 60%">―Œ <sup style="margin-left:-3ex">(talk) 15:10, 16 September 2009 (UTC)
 * Those should probably be better exempted by wrapping them in templates, since they aren't in English usage (they don't appear in the destination article, for example). -- JHunterJ (talk) 18:20, 16 September 2009 (UTC)
 * It's a spanish word and it's a pain to fix it every time someone uses AWB and comes by the articles, I've had to do it like 5 times in the last 2-3 months alone.  MPJ-DK  (No Drama) Talk 18:34, 19 September 2009 (UTC)
 * ✅ with this edit. To illustrate what I was saying before, I also blocked the possibility of AWB altering it on one page with these edits. AWB won't "fix" foreign-language text that's identified as foreign language text by the use of the lang template. -- JHunterJ (talk) 19:50, 19 September 2009 (UTC)
 * Thank you, I really appriciate it.  MPJ-DK  (No Drama) Talk 20:03, 19 September 2009 (UTC)

webiste - website
I've done this manually, but think it could be botted for the future.  Ϣere Spiel  Chequers  20:13, 18 September 2009 (UTC)
 * ✅ with this edit -- JHunterJ (talk) 20:22, 19 September 2009 (UTC)

Petersberg - Petersburg
In this edit, an AWB user replaced a "Petersberg" referring to Petersberg, Hesse into "Petersburg" claiming it was a typo. I'm not sure this spelling should be included in the list; it might do more harm than good, considering that there are several plausible legitimate uses of Petersberg. — JAO • T • C 09:18, 6 October 2009 (UTC)
 * with this edit -- JHunterJ (talk) 10:36, 6 October 2009 (UTC)

Two additions
✅ with these edits -- JHunterJ (talk) 12:52, 19 October 2009 (UTC)
 * 1) 1 AWB does not detect "fondation" as misspelling of "foundation", but it should.
 * 2) 2 AWB detects "Musial" (as in Stan Musial, for example) as misspelling of "musical", but it should not. --bender235 (talk) 12:11, 19 October 2009 (UTC)

earnign
earnign currently corrects to eearning. Can't see how to fix that myself. --Closedmouth (talk) 14:36, 21 October 2009 (UTC)
 * with this edit. -- JHunterJ (talk) 14:47, 21 October 2009 (UTC)
 * Ta. I should also mention that advertizing is correcting to advertising. I'd remove the rule myself, but I'm not sure if it's just faulty. --Closedmouth (talk) 14:51, 21 October 2009 (UTC)
 * Advertising doesn't mention "advertizing". Is it an acceptable alternate spelling? -- JHunterJ (talk) 21:59, 21 October 2009 (UTC)
 * Isn't that how the Americans spell it? --Closedmouth (talk) 07:03, 22 October 2009 (UTC)
 * No. -- JHunterJ (talk) 11:09, 22 October 2009 (UTC)
 * Well, I'm an idiot. --Closedmouth (talk) 23:19, 22 October 2009 (UTC)
 * Nah, English is just a really goofy language. :) --ThaddeusB (talk) 00:35, 23 October 2009 (UTC)

Fenerbahçe
Please add "Fenerbahçe" as fix of "Fenerbahce". --bender235 (talk) 08:48, 31 October 2009 (UTC)
 * ✅ this edit Paradoctor (talk) 12:48, 31 October 2009 (UTC)

RegExp documentation
It took me awhile to find the external link to the syntax summary on the AWB home page. For the benefit of those who know the RegExp principles, but are not acquainted with Microsoft's take on it, I suggest Paradoctor (talk) 12:17, 31 October 2009 (UTC)
 * linking the documentation: official MSDN documentation (is this standardized somewhere?) Well House summary
 * clarifying which elements of the Microsoft mess should not be used, and which ones should be avoided, be it for performance reasons or compatibility or whatnot
 * making our own summarys for both the quick and dirty and for the advanced messers

Collection maintenance
A few automation suggestions: Paradoctor (talk) 13:14, 31 October 2009 (UTC)
 * 1) Checking the list for expressions that match their output, i. e. matching "foo, then replacing it with "foo".
 * 2) Overlapping search expressions, e. g. if someone added a rule "ibm" -> "ibn", this would clash with "ibm" -> "IBM".
 * 3) Crawling redirects, disambiguation pages and AJAX suggestions (from search results) for useful information.
 * 4) Utilities that convert between regexps and lists of match-replacement pairs, for not-too-complex rules this could save a lot of headaches for beginners, and time for advanced users, and these cases make up the vast majority of rules.
 * 5) Writing the above it occurred to me that a simple wizard would probably be the simplest solution: You enter a match and/or replacement term, and the wizard shows you whether the matchword already has a replacement, or what words match to a given replacement term. Then, you get to choose the appropriate editing options. Forming efficient regexps can be left to the software. How does that sound?
 * First one is already done by AWB. The rest sounds great, but hard to do. Rjwilmsi  14:14, 31 October 2009 (UTC)

Welcome
Things like "wellcome" should be corrected with "welcome". --bender235 (talk) 17:00, 1 November 2009 (UTC)


 * Some would not Wellcome that. ;) Paradoctor (talk) 22:16, 1 November 2009 (UTC)


 * Okay. --bender235 (talk) 11:23, 7 November 2009 (UTC)
 * A case-sensitive search on "wellcome" could be added though. -- JHunterJ (talk) 13:51, 11 November 2009 (UTC)

Eulerian
Please don't correct an Eulerian to *a Eulerian. Being Swiss, he isn't pronounced like that. —Blotwell 23:28, 3 November 2009 (UTC)


 * Being the one who erroneously edited several articles, replacing "an Eulerian" to "a Eulerian" with AWB, I support. --bender235 (talk) 11:27, 7 November 2009 (UTC)
 * ✅ Fixed. Rjwilmsi  12:35, 7 November 2009 (UTC)

addition to geographic canada
someone should add "Mississauga", "Calgary", "New Brunswick", "Nova Scotia", "Prince Edward Island", and "Edmonton" tablo (talk) 22:20, 8 November 2009 (UTC)
 * What are their common misspellings? -- JHunterJ (talk) 13:50, 11 November 2009 (UTC)

continous
Hi, can we add continous as a typo for continuous?  Ϣere Spiel  Chequers  12:35, 9 November 2009 (UTC)
 * The "(Dis)Continuous" rule already catches it. Rjwilmsi  10:33, 11 November 2009 (UTC)
 * Oh I thought that any typo with examples more than two months old was probably not on the list. As a rule of thumb how old would you suggest examples need to be for a typo not to be on the list?  Ϣere Spiel  Chequers  13:33, 11 November 2009 (UTC)
 * There's no rule of thumb. Typos older than the last time an editor used AWB with RETF enabled are probably not on the list. The way to check to see if it's on the list is to point AWB at the page (with RETF enabled) and see if it catches it. Since AWB usage is human-initiated, not automatic, a page that is five years old but hasn't bubbled up to some AWB editors list won't get corrected. -- JHunterJ (talk) 13:49, 11 November 2009 (UTC)

False Positives in Sixteenth-Century Titles
Hi, is there a way to stop people using AWB to change sixteenth-century spellings in titles of sixteenth-century books into modern spelling? I keep reverting the corrections of agenst → against, breif → brief, mariage → marriage in the article on George Joye but people using AWB keep changing it back without thinking or reading the text in context. A note in the Discussion page did not help. GJ1535 (talk) 09:41, 11 November 2009 (UTC)
 * One option would be to use sic with the 'hide' flag on. Rjwilmsi  09:50, 11 November 2009 (UTC)
 * Thanks a lot. That helps a lot! GJ1535 (talk) 16:31, 11 November 2009 (UTC)

Not sure where to report this, but a similar incidence is this change of the name of a painting. Perhaps the script could be careful when the suspect word is capitalized, indicating a name? Skomorokh, barbarian  15:50, 14 November 2009 (UTC)


 * It certainly could ignore capitalization, but the trade off, obviously, would be actual errors going uncaught.
 * I another suggestion would be to contact the offending party and ask them to be a bit more careful. --ThaddeusB (talk) 16:51, 14 November 2009 (UTC)

Misc typofix suggestions
Some possibly autofixable typos I came across, and suggested correction. --HamburgerRadio (talk) 01:57, 16 November 2009 (UTC)
 * positionned positioned
 * crittercism criticism
 * successsful successful
 * definately definitely
 * posotive positive
 * Retrivied Retrieved
 * lilie lily
 * alzheimers Alzheimer's
 * privides provides
 * battlions battalions
 * determin determine

Les Mis typo
I just scanned an article about Les Miserables, where every occurence of "Rue Plumet" was suggested to be changed to "Rue Plummet". Opinions on the best way to handle this going forward? --SarekOfVulcan (talk) 19:14, 25 November 2009 (UTC)
 * Which rule was catching it? We can address the rule, and/or the "Rue Plumets" can be tagged as French with the lang template. -- JHunterJ (talk) 19:29, 25 November 2009 (UTC)
 * Exception added to rules. Rjwilmsi  10:45, 8 December 2009 (UTC)

based of -> based on
This rule:

" <Typo word="Based (off) of" find="\b(B|b)ased\s+(off\s+)?of\b" replace="$1ased on" /> "

produced a false positive, it tried to fix "... the most dynamic, action-based of these ..." to "... the most dynamic, action-based on these ...". (from "Bacone school).

I'm not sure how often "based of" is part of a " based of " construction and how often it should be changed to "based on".--ospalh (talk) 14:04, 7 December 2009 (UTC)
 * Would a negative look-behind matching the hyphen to avoid fixes of "-based" suit this problem? -- JHunterJ (talk) 14:34, 7 December 2009 (UTC)
 * Looks like a good idea.--ospalh (talk) 15:10, 7 December 2009 (UTC)

with this edit. -- JHunterJ (talk) 22:43, 7 December 2009 (UTC)

Better typo correction
AWB changes indias to Indias, but it should change indias to India's (with an apostrophe). Please correct it. --Siddhant (talk) 19:05, 8 December 2009 (UTC)
 * indias -> Indias is a good change. Indias -> India's has the possibility of false positives. http://www.google.com/search?q=%22many+indias, for example. -- JHunterJ (talk) 01:35, 9 December 2009 (UTC)
 * I understand. Thanks for explaining. --Siddhant (talk) 16:03, 9 December 2009 (UTC)

correction for "indiscernible"
The replacement yields "iiscernible" as it now stands. Also the word "indiscernible" may exist as a stray on a line above. LilHelpa (talk) 01:58, 17 December 2009 (UTC)
 * ✅ Thanks for reporting – fixed. Rjwilmsi  07:57, 17 December 2009 (UTC)

Wikipedia:Lists of common misspellings
This program can incorporate data from Lists of common misspellings. -- Wavelength (talk) 16:14, 17 December 2009 (UTC)

"concerned" is being changed to "concearned"
And I can't find why. Actually, could be "concerning" to "concearning" or both. Can't recall. LilHelpa (talk) 01:40, 21 December 2009 (UTC)
 * There's a regexp tab in AWB that will tell you which patterns hit. (BTW, adding comments to talk pages aren't minor edits.) -- JHunterJ (talk) 02:11, 21 December 2009 (UTC)
 * Bah, it's my own setting, not one from the list. Nevermind. Sorry. LilHelpa (talk) 00:20, 22 December 2009 (UTC)
 * No problem. Please, though, don't mark talk page comments as minor. See WP:MINOR. Thanks. -- JHunterJ (talk) 01:46, 22 December 2009 (UTC)
 * Will try to remember that. Difficult when almost everything I do is minor ;) LilHelpa (talk) 00:14, 23 December 2009 (UTC)

himself herself
Please would somebody add himslef herslef themsleves. Kittybrewster  &#9742;  09:05, 23 December 2009 (UTC)
 * ✅ Corrected existing rule here. Thanks Rjwilmsi  09:51, 23 December 2009 (UTC)

Error: enployed → empployed
AWB erroneously fixes "enployed" to "empployed" using the "Emp-" beginning rule. M AN d ARAX •  XAЯA b ИA M  11:26, 26 December 2009 (UTC)
 * ✅ That should fix it. Rjwilmsi  11:39, 26 December 2009 (UTC)

Women's'
" Womens' " gets changed to " women's' " instead of " women's ". (Spaces added so you could actually see what I was talking about.) --Closedmouth (talk) 14:24, 26 December 2009 (UTC)

Capitalization of titles in other languages
A recent edit at Nicole Oresme cleaned up a lot of things, but incorrectly changed the word latin to Latin in the title of the following book in French. French usage minimizes capitalization, and the lower cased latin was correct. Is there a way to make your capitalization changes language sensitive? Thanks. --SteveMcCluskey (talk) 22:01, 1 January 2010 (UTC)
 * Wolowski, ed., Traictié de la première invention des monnoies de Nicole Oresme, textes français et latin d'après les manuscrits de la Bibliothèque Impériale, et Traité de la monnoie de Copernic, texte latin et traduction française (Paris, 1864)
 * I would have updated the article but you've now removed those references. The answer is to use the lang template to enclose the foreign-language text. e.g.  Smith, F. Quelques mots en français . Rjwilmsi  22:35, 1 January 2010 (UTC)

Purportrated
The word "purpotrated" gets fixed to "purportrated" by

It should, of course, become "perpetrated". I don't know if it's worth making a new rule for this uncommon typo, but I do think the "Purport" rule should be fixed so it doesn't catch it. M AN d ARAX •  XAЯA b ИA M  23:11, 1 January 2010 (UTC)
 * ✅ "Purport" rule updated. Rjwilmsi  00:16, 3 January 2010 (UTC)

"à la" fix disabled
I've disabled the fix for "à la" because there are lots of false positives (particularly on Spanish/Italian text). If we are to keep the rule we need to fine a more restrictive version with many fewer false positives. Rjwilmsi 17:28, 3 January 2010 (UTC)

rhtyhmic not detected
For some reason, AWB did not find "rhtyhmic" as misspelling of "rhythmic" here. Could someone please fix that? --bender235 (talk) 13:57, 5 January 2010 (UTC)
 * ✅ here. Thanks Rjwilmsi  14:47, 5 January 2010 (UTC)

dissigner
"disigner" gets changed to "dissigner" instead of "designer" for some reason. --Closedmouth (talk) 13:56, 9 January 2010 (UTC)
 * That would be because of this prefix regex:


 * <Typo word="Dissi-" find="\b(D|d)isi([a-ko-z]|m[a-nq-z])(\w+)\b" replace="$1issi$2$3" />


 * Doing a little back of the envelope testing with an English wordlist, I anticipate at least 50 words that if spelled incorrectly, could be transformed nonproductively (like your example), including things like:
 * desiccant
 * designed
 * designator
 * designer
 * desirable


 * The good news is they were spelled wrong before, so a new rule could anticipate those extra S's. I could propose one, but it would require more testing than I can do right now before going live.


 * I count only a few that should have one s and will be made incorrect, (disidentify, disimitate, disimitation), and about 30 words that the filter will correct (dissimilar, dissipated, dissipation).


 * Those somewhat strange rules in the middle serve to exclude about 100 or so correct words that would be changed. These include words like disinterested, disinfect, disincline.


 * One possible solution is to remove r from the range: \b(D|d)isi([a-ko-qs-z]|m[a-nq-z])(\w+)\b (the added part is in bold). This eliminates about half of the problem words, including your example, and only eliminates two of the legitimate corrections. This is at a cost of about 5% of the legitimate corrections.


 * Again, these are all estimates and don't take into account the frequency with which the words are used, which is a big factor. Shadowjams (talk) 06:27, 13 January 2010 (UTC)


 * There already is a fix for "Design", and it should fix the misspelling above.


 * It is: <Typo word="Design" find="\b(D|d)[ei]s(?:sigi?n|gin|ing)(s?|ed|ers?|ing)\b" replace="$1esign$2" />


 * Interestingly enough though, it won't catch these misspelling: desigins, desiginer, desiging. I think that could be added ( add |igin after |ing   ) without breaking anything, but I can't test it right now. Shadowjams (talk) 07:21, 13 January 2010 (UTC)
 * ✅ Design rule expanded for 'disign'. Rjwilmsi  08:24, 13 January 2010 (UTC)
 * Thanks guys :) --Closedmouth (talk) 08:28, 13 January 2010 (UTC)

Rule for "platform"
Please add a rule that corrects things like "plattform" or "plataform" to "platform". --bender235 (talk) 22:37, 16 January 2010 (UTC)
 * That's a start. Rule added for the two misspellings you give. Rjwilmsi  23:08, 16 January 2010 (UTC)


 * Please check usage beforehand. The variant "Plattform" has numerous legitimate uses, among them PlattForm Advertising. "plataform" looks good, only exception seems to be PLATAFORM BL

"long tenured" -> "long-tenured"
Please add a rule that replaces "long tenured" with "long-tenured". Thanks. --bender235 (talk) 14:05, 11 January 2010 (UTC)
 * Anyone? --bender235 (talk) 20:07, 19 January 2010 (UTC)

"approximatley" -> "approximately"
Could someone please add that rule? AWB did not detect it here (I changed it per hand). --bender235 (talk) 00:59, 17 January 2010 (UTC)
 * ✅ Here. Rjwilmsi  08:52, 17 January 2010 (UTC)

Spelling corrections in URLs
AWB permanently tries to correct spellings in URLs, like "www.xyz.com/india" -> "www.xyz.com/India". Can this be prevented? --bender235 (talk) 19:50, 16 January 2010 (UTC)
 * Possibly. Do you have examples of an article with such a problem? Rjwilmsi  20:43, 16 January 2010 (UTC)
 * E.g. Concepcion Quetzaltepeque El Salvador. AWB tried to correct " http:// www.lonelyplanet.com/worldguide/destinations/central-america/el-salvador/essential?a=culture" to " http:// www.lonelyplanet.com/worldguide/destinations/central-America/el-salvador/essential?a=culture" --bender235 (talk) 22:36, 16 January 2010 (UTC)
 * Hmm, if that page were reformatted to use external links or citation templates the typo fixing would know to leave the URLs alone. Rjwilmsi  23:10, 16 January 2010 (UTC)
 * ✅ There was an AWB bug report, which has been fixed for the next release. Rjwilmsi  20:17, 20 January 2010 (UTC)

Dates in succession boxes
Hi, I have noticed that AWB removes the spaces between the years and the – in succession boxes, which is contrary to WikiProject_Succession_Box_Standardization/Guidelines (point vii, a). Could someone please fix this. Thanks Phoe   talk   19:38, 20 January 2010 (UTC)
 * That guideline contravenes WP:YEAR. Rjwilmsi  20:13, 20 January 2010 (UTC)
 * Yes, but WP:Year doesn't mention succession boxes. Additionally sometimes full dates are used in succession boxes (for example in articles about music albums or about boxes), which wouldn't come under WP:Year, but perhaps rather under MOS:DOB. Finally MOS:DASH allows exceptions in lists, so why should'nt it also apply for succession boxes ? (Yes I know that succession boxes are an odd version of a list). If I have not convinced you, then please consider this as settled. Best wishes Phoe   talk   20:48, 20 January 2010 (UTC)
 * Yes, date ranges are different to year ranges. WP:DASH perhaps has a clearer explanation of why. AWB is not removing spaces in date ranges, only year ranges. The WP:DASH exception for lists is the extra use of endahses, not an exception to allow year ranges to be spaced. Rjwilmsi  21:16, 20 January 2010 (UTC)
 * I agree with Rjwilmsi on this one. --bender235 (talk) 16:20, 23 January 2010 (UTC)

"whoom" -> "whom"
Could someone please add that rule? Thanks. --bender235 (talk) 16:21, 23 January 2010 (UTC)


 * Testing: \b([Ww])hoo+m\b => $1hom right now. I'll see if there are problematic false positives. Shadowjams (talk) 00:01, 25 January 2010 (UTC)


 * That expression works, but it's not a common typo. Scanning the November database dump I only find that misspelling used in 4 articles, 2 of which are intentional, and 2 of which I corrected. The two misspellings were added by one editor. I'm going to hold off adding it. Shadowjams (talk) 01:55, 25 January 2010 (UTC)

"intitled" -> "entitled"
Please add that one. Found it here, but AWB did not detect (changed it manually). --bender235 (talk) 22:03, 27 January 2010 (UTC)


 * Make sure not to correct "intitle", this is used in query strings for Google Books URLs, e. g. Populares. Paradoctor (talk) 22:34, 27 January 2010 (UTC)


 * Do you have any indication that there is another rule that generally handles this, but didn't in this case? I can't find (in a very quick search, admittedly) a rule that would have matched this. I'll work on a new one, but if there's an old one that should have gotten it, knowing that would be very helpful. Shadowjams (talk) 08:26, 29 January 2010 (UTC)


 * Ok, this should work. I don't want to add it in quite yet because I haven't tested it very much, but feel free to add it to your add/replaces, and if you don't see any problems then go ahead and add it to the typo list.


 * I'm not 100% that "intitled" is a typo, the dictionary references I looked up were a little unclear. But I don't think it's a problem edit either. In most cases "entitled" is going to be more right than "intitled", although I wonder if there are cases where "intitled" is correct. I'm not sure.


 * The other downside, I can't offhand think of a way to keep the case correct while transforming letters, so you'll need two rules, one for "Intitled" and one for "intitled". Just change the first letters, respectively. This one should also catch a simple transposition or deletion in the middle (the most likely typo).


 * Find: \binti[tl]{1,2}ed\b
 * Replace: entitled


 * Let me know how it works out. I'm using it on my own personal set at the moment. Shadowjams (talk) 08:52, 29 January 2010 (UTC)
 * I'm finding a lot of English language quotes, particularly in legal opinions, from the 1800s and before use "intitled". Perhaps we need to make sure any edit doesn't change a quote. Shadowjams (talk) 08:59, 29 January 2010 (UTC)
 * Don't know why I missed this, but my Merriam Webster lists "intitle" as an archaic version of "entitle". Paradoctor (talk) 09:08, 29 January 2010 (UTC)
 * There are ways to exclude quoted statements like this, but all of them that I'm coming up with right now are pretty processor intensive. There might be a way to creatively limit this, at some expense of type 2 errors, that is less processor intensive. I might revisit it at another time. I would recommend against using the above regex unless you're extremely careful you're not changing a quote. Shadowjams (talk) 09:10, 29 January 2010 (UTC)
 * Paradoctor - That is what I found, more or less as well. I don't think there's a problem converting modern text, but we certainly don't want to alter any quotes that use it. Because AWB uses the .net regex library there are some non-greedy expressions that aren't possible in most other regexes that might fix this nicely... but I'm concerned that most solutions will eat a lot of processing power. If some others have ideas I'd like any advice. Shadowjams (talk) 09:14, 29 January 2010 (UTC)
 * AWB does not apply the typo fixing rules within templates e.g. cquote or within quote marks e.g. " and all the common variations. Rjwilmsi  09:38, 29 January 2010 (UTC)
 * Oh, ok, so in a find-replace yes, but not within AWB/t? Shadowjams (talk) 09:42, 29 January 2010 (UTC)
 * I'm not sure I understand your question. I'll explain my answer again in more detail in the hope it does answer your question: when AWB executes a typo rule from the WP:AWB/T list it first hides the quotes then applies the typo regexes, then unhides the quotes again. If you apply the regex by other means you will not get this quote hiding (unless you write a custom module to access the functions). Rjwilmsi  09:55, 29 January 2010 (UTC)
 * Sorry for the confusion. That wasn't very clear. You understood what I meant though. I believe, in that case, that the above should fix what the OP was talking about. Of course, the question of whether or not the i version is appropriate in the modern context is still open, although I would assume not especially controversial. Shadowjams (talk) 10:14, 29 January 2010 (UTC)

E.g.
The rule for “e.g.” (currently fourth among new additions) adds left bracket, for example “eg.” → “(e.g.”. This should be fixed by removing the bracket. Svick (talk) 04:08, 30 January 2010 (UTC)
 * I originally put it there, and then its structure was changed, and then User:Marek69 disabled it, then made some changes and renabled it. The original one had a leading ( because the overwhelming majority of examples I found were at the beginning of parentheticals, which makes sense when you consider how people use the abbreviation. It is probably adding it because it was removed by Marek without changing the corresponding output.


 * I had tested the first version and was reasonably confident it didn't have many (I never found any) false positives. I cannot say the same about this new version. I am going to revert it back to the earlier version with a note. If someone wants to test it and change it that's fine too, but I think we're seeing some problems with it right now. Shadowjams (talk) 22:38, 30 January 2010 (UTC)


 * Another small question. Is E.g. ever proper in the Manual of style? (compared to e.g.). I don't know the answer, but wanted to bring it up. Shadowjams (talk) 22:41, 30 January 2010 (UTC)


 * The last version didn't work again (changed “eg.” to “(e.g.”, but didn't change “(eg.”), so I disabled it. Before it is turned on again, please make sure it works as it should. Svick (talk) 23:36, 30 January 2010 (UTC)


 * Looks fixed now. My mistake for not noticing that Marek's change was correct; the simplification is where it caused the problem.


 * If there are false positives without the (, then we'll need to note those here. Shadowjams (talk) 02:41, 31 January 2010 (UTC)

"Discoverinig" -> "Discovering"
AWB accidently replaced "Discoverinig" with "Discoverining" here, but it should be "Discovering", of course. --bender235 (talk) 23:31, 6 February 2010 (UTC)
 * That appears to be a result of the "-ining" regex, which is (?!\b(?:(?:Br|Kl|M|H|St)e|Nar|Kurt|Lap)inig\b)\b(\w+)inig(s|ly)?\b . I don't see any systematic way to fix this class of typos without interfering with the others. In other words, "inig" that should be "ing" are virtually indistinguishable from "inig" that should be "ining". If someone has some way to distinguish the two that would be useful, but I can't think of one right now.


 * I also don't know which is more common, but that could be a useful exercise. Shadowjams (talk) 08:24, 7 February 2010 (UTC)

Fluorescent
Using the "-escent" rule, AWB changes "floresent" to "florescent". Although that is a valid word, the more likely intended word is "fluorescent". A wiki search for "fluorescent" produced 1042 articles, and "florescent" found 32 pages. For those 32, I fixed the incorrect usages, discovering that all except 3 were actually intended to be "fluorescent". M AN d ARAX •  XAЯA b ИA M  21:31, 9 February 2010 (UTC)


 * I've expanded the "Fluoresce" rule and removed "|[Ff]lu?or" from the "-escent" rule. I excluded "florescent" and "florescence" from "fixing" as they are correctly spelled words; however, as noted above, they're extremely rare on Wikipedia and the "fluo..." word is almost always the intended one, so if anyone thinks it's better without the exclusion, feel free to remove it. M AN d ARAX <font color="6600FF"> •  XAЯA b ИA M  04:09, 21 February 2010 (UTC)

New or Fix existing typos
I have come across a couple typos that are either not working or need to be added. Below are a few that I have found that either need to be added or don't seem to be working.
 * occassion to occasion. This exists in the typo list but doesn't seem to work all the time.
 * Philidelphia to Philadelphia. This exists in the typo list but doesn't seem to work all the time.
 * Pitsburg, Pittsberg, Pittsburg to Pittsburgh --Kumioko (talk) 18:37, 23 February 2010 (UTC)
 * But there are a lot of places named Pittsburg without the "h".--BillFlis (talk) 20:44, 9 March 2010 (UTC)
 * The rule for "Occasion" seemed correct for the case you cite, but I expanded it a little anyway to catch more misspellings.--BillFlis (talk) 20:51, 9 March 2010 (UTC)
 * Thanks, not great at regex developement myself. --Kumioko (talk) 20:58, 9 March 2010 (UTC)

Workign -> Working
For some reason, AWB tried to replace "Workign" with "Wooking" here (I correct it manually), but it should be "Working". --bender235 (talk) 20:10, 9 March 2010 (UTC)
 * Fixed.--BillFlis (talk) 20:42, 9 March 2010 (UTC)

'yound' / 'young' and 'switchs' / 'switches'
I've noticed both while looking over today's recent changes. Are they sufficiently notable to include in the list? Mephistophelian (talk ● contributions) 22:37, 17 March 2010 (UTC)

Distict
Looks like Distict is changed to Distinct. (" <Typo word="Distinct_" find="\b(D|d)is(?:ctinc|tic|inc|t[ai]n(?=ti))t(i(ve|on|vely)|ly)?\b" replace="$1istinct$2" /> ") But it might as well be a typo for District. (Especially if capitalized).--ospalh (talk) 09:58, 18 March 2010 (UTC)
 * Yes, it can be. I haven't found any good ways to differentiate between the two. Rjwilmsi  10:27, 18 March 2010 (UTC)

Exception for "antarctica" rule
Could someone please add an exception for Sinfonia antartica to that "Antarctica" rule, because I falsely "fixed" that on Vernon Handley, and I don't think many people know that it in fact isn't a typo. --bender235 (talk) 14:31, 19 March 2010 (UTC)
 * ✅ updated rules. Rjwilmsi  08:27, 25 March 2010 (UTC)

Occasionally
Why is "occasionanlly" corrected to "occasionnally", from an incorrect spelling to another incorrect spelling? I know that there is a rule for -anlly -> -nally, but it shouldn't apply to that case. PleaseStand (talk) 02:40, 25 March 2010 (UTC)
 * ✅ Rule updated to avoid that one. Rjwilmsi  08:22, 25 March 2010 (UTC)

'on bored' / 'on board'
While AWB caught that 'their' should've been 'there', it missed 'on bored'. Mephistophelian † 14:52, 26 March 2010 (UTC)
 * This doesn't appear to be a very common misspelling: and there are also appropriate uses of the words "on bored" together, such as: "...blames anti-social behaviour in her area on bored News Night presenters...". – xeno <sup style="color:black;">talk  14:55, 26 March 2010 (UTC)

Nearly
AWB tried to replace "neraly" with "nerally" here (I fixed it manually), but it should be "nearly". --bender235 (talk) 16:58, 26 March 2010 (UTC)
 * Looks like it comes from the "ally" suffix. Not sure how to fix. – xeno <sup style="color:black;">talk 17:02, 26 March 2010 (UTC)

<Typo word="-ally" find="\b(\w+(?:[cdglntv]i|nt|ic|io?n|er|son))aly\b" replace="$1ally" />


 * "Neraly" is evidently a very rare error—I just fixed the only other one I found in wikipedia.—BillFlis (talk) 17:54, 26 March 2010 (UTC)


 * Yet it could happen again. Don't forget that. --bender235 (talk) 23:17, 26 March 2010 (UTC)

✅ Expanded year rule to catch "neraly". Rjwilmsi 14:04, 27 March 2010 (UTC)

A couple more possible changes
I have stumbled upon a couple more typos that I think might be useful additions to the list
 * adn to and
 * thier to their -- rule exists
 * establishement to establishment ✅
 * etal to et. al.
 * amry to army ✅
 * aviaror to aviator --Kumioko (talk) 18:02, 26 March 2010 (UTC)


 * Added "Establishment". "Amry" seems to be a proper name, as are definitely "Thier", "Thiers", and "Etal", so need caution. I didn't find any occurrences of "aviaror" in wikipedia.--BillFlis (talk) 18:56, 26 March 2010 (UTC)
 * Thanks, how did you search WP for that? --Kumioko (talk) 19:10, 26 March 2010 (UTC)
 * Enter whatever in "search" and click "Search" (not "Go"). But even "Go" will find Thiers and Etal, as they have their own articles.--BillFlis (talk) 07:25, 27 March 2010 (UTC)


 * It's not "et. al." but "et al.", which is short for "et alii" (meaning "and others"). --bender235 (talk) 23:20, 26 March 2010 (UTC)

I suggest: Kittybrewster  &#9742;  12:43, 27 March 2010 (UTC)
 * "the hoi polloi" be changed to "hoi polloi" (tautology).
 * Also "return back" to "return".
 * Also "their were" to "there were". -- rule exists Rjwilmsi  14:08, 27 March 2010 (UTC)

accidently
AWB fixed "acidentaly" with "acidentally" here, but it should've been "accidently" of course (I later fixed it manually in the article). --bender235 (talk) 22:56, 27 March 2010 (UTC)
 * What "accidently"? -- accidentally. Rjwilmsi  23:47, 27 March 2010 (UTC)
 * ✅ New rule for "Accident" to fix single 'c'. Rjwilmsi  23:54, 27 March 2010 (UTC)

XML
I would like to suggest we put this in a

M AN d ARAX •  XAЯA b ИA M  06:11, 26 May 2010 (UTC)
 * Yeah, I'm aware of that problem. Most of those should be avoided if they're in a full url, but the ones that aren't in link templates won't be. It also shows up on a few other web addresses. One possibility is to add (?!\.ie\b) as an exclusion to the beginning (I've had a lot of trouble with those lately so I'll let someone else test that before adding it in). Shadowjams (talk) 10:33, 26 May 2010 (UTC)
 * I noticed there is quite a few to skip past. Why can't .ie be ignored? Surely it's enough to have a dot infront rather then checking for a complete url? Regards, SunCreator (talk) 14:19, 26 May 2010 (UTC)

✅ with this update. Rjwilmsi 18:42, 26 May 2010 (UTC)

Can we add milatary => military
Occurrences here. Regards, SunCreator (talk) 00:30, 29 May 2010 (UTC)
 * Fixed here. Shadowjams (talk) 05:36, 29 May 2010 (UTC) ✅

french => French
We don't have this Capitalisation in Cultures, languages, and ethnic groups or elsewhere. Regards, SunCreator (talk) 00:34, 29 May 2010 (UTC)
 * Done ✅ Shadowjams (talk) 05:31, 29 May 2010 (UTC)
 * Except for french fries.--BillFlis (talk) 12:55, 29 May 2010 (UTC)
 * french fries says you can use French fries with a reference. Regards, SunCreator (talk) 16:48, 29 May 2010 (UTC)
 * Well, you can use "French" but "french fries" ("sometimes capitalized") and "french-fried" don't need "correcting".--BillFlis (talk)
 * Good point. Should we exclude that one example, or is the rule generally problematic? I think we do the language capitalizations generally, notwithstanding other similar examples. Shadowjams (talk) 08:52, 30 May 2010 (UTC)
 * There's also the verb french ("often capitalized"), which doesn't take any particular words after it. Also, french curve is only "often capitalized F".--BillFlis (talk) 12:28, 30 May 2010 (UTC)

winnining
Hello, winninig gets changed to winnining instead of winning. It's probably not very common but whatever. --Closedmouth (talk) 12:55, 8 June 2010 (UTC)
 * This rule would be the problem <Typo word="-ining" find="(?!\b(?:(?:Br|Kl|M|H|St)e|Nar|Kurt|Lap)inig\b)\b(\w+)inig(s|ly)?\b" replace="$1ining$2" />.


 * I'm honestly not sure exactly what that rule's fixing. Maybe someone can explain it, in which case I'd be more comfortable adding the exclusion for Closedmouth's example. Shadowjams (talk) 06:02, 9 June 2010 (UTC)
 * It fixes typos like "beginig". No harm to add a new rule for "-inninig" to "-inning" above this one. Rjwilmsi  09:39, 9 June 2010 (UTC)

Defered from AutoWikiBrowser/Tasks
I noticed today that there are many articles with the word "Olympic" or "Olympics" misspelled. Common misspellings are "Oylmpic", "Olmypic", and "Olypmic". Would a bot be able to fix these spellings, or am I in the wrong place? Thanks, GaryColemanFan User Talk:GaryColemanFan 9:05 pm, 27 May 2010, Thursday (19 days ago) (UTC−6) -- Cit helper (talk) 06:04, 16 June 2010 (UTC)
 * I added a rule here. It corrects your suggestions "Olmypic" and "Olypmic", as well as "Olypic" and "Olymic" (and of course all their plurals), but I was not able to find any instances of "Oylmpic", so that's not included.--BillFlis (talk)

False Positives

 * 1) McGrath, Alaska
 * 2) McLeod River
 * 3) Me You Them
 * 4) Meagan Good
 * 5) Meanings of minor planet names: 65001–66000

N'Sync ---> NSYNC

Cit helper (talk) 01:46, 15 June 2010 (UTC)
 * I may not understand you correctly: I couldn't find any occurrences of "sync" on any of those pages.--BillFlis (talk) 09:58, 15 June 2010 (UTC)
 * Yes, that was a suggestion that has been brought to my attention, not a FP...Cit helper (talk) 06:04, 16 June 2010 (UTC)
 * The numbered entries have False Positives with various words (this was just a dump from false_positive.txt).Cit helper (talk) 06:04, 16 June 2010 (UTC)

Axel Finet -> Axel Finite (False Positive, Name) Article: Nick Tarabay —Preceding unsigned comment added by Cit helper (talk • contribs) 07:46, 16 June 2010 (UTC)
 * Agree about Finet it has several uses. Regards, SunCreator (talk) 10:53, 16 June 2010 (UTC)
 * I corrected the various "Finite" and "-finite" rules not to change "Finet".--BillFlis (talk) 12:07, 16 June 2010 (UTC)

"achiveved" -> "achieveved"
AWB replaced "achiveved" with "achieveved" here, which is obviously incorrect. Could someone please fix the rule? --bender235 (talk) 13:08, 17 June 2010 (UTC)

Nurnberg
Someone please add a rule that replaces "Nurnberg" with either "Nürnberg" oder "Nuremberg" (I suggest the latter would be more appropriate). --bender235 (talk) 21:21, 18 June 2010 (UTC)
 * The are several articles with "Nürnberg" in the title (e.g., German cruiser Nürnberg), although the city is under "Nuremberg". I found some (probably incorrect) occurrences of "Nurnburg" (with U for E).--BillFlis (talk) 11:56, 19 June 2010 (UTC)

Incorrect pluralizations
Please check this regex to see if it would be a good addition: In particular, I question whether "Medlys" and "Medlies" should be included here, in a separate rule ("Medly" seems quite common), or not at all. PleaseStand (talk) 23:56, 19 June 2010 (UTC)
 * I'm testing it now. It's mostly catching "attornies". Don't see any issues with it yet. Shadowjams (talk) 04:09, 20 June 2010


 * It's pretty frequent too, more frequent than many of our rules. I'll add it in to the new additions. Shadowjams (talk) 04:14, 20 June 2010 (UTC)
 * I removed the "Medlies" rule because I found the quoted term "Monstrous Medlies" at Colley Cibber (it's sourced to a book). You can add that back if you like. Shadowjams (talk) 04:18, 20 June 2010 (UTC)

Importing Typo list for other languages
I would like to use this great plugin for my language but when i try to enable RegexTypoFix checkbox it is saying it will load typos list from english wikipedia. But I want to set it to download from my own langauge wikipedia. How can i do this? -- Mahir78 (talk) 10:29, 22 June 2010 (UTC)
 * Add, replacing the en with whatever language you want, to the local checkpage. —  Ree  dy  10:40, 22 June 2010 (UTC)

Playright -> Playwright
There is a publishing house "Playright publishing". Is there a way to make sure the word is not replaced when it is either 1. capitalized or 2. followed by the word "publishing" ?--Muhandes (talk) 10:37, 23 June 2010 (UTC)
 * Yes, you can protect that deliberate misspelling by applying the Sic template.--BillFlis (talk) 16:00, 23 June 2010 (UTC)
 * Thanks, I obfuscated it with Sic on wherever it was used. --Muhandes (talk) 14:13, 24 June 2010 (UTC)

centerfield -> center field?
Concise Oxford has "centerfield" as a valid word. Should it really be replaced with center field? --Muhandes (talk) 14:13, 24 June 2010 (UTC)

childrens' → children's
I'm not sure how to add this but it is very common. Currently it does childrens' → children's' which is incorrect. If someone could add this it would be most helpful. --Muhandes (talk) 14:11, 24 June 2010 (UTC)
 * I just had it work correctly at least in one case. It might be that the times when it didn't work were due to ’ used instead of '? I will have to supply an example of a page not working correctly I guess. --Muhandes (talk) 14:43, 24 June 2010 (UTC)
 * Sorry for multiple edits, but I was right. The problem is indeed with the use of the second type of apostrophe. Namely, childrens’ → children's’ see Amerika-Gedenkbibliothek for example. --Muhandes (talk) 14:48, 24 June 2010 (UTC)
 * I modified the rule to handle both types of apostrophe.--BillFlis (talk) 17:07, 24 June 2010 (UTC)

mens → men's
We have childrens → children's and womens → women's, why not mens → men's ? If this is appropriate, can anyone add it please? --Muhandes (talk) 09:17, 25 June 2010 (UTC)
 * Because of Mens, Mens sana in corpore sano, Mens rea, etc. (Latin phrases), as well as Mens Sana Basket. However, I did add a rule to change "mens'" to "men's".--BillFlis (talk) 11:55, 25 June 2010 (UTC)

Widly
Please extend the -ely rule to catch that. I am also considering "falsly" and "sparsly" but am unsure whether it would be worth the processing time. PleaseStand (talk) 01:15, 26 June 2010 (UTC)
 * I'll check it out. I wouldn't worry about the processing time for those too much. Strangely though, that rule only finds those roots that have "in" or "un" at the front. I think that's unintentional... adding a ? to that first group would allow it to find all permutations. I'm testing that rule right now to see if there's some reason for it. Shadowjams (talk) 02:41, 26 June 2010 (UTC)
 * Added here. ✅ Shadowjams (talk) 06:30, 26 June 2010 (UTC)

More then > More than
There must be something strange about this rule - it doesn't show up in the edit summary in the same way as the others. diff -- John of Reading (talk) 15:11, 27 June 2010 (UTC)
 * ✅ Fixed. Rjwilmsi  13:16, 28 June 2010 (UTC)
 * Is there a reason the search string ends with a space (\s) rather than a simple word boundary (\b)? The "then" (for "than") could be followed by a comma (perhaps separating a parenthetical phrase); e.g., "other than, say, sausages".--BillFlis (talk) 15:11, 28 June 2010 (UTC)
 * Because we want whitespace, not a word boundary, to avoid false positives when "then" is an adverb and not a misspelled preposition. For instance, since (back then) I thought that was the explanation, I didn't say any more then. -- JHunterJ (talk) 15:23, 28 June 2010 (UTC)

Metropolitan: Is this a bug?
http://en.wikipedia.org/w/index.php?title=Wikipedia:AutoWikiBrowser/Sandbox&diff=370783051&oldid=370782700

AWB ignores typos in the version containing the link metropoltian, but it works with the version without that link.--Diwas (talk) 13:08, 29 June 2010 (UTC) (I had added the new typo rule yesterday.)--Diwas (talk) 13:11, 29 June 2010 (UTC)

This links Metropolitan bishop metropolitan are incompatible with this RETF-rule too. Is the rule incomplete? --Diwas (talk) 13:49, 29 June 2010 (UTC)

The AWB Regex Tester is replacing metropoltian with metropolitan --Diwas (talk) 14:06, 29 June 2010 (UTC)
 * No bug, deliberate behaviour: under we did: fix https://secure.wikimedia.org/wikipedia/en/wiki/Wikipedia_talk:AutoWikiBrowser/Bugs#names_often_spelled_differently don't apply a typo fix if there is a wikilink target using that spelling.  Rjwilmsi  10:09, 1 July 2010 (UTC)
 * But it was finding false positives, which I have just corrected.--BillFlis (talk) 11:23, 1 July 2010 (UTC)
 * I guess the link is https://secure.wikimedia.org/wikipedia/en/wiki/Wikipedia_talk:AutoWikiBrowser/Bugs/Archive_13#names_often_spelled_differently now.


 * after edit conflict: Thank you for your answer. Now it works. Originally the link was correct, but I guess this correction of my simple rule was making it working. I guess if a rule match a link, the rule will be ignored in this article. But my bad rule was matching the correct spelling. thanx --Diwas (talk) 12:51, 1 July 2010 (UTC)

separete/separeted
I notice there is separeble but not separete/separeted. It is quite a common typo. Would be nice if someone could add it. --Muhandes (talk) 15:25, 29 June 2010 (UTC)
 * The existing "(In)Separable" rule covers those variations too. Rjwilmsi  13:08, 1 July 2010 (UTC)
 * It didn't until I modified it a couple of days ago. I should have commented here that it had been handled.--BillFlis (talk) 13:33, 1 July 2010 (UTC)

Request addition
Could someone please add:

ascession --> accession

it's in many many "list of monarchs" type articles and it's a blatant misspelling, there's too many for me to fix them all manually. -- &oelig; &trade; 22:12, 22 June 2010 (UTC)
 * Sometimes, maybe "ascession" should be "ascension", no? They only differ by one letter.--BillFlis (talk) 00:03, 23 June 2010 (UTC)


 * I considered that and actually checked and pretty much all of them deal with accession. The search turns up nothing but lists of consorts etc. Besides, ascession is much more likely to be mistaken for accession because of the similar sound, and "ascension" isn't commonly misspelled. -- &oelig; &trade; 05:34, 23 June 2010 (UTC)

Okay so is noone going to add this? -- &oelig; &trade; 02:33, 25 June 2010 (UTC)


 * I'm not in the process of testing it right now, but if you'd like to, try this: Find: "\b(A|a)sc+es+[io]{2}n\b" Replace: "$1ccession" . The extra stuff in the middle should catch the "io" "oi" switch, and I'd guess that ascension misspellings will probably include an "n" somewhere, which would exempt it from that regex. Shadowjams (talk) 05:58, 25 June 2010 (UTC)


 * Oh you mean for me to test it? No I don't normally use AWB, I don't have it installed.
 * I'm curious though, why people are avoiding adding this? -- &oelig; &trade; 03:17, 27 June 2010 (UTC)
 * Hah. I'm sorry; I'm not avoiding it, I'm not sure if anyone else is, but I wouldn't see a reason why if that were so. I don't have a wiki dump handy right now which is why I can't test it immediately [I did earlier but I forgot about this one]. I'll try and take a look soon. I don't foresee any issues with what I proposed above, but I get a little cautious around these British monarch-related changes because they're used in all sorts of ways that I can't begin to comprehend, so I like to test those. I am pretty cautions though, it's not a catastrophic event if they're added and then later tweaked. Shadowjams (talk) 09:00, 27 June 2010 (UTC)


 * It just seemed like some are hesitant about adding it. So if some readers here need some reassurance, I did my homework on this. The search for "ascession" gives only 47 results, while "accession" gives 11,334 results! The search for "ascession" turns up almost all "List of ____ consorts" type articles. In all of these articles the word is used in the context of the definition of "accession", not "ascension", or anything else. These articles all have similar tables in which this word appears multiple times, so I'm thinking the same person created all these tables and used the same misspelled word in all of them, not knowing that "ascession" isn't even a word! I checked in multiple dictionaries and even asked the gurus over at Wiktionary's Tea room. So I'm quite certain it's safe to add this to the list! :) &oelig; &trade; 08:26, 28 June 2010 (UTC)

Meh. I'm talking this page off my watchlist. -- &oelig; &trade; 00:01, 1 July 2010 (UTC)
 * Wow. Sorry that things here weren't happening fast enough to please you. We hate to see you go, really, because we are entirely at your service, and your complete satisfaction is our only goal. The thing is, some of us are Old Farts, who check our email only about every couple of hours. Even then, we tend to think a bit before we act. Oh, and you forgot take a number, so we didn't even see you there at the end of the queue.--BillFlis (talk) 00:17, 1 July 2010 (UTC)
 * I did put in a rule that you could try out. Presumably you use AWB, so you could plug it in and try a few. I haven't gotten around to doing that. It's nothing personal. I think that rule will work without any problems and someone can axe it if it starts acting up. Shadowjams (talk) 09:01, 1 July 2010 (UTC)

lol, thanks Bill. &oelig; &trade; 08:30, 2 July 2010 (UTC)
 * Let's bury the hatchet you two. My regex from above will probably blank the main page. Actually... that'd be much more impressive than anything I've actually contributed. Let's hope for disaster. Shadowjams (talk) 10:47, 2 July 2010 (UTC)


 * It looks OK to me. I found one "ascesion" that should be "ascension", which I fixed by hand.--BillFlis (talk) 11:02, 2 July 2010 (UTC)

Genious
It seems to be a common misspelling of "genius". PleaseStand (talk) 02:48, 3 July 2010 (UTC)
 * ✅ Here. I haven't tested it yet, but offhand doesn't seem like a large risk. Shadowjams (talk) 06:10, 3 July 2010 (UTC)

"Practive"
May someone please add this to the misspelling list, to be replaced with "practice"? I'm a bit intimidated by the code. :) Search results bring up quite a few occurrences that are tedious to be fixed manually. Thanks, Airplaneman   ✈  06:19, 5 July 2010 (UTC)
 * ...or maybe not. "Proactive" could be a possibility as well. I'll go through the search and manually fix them :) Airplaneman   ✈  06:26, 5 July 2010 (UTC)

non-metropolitan
Metropolitan: The shorter rule

"\b(M|m)etr(?:(?:op|po)lit|(?:opo?|po)lti)(\w*)\b"

is correct, too. Isn't it? --Diwas (talk) 20:17, 5 July 2010 (UTC)


 * Thanx BillFlis, now I see you have it done already.--Diwas (talk) 20:26, 5 July 2010 (UTC)

non-metropolitan: What about this rule for finding words like semi-metroplitan, too?

"\b((?:\w+-)?(?:M|m))etr(?:(?:op|po)lit|(?:opo?|po)lti)(\w*)\b" --Diwas (talk) 20:17, 5 July 2010 (UTC)
 * I think "non-metropolitan" should be replaced by "rural." And "semi-metropolitan" by "small-town". What do you think?--BillFlis (talk) 22:01, 5 July 2010 (UTC)
 * I am not sure, I am not nativ-english, but the word rural entered my mind when I was reading non-metropolitan. But non-metropolitan is a legal term in England and the rule above covers all ...-metropolitan words. Semi-metropolitan is a rare word. I am not sure if there are other words with -metropolitan. --Diwas (talk) 07:56, 6 July 2010 (UTC)


 * Non-metropolitan isn't a word I think I've ever heard, and semi-metropolitan is just as weird. I am a native speaker, and rural is not an antonym of metropolitan. This is the kind of example of what this project isn't appropriate for, although may be an appropriate fix in some cases. Shadowjams (talk) 08:00, 6 July 2010 (UTC)

Staus
Does this make sense? I want to replace "staus" with "status", but only when not capitalized (to avoid the surname). The misspelling seems to be very common. Thanks, PleaseStand (talk) 02:15, 6 July 2010 (UTC)
 * Yeah. AWB is case sensitive, so that's possible. Shadowjams (talk) 02:25, 6 July 2010 (UTC)
 * I knew that, so I have now added that rule. All or almost all occurrences of lowercase "staus" shown in a Wikipedia search should have been "status". PleaseStand (talk) 02:49, 6 July 2010 (UTC)
 * Looks like a good addition. Shadowjams (talk) 07:02, 8 July 2010 (UTC)

Merovingian
For some reason, AWB tried to replace "merovingian" with "Merovingia$2". Could someone please fix this? --bender235 (talk) 11:24, 8 July 2010 (UTC)
 * Fixed.--BillFlis (talk) 11:48, 8 July 2010 (UTC)

Tamil Nadu
In an abundance of caution I have removed the following line from the New section of RETF.
 * <Typo word="Tamil Nadu" find="\b[Tt]amil\s*[Nn]adu\b(?<!Tamil Nadu)" replace="Tamil Nadu" />

This appears to be effecting many articles and may be a legitimate spelling of the name since it's so prolithic across Wikipedia. Can we discuss this? Just want some reassurances that all these edits I'm doing won't have to be reverted. Not against it if it's right.--mboverload @ 00:49, 28 June 2010 (UTC)
 * The Indian Government website has "Tamil Nadu". Also thehindu.com. -- John of Reading (talk) 06:06, 28 June 2010 (UTC)
 * It appears to me that the intent of this rule is only to capitalize it and make it two words if it appears as one. Is it doing something else?--BillFlis (talk) 09:44, 28 June 2010 (UTC)
 * Being prolific across Wikipedia is not an indication of legitimacy. Unless there is a reliable source that indicates it should not be capitalized or should not be two words, you have at least my reassurances that those edits shouldn't be reverted. -- JHunterJ (talk) 11:25, 28 June 2010 (UTC)

The rule has been restored. That's all it was doing. The reference desk said either could be accurate. Might as well stick with one. --mboverload @ 06:21, 5 July 2010 (UTC)
 * I'd like to point out (in case it wasn't clear) that although the official name is indeed Tamil Nadu, correcting it is in many cases wrong. Specifically, as part of an organization's name, as we all agree organization names should not be "corrected" (my favorite example being Childrens Hospital Los Angeles). As some/most people are not aware of this and might be tempted to "correct" such instances, and it is indeed very prolific, it might be best to be prudent and not include this rule. --Muhandes (talk) 08:44, 12 July 2010 (UTC)

Manoeuvre - Manouvre
Per wikt:manoeuvre, this is a British English spelling, not a typo. Mjroots (talk) 10:21, 10 July 2010 (UTC)
 * And what is "manouvre"?--BillFlis (talk) 05:49, 11 July 2010 (UTC)

"-keted"
I changed the "-keted" so it wont catch racketts, but still catches bracketted. I hope I did it correctly, first time I try my hand at this. --Muhandes (talk) 11:00, 12 July 2010 (UTC)
 * It seems that "rackett" is a noun, not a verb, so there would be no such word as "racketted".--BillFlis (talk) 11:35, 12 July 2010 (UTC)
 * Looking at the rule. it also captures the ending "s" and "ing", so in fact it catches "-keted", "-kets", "-keting". --Muhandes (talk) 12:07, 12 July 2010 (UTC)

heavily
WB tried to replace "heaively" with "heaively", but it should've been "heavily". Please fix. --bender235 (talk) 20:22, 3 July 2010 (UTC)


 * Anyone? --bender235 (talk) 00:39, 16 July 2010 (UTC)

'Publisher=' parameter of cite template
I've noticed that the "publisher=" parameter of the cite templates is widely misused to specify the name of the newpaper or magazine; sometimes the person responsible realised that the name should be italicised, so they've manually added italics e.g. " publisher=The Times ". Of course, the real problem is that it's the wrong parameter - what's really needed is "work=The Times". I've set up my own find and replace regex to correct anything of that form, specifying a long list of widely quoted newspapers and magazines. Could/should this be added to the list of automatic corrections somehow? Colonies Chris (talk) 16:24, 13 July 2010 (UTC)
 * cannot be done as a typo fix, could be as a genfix. Rjwilmsi  17:28, 13 July 2010 (UTC)
 * Was this dispute settled? I thought there was still a discussion on it. My (very limited) understanding is that the website is the work, the publisher is the entity behind it, so isn't "publisher=The Times" correct? --Muhandes (talk) 18:22, 13 July 2010 (UTC)
 * work is the name of the publication/periodical/newspaper/website so is "The Times" for www.timesonline.co.uk etc. If publisher is used then it's the parent company of the website (perhaps Times Newspapers Limited or News Corporation in this case); publisher isn't used for well known publications as it's no extra use. Rjwilmsi  18:44, 13 July 2010 (UTC)

two-fold, four-fold, hundred-fold etc.
Any reason why these are corrected? They show as valid in many dictionaries. --Muhandes (talk) 23:26, 14 July 2010 (UTC)
 * Please, can anyone check this? I have seen several edits in the last few days using this rule (here's one) and I am hesitant. See two-fold, and wordnet also has four-fold, five-fold, six-fold seven-fold eight-fold nine-fold --Muhandes (talk) 10:31, 20 July 2010 (UTC)
 * Hyphenated versions don't seem to be in the OED, so rule seems fine: twofold. Rjwilmsi  13:00, 20 July 2010 (UTC)
 * But they do appear on Merriam-Webster two-fold, so it might be more American English.--Muhandes (talk) 14:04, 20 July 2010 (UTC)

Subsidiary
I believe we can do better than what we currently have. I was considering the above proposed regex to match "subsidiery" and its variants, but I don't want it to match "subseries". PleaseStand (talk) 04:13, 21 July 2010 (UTC)
 * It is a little messy though, and it's surprising how little it matches in terms of misspellings.




 * That might work. Shadowjams (talk) 04:54, 21 July 2010 (UTC)
 * Is this better? PleaseStand (talk) 19:09, 21 July 2010 (UTC)

Nera
Currently, "nera" is fixed to "near". Glyka Nera is a place in Greece and AWB suggested changing "Nera" to "Near". This of course is wrong and had it been in a large article full of suggested changes, I may not have noticed it. "Nera" with a capital N should not be corrected. McLerristarr (Mclay1) (talk) 12:18, 1 August 2010 (UTC)
 * The word "near" could begin a sentence, such as, "Near the opera house is the city hall." Some of these things just have to be tolerated -- not saying this is necessarily one, but just sayin'. --Auntof6 (talk) 17:00, 1 August 2010 (UTC)


 * Well, we can't possibly correct all typos. Someone could type "three" instead of "there", so that will never be corrected. It's better to be safe than sorry, we shouldn't rely on machines to do everything for us – good ol' copy-editing is always best. So in the case of three/there and Near/Nera, they should be left alone for people to find when reading. Perhaps "Nera" could only be left alone if it follows "Glyka". I don't know if that's possible. McLerristarr (Mclay1) (talk) 02:36, 2 August 2010 (UTC)


 * It's trivial to exclude the uppercase version, or to exclude "Glyka Nera" or similar constructions. Is the proper use of "Nera" identifiable from the typos by excluding times it's followed by Glyka? Shadowjams (talk) 03:06, 2 August 2010 (UTC)


 * Acording to Nera, "Nera" is the name of a company, a goddess and several places, so I think it should not be corrected. McLerristarr (Mclay1) (talk) 03:27, 2 August 2010 (UTC)


 * I'm fine with that. I doubt it's a common typo, and it's easily spotted by regular editing. I'll go half-way and change the rule to only correct non-capitalized versions. Someone else can remove it completely if that seems appropriate. Shadowjams (talk) 03:59, 2 August 2010 (UTC)


 * Just removing capitalised Nera is what I wanted. McLerristarr (Mclay1) (talk) 07:28, 2 August 2010 (UTC)

sapces
Can somebody please add "sapce" to change to "space" and "sapces" to change to "spaces". It is an easy typo to make and currently the typo exists in seven articles. In every case, it is a typo and not a foreign word. McLerristarr (Mclay1) (talk) 07:50, 2 August 2010 (UTC)
 * ✅ With this change. Rjwilmsi  17:49, 2 August 2010 (UTC)

Enmei vs. Emmei and Ie vs. i.e. in Japanese pages
I noticed that AWB tries to change Enmei to Emmei in places such as "Enmei ryu" (a martial arts school) and "Enmei ji" (the name of a Buddhist temple). I always leave the page at Enmei because I have seen this spelling in various places online. But I have not been able to find a definitive answer as to witch is correct. Also I notice that the family name "Ie" gets picked up and changed to "i.e.". So those using AWB on Japan related pages need to take extra care before saving. Colincbn (talk) 06:26, 3 August 2010 (UTC)
 * Exception added for "Enmei". Rjwilmsi  08:33, 3 August 2010 (UTC)

Compilaton
Compilaton - Compilation

There are 11 currently and I've started fixing them but it might as well go here.  Ϣere Spiel  Chequers  12:49, 4 August 2010 (UTC)
 * The existing "Compilation" rule already covers that one. Rjwilmsi  16:30, 4 August 2010 (UTC)

Italicise foreign words and phrases
As per WP:MOS, foreign words and phrases should be italicised. Common foreign words and phrases used in English include those in List of Latin phrases. I brought this up on Wikipedia talk:AutoWikiBrowser/Feature requests and someone suggested it would be better if the typo finder did it. McLerristarr (Mclay1) (talk) 02:39, 11 August 2010 (UTC)
 * That could work although it of course has to be case by case. Non-English words are forever an issue when trying to write a new rule. Shadowjams (talk) 08:10, 11 August 2010 (UTC)

setle
Can somebody please change "setle" to "settle", "setler" to "settler", "setlers" to "settlers", "setling" to "settling" and "setled" to "settled"? McLerristarr (Mclay1) (talk) 04:54, 11 August 2010 (UTC)
 * ✅ New rule added. Rjwilmsi  06:53, 11 August 2010 (UTC)

canvern
Can somebody please correct "canvern" to "cavern". I always make this mistake. McLerristarr (Mclay1) (talk) 02:40, 11 August 2010 (UTC)
 * A search turned up no instances of "canvern" on wikipedia. You must be doing a good job of correcting yourself!--BillFlis (talk) 17:59, 13 August 2010 (UTC)
 * Well, I usually edit with Safari, which has an automatic spell check so I usually notice when I make a mistake. I was thinking more for other editor's sake, but since the typo does not exist at the moment, it's probably not worth adding. McLerristarr (Mclay1) (talk) 07:28, 14 August 2010 (UTC)

i.e. and e.g.

 * "i.e" should be correct to "i.e." ("e.g" already corrects to "e.g.")
 * a colon after "i.e.", "i.e", "ie", "e.g.", "e.g" or "eg" should be removed as it is completely unnecessary and yet common
 * McLerristarr (Mclay1) (talk) 07:35, 11 August 2010 (UTC)
 * Interesting. As to your first point, it took me a little bit to figure out why it's doing that because when I wrote the rule I did it largely to correct that problem. Whatever you're running it on that doesn't correct is a case where "i.e" is not followed by either a single quote, a space, a colon, a comma, a semi-colon, a close parenthesis mark, an ampersand (for non-breakable spaces, etc.), or a dash. Do you have an example of a page with that in the wild? It was somewhat intentional as a safety feature to not over-correct. Perhaps using \b would be sufficient, but the rule as it is now is very stable.

perl -e '$x="i.e";$x=~s/\bi(?:\.?e|e\.)([\s,:;\)&-])(?<!\.ie.)/i.e.$1/;print "$x\n"' does not correct, while perl -e '$x="i.e ";$x=~s/\bi(?:\.?e|e\.)([\s,:;\)&-])(?<!\.ie.)/i.e.$1/;print "$x\n" does. Shadowjams (talk) 08:06, 11 August 2010 (UTC)
 * As to the second, I'd invite others to comment on that. I'm not enough of a style wonk to know the right answer to that. Shadowjams (talk) 08:05, 11 August 2010 (UTC)
 * Here's a proof of concept on the first point:


 * AutoWikiBrowser/Sandbox is what I used to test the first point. It does not correct "i.e" but it does correct "ie", "eg" and "e.g". McLerristarr (Mclay1) (talk) 08:29, 11 August 2010 (UTC)
 * Right. It will correct "i.e " but not "i.e"    It's rare if not non-existent in articles (i.e. supposes some text after it so it should have one of the demarcating characters; if it doesn't, it likely isn't the abbreviation). Shadowjams (talk) 08:33, 11 August 2010 (UTC)
 * "i.e" could exist in a list. For example:
 * List of Latin abbreviations:
 * c.
 * e.g.
 * etc.
 * i.e
 * McLerristarr (Mclay1) (talk) 10:02, 11 August 2010 (UTC)
 * I don't think that's at all likely.--BillFlis (talk) 11:26, 11 August 2010 (UTC)
 * It's more likely than "i.e" not being related to "i.e." McLerristarr (Mclay1) (talk) 12:33, 11 August 2010 (UTC)
 * I meant that it's so unlikely that it's not worth making a rule here for. An error in a far-fetched list like that is less likely than someone trying to type "Ile" or "ile" and accidentally hitting the period key for the "l".--BillFlis (talk) 13:45, 11 August 2010 (UTC)
 * If that were true, it would have to be in a list as well, or at the end of a paragraph that is missing a full stop. I just thought that making "i.e" always correct to "i.e." no matter what followed it would only require deleting the code that specified something followed it. I don't know though, I have no idea how this thing works. Either way, what's happening about the second point? McLerristarr (Mclay1) (talk) 03:16, 12 August 2010 (UTC)
 * If "i.e" only corrects to "i.e." if followed by a space, what if "i.e" was followed by a punctuation mark such as a comma or colon? McLerristarr (Mclay1) (talk) 12:33, 13 August 2010 (UTC)
 * It will work if it's followed by a space or any of these characters (in bold): ' :, ; ) & -. My reason for writing it this way was to avoid situations where ie might be used in some different, but correct way. I don't remember what exactly prompted that, maybe I found something testing or maybe I was being overly cautious. It's also important that rules don't catch correct versions of the words, and this helps with that, although you could do it other ways too. Shadowjams (talk) 19:27, 13 August 2010 (UTC)

Metres per seconds?
I use find (\d)(\s)?m/s, which I replace with $1&amp;nbsp;m/s. Hasn't caused me any problems so far. Headbomb {talk / contribs / physics / books} 23:33, 10 August 2010 (UTC)
 * Doesn't AWB already do that internally? I see what you're doing... you're adding spaces in those conversions. If you wanted to expand that rule though you could do: "(\b\d+)\s*m(etere?s)?(/| per | a )s(econd)\b" and replace it with "$1 m/s", although that's more expensive. Shadowjams (talk) 23:46, 10 August 2010 (UTC)
 * I have a few bucks here....--BillFlis (talk) 02:14, 11 August 2010 (UTC)
 * I've no idea why you'd want to clutter the regex that way, but I ain't the AWB guru, so what do I know. Use whatever works, I'll be happy with it. Also this should just cover the symbols, and not the words "metres/second", the point is to add the non-breaking space in before m/s. Headbomb {talk / contribs / physics / books} 07:10, 11 August 2010 (UTC)
 * Your first version's better than my convoluted second. Shadowjams (talk) 08:07, 11 August 2010 (UTC)

Any updates on this? Headbomb {talk / contribs / physics / books} 16:02, 13 August 2010 (UTC)
 * Added to AWB general fixes:  support m/s as an SI unit for non-breaking space insertion. Rjwilmsi  11:34, 18 August 2010 (UTC)
 * Excellent, thanks! Headbomb {talk / contribs / physics / books} 02:31, 19 August 2010 (UTC)

Phrases
Is it entirely a good idea to correct the phrases at the bottom of the project page? If they were part of a quote, they would not need a sic tag since they are technically not incorrect. An editor may not notice they have correct something that should not have been corrected.  McLerristarr  |  Mclay1  23:40, 17 August 2010 (UTC)
 * When typo fixing all editors have to look out for untemplated quoted material. For such situations if there are problems sic can be used in hidden mode. Rjwilmsi  11:27, 18 August 2010 (UTC)

Fixing decent --> descent
This is a surprisingly common misspelling, in phrases like ".. he is of Asian decent .." , but obviously isn't suitable for a general typo fix. However, I think a regex to pick up anything of the form. (where U represents an uppercase character) would find most of them without any false positives. My regex skills aren't up to it though - could someone more knowledgeable add this to the list? Colonies Chris (talk) 11:08, 18 August 2010 (UTC)
 * I'll do a database scan for this one first, and if it goes well I'll add it as a new rule. Rjwilmsi  11:23, 18 August 2010 (UTC)
 * ✅ New rule added (~140 matches in database scan). Rjwilmsi  14:12, 18 August 2010 (UTC)

Propellor
Can somebody please add 'propellor' to change to 'propeller'.  McLerristarr  |  Mclay1  23:38, 17 August 2010 (UTC)


 * According to Merriam-Webster, "propellor" is an alternative spelling.--BillFlis (talk) 03:44, 18 August 2010 (UTC)
 * Wiktionary says propeller is 'more correct'. On that basis I'd say it's fair to add it. Rjwilmsi  11:25, 18 August 2010 (UTC)
 * Are we sure it's not an WP:ENGVAR issue? – xeno <sup style="color:black;">talk 14:15, 18 August 2010 (UTC)
 * I think not an ENGVAR issue – the Concise OED doesn't identify the two variations as being so. Rjwilmsi  15:51, 18 August 2010 (UTC)
 * Alright, thanks. – xeno <sup style="color:black;">talk 15:52, 18 August 2010 (UTC)
 * So you're going to follow the guidance of a single person at wiktionary who says it's "considered more correct by most authorities" (without a reference to even a single "authority") instead of Merriam-Webster and the OED? Maybe you want to check back with that wiktionary person first.--BillFlis (talk) 01:48, 19 August 2010 (UTC)
 * The full online OED lists propellor as "nonstandard". Rjwilmsi  09:44, 19 August 2010 (UTC)
 * The free Oxford online dictionary says "Propeller can also be spelled propellor: both are correct, but propeller is much more common."  McLerristarr  |  Mclay1  11:09, 19 August 2010 (UTC)

masturbatch
The "masturbate"-rule, <Typo word="Masturbate" find="\b(M|m)asterbat(\w+)\b" replace="$1asturbat$2" /> , tried to change masterbatch to masturbatch. I found "masterbatch" on five pages. Is that enough to add an exception? I'm not quite sure how to do that myself.--ospalh (talk) 11:57, 20 August 2010 (UTC)
 * Fixed.--BillFlis (talk) 13:38, 20 August 2010 (UTC)

Commemorate
"<Typo word="Commemorate" find="\b(C|c)ommerat(es|ed|ing|ions?)\b" replace="$1ommemorat$2" /> ": Is "commerates" &c. really the most common misspelling? I thought things like "comemorate" (one m before e) or "comemerate" (e instead of o) would be more common. "<Typo word="Commemorate" find="\b(C|c)om{1,2}e(?:mo|me)?rat(e|es|ed|ing|ions?)\b" replace="$1ommemorat$2" />" would find all of these, but would also change "comerates" to "commemorates". "Comerates" is a bit too close to "Comrades" for my taste. So, "<Typo word="Commemorate" find="\b(C|c)om{1,2}e(?:mo|me)rat(e|es|ed|ing|ions?)\b" replace="$1ommemorat$2" />" would fix "comemorate" and "commemerate", but not "comerates". Any thoughts?--ospalh (talk) 11:52, 25 August 2010 (UTC)
 * (Note to self: research before you type) Looks like a) "commerates" etc. is somewhat common, but b) there seems to be an actor called "Sheridan Comerate", so 'find="\b(C|c)om{1,2}e(?:mo|me)?rat(e|es|ed|ing|ions?)\b"' would give some false positives and 'find="\b(C|c)om{1,2}e(?:mo|me)rat(e|es|ed|ing|ions?)\b"' would miss some misspellings.--ospalh (talk) 12:01, 25 August 2010 (UTC)
 * We can use a lookbhehind to specifically exclude "Comerate", so what then is the best rule? Rjwilmsi  07:06, 26 August 2010 (UTC)

Double superlatives
Is this worth it? The most common matches seem to be "most earliest", "most holiest", and "most costliest" (not necessarily in that order). PleaseStand (talk) 19:15, 25 August 2010 (UTC)
 * Is there a point to deleting it? Unless there is an exception to the rule, I don't see a reason not to include something.  McLerristarr /  Mclay1  13:50, 26 August 2010 (UTC)
 * As far as I know, the typo rule does not exist yet. My question is whether it is worth adding. PleaseStand (talk) 17:41, 26 August 2010 (UTC)
 * I don't know how common it is, but I think that kind of fix is legitimate subject matter for the typo fixes. Shadowjams (talk) 19:47, 26 August 2010 (UTC)
 * Ah, I see. I thought you meant is it worth keeping, as in you wanted to delete it. My mistake. One of the many problems of communicating by text.  McLerristarr /  Mclay1  09:28, 27 August 2010 (UTC)
 * It's not always going to work as intended: When "Most" is capitalized, the adjective after correction will not be (will remain as it was). I would leave out the "M"; the error will probably be preceded by "the" anyway.--BillFlis (talk) 13:10, 27 August 2010 (UTC)
 * You could remove the "l" and catch things like "most greediest" too. Shadowjams (talk) 00:35, 28 August 2010 (UTC)

Melbourne
Can somebody please correct 'Melbounre' to 'Melbourne'?  McLerristarr /  Mclay1  06:31, 24 August 2010 (UTC)
 * I'll let others opine on if there's some risk of a false positive, but this should do it:  <find="\b(M|m)elbo(rn|unr)e\b" replace="$elbourne" /> . It should catch "Melborne" and "Melbounre" and will capitalize any lower case versions. Shadowjams (talk) 07:01, 24 August 2010 (UTC)
 * Not safe to correct missing 'u' due to Melborne Camp and Melborne surname. Rjwilmsi  12:41, 28 August 2010 (UTC)
 * ✅ Fixed version done here. Shadowjams (talk) 20:12, 28 August 2010 (UTC)

Phenonema --> Phenomena
Could someone please add the plural form of Phenomenon? It should be "'" but a very common misspelling is "'", with only two letters, the n and the m, switched around, making it very hard to spot. There's also a fairly large amount of search results in Wikipedia for this misspelling. I checked the current entry for "Phenomenon" in the list, and I do believe it does not take into account this particular misspelling of the plural form. -- &oelig; &trade; 23:14, 29 August 2010 (UTC)
 * ✅ BillFlis updated the rules. Rjwilmsi  08:05, 30 August 2010 (UTC)

Do we want to hide italics from typo fixing?
For a feature request I added the capability for AWB to hide text in italics as part of its  function ('Ignore templates, refs, link targets...'). Do we want hiding of italics on or off for typos? We already hide untemplated quotes (text between " and related curly quotes). Rjwilmsi  09:01, 30 August 2010 (UTC)
 * Sometimes we use italics to emphasise a word or a sentence. Italics are used for many reasons. Typo fixing should apply inside italics exactly the same way it applies outside them. -- Magioladitis (talk) 09:03, 30 August 2010 (UTC)


 * Was the original concern over foreign and proper terms (like book/movie titles) or is there something else I'm not thinking of? Shadowjams (talk) 18:23, 30 August 2010 (UTC)
 * Italics hiding was added for a feature request. We now have the option to apply it for typo fixing or not. Rjwilmsi  08:11, 31 August 2010 (UTC)
 * I see. I tend to agree with Magioladitis on this point, there're a lot of these that fit within typo territory, but perhaps it cuts down on false positives. Just something to be aware of, it's obviously not an ideological issue. Shadowjams (talk) 08:51, 31 August 2010 (UTC)

Catepillar → Caterpillar
Could someone please update the entry for Caterpillar to also fix the incorrect "Catepillar" (missing the first "r")? GoingBatty (talk) 03:55, 1 September 2010 (UTC)
 * ✅ - EdoDodo  talk 16:57, 4 September 2010 (UTC)

Apostrophe fix contested
I changed series's to series' using AWB. It was subsequently reverted. Does the rule need to be removed or edited? -- JHunterJ (talk) 12:13, 5 September 2010 (UTC)


 * I think the rule, and your fix, is correct, since the phrase is going to be pronounced "the seeriz antagonist", not "the seeriziz antagonist". The advice at Apostrophe is not at all clear, though. -- John of Reading (talk) 13:24, 5 September 2010 (UTC)
 * The guideline is laid out here: APOSTROPHE. If you pronounce "series'[s] antagonist" as "sireez antagonist", then Wikipedia says not to use the additional s. On the other hand, it says if there are two possible pronunciations, you can use either. I definitely pronounce the phrase "series's antagonist" as "sireeziz antagonist". — the Man in Question (in question)  17:07, 5 September 2010 (UTC)
 * If that's the guideline then the rule should be removed. It was added by on 4th August 2008 apparently without any discussion on this talk page. I've pinged that user's talk page. -- John of Reading (talk) 21:01, 5 September 2010 (UTC)
 * I've removed the rule. Per the guidelines on apostrophes, both versions are potentially correct, as long as usage is consistent (with the 's, without the 's, or with the 's if pronounced as iz) on a given article. -- JHunterJ (talk) 11:29, 6 September 2010 (UTC)

specail -> special
Manually fixed one here. Regards, SunCreator (talk) 18:33, 5 September 2010 (UTC)
 * ✅ - EdoDodo  talk 14:09, 6 September 2010 (UTC)

Besancon
A think a false positive here, AWB changes Besancon -> Besançon, but there is a place in France called Besançon and one in New Haven, Indiana called Besancon. Regards, SunCreator (talk) 21:44, 5 September 2010 (UTC)

Retropective → Retrospectiv
This edit changed Retropective → Retrospectiv instead of Retrospective. I've manually fixed this article, but could someone please update the rule? Thanks! GoingBatty (talk) 05:17, 12 September 2010 (UTC)
 * Fixed.--BillFlis (talk) 06:20, 12 September 2010 (UTC)
 * Thanks BillFlis - I didn't find the rule under the "R" section - should have looked under the new additions section too. GoingBatty (talk) 06:38, 12 September 2010 (UTC)

heavily, 2nd try
WB tried to replace "heaively" with "heaively", but it should've been "heavily". Please fix. --bender235 (talk) 20:22, 3 July 2010 (UTC) (—bender235 (talk) 00:50, 13 September 2010 (UTC))
 * I can't find the rule that would make such a change, and I can't find any instances of "heaively" (or "heaivly", which seems more likely) in wikipedia. It looks like it's no longer a problem.--BillFlis (talk) 11:19, 13 September 2010 (UTC)


 * Either bender's original post has a typo, or it's replacing "heaively" with itself, which I too can't find a rule that would do. Perhaps you meant it was replacing "heavily" with "heaiviley", which would make sense given this rule: <Typo word="-ively" find="\b(\w+)ivly\b" replace="$1ively" /> . Before changing that, beware that "ively" is an equally, if not more, common version of that ending. Anyone have ideas about how to distinguish which ending is right based on the base? Shadowjams (talk) 17:40, 13 September 2010 (UTC)

Alternation vs. character classes
Hall with Schwartz calls using alternation (A|a) instead of character class [Aa] a "classic mistake" in Effective Perl Programming, and that it takes a speed penalty, perhaps on the order of 4x. Maybe the processing here has gotten smarter since then, and it does save characters when capturing, (A|a) instead of ([Aa]), but we may still want to change it back. -- JHunterJ (talk) 19:25, 13 September 2010 (UTC)
 * I'll investigate what difference, if any, there is for AWB/C#. Rjwilmsi  20:31, 13 September 2010 (UTC)
 * ISBN 0596528124 page 237 has a benchmark for .NET that lists character classes as being 4.7x faster. I don't know how old that is... but worth considering. There are probably other optimizations like this as well. Shadowjams (talk) 00:40, 14 September 2010 (UTC)
 * VB.NET, we use C#: I profiled 1000 replace operations for "\b(R|r)ec(?:ie|ei?)pient(s?)\b" and "\b([Rr])ec(?:ie|ei?)pient(s?)\b" (details on request) and the numbers were 13463 and 12860 ms respectively i.e. around a 5% difference only. So I conclude there's not much difference for C#. We cannot take a 4x or 5x difference in another language and assume it applies for ours. Rjwilmsi  20:54, 14 September 2010 (UTC)

Opiod --> Opioid
Very common misspelling, hard to spot. Please add, thanks. -- &oelig; &trade; 07:16, 14 September 2010 (UTC)


 * Wow that is common. Added a rule here. I looked around in a few dictionaries thinking it might be an alternative spelling just based on how common it is, but I couldn't find anything. ✅ Shadowjams (talk) 15:28, 14 September 2010 (UTC)

Sargent's cypress
I had typo fixing switched on. It made this error. It is a false positive for Sargent's cypress or Sargent cypress Regards Lightmouse (talk) 09:56, 14 September 2010 (UTC)
 * ❌ Only an error as the article incorrectly had the word in lower case. Rjwilmsi  21:06, 14 September 2010 (UTC)

Thanks for investigating it. Lightmouse (talk) 21:47, 14 September 2010 (UTC)

km/kg corrections OK, but summary incorrect
This edit correctly changed "67 Kg" and "800 Km" to "67 kg" and "800 km". However, the edit summary reads (Typo fixing, typos fixed: 7 Kg → 7 kg (2) using AWB). Anyone want to try updating the rule to make the edit summary better? Thanks! GoingBatty (talk) 04:49, 14 September 2010 (UTC)
 * One could make the summary more accurate by putting a quantifier (+ in this case) on the \d in the rule, but that would increase the time (infinitesimally, albeit) the regex runs across every page scanned. It probably doesn't matter either way; if you want to put it in there that's how one would do it. Shadowjams (talk) 05:48, 14 September 2010 (UTC)
 * Actually, on second look, that's not a Typo rule, that's a built-in program rule. I'm guessing that internal rule uses regex too though, so the same applies. Shadowjams (talk) 05:51, 14 September 2010 (UTC)
 * Typo rule is for Kg to kg (case conversion). Rjwilmsi  07:22, 14 September 2010 (UTC)
 * I see now. Shadowjams (talk) 16:45, 14 September 2010 (UTC)
 * So should I move this from this talk page to a bug report? GoingBatty (talk) 16:34, 14 September 2010 (UTC)
 * No, it is a typo issue. My second point was wrong (Rjwilmsi was correcting me). I was confused because I was looking for a rule that would add &nbsp to the output, and there isn't a rule that did that (that part is internal). However, there is a rule that did the capitalization, and updating that, would fix the OP's issue. It's this one: <Typo word="kg/km (kilogram/kilometer)" find="(\d(?:\s| |-)?)K(g|m)\b" replace="$1k$2" />.


 * Change it to <Typo word="kg/km (kilogram/kilometer)" find="(\d+(?:\s| |-)?)K(g|m)\b" replace="$1k$2" /> and you've fixed the issue (see above for speed considerations). Shadowjams (talk) 16:45, 14 September 2010 (UTC)
 * All of the rules have been updated with the +. Now I see in this edit that AWB accurately changed "16KHZ" → "16  kHz", but the edit summary says: (Typo fixing, typos fixed: 16KHZ → 16kHz using AWB) (without the space) GoingBatty (talk) 03:27, 17 September 2010 (UTC)
 * Also this edit changed "710 KHz" and "970 KHz" to "710 kHz" and "970 kHz", but the edit summary is (Typo fixing, typos fixed: 710 KHz → 710 kHz (2) using AWB) GoingBatty (talk) 03:53, 17 September 2010 (UTC)

Supress --> Suppress
Another very common misspelling (over 2000 search results!) Including supressed/supressing/supression and whatever other prefixes there are. I'm surprised this one wasn't in there already..

Actually I did find "(Immuno)Suppress" in the list, but that doesn't seem correct.. it's already got the double-p, so maybe that's just a mistake? or what, but I don't know if maybe the (Immuno) part is affecting the detection somehow too.

Opress --> Oppress is another one we could add, that one is a bit less common but still coming up in search results. Except that the search results come up with the false positive "of-press" for some reason, which is slightly annoying, but I don't think that would affect AWB's typo detection anyway. -- &oelig; &trade; 22:50, 15 September 2010 (UTC)
 * The existing "(Immuno)Suppress" rule already covers all of the suppress variations you've listed. Rule expanded for oppress too. Rjwilmsi  09:32, 16 September 2010 (UTC)


 * Oh! okay. These regexes still confuse me. :) But, is it normal for there to still be so many existing misspellings? I thought that once a typo gets added to the list they usually all get fixed pretty quickly.. Is it just that noone has patrolled these articles yet with AWB? -- &oelig; &trade; 17:05, 16 September 2010 (UTC)
 * The WP:TYPOSCAN project should go through these regularly but it's waiting for new data at the moment. Rjwilmsi  17:10, 16 September 2010 (UTC)

achitecture → architecture
Could someone please update the existing entry for "architecture" so it also catches "achitecture"? Thanks! GoingBatty (talk) 01:53, 17 September 2010 (UTC)
 * I modified the rule for "Architect" to catch this.--BillFlis (talk) 08:55, 18 September 2010 (UTC)

Inconsistent use of formats such as '(C|c)' and '[Cc]'. Propose change all to '[Cc]'
The list is inconsistent in whether the regex uses '(C|c)' or '[Cc]'. I propose running a changing them all to the format '[Cc]'. It's trivial but using the same format makes it slightly easier to notice the real differences. Any objections? Lightmouse (talk) 15:15, 17 September 2010 (UTC)


 * They are not equivalent. "(C|c)" is equivalent to "([Cc])". Also, I know there was some discussion about speed, but a more important consideration might be space. This page is already huge, and changing every instance of this would add another character to each of the affected rules, which is the large majority of them.--BillFlis (talk) 18:54, 17 September 2010 (UTC)

You're quite right, the pairings are '(C|c)' with '([Cc])', or '(?:C|c)' with '[Cc]'. I agree that compact code is a good thing. I'll leave it to you. Incidentally, I'm sure there are more units of measure that would be useful, also I only see one square unit of length and there could be cubes too. Lightmouse (talk) 20:23, 17 September 2010 (UTC)


 * Bill sums up the issue exactly. I can see positives to both. In some ways I think ([Cc]) is conceptually clearer, but that's a personal preference. I made the changes to all of the New additions thinking the speed tradeoff was more important than later testing demonstrated. There is 1 character difference between the two; I don't see any reason to prefer one over the other. I think it's best to leave them as they're originally created, with whatever idiom the creator chooses. Shadowjams (talk) 21:58, 17 September 2010 (UTC)

Units of measure
There is km². Would it also be possible to do km³, m², m³, ft², ft³ ? Lightmouse (talk) 08:50, 18 September 2010 (UTC)

etc... → etc.
Could the Etc. rule be changed so that it would also remove extra periods? (e.g. change "etc..." → to "etc.") Thanks! GoingBatty (talk) 02:44, 17 September 2010 (UTC)
 * I think this should do it. Shadowjams (talk) 03:20, 17 September 2010 (UTC) ✅
 * I think you're on the right track. According to the AWB Regex Tester, that will fix "ect...." (which is great), but not "etc....."  GoingBatty (talk)
 * Ah. That makes sense. Ok, one more try.... Shadowjams (talk) 04:44, 17 September 2010 (UTC)
 * See if that did it. Shadowjams (talk) 04:46, 17 September 2010 (UTC)
 * Sorry - tried the AWB Regex Tester, and it still doesn't fix "etc...." or "etc" (with no periods) GoingBatty (talk) 16:23, 17 September 2010 (UTC)
 * I took another look. What it's doing is it's looking for anything with an "Etc" followed by something that's not either a period or a word character (0-9,a-z). In the case of "etc....." it's skipping it because there's already a period, and not looking at the rest. This is intentional for two reasons. One, it terminates the search early on correct matches (which are the majority) and saves processing time, and second, it allows for unanticipated but correct uses, like an ellipsis. It not fixing "etc" is related... because there's nothing following the c, it doesn't catch. However, in a real article etc won't be alone. It will be followed by something: "etc more words". This sometimes comes up in testing. We try to design rules so they don't catch on correct spellings (even if they correct them back to themselves) because I assume they take more processing (they run entirely, as opposed to stopping midway through). Maybe that's unnecessary, but most of the rules adhere to that format. Shadowjams (talk) 22:10, 17 September 2010 (UTC)
 * I appreciate your reply. I made this request because I thought that "etc." plus an ellipsis was not a correct use.  Why would an ellipsis be necessary?  Thanks!  GoingBatty (talk) 15:26, 19 September 2010 (UTC)
 * That's a good point. I tended towards the cautious with some of these when I started, and I added the etc. rule that's currently in use (although there was a simpler one earlier) earlier on. I think the change you're talking about would be fine. Shadowjams (talk) 05:12, 20 September 2010 (UTC)
 * Thanks Shadowjams. I was playing around with how to edit the rule to fix "etc....", but couldn't get it to skip "etc."  Could you please help me with this?  Thanks!  GoingBatty (talk) 17:07, 20 September 2010 (UTC)

Should regex be using an escape character.
I notice that square kilometre contains: [-.\s]

Should it be: [-\.\s]

Regards Lightmouse (talk) 16:46, 19 September 2010 (UTC)


 * I don't think you need to escape charters inside character classes (says as much). Shadowjams (talk) 21:01, 19 September 2010 (UTC)


 * There's another problem with that though. The - needs to be at the end of the class, otherwise it's looking for a range. I'm not sure what it does in that case, but it might explain any strange effects you're seeing. Shadowjams (talk) 21:02, 19 September 2010 (UTC)


 * No, a hyphen immediately after a "[" counts as a literal hyphen. -- John of Reading (talk) 06:13, 20 September 2010 (UTC)
 * Interesting. That's actually a little new... it doesn't work with grep for instance. Perl calls this version 8 regex (I think). Apparently - at either the beginning or end is fine, but in the middle, of course, it's ambiguous. Shadowjams (talk) 06:17, 20 September 2010 (UTC)

Aha - "the dot is not a metacharacter inside a character class, so we do not need to escape it with a backslash.". Very interesting, thanks. Lightmouse (talk) 17:15, 20 September 2010 (UTC)

Not fixing "hungarian" ?
Although there's an existing rule for "Hungary" that includes "Hungarian", it doesn't want to fix "hungarian" and "hungarians" in Culture of Hungary. When I tried the rule in the AWB Regex tester, it seems to work fine. Any ideas? GoingBatty (talk) 04:22, 20 September 2010 (UTC)
 * Typo fixing rules are not applied when a wikilink target also matches on the typo rule in order to avoid false positives on uncommon names etc. In this case there's an image linked in the article with a lowercase 'hungarian' in the file name, hence the typo fix is not applied. From looking at the Commons:File Renaming page it would appear that asking for the file to be renamed might be refused. I've now applied the typo fixing to the article. Feel free to try to get the image renamed. Rjwilmsi  16:29, 23 September 2010 (UTC)
 * Thanks for the explanation - having an example makes it more clear than the manual, but I'll try to be more diligent about reading the manual first. GoingBatty (talk) 01:53, 24 September 2010 (UTC)

criticized
AWB replaced "critiziced" with "criticiziced" here, but it should have been "criticized". Please fix. —bender235 (talk) 14:07, 23 September 2010 (UTC)
 * I limited the rule for "Critical", which was evidently making this change, to not make this particular change. We'll need a new rule to correct "critiziced" to "criticized", which I was surprised to find has more than a dozen occurrences on wikipedia.--BillFlis (talk) 16:22, 23 September 2010 (UTC)