Wikipedia:Reference desk/Archives/Computing/2022 May 27

= May 27 =

Regex
With regular expressions (I am using the perl variety), how do you capture a character and then match any character except that one?

I tried /(.)[^\1]/ but apparently back references don't work in groups. I tried to match (say) "banana" with something like /(.)(.)([^\1])\2\3\2/ which doesn't work. -- SGBailey (talk) 06:43, 27 May 2022 (UTC)
 * It is not valid to use backreferences like \1 inside a character class [ ... ]. That is why it doesn't work. Unfortunately, that is the limit of my RegEx knowledge. 97.82.165.112 (talk) 14:44, 27 May 2022 (UTC)
 * I haven't got a complete answer yet, but (.)(?!\1) will match the first character not followed by the same character. I picked that up from here: "Negative lookahead is indispensable if you want to match something not followed by something else." So (.)(?!\1). matches a character and then any character except that one.
 * OK, I think a parallel of what you tried to write for "banana" is (.)(.)(?!\1)(.)\2\3\2
 * This says "some first character, some second character not followed by a repeat of the first, some third character," and then obviously the \2\3\2 repeats characters 2, 3 and 2 again. This will match banana, but will also match baaaaa. Note that (?!\1) is not in itself a character or group - my terminology might be wrong here, but anyway, I mean you can't refer back to it, it's not \3. It just looks ahead past the preceding character to check a fact about the next one.
 * Then there's the variation (.)(.)(?!\2)(.)\2\3\2
 * This says "some first character, some second character not followed by itself, some third character," then second, third, second again. So it matches banana, bbnbnb and bababa.
 * My super ultimate banana-matcher in its final form is (.)(?!\1)(.)(?!\1|\2)(.)\2\3\2 Card Zero  (talk) 14:44, 27 May 2022 (UTC)

Thx -- SGBailey (talk) 06:21, 29 May 2022 (UTC)