User:MarkMYoung/regular expressions

Useful Regular Expressions
There are too many places with incomplete or incorrect regular expressions scattered on the Internet and books are reluctant to list them because the author would likely have to compose an errata at some point. So, I am compiling a list of regular expressions (although these may also be incorrect, they are at least in one place). One decent source is the  module available from CPAN. However, I was prompted to maintain this page when I discovered 's (v2.120) regular expression for a decimal IPv4 address unit of   was incorrect because it would accept '05' as a decimal IP unit (which is octal) and it does not have an IPv6 regular expression.

CSV
This does not remove the double-quotes which are now superfluous.

Domain Name
This regular expression merely ensures the domain only contains valid characters and checks for constituent domain length between 1 and 63. One can either use the specific or more general top-level domain regular expression. This regular expression excludes hyphens at the beginning, after a dot, consecutively, before a dot, and at the end. Keep in mind that something as simple as the word "a" or the text "0.7-1.2" matches as an unqualified hostname. So, this regular expression is good for validation, but not for searching. This is much better suited for searching. Here is a reasonable one-line regular expression that does not check for overall length greater than 255 or misplaced hyphens.