San Diego Website design home Contact Us Client Login


Wednesday, October 10, 2007

Validation Expressions

These are so often hard to find so I am posting them here:

 

Metacharacter Match
\ the escape character - used to find an instance of a metacharacter like a period, brackets, etc.
. (period) match any character except newline
x match any instance of x
^x match any character except x
[x] match any instance of x in the bracketed range - [abxyz] will match any instance of a, b, x, y, or z
| (pipe) an OR operator - [x|y] will match an instance of x or y
() used to group sequences of characters or matches
{} used to define numeric quantifiers
{x} match must occur exactly x times
{x,} match must occur at least x times
{x,y} match must occur at least x times, but no more than y times
? preceding match is optional or one only, same as {0,1}
* find 0 or more of preceding match, same as {0,}
+ find 1 or more of preceding match, same as {1,}
^ match the beginning of the line
$ match the end of a line

POSIX Class Match
[:alnum:] alphabetic and numeric characters
[:alpha:] alphabetic characters
[:blank:] space and tab
[:cntrl:] control characters
[:digit:] digits
[:graph:] non-blank (not spaces and control characters)
[:lower:] lowercase alphabetic characters
[:print:] any printable characters
[:punct:] punctuation characters
[:space:] all whitespace characters (includes [:blank:], newline, carriage return)
[:upper:] uppercase alphabetic characters
[:xdigit:] digits allowed in a hexadecimal number (i.e. 0-9, a-f, A-F)

 

Character class Match
\d matches a digit, same as [0-9]
\D matches a non-digit, same as [^0-9]
\s matches a whitespace character (space, tab, newline, etc.)
\S matches a non-whitespace character
\w matches a word character
\W matches a non-word character
\b matches a word-boundary (NOTE: within a class, matches a backspace)
\B matches a non-wordboundary

 

  • \
    The backslash escapes any character and can therefore be used to force characters to be matched as literals instead of being treated as characters with special meaning. For example, '\[' matches '[' and '\\' matches '\'.
  • .
    A dot matches any character. For example, 'go.d' matches 'gold' and 'good'.
  • { }
    {n} ... Match exactly n times
    {n,} ... Match at least n times
    {n,m} ... Match at least n but not more than m times
  • [ ]
    A string enclosed in square brackets matches any character in that string, but no others. For example, '[xyz]' matches only 'x', 'y', or 'z', a range of characters may be specified by two characters separated by '-'. Note that '[a-z]' matches alphabetic characters, while '[z-a]' never matches.
  • [-]
    A hyphen within the brackets signifies a range of characters. For example, [b-o] matches any character from b through o.
  • |
    A vertical bar matches either expression on either side of the vertical bar. For example, bar|car will match either bar or car.
  • *
    An asterisk after a string matches any number of occurrences of that string, including zero characters. For example, bo* matches: bo, boo and booo but not b.
  • +
    A plus sign after a string matches any number of occurrences of that string, except zero characters. For example, bo+ matches: boo, and booo, but not bo or be.
  • \d+
    matches all numbers with one or more digits
  • \d*
    matches all numbers with zero or more digits
  • \w+
    matches all words with one or more characters containing a-z, A-Z and 0-9. \w+ will find title, border, width etc. Please note that \w matches only numbers and characters (a-z, A-Z, 0-9) lower than ordinal value 128.
  • [a-zA-Z\xA1-\xFF]+
    matches all words with one or more characters containing a-z, A-Z and characters larger than ordinal value 161 (eg. ä or Ü). If you want to find words with numbers, then add 0-9 to the expression: [0-9a-zA-Z\xA1-\xFF]+



Typical examples

  • (bo*)
    will find "bo", "boo", "bot", but not "b"
  • (bx+)
    will find "bxxxxxxxx", "bxx", but not "bx" or "be"
  • (\d+)
    will find all numbers
  • (\d+ visitors)
    will find "3 visitors" or "243234 visitors" or "2763816 visitors"
  • (\d+ of \d+ messages)
    will find "2 of 1200 messages" or "1 of 10 messages"
  • (\d+ of \d+ messages)
    will filter everything from the last occurrence of "2 of 1200 messages" or "1 of 10 messages" to the end of the page
  • (MyText.{0,20})
    will find "MyText" and the next 20 characters after "MyText"
  • (\d\d.\d\d.\d\d\d\d)
    will find date-strings with format 99.99.9999 or 99-99-9999 (the dot in the regex matches any character)
  • (\d\d\.\d\d\.\d\d\d\d)
    will find date-strings with format 99.99.9999
  • (([_a-zA-Z\d\-\.]+@[_a-zA-Z\d\-]+(\.[_a-zA-Z\d\-]+)+))
    will find all e-mail addresses

thanks

Comments

Name
URL
Email
Email address is not published
Remember Me
Comments

CAPTCHA
Write the characters in the image above

San Diego Website Design
San Diego Flash Design
Testimonials
Contact Us
Support
Privacy Policy
Site Map