Regular Expression Cheat Sheet
Match Characters
| [abc] | Matches any single character among a, b, or c |
| [^abc] | Matches any character NOT among a, b, or c. “^” is only valid when it is the first character |
| [a-g] | Matches any single character in the range a-g |
| [^a-g] | Matches any character NOT in the range a-g |
| [H-N] | Matches any single character in the range H-N |
| [0-9] | Matches any single character in the range 0-9 |
| [a-gH-N] | Matches any single character in the range a-g or H-N |
| Wildcards: | |
| . | Matches any character except newline ([^\n\r]) |
| \s | Whitespace: matches space, newline, tab, etc. |
| \S | [^\s] (Non-whitespace) |
| \d | Digit: [0-9] |
| \D | [^\d] (Non-digit) |
| \w | Word character: [0-9A-Za-z_] |
| \W | [^\w] (Non-word character) |
Match Groups
| (a | b) |
| (…) | Capture group |
| (?:…) | Non-capturing group |
| (?<name>…) or (?’name’…) | Named capture group, “name” can be customized |
| (?(condition)true_regex | false_regex) |
group(0) is used to get the complete match string, while group(>0) can extract sub-capture groups from the complete match result.
Note: Named capture groups are supported since Java 1.7 (passing the string “name” as an argument to the group method). JS and Python do not support named capture groups yet.
Frequency Range (Quantifiers)
| {3} | The preceding item matches exactly 3 times, equivalent to {3,3} |
| {3,6} | The preceding item matches 3 to 6 times |
| {3,} | The preceding item matches 3 or more times |
| {0,6} | The preceding item matches at most 6 times |
| Wildcards: | |
| * | {0,} (Zero or more) |
| + | {1,} (One or more) |
| ? | {0,1} (Zero or one) |
| \w* | Greedy mode |
| \w*? | Non-greedy mode |
Anchors
| ^ | Start of string |
| $ | End of string |
| \b | Word boundary |
| \B | Non-word boundary |
| Lookaround Assertions: | |
| (?=exp) | Positive lookahead (suffix is exp) |
| (?<=exp) | Positive lookbehind (prefix is exp) |
| (?!exp) | Negative lookahead (suffix is NOT exp) |
| (?<!exp) | Negative lookbehind (prefix is NOT exp) |
\b(\w+)\b is equivalent to (?<=\W?)(\w+)(?=\W?)