Regular expressions
Regular expression, often shorted to regex, is used to specify a match pattern with just text.
Syntax
x
, y
, and z
when used under symbols are placeholders for text. Capital X
s, Y
s, and Z
s are used for number placeholders.
Symbol(s) | Name | Description | Example |
---|---|---|---|
Groups and backreferences | |||
(x) |
Capture group | Separates the content in the output. | "Foo Bar" /(Foo)|(Bar)/g -> [ "Foo", "Bar" ]
|
(?:x) |
Non-capture group | Acts as if the parentheses were not there | "Foo Bar" /(?:Foo)|(?:Bar)/g -> [ "Foo Bar" ]
|
(?<y>x) |
Named capture group | Equivalent to (x) , except it remembers the content used. |
"Foo Bar" /(?<F>Foo)|(?<B>Bar)/g -> [ "Foo", "Bar" ]
|
\k<y> |
Named backreference | References a previous named capture group, note that \k is literal |
"Foo Foo" /(?<Foo>Foo)\s\k<Foo>/g -> [ "Foo Foo" ]
|
Character classes | |||
[x-z] |
Character class | Matches every letter or number from x to z . |
"Foo Bar" /[a-f]/gi -> [ "F", "B", "a" ]
|
[xyz] |
References either x , y , or z |
"Foo Bar" /[FB]/g -> [ "F", "B" ]
| |
[^x-z] |
Negated character class | Matches every letter or number not from x to z . |
"Foo Bar" /[^a-f]/gi -> [ "o", "o", " ", "r" ]
|
[^xyz] |
References characters that aren't x , y , or z |
"Foo Bar" /[^FB]/g -> [ "o", "o", " ", "a", "r" ]
| |
. |
Wildcard | Matches every character besides line terminators. Line terminators include \n , \r , \u2028 , and \u2029 |
"Foo Bar" /./g -> [ "F", "o", "o", " ", "B", "a", "r" ]
|
x|y |
Disjunction | Match something or something else. | "Foo Bar" /Foo|Bar/g -> [ "Foo", "Bar" ]
|
\ |
Escape character | If a character is reserved for regex, such as * , | , or . . Note that this is itself a reserve character, so to match for it, you need to use \\ . |
"Foo.bar apple 78.9 banana" /[A-Za-z0-9]*\.[A-Za-z0-9]*/g -> [ "Foo.bar", "78.9" ]
|
\d |
Digit character class escape | Equivalent to [0-9] |
"78 Foo Bars" /\d/g -> [ "7", "8" ]
|
\D |
Non-digit character class escape | Equivalent to [^0-9] |
"78 Foo Bars" /\d/g -> [ "F", "o"," "o", " ", "B", "a", "r", "s" ]
|
\w |
Word character class escape | Equivalent to [A-Za-z0-9_] |
"_Foo- Bars+" /\d/g -> [ "_", "F"," "o", "o", "B", "a", "r", "s" ]
|
\W |
Non-word character class escape | Equivalent to [^A-Za-z0-9_] |
"_Foo- Bars+" /\d/g -> [ "-", " ", "+" ]
|
\s |
White space character class escape | Matches all whitespace characters. Equivalent to [\f\n\r\t\v\u0020\u00a0\u1680\u2000-\u200a\u2028\u2029\u202f\u205f\u3000\ufeff] |
"_Foo- Bars+" /\d/g -> [ "-", " ", "+" ]
|
\S |
Non-white space character class escape | Matches everything but whitespace characters. Equivalent to [^\f\n\r\t\v\u0020\u00a0\u1680\u2000-\u200a\u2028\u2029\u202f\u205f\u3000\ufeff] |
"_Foo- Bars+" /\d/g -> [ "-", " ", "+" ]
|
\t |
Horizontal tab escape | Matches horizontal tab characters. | "a b" /\t/g -> [ " " ]
|
\n |
Linefeed escape | Matches linefeed/new line characters | "a b" /(?:\r?\n)|(?:\v)|(?:\f)/g -> [ "" ]
|
\r |
Carriage return escape | Matches carriage return characters | |
\v |
Vertical tab escape | Matches vertical tab characters | |
\f |
Form feed escape | Matches form feed characters | |
[\b] |
Backspace escape | Matches backspace | No example can be provided |
\0 |
NUL escape | Matches the NUL character | |
\u{YYYY} or \u{YYYY} |
Unicode value escape | When the u flag is applied. Here Y represents a hexadecimal number.
| |
\uYYYY |
Matches provided UTF-16 hexadecimal value. Represented with Y s here.
| ||
\p{x} or \P{x} |
Unicode character class escape | Matches a character based on the Unicode property (x ).
| |
\cx |
Caret notation escape | Matches the sequence following \c with caret notation. Note that x represents a sequence of characters here, rather than a single one. |
"a b" /\cM\cJ//g -> [ "" ]
|
Assertions | |||
^ |
Input boundary beginning assertion | Matches the beginning of the input. If the m flag is on, it matches the start of each line. |
"Foo Bar" /(^Foo)|(Bar$)/g -> [ "Foo", "Bar" ]
|
$ |
Input boundary end assertion | Matches the end of the input. If the m flag is on, it matches the end of each line.
| |
\b |
Word boundary assertion | Matches either end of a word. | "Foo Bar" /(\bFoo\b)/ -> [ "Foo" ]
|
\B |
Non-word boundary assertion | Matches the middle of a word. | "Foo Bar" /(B\Bar)/ -> [ "Bar" ]
|
Flags
Flag | Name | Description |
---|---|---|
g |
g lobal |
Search all of a string, rather than stopping once you find an occurrence. |
See also
External links
- Regular expressions on Wikipedia
- TurboWarp extension gallery featuring TrueFantom's RegExp extension. It can be loaded into PenguinMod using
https://extensions.turbowarp.org/true-fantom/regexp.js
as the URL in the Load Custom Extensions popup. It adds more regex functionality into PenguinMod. - regex101, fairly useful little app with some fun challenges to test your knowledge of regex.
- MDN's Regular expressions documentation for JavaScript. There wasn't a good place to cite this, but I sourced at lot of stuff from here. Pretty much all of the names for each syntax element.