Match () with regex () (): Difference between revisions

From PenguinMod Wiki
Jump to navigation Jump to search
Content added Content deleted
(Created page with "thumb|590x590px The match regex block is a reporter block that matches a string with a regex.")
 
(change wikipedia links to interwiki links)
 
(6 intermediate revisions by 2 users not shown)
Line 1: Line 1:
{{Blocks|block_img=<scratchblocks>(match [foo bar] with regex [foo] [g] :: operators)</scratchblocks>|block_type=Reporter Block|category=[[Operators]]}}
[[File:Block 5 26 2023-3 14 00 PM.png|thumb|590x590px]]
=== Information ===
The match regex block is a reporter block that matches a string with a regex.
The match regex block is a reporter block that matches a string with a [[w:Regular expressions|regular expression]], a regex.

=== Use ===
This block searches for every instance of the selected rule. The rule can be inserted into the second input box. The third input box acts as a flag (more info later on)

=== Examples ===

==== Foo bar example ====
<scratchblocks>
(match [foo bar] with regex [foo] [g] :: operators) // ["foo"]
</scratchblocks>
In this example inputs of the block it searches for the string "foo" in the string "foo bar" which will result in the array ["foo"] since there is only one foo in the string.

==== Fruit example ====
<scratchblocks>
(match [banana orange apple banana banana pear apple] with regex [banana] [g] :: operators) // ["banana","banana","banana"]
</scratchblocks>
For this example the block searches for every occurance of the word "banana" which results in the array ["banana","banana","banana"].

=== Regular expression syntax ===
Regular expressions have their own [[w:Regular_expression#Syntax|syntax]].

The examples will appear as: [to be matched string] [rule] [flag] -> [result]
{| class="wikitable mw-collapsible mw-collapsed"
|+Syntax Table
!meta characters
!Description
!Example
|-
|.
|This is the syntax for (almost) any character (see flags)
|["gray grey griy gary"] ["gr'''.'''y"] ["g"] -> ["gray", "grey", "griy"]
|-
|()
|groups a set of characters
|No actual change to the result, will be needed later on
|-
| +
|matches the preceding character or group one or more times
|["abbbbbba aba aa"] ["a'''b+'''a"] ["g"] -> ["abbbbbba","aba"]
|-
|?
|matches the preceding character or group zero or one times
|["aba aa abba abbbba"] ["a'''?b'''a"] ["g"] -> ["aba","aa"]
|-
|*
|matches the preceding character or group any amount of times
|["aba aa abbbbbbbba abbbbbbbbbbbbba"] ["a'''b*'''a"] ["g"] -> ["aba", "aa", "abbbbbbbba", "abbbbbbbbbbbbba"]
|-
|{M,N}
|matches the preceding character or group
a minimum of M to a naximum of N times
|["aba abbba abbbbba abbbbbbbbbbba"] ["a'''b{3,5}'''a"] ["g"] -> ["abbba","abbbbba"]
|-
|{,N}
|matches the preceding character or group a maximum of N times
|["aba abbba abbbbba abbbbbbbbbbba"] ["a'''b{,5}'''a"] ["g"] -> ["aba", "abbba", "abbbbba"]
|-
|{M,}
|matches the preceding character or group a minimum of M times
|["aba abbba abbbbba abbbbbbbbbbba"] ["a'''b{3,}'''a"] ["g"] -> ["abbba", "abbbbba", "abbbbbbbbbbba"]
|-
|{N}
|matches the preceding character or group exactly N times
|["aba abbba abbbbba abbbbbbbbbbba"] ["a'''b{3}'''a"] ["g"] -> ["abbba"]
|-
|[...]
|syntax for any characters or group within the rectangular parenthesis
|["Hello Hallo Hmllo Hkllo"] ["H'''[ea]'''llo"] ["g"] -> ["Hello","Hallo"].
This can also used like this:
["Hello Hallo Hmllo Hkllo"] ["H'''[a-k]'''llo"] ["g] -> ["Hello","Hallo", "Hkllo"].

which acts as every letter in the alphabet from a to k
|-
|<nowiki>|</nowiki>
|seperating possible character or groups
|["gray grey griy gary"] ["('''<nowiki>gray|grey|griy</nowiki>''')"] ["g"] -> ["gray", "grey", "griy"]
|-
|\w
|syntax for all [https://en.wikipedia.org/wiki/Alphanumericals Alphanumericals] (a-z A-Z 0-9 and _)
|["This is a_text90§!"] ["'''\w'''"] ["g"] -> ["T","h","i","s","i","s","a","_","t","e","x","t","9","0"]
|-
|\W
|syntax for all non-[https://en.wikipedia.org/wiki/Alphanumericals Alphanumericals] '''not''' (a-z A-Z 0-9 and _)
|["This is a_text90§!"] ["'''\W'''"] ["g"] -> ["§","!"]
|-
|\s
|syntax for the [https://en.wikipedia.org/wiki/Whitespace_character Whitespace] which includes:
Ascii: tab, space, line feed, form feed and carriage return
Unicode: matches, no-break spaces, next line, and the variable-width spaces (amongst others). Basically anything that is not visible
|["This is a testing example"] ["'''\s'''"] ["g"] -> [" "," "," "]
|-
|\S
|syntax for all non-[https://en.wikipedia.org/wiki/Whitespace_character Whitespaces]
|["This is a tesing example"] ["'''\S'''"] ["g"] -> ["T","h","i","s","i","s","a","t","e","s","t","i","n","g","e","x","a"m","p","l","e"]
|-
|\d
|syntax for all [https://en.wikipedia.org/wiki/Numerical_digit Digits] (0-9)
|["My favourite number is 917"] ["'''\d'''"] ["g"] -> ["9","1","7"]
|-
|\D
|syntax for all non-[https://en.wikipedia.org/wiki/Numerical_digit Digits] '''not''' (0-9)
|["9 + 10 = 21"] ["'''\D'''"] ["g"] -> ["+","="]
|-
|^
|syntax for the beginning of a string or line
|["Hello World, I say Hello"] ["'''^'''Hello"] ["g"] -> ["Hello"]
|-
|$
|syntax for the end of a string or line
|["Hello World, I say Hello"] ["Hello'''$'''"] ["g"] -> ["Hello"]
|-
|\A
|syntax for the beginning of a string but not line
|["Hello World,
Hello"] ["'''\A'''"] ["g"] -> ["Hello"]
|-
|\z
|syntax for the end of a string but not line
|["Hello World,
Hello"] ["'''\z'''"] ["g"] -> ["Hello"]
|-
|[^...]
|syntax for any character or group '''not''' inside of the rectangular parenthesis
|["Hello Hallo Hmllo Hkllo"] ["H'''[^ea]'''llo"] ["g"] -> ["Hmllo","Hkllo"].
This can also used like this:
["Hello Hallo Hmllo Hkllo"] ["H'''[^a-k]'''llo"] ["g] -> ["Hmllo"].

which acts as every letter but those in the alphabet from a to k
|-
|\
|syntax for [[w:Escape_character|Escaping Characters]].
For example used to get the actual character * instead of the syntax
|["I would like to find all *s in this very *-filled *text"] ["'''\*'''"] ["g"] -> ["*","*","*"]
This is also used to escape itself:
["I would like to find all \s in this very \-filled \text"] ["'''\\'''"] ["g"] -> ["\","\","\"]
|}

=== Flags ===
Flags are used to redefine the behaviour of a regular expression. These flags include:
{| class="wikitable"
|+Flags
!Flag
!Name*
!Description
|-
|g
|Global
|Finds all the matching sub-strings not only the first one
|-
|i
|Case-insensitive
|Makes the rule case-insensitive. For example if the rule is [e] and the flag [ig] all e '''and''' E's get matched
|-
|m
|Multi-line
|Changes the meanings of <code>^<code> and <code>$</code> to make them match the beginning and end of each line, rather than the whole string.
|-
|s
|"Dotall"
|changes the . character to include '''every''' character including new-line
|}
To use multiple flags you write them, as seen in the description of the case-insensitive flag, after another with '''no delimiter''' (a space or comma inbetween the flags). Order does not matter.

More flags can be found at [https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_expressions#advanced_searching_with_flags MDN].

Latest revision as of 04:13, 2 July 2024

Match () with regex () ()
...
[[File:
(match [foo bar] with regex [foo] [g] :: operators)
|200px]]
Caption
Block Type Reporter Block
Category / Extension Operators
Status Who tf uses status

Information

The match regex block is a reporter block that matches a string with a regular expression, a regex.

Use

This block searches for every instance of the selected rule. The rule can be inserted into the second input box. The third input box acts as a flag (more info later on)

Examples

Foo bar example

(match [foo bar] with regex [foo] [g] :: operators) // ["foo"]

In this example inputs of the block it searches for the string "foo" in the string "foo bar" which will result in the array ["foo"] since there is only one foo in the string.

Fruit example

(match [banana orange apple banana banana pear apple] with regex [banana] [g] :: operators) // ["banana","banana","banana"]

For this example the block searches for every occurance of the word "banana" which results in the array ["banana","banana","banana"].

Regular expression syntax

Regular expressions have their own syntax.

The examples will appear as: [to be matched string] [rule] [flag] -> [result]

Syntax Table
meta characters Description Example
. This is the syntax for (almost) any character (see flags) ["gray grey griy gary"] ["gr.y"] ["g"] -> ["gray", "grey", "griy"]
() groups a set of characters No actual change to the result, will be needed later on
+ matches the preceding character or group one or more times ["abbbbbba aba aa"] ["ab+a"] ["g"] -> ["abbbbbba","aba"]
? matches the preceding character or group zero or one times ["aba aa abba abbbba"] ["a?ba"] ["g"] -> ["aba","aa"]
* matches the preceding character or group any amount of times ["aba aa abbbbbbbba abbbbbbbbbbbbba"] ["ab*a"] ["g"] -> ["aba", "aa", "abbbbbbbba", "abbbbbbbbbbbbba"]
{M,N} matches the preceding character or group

a minimum of M to a naximum of N times

["aba abbba abbbbba abbbbbbbbbbba"] ["ab{3,5}a"] ["g"] -> ["abbba","abbbbba"]
{,N} matches the preceding character or group a maximum of N times ["aba abbba abbbbba abbbbbbbbbbba"] ["ab{,5}a"] ["g"] -> ["aba", "abbba", "abbbbba"]
{M,} matches the preceding character or group a minimum of M times ["aba abbba abbbbba abbbbbbbbbbba"] ["ab{3,}a"] ["g"] -> ["abbba", "abbbbba", "abbbbbbbbbbba"]
{N} matches the preceding character or group exactly N times ["aba abbba abbbbba abbbbbbbbbbba"] ["ab{3}a"] ["g"] -> ["abbba"]
[...] syntax for any characters or group within the rectangular parenthesis ["Hello Hallo Hmllo Hkllo"] ["H[ea]llo"] ["g"] -> ["Hello","Hallo"].

This can also used like this: ["Hello Hallo Hmllo Hkllo"] ["H[a-k]llo"] ["g] -> ["Hello","Hallo", "Hkllo"].

which acts as every letter in the alphabet from a to k

| seperating possible character or groups ["gray grey griy gary"] ["(gray|grey|griy)"] ["g"] -> ["gray", "grey", "griy"]
\w syntax for all Alphanumericals (a-z A-Z 0-9 and _) ["This is a_text90§!"] ["\w"] ["g"] -> ["T","h","i","s","i","s","a","_","t","e","x","t","9","0"]
\W syntax for all non-Alphanumericals not (a-z A-Z 0-9 and _) ["This is a_text90§!"] ["\W"] ["g"] -> ["§","!"]
\s syntax for the Whitespace which includes:

Ascii: tab, space, line feed, form feed and carriage return Unicode: matches, no-break spaces, next line, and the variable-width spaces (amongst others). Basically anything that is not visible

["This is a testing example"] ["\s"] ["g"] -> [" "," "," "]
\S syntax for all non-Whitespaces ["This is a tesing example"] ["\S"] ["g"] -> ["T","h","i","s","i","s","a","t","e","s","t","i","n","g","e","x","a"m","p","l","e"]
\d syntax for all Digits (0-9) ["My favourite number is 917"] ["\d"] ["g"] -> ["9","1","7"]
\D syntax for all non-Digits not (0-9) ["9 + 10 = 21"] ["\D"] ["g"] -> ["+","="]
^ syntax for the beginning of a string or line ["Hello World, I say Hello"] ["^Hello"] ["g"] -> ["Hello"]
$ syntax for the end of a string or line ["Hello World, I say Hello"] ["Hello$"] ["g"] -> ["Hello"]
\A syntax for the beginning of a string but not line ["Hello World,

Hello"] ["\A"] ["g"] -> ["Hello"]

\z syntax for the end of a string but not line ["Hello World,

Hello"] ["\z"] ["g"] -> ["Hello"]

[^...] syntax for any character or group not inside of the rectangular parenthesis ["Hello Hallo Hmllo Hkllo"] ["H[^ea]llo"] ["g"] -> ["Hmllo","Hkllo"].

This can also used like this: ["Hello Hallo Hmllo Hkllo"] ["H[^a-k]llo"] ["g] -> ["Hmllo"].

which acts as every letter but those in the alphabet from a to k

\ syntax for Escaping Characters.

For example used to get the actual character * instead of the syntax

["I would like to find all *s in this very *-filled *text"] ["\*"] ["g"] -> ["*","*","*"]

This is also used to escape itself: ["I would like to find all \s in this very \-filled \text"] ["\\"] ["g"] -> ["\","\","\"]

Flags

Flags are used to redefine the behaviour of a regular expression. These flags include:

Flags
Flag Name* Description
g Global Finds all the matching sub-strings not only the first one
i Case-insensitive Makes the rule case-insensitive. For example if the rule is [e] and the flag [ig] all e and E's get matched
m Multi-line Changes the meanings of ^ and $ to make them match the beginning and end of each line, rather than the whole string.
s "Dotall" changes the . character to include every character including new-line

To use multiple flags you write them, as seen in the description of the case-insensitive flag, after another with no delimiter (a space or comma inbetween the flags). Order does not matter.

More flags can be found at MDN.