Difference between revisions of "Regular Expressions"
Jump to navigation
Jump to search
(6 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
Regular expressions or regexp are used to find strings in text. | Regular expressions or regexp are used to find strings in text. | ||
+ | |||
+ | [[Perl]], [[PHP]] and [[Python]] support all in below table. [[:Category:Bash|Bash]] and [[SQL]] only support the POSIX part. | ||
{| class="wikitable" | {| class="wikitable" | ||
|- | |- | ||
− | |.||Any character except newline||\c||Control character | + | |.||Any character except newline (POSIX)||\c||Control character |
|- | |- | ||
|\d||Digit||\D||non Digit | |\d||Digit||\D||non Digit | ||
Line 11: | Line 13: | ||
|\w||Word character [A-Za-z0-9]||\W||non Word character | |\w||Word character [A-Za-z0-9]||\W||non Word character | ||
|- | |- | ||
− | |^||Start of string||$||End of string | + | |^||Start of string (POSIX)||$||End of string (POSIX) |
+ | |- | ||
+ | |*||0 or more matches of previous expression (POSIX)||( )||Subexpression (POSIX) | ||
|- | |- | ||
− | | | + | |<nowiki>+</nowiki>||1 or more matches of previous expression (POSIX)||[ ]||Match any of the characters between the [ ].<br>^as first character negates the match (POSIX) |
|- | |- | ||
− | |<nowiki> | + | |<nowiki>?</nowiki>||0 or 1 matches of previous expression.<br>Stop search as soon as next expression is found (non greedy) (POSIX)||{x,y}||Match the previous expression between x an y times.<br>When y is ommited match exactly x times (POSIX) |
|- | |- | ||
− | |<nowiki> | + | |<nowiki>\n</nowiki>||Refers to the Nth subexpression (POSIX)|| || |
|} | |} | ||
+ | ==Examples== | ||
+ | ;<code>/[^/]*$</code> | ||
+ | :Match last / and everything following | ||
==Perl== | ==Perl== | ||
Line 28: | Line 35: | ||
;$var =~ /<pattern>/ | ;$var =~ /<pattern>/ | ||
:Generic syntax, this expression is true if the pattern is matched in $var | :Generic syntax, this expression is true if the pattern is matched in $var | ||
+ | |||
+ | ;@array = $var =~ m/<pattern>/g; | ||
+ | :Put all matches (or all first submatches) of <pattern> in var into @array | ||
Following variables are when a match is made: | Following variables are when a match is made: | ||
Line 53: | Line 63: | ||
&& ($rev = $+); | && ($rev = $+); | ||
</syntaxhighlight> | </syntaxhighlight> | ||
+ | |||
==Python== | ==Python== | ||
− | Check [[Python:Strings# | + | Check [[Python:Strings#Regular_Expressions_(regexp)]] |
Latest revision as of 13:33, 14 June 2022
Regular expressions or regexp are used to find strings in text.
Perl, PHP and Python support all in below table. Bash and SQL only support the POSIX part.
. | Any character except newline (POSIX) | \c | Control character |
\d | Digit | \D | non Digit |
\s | Whitespace | \S | non Whitespace |
\w | Word character [A-Za-z0-9] | \W | non Word character |
^ | Start of string (POSIX) | $ | End of string (POSIX) |
* | 0 or more matches of previous expression (POSIX) | ( ) | Subexpression (POSIX) |
+ | 1 or more matches of previous expression (POSIX) | [ ] | Match any of the characters between the [ ]. ^as first character negates the match (POSIX) |
? | 0 or 1 matches of previous expression. Stop search as soon as next expression is found (non greedy) (POSIX) |
{x,y} | Match the previous expression between x an y times. When y is ommited match exactly x times (POSIX) |
\n | Refers to the Nth subexpression (POSIX) |
Examples
/[^/]*$
- Match last / and everything following
Perl
- perl -lne 'print $1 if (/<regexp(subexp)>/)'
- Commandline to print the first subexp in a match.
- $var =~ /<pattern>/
- Generic syntax, this expression is true if the pattern is matched in $var
- @array = $var =~ m/<pattern>/g;
- Put all matches (or all first submatches) of <pattern> in var into @array
Following variables are when a match is made:
- $&
- Contains the string matched by the last pattern match
- $`
- The string preceding whatever was matched by the last pattern match, not counting patterns matched in nested blocks that have been exited already.
- $'
- The string following whatever was matched by the last pattern match, not counting patterns matched in nested blocks that have been exited already.
Example:
$_ = 'abcdefghi';
/def/;
print "$`:$&:$'";
# prints abc:def:ghi
- $1
- String matched by the first subexpression.
- $+
- The last bracket matched by the last search pattern. This is useful if you don't know which of a set of alternative patterns matched.
Example:
/Version: (.*)|Revision: (.*)/
&& ($rev = $+);