regexp
in a string, locate (and extract) substrings matching a regular expression
Syntax
[start, final, match, foundString] = regexp(input, pattern) [start, final, match, foundString] = regexp(input, pattern, "once")
Arguments
- input
a string.
- pattern
a character string (under the rules of regular expression).
- start
the starting index of each substring of
input
that matches the regular expression stringpattern
.- final
the ending index of each substring of
input
that matches the regular expression stringpattern
.- match
the text of each substring of
input
that matchespattern
.- foundString
the captured parenthesized
subpatterns
.- "once | "o" flag
'o'
for matching the pattern only once.
Description
Regular expressions, often abbreviated as "regex" or "regexp" are powerful tools used in programming and text processing for pattern matching within strings. They provide a concise and flexible means for identifying and manipulating strings of text, such as particular characters, words, or patterns of characters.
They are essentially a sequence of characters that form a search pattern. This pattern can be used to search, edit, or manipulate text. Others features can be encoded :
- Metacharacters
These are special characters that have a unique meaning within a regex :
. (dot): Matches any single character except a newline.
* (asterisk): Matches zero or more occurrences of the preceding element.
+ (plus): Matches one or more occurrences of the preceding element.
? (question mark): Matches zero or one occurrence of the preceding element.
| (pipe): Acts as a logical OR operator.
^ (caret): Matches the beginning of a line.
$ (dollar sign): Matches the end of a line.
- Character Classes
These allow you to match any one of a set of characters. For example,
[abc]
will match any one of the charactersa
,b
, orc
.- Quantifiers
These specify how many instances of a character, group, or character class must be present in the input for a match to be found. Examples include
{n}
,{n,}
, and{n,m}
.- Groups and Capturing
Parentheses
()
are used to create groups within a regex. These groups can be used to capture the text matched by the group for further processing.- Escaping
If you need to match a character that is a metacharacter, you can escape it with a backslash
\
. For example,\.
will match a literal dot.- Anchors
These are used to specify the position in the text where a match must occur. Common anchors include
^
for the start of a line and$
for the end of a line.- Modifiers
These are options that change how the regex engine interprets the pattern. Common modifiers include case-insensitive matching and global matching.
For the full syntax specification, see the regular expressions supported by PCRE2.
Examples
regexp('xabyabbbz','/ab*/','o') regexp('a!','/((((((((((a))))))))))\041/') regexp('ABCC','/^abc$/i') regexp('ABC','/ab|cd/i') [a b c]=regexp('XABYABBBZ','/ab*/i') piString="3.14" [a,b,c,piStringSplit]=regexp(piString,"/(\d+)\.(\d+)/") disp(piStringSplit(1)) disp(piStringSplit(2)) [a,b,c,d]=regexp('xabyabbbz','/ab(.*)b(.*)/') size(d) // get host name from URL myURL="https://www.scilab.org/download/"; [a,b,c,d]=regexp(myURL,'@^(?:http://)?([^/]+)@i') str='foobar: 2012'; // Using named subpatterns [a,b,c,d]=regexp(str,'/(?P<name>\w+): (?P<digit>\d+)/') d(1)=="foobar" d(2)=="2012"
See also
- strindex — search position of a character string in another string
- strsubst — substitute a character string by another in a character string
- strsplit — split a single string at some given positions or patterns
- regular expressions supported by PCRE2
History
Version | Description |
2026.0.0 | PCRE2 was used as engine. |
5.4.0 | A new output argument, foundString, has been added to retrieve subpatterns matches. |
Report an issue | ||
<< prettyprint | Chaînes de caractères | sci2exp >> |