strsplit
split a single string at some given positions or patterns
Syntax
chunks = strsplit(string) chunks = strsplit(string, indices) [chunks, matched_separators] = strsplit(string, separators) [chunks, matched_separators] = strsplit(string, separators, limit) [chunks, matched_separators] = strsplit(string, regexp) [chunks, matched_separators] = strsplit(string, regexp, limit)
Arguments
- string
- a single character string to split. UTF8 extended characters supported.
- indices
- vector of increasing indices, in the interval
                    [1, length(string)-1].
- separators
- matrix of strings searched in the stringand used as scissors. UTF8 extended characters are supported.
- regexp
- single string starting and ending with "/" and specifying a case-sensitive
                    regular expression pattern used as splitting separator. No regexp option
                    can be used after the trailing "/" delimiter. The regular expression
                    may include UTF8 extended characters. The "/" and "\" characters used
                    in the body of the regexp must be protected as "\/" and "\\".
                    Example: "/k.{2}o/"
- chunks
- column of strings, with length(indices)+1elements = split chunks.
- matched_separators
- column of strings, of size size(chunks,1)-1: matched separators or expression patterns.
- limit
- integer > 0: Maximum number of times that separators are searched and used along
                    the string. If this one includes more separators occurrences, its unsplit tail is returned as last chunk inchunks($).
Description
strsplit(string) splits string
            into all its individual characters.
strsplit(string, indices) splits string
            at the characters positions given in the indices vector.
            Characters at these indices are heads of returned chunks.
strsplit(string, separators) splits string
            at positions after any matching separator among
            separators strings.
            Detected and used separators are removed from chunks tails.
            strsplit(string, "") is equivalent to
            strsplit(string).
strsplit(string, regexp) does the same,
            except that string is parsed for the given regular expression
            used as "generic separator", instead of for any "constant" separator among
            a limited separators set.
If string starts with a matching separator or expression,
            chunks(1) is set to "".
If string ends with a matching separator or expression,
            "" is appended as last chunks
            element.
If no matching separator or regexp is found in string,
            this one is returned as is in chunks.
            That will be noticeably the case for string="".
Without the limit option, any string
            including n separators will be split into
            n+1 chunks.
strsplit(string, separators, limit) or
            strsplit(string, regexp, limit) will
            search for a matching separator or expression for a maximum of
            limit times. If then there are remaining matches in
            the unprocessed tail of string, this tail is returned
            as is in chunks($).
[chunks, matched_separators] = strsplit(string,…)
            returns the column of the matched separators or expressions, in addition to
            chunks.
            Then strcat([chunks' ; [matched_separators' ""]]) should be
            equal to string.
|  | Comparison between strsplit() and tokens():
             
 | 
Examples
Split at given indices:
strsplit("Scilab")' strsplit("αβδεϵζηθικλμνξοπρστυφϕχψω", [1 6 11])
--> strsplit("Scilab")'
 ans  =
  "S"  "c"  "i"  "l"  "a"  "b"
--> strsplit("αβδεϵζηθικλμνξοπρστυφϕχψω", [1 6 11])
 ans  =
  "α"
  "βδεϵζ"
  "ηθικλ"
  "μνξοπρστυφϕχψω"
Split at matching separators:
strsplit("aabcabbcbaaacacaabbcbccaaabcbc", "aa") // t starts with the separator => heading "" chunk // Consecutive separators are not squeezed: strsplit("abbcccdde", "c")' // With several possible separators: t = "aabcabbcbaaacacaabbcbccaaabcbc"; [c, s] = strsplit(t, ["aa" "bb"]); c', s' strcat([c';[s' ""]]) == t // Let's limit the number of split to 4, => 4 chunks + unprocessed tail: strsplit("aabcabbcbaaacacaabbcbccaaabcbc", ["aa" "bb"], 4) // Splitting a string ending with a separator yields a final "": strsplit("aabcabbcbaaacacaabbcbccaaabcbc", "cbc")'
--> strsplit("aabcabbcbaaacacaabbcbccaaabcbc", "aa") // t starts with the separator => heading "" chunk
 ans  =
  ""
  "bcabbcb"
  "acac"
  "bbcbcc"
  "abcbc"
--> // Consecutive separators are not squeezed:
--> strsplit("abbcccdde", "c")'
 ans  =
  "abb"  ""  ""  "dde"
--> // With several possible separators:
--> t = "aabcabbcbaaacacaabbcbccaaabcbc";
--> [c, s] = strsplit(t, ["aa" "bb"]);
--> c', s'
 ans  =
  ""  "bca"  "cb"  "acac"  ""  "cbcc"  "abcbc"
 ans  =
  "aa"  "bb"  "aa"  "aa"  "bb"  "aa"
--> strcat([c';[s' ""]]) == t
 ans  =
  T
--> // Let's limit the number of split to 4, => 4 chunks + unprocessed tail:
--> strsplit("aabcabbcbaaacacaabbcbccaaabcbc", ["aa" "bb"], 4)'
 ans  =
  ""  "bca"  "cb"  "acac"  "bbcbccaaabcbc"
--> // Splitting a string ending with a separator yields a final "":
--> strsplit("aabcabbcbaaacacaabbcbccaaabcbc", "cbc")'
 ans  =
  "aabcabbcbaaacacaabb"  "caaab"  ""
Use a regular expression as scissors:
[c, s] = strsplit("C:\Windows\System32\OpenSSH\", "/\\|:/"); c', s' [c, s] = strsplit("abcdef8ghijkl3mnopqr6stuvw7xyz", "/\d+/", 2); c', s'
--> [c, s] = strsplit("C:\Windows\System32\OpenSSH\",  "/\\|:/");
--> c', s'
 ans  =
  "C"  ""  "Windows"  "System32"  "OpenSSH"  ""
 ans  =
  ":"  "\"  "\"  "\"  "\"
--> [c, s] = strsplit("abcdef8ghijkl3mnopqr6stuvw7xyz", "/\d+/", 2);
--> c', s'
 ans  =
  "abcdef"  "ghijkl"  "mnopqr6stuvw7xyz"
 ans  =
  "8"  "3"
See also
| Report an issue | ||
| << strrev | Chaînes de caractères | strspn >> |