Scilab Home page | Wiki | Bug tracker | Forge | Mailing list archives | ATOMS | File exchange
Please login or create an account
Change language to: Français - Português - 日本語 - Русский
Scilab Help >> Elementary Functions > Search and sort > vectorfind

vectorfind

locates occurences of a (wildcarded) vector in a matrix or hypermatrix

Syntax

ind             = vectorfind(haystack, needle)
ind             = vectorfind(haystack, needle, dimAlong)
ind             = vectorfind(haystack, needle, dimAlong, ,indType)
[ind, matching] = vectorfind(haystack, needle, dimAlong, joker)
[ind, matching] = vectorfind(haystack, needle, dimAlong, joker, indType)

Arguments

haystack

A matrix or hypermatrix of any type, possibly sparse encoded: The array in which the vector will be searched.

needle

The vector to be searched in the haystack, of the same type. If the haystack is sparse-encoded, the needle may be dense. In addition, if the haystack is boolean and a joker is used, the needle must be numerical instead of boolean. In this case, any of its non-zero components is %T

  • Decimal numbers, complex numbers, and encoded integers are considered of the same type: numerical.
  • %nan values are accepted in the needle. They are processed in a regular way, as other values. They are matched only by %nan in the haystack.
dimAlong

Direction inside the haystack array along which the needle vector is searched. Possible values are "r" or 1 (along rows), "c" or 2 (along columns), or for an hypermatrix, any integer such that 2 < dimAlong <= ndims(haystack) representing the index of the scanned dimension. By default, "r" is used.

dimAlong is mandatory when a joker or indType is specified.
joker

Single element of needle's data type. The needle components equal to the joker are ignored (they match/accept any values from the haystack).

When the haystack is boolean, the joker must be a non-zero number.

To skip the joker, specify ..dimAlong, ,indType with no joker value.

indType

Single case-insensitive word among "" (empty text = default), "headIJK", and "headN": Specifies the format or returned indices. See here-below the description of ind.

ind

  • When the needle is longer than the haystack size along the chosen dimension dimAlong, ind=[] is returned.

  • When the needle's length matches the haystack size along the chosen dimension,

    • By default (indType==""): ind is a row vector containing the indices of matching rows or columns of the haystack. In case of hypermatrix, returned indices of matching ranges are linearized accross all dimensions but the dimAlong one (see examples).

    • indType="headN": ind is the row vector of linear indices in the haystack of the heading component of its matching rows, columns, or higher ranges.

    • indType="headIJK": ind is a matrix: Each row returns the [i j ..] indices in the haystack of the heading component of its matching ranges (rows, columns, or higher ranges). ind has as many rows as there are matching ranges in the haystack.

  • Otherwise (short needle): By default, ind is the row vector of linear indices of the components of the haystack where matching ranges start. Using the indType="headN" option does nothing more. Using indType="headIJK" returns ind as a matrix of [i j k ..] indices, as described here-above.

Returned indices are sorted in increasing order.
matching

When a joker is used, this matching optional output is a matrix of haystack's data type returning the actual matching ranges: The matching range number #i is returned in the row matching(i,:).

When the haystack is sparse-encoded, the matching matrix is sparse as well.

Description

vectorfind() looks for a given series of values (needle) in a haystack array, along a given right direction/dimension: width (rows), height (columns), thickness (like RGB pixels), etc. The needle may be as long or shorter than the size of the probed side of the haystack.

A special value so-called joker may be specified. Then this value works as a wildcard where it occurs in the needle vector. Since this value is no longer selective -- ANY value from the haystack matches at its position --, it can't simultaneously be used in the needle as a selective one. In practical, any value not present in the haystack makes necessarily a good joker. However, this condition is not mandatory.

Consequence: When the haystack is boolean, the joker -- and so the needle vector as well -- must be numerical. Indeed, it would be otherwise impossible to choose a joker value out of the {%T, %F} limited set of values.

When such a wildcard is used, actual values in matching ranges are not fixed. It is then possible to retrieve them thanks to the matching optional output. Otherwise, matching is empty (it is a trivial repetition of the needle vector).

Search in hypermatrices

Using vectorfind() with an hypermatrix haystack deserves some special attention:

  • About the direction value dimAlong:

    For instance, we can then probe the haystack array in "thickness", i.e. accross its successive layers haystack(:,:,#,..). To do so, we will here specify dimAlong = 3.

    Like for matrices, this kind of high-dimension array can be scanned along its rows or columns. The corresponding dimAlong values have there some exceptions:

    • Searching the needle as rows is scanning the array accross its columns. Therefore, the dimAlong = "r" value should be equivalent to dimAlong = 2 instead of 1!
    • In the same way, searching the needle as columns is scanning the array accross its rows: The usual value dimAlong = "c" should be equivalent to dimAlong = 1 instead of 2!

    In order to not quit the common convention "r"<=>1 and "c"<=>2 used everywhere in Scilab, vectorfind() keeps and copes with it. But one should keep in mind the underlying switch, to have a clear understanding of the returned default indices when "r",1 or "c",2 are used.

  • About returned indices of matching rows, columns, "pixels"... when the needle is as long as the haystack side size and no indType option is used:

    Indices of matching ranges are then linear indices of components of the following subspaces:

    • With dimAlong = "r" = 1: in haystack(:,1,:,:..)
    • With dimAlong = "c" = 2: in haystack(1,:,:,:..)
    • With dimAlong = 3: in haystack(:,:,1,:..)
    • With dimAlong = 4: in haystack(:,:,:,1,:..).
    • etc...
    The case of a 3D and of a 4D array is dealt with in the Examples section.

    Despite they are easy to understand and use for a simple matrix, it is somewhat hard to work with these linear indices in the haystack subspace to actually address the matching ranges in a ND-dimensional array with N>2. The option indType = "headN" | "headIJK will then return more workable indices refering to the whole haystack array.

Examples

In a matrix of numbers:

m = [ 1  0   1   2  2  1
      2  2   0   1  0  2
      0  2  %nan 2  1  2
      2 %nan 1   0  1  2
    ];
vectorfind(m,[2 0 1 1], "c")            // => 5
vectorfind(m,[2 0 1 1], "c",,"headN")   // => 17
vectorfind(m,[2 0 1 1], "c",,"headIJK") // [1 5]

// With a short needle:
vectorfind(m,[2 2])                     // => [2 13]
vectorfind(m,[2 2], "r",,"headN")       // same output
vectorfind(m,[2 2], "r",,"headIJK")     // => [2 1 ; 1 4]
vectorfind(m,[2 %nan])                  // => [4 7]

// With a wildcard in the needle:

// ex #1: All columns starting with 1 and ending with 2:
[n, ma] = vectorfind(m,[1 .3 .3 2], "c", .3) // => n = [1 6], ma = [1 2 0 2; 1 2 2 2]

// ex #2: All rows having a [2 * 2] range (wildcarded short needle):
[n, ma] = vectorfind(m,[2 .3  2], "r", .3)   // => n = [7 15], ma = [2 %nan 2; 2 1 2]
vectorfind(m,[2 .3  2], "r", .3, "headIJK")  // => [3 2 ; 3 4]
                                             // Note: The %nan is matched by *

In a boolean matrix:

m = [0  0  0  1  1  0
     0  1  1  1  0  1
     1  1  0  1  1  1
     1  0  1  0  0  1]==1
// m  =
//  F F F T T F
//  F T T T F T
//  T T F T T T
//  T F T F F T
vectorfind(m, [%F %T %T %F], "c")   // => 2
vectorfind(m, [%T %T], "c")         // => [3 6 13 14 22 23]
vectorfind(m, [1 1], "c")           // => error: same type expected
// Joker => the needle is numerical:
[n, ma] = vectorfind(m, [0 %nan 0 %nan 1], "r", %nan) // => n=[1 8], ma=[F F F T T ; F T F F T]

In a tiny 8-color RGB image (3D hypermatrix of uint8 integers):

// Generating the array of color brightnesses:
m = [1  1  1  1  1  0  1  0  0  0  1  0  1  0  0
     1  1  0  0  0  0  1  0  1  0  1  1  1  1  1
     1  1  0  1  0  1  1  0  0  1  1  0  0  1  0];
m = uint8(matrix(m,3,5,3)*255)
// m  =
//(:,:,1)                   // RED layer
//  255  255  255  255  255
//  255  255    0    0    0
//  255  255    0  255    0
//(:,:,2)                   // GREEN layer
//    0  255    0    0    0
//    0  255    0  255    0
//  255  255    0    0  255
//(:,:,3)                   // BLUE layer
//  255    0  255    0    0
//  255  255  255  255  255
//  255    0    0  255    0

// Locates red pixels:
vectorfind(m, [255 0 0], 3)             // => [10 13]
vectorfind(m, [255 0 0], 3,,"headIJK")  // => [1 4 1 ; 1 5 1]

// Pixels with Green & Blue ON, whatever is their Red channel:
//   We may use a decimal-encoded needle (not a uint8).
//   Then, %nan is a possible joker, that can't be in the uint8 image:
vectorfind(m, [%nan 255 255], 3, %nan,"headIJK") // => [3 1 1; 2 2 1; 2 4 1]

// Columns of 255:
vectorfind(m, [255 255 255], "c")      // => [1 2 7 11]

In a 4D hypermatrix of text:

m  = [
  "U"  "C"  "G"  "A"  "A"  "A"  "U"  "U"  "A"  "G"  "A"  "G"
  "A"  "A"  "A"  "A"  "C"  "C"  "U"  "U"  "C"  "G"  "G"  "G"
  "A"  "G"  "A"  "C"  "G"  "C"  "C"  "C"  "G"  "C"  "A"  "G"
  "C"  "U"  "G"  "G"  "G"  "A"  "A"  "G"  "C"  "C"  "C"  "C"
  "C"  "G"  "G"  "A"  "A"  "G"  "U"  "C"  "A"  "U"  "G"  "C"
  ];
m = matrix(m, 3, 5, 2, 2);
// (:,:,1,1)
// !U  C  A  G  A  !
// !A  C  G  G  G  !
// !A  C  U  A  G  !
//(:,:,2,1)
// !A  G  C  A  C  !
// !A  A  G  A  A  !
// !C  A  G  C  G  !
//(:,:,1,2)
// !U  A  U  C  G  !
// !U  U  C  A  C  !
// !C  U  G  C  A  !
//(:,:,2,2)
// !G  C  G  G  G  !
// !G  U  A  G  C  !
// !C  A  C  G  C  !

vectorfind(m, ["A" "A" "C"], "c")       // => [6 9]
vectorfind(m, [""  "G" "G"], "c", "")   // => [5 8 19]

// Joker
[n, ma] = vectorfind(m, ["" "G" "G"], "c", "", "headN") // => n=[13 22 55], ma=[A G G; C G G; G G G]
vectorfind(m, ["" "C" "C"], "c", "", "headIJK") // => [1 2 1 1 ; 1 5 2 2]

// Short needle
vectorfind(m, ["C" "C"], "c",,"headIJK")        // => [1 2 1 1; 2 2 1 1; 2 5 2 2]

// Short needle with joker
vectorfind(m, ["A" "" "A"],"r","","headIJK")    // => [1 3 1 1 ; 2 2 2 1]

See also

  • find — find indices of boolean vector or matrix true elements
  • members — count (and locate) in an array each element or row or column of another array
  • grep — find matches of a string in a vector of strings

History

VersionDescription
6.1
  • vectorfind(H,[]) nows returns [] instead of an error.
  • When the needle is too long, [] is now returned instead of an error.
  • A needle shorter than the haystack size can now be used.
  • A wildcard value matched by any value of the haystack can now be specified and used in the needle. Then, actual matching ranges can be returned: Options joker and matching added.
  • Any %nan value occuring in the needle is now processed as any other regular value: It is matched by %nan in the haystack. It could formerly never be matched.
  • Hypermatrices can now be processed as haystack.
  • The probing direction dimAlong can now be numerical: 1, 2, ..
  • Option indType added.
Scilab Enterprises
Copyright (c) 2011-2017 (Scilab Enterprises)
Copyright (c) 1989-2012 (INRIA)
Copyright (c) 1989-2007 (ENPC)
with contributors
Last updated:
Tue Feb 25 08:49:19 CET 2020