Scilab Home page | Wiki | Bug tracker | Forge | Mailing list archives | ATOMS | File exchange
Scilab-Branch-6.1-GIT
Change language to: English - Français - 日本語 - Русский
Ajuda do Scilab >> Funções Elementares > Search and sort > vectorfind

# vectorfind

locates occurences of a (wildcarded) vector in a matrix or hypermatrix

### Syntax

```ind             = vectorfind(haystack, needle)
ind             = vectorfind(haystack, needle, dimAlong)
ind             = vectorfind(haystack, needle, dimAlong, ,indType)
[ind, matching] = vectorfind(haystack, needle, dimAlong, joker)
[ind, matching] = vectorfind(haystack, needle, dimAlong, joker, indType)```

### Arguments

haystack

A matrix or hypermatrix of any type, possibly sparse encoded: The array in which the vector will be searched.

needle

The vector to be searched in the `haystack`, of the same type. If the `haystack` is sparse-encoded, the `needle` may be dense. In addition, if the `haystack` is boolean and a `joker` is used, the `needle` must be numerical instead of boolean. In this case, any of its non-zero components is `%T`

 Decimal numbers, complex numbers, and encoded integers are considered of the same type: numerical. `%nan` values are accepted in the `needle`. They are processed in a regular way, as other values. They are matched only by `%nan` in the `haystack`.
dimAlong

Direction inside the `haystack` array along which the `needle` vector is searched. Possible values are `"r"` or `1` (along rows), `"c"` or `2` (along columns), or for an hypermatrix, any integer such that `2 < dimAlong <= ndims(haystack)` representing the index of the scanned dimension. By default, `"r"` is used.

 `dimAlong` is mandatory when a `joker` or `indType` is specified.
joker

Single element of `needle`'s data type. The `needle` components equal to the `joker` are ignored (they match/accept any values from the `haystack`).

When the haystack is boolean, the `joker` must be a non-zero number.

To skip the `joker`, specify `..dimAlong, ,indType` with no joker value.

indType

Single case-insensitive word among `""` (empty text = default), `"headIJK"`, and `"headN"`: Specifies the format or returned indices. See here-below the description of `ind`.

ind

• When the `needle` is longer than the `haystack` size along the chosen dimension `dimAlong`, `ind=[]` is returned.

• When the `needle`'s length matches the `haystack` size along the chosen dimension,

• By default (`indType==""`): `ind` is a row vector containing the indices of matching rows or columns of the haystack. In case of hypermatrix, returned indices of matching ranges are linearized accross all dimensions but the `dimAlong` one (see examples).

• indType="headN": `ind` is the row vector of linear indices in the `haystack` of the heading component of its matching rows, columns, or higher ranges.

• indType="headIJK": `ind` is a matrix: Each row returns the `[i j ..]` indices in the `haystack` of the heading component of its matching ranges (rows, columns, or higher ranges). `ind` has as many rows as there are matching ranges in the `haystack`.

• Otherwise (short needle): By default, `ind` is the row vector of linear indices of the components of the `haystack` where matching ranges start. Using the `indType="headN"` option does nothing more. Using `indType="headIJK"` returns `ind` as a matrix of `[i j k ..]` indices, as described here-above.

 Returned indices are sorted in increasing order.
matching

When a joker is used, this `matching` optional output is a matrix of haystack's data type returning the actual matching ranges: The matching range number #i is returned in the row `matching(i,:)`.

 When the `haystack` is sparse-encoded, the `matching` matrix is sparse as well.

### Description

`vectorfind()` looks for a given series of values (needle) in a haystack array, along a given right direction/dimension: width (rows), height (columns), thickness (like RGB pixels), etc. The needle may be as long or shorter than the size of the probed side of the haystack.

A special value so-called joker may be specified. Then this value works as a wildcard where it occurs in the needle vector. Since this value is no longer selective -- ANY value from the haystack matches at its position --, it can't simultaneously be used in the needle as a selective one. In practical, any value not present in the haystack makes necessarily a good joker. However, this condition is not mandatory.

Consequence: When the haystack is boolean, the joker -- and so the needle vector as well -- must be numerical. Indeed, it would be otherwise impossible to choose a joker value out of the {%T, %F} limited set of values.

When such a wildcard is used, actual values in matching ranges are not fixed. It is then possible to retrieve them thanks to the `matching` optional output. Otherwise, `matching` is empty (it is a trivial repetition of the needle vector).

##### Search in hypermatrices

Using `vectorfind()` with an hypermatrix haystack deserves some special attention:

• About the direction value `dimAlong`:

For instance, we can then probe the haystack array in "thickness", i.e. accross its successive layers `haystack(:,:,#,..)`. To do so, we will here specify `dimAlong = 3`.

Like for matrices, this kind of high-dimension array can be scanned along its rows or columns. The corresponding `dimAlong` values have there some exceptions:

• Searching the needle as rows is scanning the array accross its columns. Therefore, the `dimAlong = "r"` value should be equivalent to `dimAlong = 2` instead of 1!
• In the same way, searching the needle as columns is scanning the array accross its rows: The usual value `dimAlong = "c"` should be equivalent to `dimAlong = 1` instead of 2!

In order to not quit the common convention `"r"<=>1` and `"c"<=>2` used everywhere in Scilab, `vectorfind()` keeps and copes with it. But one should keep in mind the underlying switch, to have a clear understanding of the returned default indices when `"r",1` or `"c",2` are used.

• About returned indices of matching rows, columns, "pixels"... when the needle is as long as the haystack side size and no `indType` option is used:

Indices of matching ranges are then linear indices of components of the following subspaces:

• With `dimAlong = "r" = 1`: in `haystack(:,1,:,:..)`
• With `dimAlong = "c" = 2`: in `haystack(1,:,:,:..)`
• With `dimAlong = 3`: in `haystack(:,:,1,:..)`
• With `dimAlong = 4`: in `haystack(:,:,:,1,:..)`.
• etc...
The case of a 3D and of a 4D array is dealt with in the Examples section.

Despite they are easy to understand and use for a simple matrix, it is somewhat hard to work with these linear indices in the haystack subspace to actually address the matching ranges in a ND-dimensional array with N>2. The option `indType = "headN" | "headIJK` will then return more workable indices refering to the whole `haystack` array.

### Examples

In a matrix of numbers:

```m = [ 1  0   1   2  2  1
2  2   0   1  0  2
0  2  %nan 2  1  2
2 %nan 1   0  1  2
];
vectorfind(m,[2 0 1 1], "c")            // => 5
vectorfind(m,[2 0 1 1], "c",,"headN")   // => 17
vectorfind(m,[2 0 1 1], "c",,"headIJK") // [1 5]

// With a short needle:
vectorfind(m,[2 2])                     // => [2 13]
vectorfind(m,[2 2], "r",,"headN")       // same output
vectorfind(m,[2 2], "r",,"headIJK")     // => [2 1 ; 1 4]
vectorfind(m,[2 %nan])                  // => [4 7]

// With a wildcard in the needle:

// ex #1: All columns starting with 1 and ending with 2:
[n, ma] = vectorfind(m,[1 .3 .3 2], "c", .3) // => n = [1 6], ma = [1 2 0 2; 1 2 2 2]

// ex #2: All rows having a [2 * 2] range (wildcarded short needle):
[n, ma] = vectorfind(m,[2 .3  2], "r", .3)   // => n = [7 15], ma = [2 %nan 2; 2 1 2]
vectorfind(m,[2 .3  2], "r", .3, "headIJK")  // => [3 2 ; 3 4]
// Note: The %nan is matched by *```

In a boolean matrix:

```m = [0  0  0  1  1  0
0  1  1  1  0  1
1  1  0  1  1  1
1  0  1  0  0  1]==1
// m  =
//  F F F T T F
//  F T T T F T
//  T T F T T T
//  T F T F F T
vectorfind(m, [%F %T %T %F], "c")   // => 2
vectorfind(m, [%T %T], "c")         // => [3 6 13 14 22 23]
vectorfind(m, [1 1], "c")           // => error: same type expected
// Joker => the needle is numerical:
[n, ma] = vectorfind(m, [0 %nan 0 %nan 1], "r", %nan) // => n=[1 8], ma=[F F F T T ; F T F F T]```

In a tiny 8-color RGB image (3D hypermatrix of uint8 integers):

```// Generating the array of color brightnesses:
m = [1  1  1  1  1  0  1  0  0  0  1  0  1  0  0
1  1  0  0  0  0  1  0  1  0  1  1  1  1  1
1  1  0  1  0  1  1  0  0  1  1  0  0  1  0];
m = uint8(matrix(m,3,5,3)*255)
// m  =
//(:,:,1)                   // RED layer
//  255  255  255  255  255
//  255  255    0    0    0
//  255  255    0  255    0
//(:,:,2)                   // GREEN layer
//    0  255    0    0    0
//    0  255    0  255    0
//  255  255    0    0  255
//(:,:,3)                   // BLUE layer
//  255    0  255    0    0
//  255  255  255  255  255
//  255    0    0  255    0

// Locates red pixels:
vectorfind(m, [255 0 0], 3)             // => [10 13]
vectorfind(m, [255 0 0], 3,,"headIJK")  // => [1 4 1 ; 1 5 1]

// Pixels with Green & Blue ON, whatever is their Red channel:
//   We may use a decimal-encoded needle (not a uint8).
//   Then, %nan is a possible joker, that can't be in the uint8 image:
vectorfind(m, [%nan 255 255], 3, %nan,"headIJK") // => [3 1 1; 2 2 1; 2 4 1]

// Columns of 255:
vectorfind(m, [255 255 255], "c")      // => [1 2 7 11]```

In a 4D hypermatrix of text:

```m  = [
"U"  "C"  "G"  "A"  "A"  "A"  "U"  "U"  "A"  "G"  "A"  "G"
"A"  "A"  "A"  "A"  "C"  "C"  "U"  "U"  "C"  "G"  "G"  "G"
"A"  "G"  "A"  "C"  "G"  "C"  "C"  "C"  "G"  "C"  "A"  "G"
"C"  "U"  "G"  "G"  "G"  "A"  "A"  "G"  "C"  "C"  "C"  "C"
"C"  "G"  "G"  "A"  "A"  "G"  "U"  "C"  "A"  "U"  "G"  "C"
];
m = matrix(m, 3, 5, 2, 2);
// (:,:,1,1)
// !U  C  A  G  A  !
// !A  C  G  G  G  !
// !A  C  U  A  G  !
//(:,:,2,1)
// !A  G  C  A  C  !
// !A  A  G  A  A  !
// !C  A  G  C  G  !
//(:,:,1,2)
// !U  A  U  C  G  !
// !U  U  C  A  C  !
// !C  U  G  C  A  !
//(:,:,2,2)
// !G  C  G  G  G  !
// !G  U  A  G  C  !
// !C  A  C  G  C  !

vectorfind(m, ["A" "A" "C"], "c")       // => [6 9]
vectorfind(m, [""  "G" "G"], "c", "")   // => [5 8 19]

// Joker
[n, ma] = vectorfind(m, ["" "G" "G"], "c", "", "headN") // => n=[13 22 55], ma=[A G G; C G G; G G G]
vectorfind(m, ["" "C" "C"], "c", "", "headIJK") // => [1 2 1 1 ; 1 5 2 2]

// Short needle
vectorfind(m, ["C" "C"], "c",,"headIJK")        // => [1 2 1 1; 2 2 1 1; 2 5 2 2]

// Short needle with joker
vectorfind(m, ["A" "" "A"],"r","","headIJK")    // => [1 3 1 1 ; 2 2 2 1]```

• find — fornece os índices de elementos %T ou diferentes de zero
• members — count (and locate) in an array each element or row or column of another array
• grep — acha correspondências de um string em um vetor de strings

### History

 Versão Descrição 6.1 `vectorfind(H,[])` nows returns `[]` instead of an error. When the needle is too long, `[]` is now returned instead of an error. A needle shorter than the haystack size can now be used. A wildcard value matched by any value of the haystack can now be specified and used in the needle. Then, actual matching ranges can be returned: Options `joker` and `matching` added. Any `%nan` value occuring in the needle is now processed as any other regular value: It is matched by `%nan` in the haystack. It could formerly never be matched. Hypermatrices can now be processed as haystack. The probing direction `dimAlong` can now be numerical: 1, 2, .. Option `indType` added.