kmeans
K-means clustering
Syntax
idx = kmeans(X, k) [idx, c] = kmeans(X, k)
Arguments
- X
is a nxp (n observations, p variables) real matrix.
- k
a positive integer. It corresponds to number of clusters.
- idx
an integer column vector. It corresponds to clusters indices.
- c
a k x p real matrix containing to cluster centroid locations.
Description
kmeans is an unsupervised learning method for clustering data points. The algorithm iteratively aims to divide the points of X into k clusters, by minimizing the sum of the distances between the data points and the cluster centroid.
kmeans uses the squared Euclidean distance metric.
idx = kmeans(X, k) returns the column vector containing cluster indices of each point.
[idx, c] = kmeans(X, k) returns the k-by-p matrix containing the k cluster centroid locations.
Examples
rand("seed", 0) n = 200; x1 = rand(n, 2, "normal") + 3 * ones(n, 2); x2 = rand(n, 2, "normal") - 3 * ones(n, 2); x3 = rand(n, 2, "normal") + [3 -3].*.ones(n, 1); x4 = rand(n, 2, "normal") + [-3 3].*.ones(n, 1); x5 = rand(n, 2, "normal") + [1 -1].*.ones(n, 1); x6 = rand(n, 2, "normal") + [-1 1].*.ones(n, 1); x = [x1; x2; x3; x4; x5; x6]; scatter(x(:,1), x(:,2), "fill") [index, c] = kmeans(x, 6); scf(1); scatter(x(:,1), x(:,2), [], index, "fill") plot(c(:,1), c(:,2), "*r") // centroid of each cluster
See also
- pca — Computes principal component analysis for the data matrix X
History
Version | Description |
2025.0.0 | Introduction in Scilab. |
Report an issue | ||
<< Tests d'hypothèses | Statistiques | polyfit >> |