Scilab Website | Contribute with GitLab | Mailing list archives | ATOMS toolboxes
Scilab Online Help
2025.0.0 - English


kmeans

K-means clustering

Syntax

idx = kmeans(X, k)
[idx, c] = kmeans(X, k)

Arguments

X

is a nxp (n observations, p variables) real matrix.

k

a positive integer. It corresponds to number of clusters.

idx

an integer column vector. It corresponds to clusters indices.

c

a k x p real matrix containing to cluster centroid locations.

Description

kmeans is an unsupervised learning method for clustering data points. The algorithm iteratively aims to divide the points of X into k clusters, by minimizing the sum of the distances between the data points and the cluster centroid.

kmeans uses the squared Euclidean distance metric.

idx = kmeans(X, k) returns the column vector containing cluster indices of each point.

[idx, c] = kmeans(X, k) returns the k-by-p matrix containing the k cluster centroid locations.

Examples

rand("seed", 0)
n = 200;
x1 = rand(n, 2, "normal") + 3 * ones(n, 2);
x2 = rand(n, 2, "normal") - 3 * ones(n, 2);
x3 = rand(n, 2, "normal") + [3 -3].*.ones(n, 1);
x4 = rand(n, 2, "normal") + [-3 3].*.ones(n, 1);
x5 = rand(n, 2, "normal") + [1 -1].*.ones(n, 1);
x6 = rand(n, 2, "normal") + [-1 1].*.ones(n, 1);
x = [x1; x2; x3; x4; x5; x6];
scatter(x(:,1), x(:,2), "fill")

[index, c] = kmeans(x, 6);

scf(1);
scatter(x(:,1), x(:,2), [], index, "fill")
plot(c(:,1), c(:,2), "*r") // centroid of each cluster

See also

  • pca — Computes principal component analysis for the data matrix X

History

VersionDescription
2025.0.0 Introduction in Scilab.
Report an issue
<< Hypothesis Testing Statistics polyfit >>

Copyright (c) 2022-2024 (Dassault Systèmes)
Copyright (c) 2017-2022 (ESI Group)
Copyright (c) 2011-2017 (Scilab Enterprises)
Copyright (c) 1989-2012 (INRIA)
Copyright (c) 1989-2007 (ENPC)
with contributors
Last updated:
Thu Oct 24 11:13:09 CEST 2024