meanshift

Mean Shift clustering algorithm

Syntax

[centers, labels] = meanshift(X)
[centers, labels] = meanshift(X, bandwidth)
[centers, labels] = meanshift(X, bandwidth, opts)

Arguments

X

is a N x D (N samples, D features) real matrix.

bandwidth

a positive scalar. Radius parameter controlling the neighborhood window. Small values produce many small clusters; large values produce fewer clusters. If not provided or is equal to [], bandwidth is obtained by estimate_bandwidth.

opts

structure with fields:

max_iter: positive integer (default=300). Maximum number of iterations per seed for convergence.
kernel: string (default="flat"). Kernel type for weighting points in the neighborhood. Supported values: "flat" (uniform kernel, all neighbors equally weighted) and "gaussian" (Gaussian kernel, weights decrease with distance.)
seeds: real matrix (M x D) (default=X). Initial seed points for clustering. If not provided, seeds are initialized from the dataset.
bin_seeding: boolean (default=%f). If %t, seeds are initialized from a grid of binned points rather than all samples, which speeds up computation on large datasets.
min_bin_freq: positive integer (default=1). Minimum number of points required in a bin to be considered as a seed (used only when bin_seeding=%t).

centers

real matrix (K x D). Final cluster centers. K is the number of clusters.

labels

integers column vector (N x 1). Cluster assignment for each point (values from 1 to K).

Description

The Mean Shift algorithm is a non-parametric clustering that draws a neighbordhood circle of radius bandwidth around each seed, calculates the barycenter of the points located in this circle, and moves the point towards this barycenter. The iterations continue until the algorithm converges, that is, until the barycenters no longer move. Two points are classified in the same class if they have converged towards the same barycenter.

With flat kernel, all neighbors inside the window are weighted equally.
With gaussian kernel, weights decrease smoothly with distance.

Options seeds, bin_seeding, and min_bin_freq allow control over initialization for performance on large datasets.

Examples

Two well-separated blobs

rand("seed", 0)
n = 20;
X1 = [rand(n, 2) + 1; rand(n, 2) + 5];
[centers1, labels1] = meanshift(X1, 1);
scf(); 
scatter(X1(:,1), X1(:,2), [], labels1, "fill");
plot(centers1(:,1), centers1(:,2), 'r*', "markersize", 10, "thickness", 3);

Three nearby clusters

rand("seed", 0)
n = 20;
X2 = [rand(n, 2) + 1; rand(n, 1) + 3, rand(n, 1) + 1; rand(n, 1) + 2, rand(n, 1) + 3];
[centers2, labels2] = meanshift(X2, 0.8);
scf(); 
scatter(X2(:,1), X2(:,2), [], labels2, "fill");
plot(centers2(:,1), centers2(:,2), 'r*', "markersize", 10, "thickness", 3);

Unequal cluster sizes

rand("seed", 0)
X3 = [rand(50, 2)+2; rand(5,2)+8];
opts = struct("bin_seeding", %t, "min_bin_freq", 2);
[centers3, labels3] = meanshift(X3, 1.2, opts);
scf(); 
scatter(X3(:,1), X3(:,2), [], labels3, "fill");
plot(centers3(:,1), centers3(:,2), 'r*', "markersize", 10, "thickness", 3);

Gaussian kernel

rand("seed", 0)
n = 20;
X4 = [rand(n, 2) + 1; rand(n, 1) + 3 rand(n, 1) + 1];

// estimate bandwidth from data
bw = estimate_bandwidth(X4, 0.3);

opts = struct("kernel", "gaussian");
[centers4, labels4] = meanshift(X4, bw, opts);
scf(); 
scatter(X4(:,1), X4(:,2), [], labels4, "fill");
plot(centers4(:,1), centers4(:,2), 'r*', "markersize", 10, "thickness", 3);

History

Version	Description
2026.0.0	Function added.

Report an issue
<< kmeans	Statistics	polyfit >>

Copyright (c) 2022-2025 (Dassault Systèmes S.E.)
Copyright (c) 2017-2022 (ESI Group)
Copyright (c) 2011-2017 (Scilab Enterprises)
Copyright (c) 1989-2012 (INRIA)
Copyright (c) 1989-2007 (ENPC)
with contributors

Last updated:
Thu Oct 16 09:02:33 CEST 2025