The method is quite simple.

We do a mean-shift-procedure for each vector, i.e. we use some weightening window (a.k.a. “kernel”, typically you use a Gaussian kernel) and compute a weighted mean of all the neighbored vectors around the current vector (the weight values come from the kernel function), then we shift the so called “mode estimate” exactly onto that new weighted sum (position) and continue (iterative procedure) until the mode estimate does not move significantly any longer, i.e. the mean shift vector has a length below some threshold.

Finally we collect all similar modes - it can happen e.g. that two modes lie with one pixel distance to each other.

Variance (sigma^2) of Gaussian kernel = 1000 pixel, i.e. standard deviation (sigma) = sqrt(1000) = 31.62 pixel

Final clusters:

i.e. standard deviation = sqrt(250) = 15.81 pixel

Note the very different final cluster results using var=1000 pixel vs. var=250 pixel for the Gaussian mean shift kernel.

You can see that you indirectly control the nr of clusters by the variance of the Gaussian kernel used for the mean shift procedure:

- small variance (sigma^2) ⇒ many clusters
- large variance (sigma^2) ⇒ few clusters

Have a look at the image above. The final cluster centers (the circles that are not filled) were computed using the final mode estimates. This is incorrect! Have a look at e.g. the dark blue cluster at the top. It has only one sample assigned, but the cluster center is not exactly on that sample!

For computing the final cluster centers we have to average all the positions that are assigned to a cluster, then we get the correct cluster centers:

public/mean_shift_clustering.txt · Last modified: 2012/02/01 17:31 (external edit) · []