Viet-Anh on Software Logo

What is: k-Means Clustering?

Year2000
Data SourceCC BY-SA - https://paperswithcode.com

k-Means Clustering is a clustering algorithm that divides a training set into kk different clusters of examples that are near each other. It works by initializing kk different centroids {μ(1),,μ(k)\mu\left(1\right),\ldots,\mu\left(k\right)} to different values, then alternating between two steps until convergence:

(i) each training example is assigned to cluster ii where ii is the index of the nearest centroid μ(i)\mu^{(i)}

(ii) each centroid μ(i)\mu^{(i)} is updated to the mean of all training examples x(j)x^{(j)} assigned to cluster ii.

Text Source: Deep Learning, Goodfellow et al

Image Source: scikit-learn