K-Means vs K-Means++: Smarter Centroids, Better Clusters
K-Means++ is a clever upgrade to K-Means that fixes its biggest flaw: random initialization. Instead of picking all k centroids at random, K-Means++: Picks the first centroid randomly from the data points. For each remaining point ( x ), compute its shortest distance ( D(x) ) to the nearest chosen centroid. Choose the next centroid from the dataset with probability proportional to ( D(x)^2 ). Repeat until ( k ) centroids are selected. This spreads centroids out more effectively and leads to: ...