K-Means vs K-Means++: Smarter Centroids, Better Clusters

K-Means++ is a clever upgrade to K-Means that fixes its biggest flaw: random initialization. Instead of picking all k centroids at random, K-Means++: Picks the first centroid randomly from the data points. For each remaining point ( x ), compute its shortest distance ( D(x) ) to the nearest chosen centroid. Choose the next centroid from the dataset with probability proportional to ( D(x)^2 ). Repeat until ( k ) centroids are selected. This spreads centroids out more effectively and leads to: ...

June 30, 2025 · 1 min · Shivam Chhuneja

Reducing Churn in E-Commerce: My End-to-End Capstone Project in Predictive Modeling

Customer churn isn’t just a marketing problem - it’s a business survival issue. In competitive industries like e-commerce, losing one customer often means losing several revenue streams, especially when one account can represent multiple users. This post is a breakdown of my churn prediction capstone project for the postgraduate data science program at UT Austin - also tied to my master’s in data science at Deakin U. The project was closed-source, so I can’t release the full notebook, but I’ll walk you through everything I did including code snippets, results, charts, what I learned, and where this project fits in my larger journey into machine learning and MLOps. ...

May 27, 2025 · 5 min · Shivam Chhuneja