This calculator runs the k-means clustering algorithm on two-dimensional data. You provide a list of points in the plane and choose how many clusters k you want. The tool then returns the coordinates of the cluster centroids and the cluster assignment for each point.
x,y. You may use integers or decimals, with optional spaces after the comma (for example, 1,2, 3.5, -0.2).1 ≤ k <= the number of points.After you click the button to run k-means, the calculator:
K-means is an unsupervised learning method that partitions data into k clusters. Each cluster is represented by a centroid (a point in the same space as the data). The algorithm tries to place centroids so that points in the same cluster are close to each other and far from points in other clusters, using standard Euclidean distance.
Suppose you have n data points in 2D, written as
, where each point has coordinates .
You choose a number of clusters . The algorithm searches for centroids
and a partition of the points into sets (clusters) that minimize the total squared distance from each point to the centroid of its cluster. In symbols, k-means tries to minimize the objective
Here is the usual Euclidean distance between point and centroid . In 2D this distance is
The centroid of each cluster is simply the average of the points assigned to it:
In practice, k-means alternates between assigning each point to its nearest centroid and recomputing centroids as these averages, until the assignments stop changing or the improvement becomes negligible.
When you run the calculator, it typically displays two main outputs:
1, 2, …, k, you see the centroid coordinates (xc, yc). Each centroid is like the “center of mass” of that cluster.You can use these results to:
If you try multiple values of k, you will notice that:
Consider this simple dataset of six points:
0, 0 0, 1 1, 0 5, 5 5, 6 6, 5
There are two obvious groups: three points near (0,0) and three near (5,5). If you set k = 2 and run the calculator, you should see:
(0.33, 0.33) and (5.33, 5.33) (exact values can vary slightly).Interpretation:
| Method | Key idea | When it works well | Limitations |
|---|---|---|---|
| K-means (this calculator) | Finds k centroids that minimize squared distances within clusters. | Compact, roughly spherical clusters with similar size; numeric 2D data. | Sensitive to outliers and scaling; requires choosing k in advance. |
| Hierarchical clustering | Builds a tree of merges or splits between clusters. | Exploratory analysis when you want to see structure at multiple levels. | Can be slower on large datasets; tree cut choice can be subjective. |
| Density-based (e.g., DBSCAN) | Groups dense regions and marks isolated points as noise. | Irregular shapes and clusters of varying size; noise detection. | Requires density parameters; may struggle with varying densities. |
This calculator is intentionally focused on the classic k-means setting: fixed k, Euclidean distance, and two-dimensional numeric data.
x,y pairs. Non-numeric entries will be ignored or cause errors.x in thousands and y in single digits), that coordinate will dominate the distance. Consider rescaling or standardizing your data before clustering.Keep these assumptions in mind when interpreting the output. For high-stakes decisions or complex datasets, consider complementing this simple calculator with more advanced statistical or machine learning tools.