The best examples of k-means clustering: practical examples that actually matter
K-means is popular not because it’s fancy, but because it’s fast, scalable, and easy to interpret. When you look at real examples of k-means clustering in production systems, a pattern emerges: it’s usually applied when you have lots of observations, many variables, and no labels.
Below are some of the best examples from modern practice, organized by domain.
Customer segmentation: the classic example of k-means clustering
If you had to pick the single most common example of k-means clustering, it would be customer segmentation. Marketing teams sit on huge tables of customer data and want to answer a simple question: Which types of customers do we have, and how do they differ?
A typical setup:
- Data: transaction history, average order value, frequency of purchases, time since last purchase, product categories, geography, and sometimes app or web behavior.
- Goal: group customers into segments that can be targeted differently with pricing, promotions, and messaging.
A retailer might discover, say, five clusters:
- High-spend, low-frequency buyers (think expensive holiday shoppers)
- Low-spend, high-frequency buyers (everyday essentials)
- Discount-driven buyers (purchase mostly on promotions)
- New customers with few purchases
- Dormant customers who haven’t purchased in a long time
K-means is used here because it scales well to millions of customers and dozens of features. Once the clusters are created, analysts interpret them using summary statistics and visualizations, then connect them to marketing strategy. This is one of the best examples of k-means clustering: practical examples where the output directly drives decisions about campaigns and budgets.
Healthcare risk stratification: k-means clustering for patient groups
Healthcare systems and insurers routinely group patients by risk to allocate resources, design care pathways, and predict hospital utilization. While many methods exist, k-means clustering is often a starting point for exploratory analysis.
A typical workflow:
- Data: age, comorbidities (e.g., diabetes, hypertension), prior hospital visits, medication counts, lab values, and sometimes social determinants of health.
- Goal: identify clinically meaningful patient subgroups that differ in risk of hospitalization, cost, or complications.
For instance, a health system might cluster patients with type 2 diabetes and find:
- Younger, low-complication patients with good lab control
- Older patients with multiple chronic conditions and frequent admissions
- Middle-aged patients with poor medication adherence and unstable lab values
Researchers then evaluate whether these clusters correspond to different outcomes, guided by domain standards such as chronic disease management frameworks from organizations like the National Institutes of Health. While k-means alone doesn’t make treatment decisions, it provides a data-driven starting point for designing tailored care programs and identifying where to invest in prevention.
Credit card fraud and anomaly detection: using k-means as a baseline
Fraud detection is usually framed as a classification or anomaly detection problem, but k-means clustering often appears as a building block or baseline.
How it’s used:
- Feature space: transaction amount, time of day, merchant category, distance from usual location, device fingerprints, and user behavior patterns.
- Core idea: cluster normal behavior and treat points far from any cluster center as potential anomalies.
For example, a bank might run k-means on a large sample of confirmed non-fraud transactions and learn a handful of common spending patterns:
- Small, frequent local purchases
- Regular recurring bills
- Occasional large travel-related charges
New transactions that fall far outside these clusters—say, a sudden, high-value overseas purchase at an odd hour—are flagged for further checks or temporary holds.
Modern fraud systems increasingly use more advanced models, but k-means remains a practical example because it’s easy to explain to risk teams and auditors: each cluster represents a pattern of normal behavior, and distance from those patterns is a simple risk signal.
Image compression and color quantization: a visual example of k-means clustering
A surprisingly intuitive example of k-means clustering comes from digital images. Every pixel has a color, often represented in RGB space (three numbers). High-resolution images can have millions of distinct colors, which is expensive for storage and transmission.
Color quantization with k-means works like this:
- Treat each pixel’s color as a point in 3D space (R, G, B).
- Run k-means clustering to find, say, 16 or 64 representative colors.
- Replace each pixel’s color with the nearest cluster center.
The result: a smaller color palette and a compressed image that still looks reasonably close to the original. This is a textbook example of k-means clustering that’s also used in practice for graphics, icons, and thumbnails where perfect fidelity isn’t necessary.
This is also a good reminder that k-means is not limited to business data. Whenever you can represent your objects as numeric vectors, k-means is a candidate.
Document clustering and topic organization: k-means in NLP pipelines
In natural language processing (NLP), k-means clustering is frequently used to group documents, news articles, or research papers by similarity.
A typical modern pipeline:
- Represent each document using embeddings (for example, from transformer models) or TF-IDF vectors.
- Run k-means clustering on these vectors.
- Inspect top keywords or representative documents from each cluster.
Real examples include:
- Grouping customer support tickets into themes (billing, account access, technical bugs).
- Organizing large collections of research abstracts into topic clusters to support literature reviews.
- Segmenting user reviews into product-related themes (quality, shipping, pricing, customer service).
Universities and research libraries sometimes use document clustering to help researchers navigate large literature sets. While more advanced topic models exist, k-means remains attractive because it’s fast and interpretable, especially when combined with keyword extraction or manual review.
Retail assortment and store clustering: grouping locations, not people
Another practical example of k-means clustering comes from brick-and-mortar retail. Instead of clustering customers, analysts cluster stores or locations.
Consider a national chain with hundreds or thousands of outlets. For each store, you might have:
- Sales by category (e.g., groceries, electronics, apparel)
- Average basket size and transaction count
- Local demographics and income levels
- Urban vs. suburban vs. rural classification
Running k-means on these features can reveal groups of stores with similar demand patterns, such as:
- High-traffic urban stores with small baskets
- Suburban family-focused stores with large weekly baskets
- Tourist-heavy locations with seasonal spikes
These clusters inform assortment planning, inventory levels, staffing, and local promotions. This is one of the more operational examples of k-means clustering: practical examples where each cluster leads to a different playbook for store managers.
Manufacturing and IoT: machine condition and process clusters
In manufacturing and industrial IoT, k-means clustering helps summarize high-dimensional sensor data into interpretable machine states.
Imagine a factory with sensors measuring temperature, vibration, pressure, and throughput on each machine.
Use case:
- Collect time-windowed summaries (e.g., averages, standard deviations, peak values) across sensors.
- Run k-means clustering to identify recurring patterns of machine behavior.
You might find clusters corresponding to:
- Normal steady-state operation
- Warm-up periods with higher variability
- Overloaded states with elevated vibration and temperature
Maintenance teams can then focus on the clusters associated with higher failure rates or quality issues. Agencies like the National Institute of Standards and Technology (NIST) publish guidance and research on smart manufacturing and data-driven process control, where clustering is often part of exploratory analysis.
Urban planning and public health: clustering regions by population and risk
K-means clustering also appears in public policy, urban planning, and public health. Researchers may cluster geographic units—census tracts, ZIP codes, or counties—based on demographic, economic, or health indicators.
Typical variables include:
- Population density, age distribution, and household structure
- Income, employment, and education levels
- Disease prevalence, hospitalization rates, or access to care
For example, public health teams might cluster neighborhoods by a combination of chronic disease rates and access to primary care to prioritize interventions. Agencies like the Centers for Disease Control and Prevention (CDC) publish data that can feed directly into this kind of analysis.
This is a powerful example of k-means clustering: practical examples where each cluster corresponds to a different policy strategy—more clinics, targeted screening, or community outreach.
Modern 2024–2025 trends: where k-means still fits in the ML toolbox
With all the attention on deep learning and large language models, it’s fair to ask: Is k-means still relevant in 2024–2025? The short answer is yes—but its role has shifted.
Here’s how practitioners now commonly use k-means:
- On embeddings: Instead of clustering raw data, teams cluster embeddings from neural networks (for text, images, audio, or users). For example, product teams might cluster user embeddings from recommendation models to discover behavior-based segments.
- As a preprocessing step: K-means can create cluster labels that are then used as features in downstream models, or to initialize more complex algorithms.
- For large-scale summarization: When you have millions of points, clustering with k-means gives you a set of representative centroids that approximate the dataset for faster querying or visualization.
In other words, some of the best examples of k-means clustering: practical examples in 2024–2025 are not standalone pipelines, but components inside larger AI systems.
When k-means is a bad fit (and what to watch for)
Real-world examples of k-means clustering also highlight its limitations.
You should be cautious when:
- Clusters are not spherical: K-means assumes clusters are roughly ball-shaped in feature space. If your data has elongated or irregular clusters, other methods (like DBSCAN or Gaussian mixture models) may work better.
- Scales differ wildly: K-means is sensitive to scale. Always standardize or normalize features when units differ (dollars, counts, percentages, etc.).
- There are many outliers: Outliers can pull centroids toward them and distort clusters. Preprocessing to handle outliers is often necessary.
- K is arbitrary: Choosing the number of clusters is part art, part science. Analysts often use the “elbow method,” silhouette scores, or domain knowledge to pick a reasonable K, then validate clusters with business stakeholders.
The best practical examples of k-means clustering are the ones where these issues are explicitly addressed—through feature engineering, scaling, and careful validation—rather than ignored.
FAQ: examples of k-means clustering and common questions
What is a simple example of k-means clustering for beginners?
A simple example of k-means clustering is grouping customers by how often they purchase and how much they spend. Plot each customer as a point with “frequency” on one axis and “average order value” on the other, run k-means with a small K (like 3 or 4), and you’ll see segments emerge: high-value frequent buyers, low-value infrequent buyers, and so on.
What are some real examples of k-means clustering in business?
Real business examples include customer segmentation for targeted marketing, store clustering for assortment planning, and transaction clustering as part of fraud detection systems in banks. In all these cases, the algorithm groups similar entities—customers, stores, or transactions—so that each cluster can be managed or monitored differently.
How many clusters should I use in a practical example of k-means?
There’s no single right answer. Analysts often run k-means for a range of K values, inspect metrics like the within-cluster sum of squares and silhouette scores, and then check whether the resulting clusters make sense to domain experts. In practical examples of k-means clustering, the “best” K is usually the smallest number that still produces distinct, actionable groups.
Are there medical examples of k-means clustering being used in research?
Yes. Medical and health services researchers use k-means clustering to identify patient subgroups based on diagnoses, lab results, and utilization patterns. For instance, clusters of patients with similar chronic disease profiles can help target care management programs. Many such studies draw on data and research guidance from organizations like the National Institutes of Health and academic medical centers.
Is k-means still useful now that we have deep learning?
Yes, but often as part of a pipeline rather than the star of the show. In 2024–2025, a lot of the most interesting examples of k-means clustering involve running it on embeddings produced by deep models—for example, clustering documents, images, or users based on learned representations. The algorithm itself hasn’t changed, but the features it sees are much richer.
What are the best examples of k-means clustering for learning the concept?
If you’re learning, start with three types of data: customer purchase behavior, image color quantization, and document clustering using TF-IDF vectors. Together, these examples of k-means clustering cover numeric business data, visual data, and text data, giving you a broad sense of where the algorithm shines and where it starts to struggle.
Related Topics
Real-world examples of multiple regression analysis examples
Real-world examples of structural equation modeling (SEM) examples
Real‑world examples of correspondence analysis: from marketing to medicine
The best examples of k-means clustering: practical examples that actually matter
Real‑world examples of MANOVA: practical applications that actually matter
Explore More Multivariate Analysis Examples
Discover more examples and insights in this category.
View All Multivariate Analysis Examples