Brainstorming Session – Unsupervised Learning
1. What is a Centroid?
Imagine a bunch of dots drawn on paper.
Now ask:”If I were to place a sticker right in the middle of these dots, where should I put it?” That “middle sticker” is called the centroid. It’s like the center of a group.
In Even Simpler Terms:
A centroid is just a fancy word for the average position of a group of things.
Kid-Friendly Analogy:
Let’s say 3 friends are sitting on a see-saw:
- One is sitting at position 2
- Another at position 4
- The third at position 6
If we want the see-saw to balance, we need to sit right at position 4, which is the average of 2, 4, and 6.That balanced spot is the centroid of our friends’ positions.
A Tiny Math Example (2D):
Suppose we have 3 points on a grid:
(2, 4), (4, 6), and (6, 8)
To find the centroid:
Average of X values = (2 + 4 + 6) / 3 = 4
Average of Y values = (4 + 6 + 8) / 3 = 6
So, the centroid is (4, 6)
In Machine Learning:
When a computer is trying to group similar things, it finds the “middle” of each group — that middle point is the centroid.The computer keeps updating these centroids to make the groups better and better!
2. When Should We Stop Finding the Centroid?
In K-Means and similar unsupervised learning methods, we do need a stopping rule, otherwise — it could go on forever (theoretically).
Here are the Common Stopping Criteria:
1. Fixed Number of Iterations
If we say:
“Hey algorithm, run the centroid update process 10 times, and then stop.” Simple and safe, but not always optimal.
2. When Centroids Stop Moving (Convergence)
If we say:
“Keep running until the centroids don’t change much anymore (or not at all).” Smart! This means the clusters are stable, and further updates don’t improve the grouping.
Usually measured using a small number like 0.001 difference in position.
3. When Cluster Assignments Don’t Change
If we say:
“If all points stay in the same group as last time, stop.” This means the grouping has stabilized.
Bonus Control — How Many Clusters (K) Do We Want?
Now, this is very important and ties back to our main concern:
“How do we know how many clusters we should even look for?”
In K-Means: We must decide K (number of clusters) before we start.
This is often set based on:
- Prior knowledge (e.g. “we know there are 3 types of customers”)
- Business goals
- Trial and error using techniques like:
The Elbow Method (for choosing K)
We try:
- K = 1, 2, 3, …, 10
- For each K, calculate how tightly points are grouped (called within-cluster sum of squares)
- Plot it — the curve often bends like an elbow
- Choose the K where the elbow bends — more clusters after that don’t give much improvement
This avoids over-grouping and keeps it practical.
Why Not Infinite Clusters?
We could keep making smaller and smaller groups, even one point per group (K = number of points).But that’s useless — it defeats the purpose of finding patterns or simplifying the data.
So, we strike a balance:
- Not too few clusters (too general)
- Not too many (too detailed or noisy)
We stop when the centroids stop changing, or when we’ve reached a maximum number of tries, and we choose the number of clusters based on what makes the grouping meaningful but simple — using methods like the elbow rule.
3. What is the Elbow Rule?
The Elbow Rule is a way to help us decide: How many clusters (K) should we use for grouping?
It’s like asking:“How many buckets should I use to sort these toys so they’re grouped nicely — but not too many that it gets silly?”
The Idea Behind the Elbow Rule:
When we increase the number of clusters (K), the computer groups things more accurately.But after a point, the improvement becomes very small, even if we add more clusters.This creates a graph that looks like a bent elbow — and that’s where we should stop!
Here’s What We Measure:
We look at a thing called “Within-Cluster Sum of Squares” (WCSS) — don’t worry about the name.It just means:”How far, on average, are the points from their group center?”
Steps of the Elbow Rule
1. Try K = 1, K = 2, K = 3… up to K = 10 or so
2. For each K:
- Group the data
- Measure the “tightness” of each group (WCSS)
3. Plot K vs WCSS:
- X-axis = K (number of clusters)
- Y-axis = WCSS (how scattered the groups are)
4. Look at the graph:
- It goes down steeply at first
- Then slows down
- At the turning point (the “elbow”) — that’s your best K!
What It Looks Like:
WCSS │ │ ● │ ● │ ● │ ● │● ├────────────────────── K 1 2 3 4 5 ... Elbow!
Real-Life Analogy: Sorting Toys
Imagine we’re sorting 100 toys into bins:
- With 1 bin, everything’s a mess
- With 2 bins, it’s better
- With 3 bins, we get cars, animals, and blocks
- With 10 bins, we’re overdoing it (like separating green blocks from red ones)
The elbow is where adding more bins stops making big improvements.
So, the Elbow Rule helps us:
- Avoid too few clusters (bad grouping)
- Avoid too many clusters (overfitting)
- Pick the sweet spot — just enough groups to make sense
Pure Python Elbow Rule Demo (No Libraries)
import random
# Step 1: Create 2D random points (simulate some data)
def generate_data(n_points=20, x_range=(0, 20), y_range=(0, 20)):
return [(random.randint(*x_range), random.randint(*y_range)) for _ in range(n_points)]
# Step 2: Distance formula
def distance(p1, p2):
return ((p1[0]-p2[0])**2 + (p1[1]-p2[1])**2) ** 0.5
# Step 3: Assign each point to the closest centroid
def assign_clusters(points, centroids):
clusters = {i: [] for i in range(len(centroids))}
for point in points:
distances = [distance(point, c) for c in centroids]
closest = distances.index(min(distances))
clusters[closest].append(point)
return clusters
# Step 4: Update centroids to average positions
def update_centroids(clusters):
new_centroids = []
for group in clusters.values():
if not group: continue
avg_x = sum(p[0] for p in group) / len(group)
avg_y = sum(p[1] for p in group) / len(group)
new_centroids.append((avg_x, avg_y))
return new_centroids
# Step 5: Calculate WCSS (total distance from points to their centroid)
def calculate_wcss(clusters, centroids):
wcss = 0
for idx, points in clusters.items():
for p in points:
wcss += distance(p, centroids[idx]) ** 2
return wcss
# Step 6: Run K-means manually and simulate elbow rule
def elbow_method(points, max_k=5):
print("\nElbow Method Results:\n")
for k in range(1, max_k + 1):
# Step A: Randomly pick k initial centroids
centroids = random.sample(points, k)
for _ in range(5): # Run 5 iterations of updating
clusters = assign_clusters(points, centroids)
centroids = update_centroids(clusters)
# Step B: Calculate and print WCSS
wcss = calculate_wcss(clusters, centroids)
print(f"K = {k} => WCSS = {round(wcss, 2)}")
# Run the demo
points = generate_data()
elbow_method(points)
What we’ll Get:
A simple printout like:
Elbow Method Results:
K = 1 => WCSS = 1298.23
K = 2 => WCSS = 740.11
K = 3 => WCSS = 450.67
K = 4 => WCSS = 440.12
K = 5 => WCSS = 438.05
We’ll see the WCSS dropping fast, then flattening. The “elbow” is where the drop starts slowing down — like at K = 3 above.
4. What is WCSS?
WCSS stands for: Within-Cluster Sum of Squares
Imagine This:
We’ve grouped a bunch of data points into clusters (like groups of toys, or students).For each group, we have a center point (the centroid — remember, the “middle” of the group).
Now, look at how far each point in the group is from the center.
- If all the points are close to the center, the group is tight and clean.
- If the points are far from the center, the group is loose and messy.
What Does WCSS Do?
For each point, it calculates:
“Distance from point to its group center (centroid)”
Then it:
1. Squares that distance (so everything is positive and big distances count more)
2. Adds up all these squared distances
This gives us the WCSS.
Simple Example (1D):
Let’s say we have a group with 3 points:
[2, 4, 6]
The centroid (average) = 4
Now:
- Distance from 2 → 4 = 2 → 2² = 4
- Distance from 4 → 4 = 0 → 0² = 0
- Distance from 6 → 4 = 2 → 2² = 4
- So the WCSS = 4 + 0 + 4 = 8
Why It Matters:
WCSS tells us how good our clusters are.
- Small WCSS = points are close to their centers = tight groups = good!
- Big WCSS = points are scattered = messy groups = bad
In Clustering (like K-Means):
- We want to minimize WCSS
- So we keep adjusting clusters until the total WCSS is as low as possible
- In the Elbow Method, we see how WCSS changes as we try different numbers of groups
In Simple Words:
WCSS is a number that tells us how compact our groups are. Lower WCSS = better grouping.
5. Summary Table of Basic Math Concepts :
| Math Concept | Why It’s Needed |
|---|---|
| Mean (Average) | To calculate centroids (group centers) |
| Distance Formula | To assign points to nearest cluster |
| Squares & Roots | For measuring WCSS (tightness of group) |
| X-Y Coordinates | To visualize and group points |
| Arithmetic Basics | For all calculations |
6. Summary Table of Usecases :
| Use Case | What’s Grouped? | Why It Helps |
|---|---|---|
| Customer Segmentation | Buyer behavior | Targeted marketing |
| Recommendations | Viewing/listening patterns | Personalized content |
| Fraud Detection | Transactions | Spotting unusual activity |
| Genetic Research | DNA patterns | Discovering hidden traits |
| Urban Planning | City data | Smarter development |
| Image Segmentation | Pixel colors | Compression or photo editing |
| Education Personalization | Student progress | Tailored support and content |
Unsupervised Learning – Summary
