Brainstorming Session – Unsupervised Learning
1. What is a Centroid?
Imagine a bunch of dots drawn on paper.
Now ask:”If I were to place a sticker right in the middle of these dots, where should I put it?” That “middle sticker” is called the centroid. It’s like the center of a group.
In Even Simpler Terms:
A centroid is just a fancy word for the average position of a group of things.
Kid-Friendly Analogy:
Let’s say 3 friends are sitting on a see-saw:
- One is sitting at position 2
- Another at position 4
- The third at position 6
If we want the see-saw to balance, we need to sit right at position 4, which is the average of 2, 4, and 6.That balanced spot is the centroid of our friends’ positions.
A Tiny Math Example (2D):
Suppose we have 3 points on a grid:
(2, 4), (4, 6), and (6, 8)
To find the centroid:
Average of X values = (2 + 4 + 6) / 3 = 4
Average of Y values = (4 + 6 + 8) / 3 = 6
So, the centroid is (4, 6)
In Machine Learning:
When a computer is trying to group similar things, it finds the “middle” of each group — that middle point is the centroid.The computer keeps updating these centroids to make the groups better and better!
2. When Should We Stop Finding the Centroid?
In K-Means and similar unsupervised learning methods, we do need a stopping rule, otherwise — it could go on forever (theoretically).
Here are the Common Stopping Criteria:
1. Fixed Number of Iterations
If we say:
“Hey algorithm, run the centroid update process 10 times, and then stop.” Simple and safe, but not always optimal.
2. When Centroids Stop Moving (Convergence)
If we say:
“Keep running until the centroids don’t change much anymore (or not at all).” Smart! This means the clusters are stable, and further updates don’t improve the grouping.
Usually measured using a small number like 0.001 difference in position.
3. When Cluster Assignments Don’t Change
If we say:
“If all points stay in the same group as last time, stop.” This means the grouping has stabilized.
Bonus Control — How Many Clusters (K) Do We Want?
Now, this is very important and ties back to our main concern:
“How do we know how many clusters we should even look for?”
In K-Means: We must decide K (number of clusters) before we start.
This is often set based on:
- Prior knowledge (e.g. “we know there are 3 types of customers”)
- Business goals
- Trial and error using techniques like:
The Elbow Method (for choosing K)
We try:
- K = 1, 2, 3, …, 10
- For each K, calculate how tightly points are grouped (called within-cluster sum of squares)
- Plot it — the curve often bends like an elbow
- Choose the K where the elbow bends — more clusters after that don’t give much improvement
This avoids over-grouping and keeps it practical.
Why Not Infinite Clusters?
We could keep making smaller and smaller groups, even one point per group (K = number of points).But that’s useless — it defeats the purpose of finding patterns or simplifying the data.
So, we strike a balance:
- Not too few clusters (too general)
- Not too many (too detailed or noisy)
We stop when the centroids stop changing, or when we’ve reached a maximum number of tries, and we choose the number of clusters based on what makes the grouping meaningful but simple — using methods like the elbow rule.
3. What is the Elbow Rule?
The Elbow Rule is a way to help us decide: How many clusters (K) should we use for grouping?
It’s like asking:“How many buckets should I use to sort these toys so they’re grouped nicely — but not too many that it gets silly?”
The Idea Behind the Elbow Rule:
When we increase the number of clusters (K), the computer groups things more accurately.But after a point, the improvement becomes very small, even if we add more clusters.This creates a graph that looks like a bent elbow — and that’s where we should stop!
Here’s What We Measure:
We look at a thing called “Within-Cluster Sum of Squares” (WCSS) — don’t worry about the name.It just means:”How far, on average, are the points from their group center?”
Steps of the Elbow Rule
1. Try K = 1, K = 2, K = 3… up to K = 10 or so
2. For each K:
- Group the data
- Measure the “tightness” of each group (WCSS)
3. Plot K vs WCSS:
- X-axis = K (number of clusters)
- Y-axis = WCSS (how scattered the groups are)
4. Look at the graph:
- It goes down steeply at first
- Then slows down
- At the turning point (the “elbow”) — that’s your best K!
What It Looks Like:
WCSS │ │ ● │ ● │ ● │ ● │● ├────────────────────── K 1 2 3 4 5 ... Elbow!
Real-Life Analogy: Sorting Toys
Imagine we’re sorting 100 toys into bins:
- With 1 bin, everything’s a mess
- With 2 bins, it’s better
- With 3 bins, we get cars, animals, and blocks
- With 10 bins, we’re overdoing it (like separating green blocks from red ones)
The elbow is where adding more bins stops making big improvements.
So, the Elbow Rule helps us:
- Avoid too few clusters (bad grouping)
- Avoid too many clusters (overfitting)
- Pick the sweet spot — just enough groups to make sense
Pure Python Elbow Rule Demo (No Libraries)
import random # Step 1: Create 2D random points (simulate some data) def generate_data(n_points=20, x_range=(0, 20), y_range=(0, 20)): return [(random.randint(*x_range), random.randint(*y_range)) for _ in range(n_points)] # Step 2: Distance formula def distance(p1, p2): return ((p1[0]-p2[0])**2 + (p1[1]-p2[1])**2) ** 0.5 # Step 3: Assign each point to the closest centroid def assign_clusters(points, centroids): clusters = {i: [] for i in range(len(centroids))} for point in points: distances = [distance(point, c) for c in centroids] closest = distances.index(min(distances)) clusters[closest].append(point) return clusters # Step 4: Update centroids to average positions def update_centroids(clusters): new_centroids = [] for group in clusters.values(): if not group: continue avg_x = sum(p[0] for p in group) / len(group) avg_y = sum(p[1] for p in group) / len(group) new_centroids.append((avg_x, avg_y)) return new_centroids # Step 5: Calculate WCSS (total distance from points to their centroid) def calculate_wcss(clusters, centroids): wcss = 0 for idx, points in clusters.items(): for p in points: wcss += distance(p, centroids[idx]) ** 2 return wcss # Step 6: Run K-means manually and simulate elbow rule def elbow_method(points, max_k=5): print("\nElbow Method Results:\n") for k in range(1, max_k + 1): # Step A: Randomly pick k initial centroids centroids = random.sample(points, k) for _ in range(5): # Run 5 iterations of updating clusters = assign_clusters(points, centroids) centroids = update_centroids(clusters) # Step B: Calculate and print WCSS wcss = calculate_wcss(clusters, centroids) print(f"K = {k} => WCSS = {round(wcss, 2)}") # Run the demo points = generate_data() elbow_method(points)
What we’ll Get:
A simple printout like:
Elbow Method Results:
K = 1 => WCSS = 1298.23
K = 2 => WCSS = 740.11
K = 3 => WCSS = 450.67
K = 4 => WCSS = 440.12
K = 5 => WCSS = 438.05
We’ll see the WCSS dropping fast, then flattening. The “elbow” is where the drop starts slowing down — like at K = 3 above.
4. What is WCSS?
WCSS stands for: Within-Cluster Sum of Squares
Imagine This:
We’ve grouped a bunch of data points into clusters (like groups of toys, or students).For each group, we have a center point (the centroid — remember, the “middle” of the group).
Now, look at how far each point in the group is from the center.
- If all the points are close to the center, the group is tight and clean.
- If the points are far from the center, the group is loose and messy.
What Does WCSS Do?
For each point, it calculates:
“Distance from point to its group center (centroid)”
Then it:
1. Squares that distance (so everything is positive and big distances count more)
2. Adds up all these squared distances
This gives us the WCSS.
Simple Example (1D):
Let’s say we have a group with 3 points:
[2, 4, 6]
The centroid (average) = 4
Now:
- Distance from 2 → 4 = 2 → 2² = 4
- Distance from 4 → 4 = 0 → 0² = 0
- Distance from 6 → 4 = 2 → 2² = 4
- So the WCSS = 4 + 0 + 4 = 8
Why It Matters:
WCSS tells us how good our clusters are.
- Small WCSS = points are close to their centers = tight groups = good!
- Big WCSS = points are scattered = messy groups = bad
In Clustering (like K-Means):
- We want to minimize WCSS
- So we keep adjusting clusters until the total WCSS is as low as possible
- In the Elbow Method, we see how WCSS changes as we try different numbers of groups
In Simple Words:
WCSS is a number that tells us how compact our groups are. Lower WCSS = better grouping.
5. Summary Table of Basic Math Concepts :
Math Concept | Why It’s Needed |
---|---|
Mean (Average) | To calculate centroids (group centers) |
Distance Formula | To assign points to nearest cluster |
Squares & Roots | For measuring WCSS (tightness of group) |
X-Y Coordinates | To visualize and group points |
Arithmetic Basics | For all calculations |
6. Summary Table of Usecases :
Use Case | What’s Grouped? | Why It Helps |
---|---|---|
Customer Segmentation | Buyer behavior | Targeted marketing |
Recommendations | Viewing/listening patterns | Personalized content |
Fraud Detection | Transactions | Spotting unusual activity |
Genetic Research | DNA patterns | Discovering hidden traits |
Urban Planning | City data | Smarter development |
Image Segmentation | Pixel colors | Compression or photo editing |
Education Personalization | Student progress | Tailored support and content |
Unsupervised Learning – Summary