Basic Math Concepts – Semi Supervised Learning

1. Counting

  • Used to count how many words match between two sentences.
  • Example: “Hi there” and “Hello there” share 1 word (“there”).

2. Set Theory (Very Basic)

  • Think of a sentence as a bag of words.
  • We use:
    • Intersection (common words)
    • Union (all unique words from both)
  • Example:
    • Words in Sentence A = {hi, how, are, you}
    • Words in Sentence B = {how, are, you, today}
    • Intersection = {how, are, you} (3 words)
    • Union = {hi, how, are, you, today} (5 words)

3. Division (Fractions / Ratios)

  • Used to calculate confidence score:
    • Confidence = Common Words ÷ Total Unique Words
  • Example:
    • If 3 words match out of 5 unique total →
      Confidence = 3 / 5 = 0.6 (or 60%)

Optional (Bonus) Concepts That Help Later

Concept Where it shows up Needed Now?
Basic logic If/else checks, comparisons Yes
Lowercasing/text rules String matching Yes
Similarity scores Matching texts better Yes
Stemming (word endings) Cutting “running” to “run” Yes (simple rules)
Percentages Confidence = 0.6 → 60% Easy to grasp

4. A visual Cheat Sheet:

null

Semi-supervised Learning – Visual Roadmap