Basic Statistics
Basic Statistics and Their Application in AI
1. Descriptive Statistics
Concept | Description | AI Relevance |
---|---|---|
Mean | Average value | Used in loss calculations (e.g., MSE) |
Median | Middle value when sorted | Robust against outliers |
Mode | Most frequent value | Helps in categorical analysis |
Standard Deviation | Spread from mean | Used in scaling and normalization |
Variance | Square of SD | Optimization and regularization |
Range/IQR | Data spread | Outlier detection |
2. Inferential Statistics
Concept | Description | AI Relevance |
---|---|---|
Hypothesis Testing | Validate assumptions | Used in A/B testing |
p-value | Probability under null hypothesis | Feature selection |
Confidence Interval | Range of likely values | Uncertainty in predictions |
3. Probability Distributions
Distribution | Description | AI Use Case |
---|---|---|
Normal (Gaussian) | Bell curve | Naive Bayes, GaussianNB |
Binomial | Two outcomes, fixed trials | Binary classification |
Poisson | Event count in interval | Anomaly/event detection |
Uniform | Equal probabilities | Weight initialization |
4. Applications in AI
Data Preprocessing:Imputation, Scaling (Z-score), Outlier Removal
Feature Selection:Pearson Correlation, Chi-square, ANOVA
Evaluation:Accuracy, Precision, Recall, ROC-AUC, Confusion Matrix
Bayesian Inference:Used in probabilistic models like Naive Bayes
5. Real-life AI Example
Loan Default Prediction:– Use Mean/Median income to understand population
– Hypothesis testing to compare default rates
– Evaluate model with precision/recall
– Use Bayes’ rule for final scoring
6. Recommended Books
Statistics for Machine Learning– Pratap Dangeti
Think Stats– Allen Downey
Practical Statistics for Data Scientists– Bruce & Bruce
An Introduction to Statistical Learning– Gareth James et al.