Summary – Decision Tree Regression
Step 1: Prepare our Data
- Organize our data as a table: features → target value
- Features can be numeric (e.g., area) or categorical (e.g., location)
- Target must be a continuous numeric value (e.g., price, rent)
Step 2: Choose a Splitting Criterion
- For every feature:
- Try possible split points (e.g., area > 1200)
- Divide the data into left/right groups
- Calculate the total squared error after the split:
Error=∑(actual−predicted)2
Step 3: Find the Best Split
- Choose the feature and value that minimizes total error
- This becomes a decision node
Step 4: Recursively Split the Data
- Apply the same logic to each child node (left/right)
- Keep splitting until:
- Maximum depth reached, or
- Number of samples is small, or
- Target values are similar (node is “pure”)
Step 5: Assign Prediction at Leaf Nodes
- For each leaf (end of a branch):
- Set the prediction as the mean of target values in that group
Step 6: Make Predictions
-
For any new input:
- Start at the root
- Traverse the tree based on feature comparisons
- Reach a leaf, return the prediction stored there
Optional Enhancements
- Add pruning to prevent overfitting
- Use categorical feature encoding
- Combine with Random Forest for ensemble learning
Decision Tree Regression – Visual Roadmap