Skip to content

Phase 2: Supervised Logic

Phase 2: Supervised Logic

In this phase, we move to Non-Linear models that can handle complex decision boundaries.

🟢 Level 1: Decision Trees

A series of “If-Then” rules.

Criteria: Gini Impurity or Entropy (Information Gain).
Pros: Highly interpretable.
Cons: Prone to massive overfitting.

🟡 Level 2: Ensemble Methods (Wisdom of the Crowd)

Combine multiple models to improve performance.

1. Bagging (Bootstrap Aggregating)

Train multiple trees on random subsets of data and average them.

Standard Tool: Random Forest.

2. Boosting

Train models sequentially. Each new model corrects the errors of the previous ones.

Standard Tools: XGBoost, LightGBM, CatBoost.

🔴 Level 3: Support Vector Machines (SVM)

Finding the Hyperplane that maximizes the “Margin” between classes.

3. The Kernel Trick

SVMs can map data to an infinite-dimensional space to find a linear boundary for non-linear data.

RBF Kernel: The most popular for complex clusters.