Skip to content

Phase 2: Supervised Logic

Phase 2: Supervised Logic

In this phase, we move to Non-Linear models that can handle complex decision boundaries.


🟒 Level 1: Decision Trees

A series of β€œIf-Then” rules.

  • Criteria: Gini Impurity or Entropy (Information Gain).
  • Pros: Highly interpretable.
  • Cons: Prone to massive overfitting.

🟑 Level 2: Ensemble Methods (Wisdom of the Crowd)

Combine multiple models to improve performance.

1. Bagging (Bootstrap Aggregating)

Train multiple trees on random subsets of data and average them.

  • Standard Tool: Random Forest.

2. Boosting

Train models sequentially. Each new model corrects the errors of the previous ones.

  • Standard Tools: XGBoost, LightGBM, CatBoost.

πŸ”΄ Level 3: Support Vector Machines (SVM)

Finding the Hyperplane that maximizes the β€œMargin” between classes.

3. The Kernel Trick

SVMs can map data to an infinite-dimensional space to find a linear boundary for non-linear data.

  • RBF Kernel: The most popular for complex clusters.