Skip to content

Data Science Foundations

📊 Data Science Foundations

Numerical computing in Python relies on Vectorization to achieve high performance, bypassing the GIL for compute-heavy tasks.


🟢 Level 1: Foundations

1. NumPy (Numerical Python)

The foundational library for all numerical computing in Python.

  • NDArray: Contiguous memory arrays with fixed types.
  • Broadcasting: Automatic element-wise operations on arrays of different shapes.

2. Pandas (Data Manipulation)

The industry standard for tabular data manipulation.

  • Series & DataFrames: Labeled data structures.
  • Split-Apply-Combine: Efficient grouping and aggregation.

🟡 Level 2: Visualization

Using matplotlib and seaborn to understand data distributions and identify outliers.


🔴 Level 3: Advanced Optimization

  • Polars: A lightning-fast DataFrame library written in Rust.
  • Numba: JIT compiler for Python that translates a subset of Python and NumPy code into fast machine code.