Learning Path
Introduction to ML
Foundational concepts. Understand supervised, unsupervised, and reinforcement learning before diving into algorithms.
Data Preprocessing
Clean data, handle missing values, encode categories, and engineer features before any model.
Supervised Learning
Learn from labeled data. Map inputs to known outputs — predicting prices, classifying spam, and more.
Fit a straight line to predict house prices. y = β₀ + β₁x. Train-test split, MSE, R² evaluation, regression line visualization.
Classification using the sigmoid function. Binary & multi-class with one-vs-rest. Decision boundaries and log-odds transform.
Tree-shaped model splitting by questions. Gini impurity, Entropy, MSE criteria. Tree depth control with visual plot_tree diagram.
Classify by majority vote of K closest neighbors. Euclidean distance, why feature scaling is critical. Interactive new predictions.
Ensemble of decision trees with bagging. Basic regression + extended with categorical encoding. Feature importance. R² = 0.96.
Maximize margin between classes. Kernel trick (RBF), support vectors, ε-insensitive tube for regression. Apartment rent prediction.
End-to-end SMS spam detection. NLTK tokenization, stopword removal, TF-IDF vectorization, GradientBoostingClassifier. 97% accuracy.
Extreme Gradient Boosting with L1 (Lasso) vs L2 (Ridge) regularization. Visualizes how regularization affects weights. Parallelization.
Three approaches compared: GridSearchCV (exhaustive), RandomizedSearchCV (sampling), Bayesian Optimization with Optuna (intelligent).
Classification: Accuracy, Precision, Recall, F1, Confusion Matrix, ROC-AUC. Regression: MAE, MSE, RMSE, R². When to use which.
Probabilistic classifier based on Bayes' theorem. Gaussian, Multinomial, Bernoulli variants. Fast and effective for text, spam detection, sentiment. Laplace smoothing explained.
K-Fold, Stratified K-Fold, Leave-One-Out, Time Series Split. Get reliable performance estimates instead of one lucky train-test split.
Filter, wrapper, and embedded methods. SelectKBest, RFE, tree importance, Lasso L1. Pick the best features and drop the noise.
Classical time series forecasting. ARIMA(p,d,q), stationarity testing, auto_arima. Facebook Prophet with holidays, multiple seasonality, component plots.
Unsupervised Learning
Discover hidden patterns in unlabeled data. Clustering, dimensionality reduction, and association rules.
Centroid-based partitioning. k-means++ init, Elbow method for K, Silhouette score. Iterative assignment & update until convergence.
Agglomerative (bottom-up) & Divisive (top-down). Linkage methods: Single, Complete, Average, Ward. Dendrogram tree diagrams.
Density-based clustering. No need to specify K. Finds arbitrary shapes, marks outliers as noise (-1). ε and MinPts parameters.
Soft probabilistic clustering via EM algorithm. Each point gets a probability per cluster. Gaussian distributions with μ, Σ, π.
Density peak-seeking. Each point shifts towards the mean of nearby points. Auto-discovers cluster count. Bandwidth parameter.
Dimensionality reduction via maximum variance projection. MNIST 64D → 2D. Eigenvalues, eigenvectors, linear transformation.
Market basket analysis. Apriori, Eclat, FP-Growth compared. Support, Confidence, Lift metrics. Real-world product bundling.
Anomaly detection by random partitioning. Anomalies are isolated with fewer splits. Anomaly scores, contamination parameter. Fraud and intrusion detection.
Collaborative filtering (user-user, item-item), content-based filtering with TF-IDF, matrix factorization. Build Netflix/Amazon-style recommendations.
Reinforcement Learning
An agent learns by trial and error, receiving rewards for good actions and penalties for bad ones.
Core vocabulary: Agent, Environment, State, Action, Reward, Policy, Episode. Model-free vs model-based. Real-world examples.
Model-free RL with Q-table. 4×4 grid world navigation. Bellman equation, epsilon-greedy exploration. 500 episodes to optimal path.
Q-Learning + neural networks for large state spaces. Experience replay, target networks, epsilon-greedy. The approach that mastered Atari games.
Deep Learning
Neural networks with multiple layers. From basic perceptrons to CNNs for images and LSTMs for sequences.
Neural network fundamentals, activation functions (ReLU, Sigmoid, Tanh, Softmax), backpropagation, common architectures.
MNIST digit classification. 784 → 128 (ReLU) → 64 (ReLU) → 10 (Softmax). Adam optimizer. 97.78% accuracy in 10 epochs.
CIFAR-10 color images. Conv2D → MaxPool → Conv2D → MaxPool → Conv2D → Dense. Feature extraction, translation invariance. 71.86%.
Next-word prediction. Embedding → SimpleRNN(50) → Dense. N-gram sequences, hidden states, vanishing gradient problem.
Gated architecture solving vanishing gradients. Input, Forget, Output gates. Cell state as long-term memory. Text generation.
Sales forecasting with stacked LSTMs. LSTM(50) → LSTM(25) → Dense(1). Sliding window lookback=5, MinMaxScaler.
MLPRegressor for house prices. Hidden (64, 32) with ReLU, Adam. Why StandardScaler is essential for NNs. R² = 0.97.
Generator vs Discriminator competing to produce realistic data. DCGAN, WGAN, CycleGAN, StyleGAN variants. Training challenges: mode collapse, instability.
Encoder-decoder networks learning compressed representations. Vanilla, Denoising, Variational (VAE), Convolutional variants. Anomaly detection, image denoising.
Reuse pre-trained models (ResNet, VGG, MobileNet, EfficientNet). Feature extraction vs fine-tuning strategies. When to freeze, when to unfreeze.
Self-attention, multi-head attention, positional encoding. The architecture behind GPT, BERT, and all modern LLMs. Encoder-decoder explained step by step.
Prevent overfitting. L1/L2 regularization, Dropout, Batch Normalization, Early Stopping, Data Augmentation, Weight Decay. When to use each.
Advanced Topics
NLP and Prompt Engineering — where classical ML meets modern AI.
Full pipeline: tokenization, stopwords, stemming vs lemmatization, BoW, TF-IDF. Word embeddings: Word2Vec, GloVe, FastText, BERT.
The most comprehensive notebook. Zero/few-shot, Chain-of-Thought, Tree of Thoughts, ReAct, Reflexion. APIs for GPT-4, Claude, Gemini. Debugging & domain-specific prompts.
HuggingFace Transformers for sentiment classification, NER, QA. Pre-trained BERT fine-tuning with Trainer API. Pipeline for instant inference.
Notebook to production. Save models (joblib, ONNX), build APIs (FastAPI, Flask), Dockerize, deploy to cloud. Input validation and monitoring.