machine learning · 2025
math
mlops
llm · genai
Phase 1 · mathematics for ML
linear algebra · calculus · probability · statistics · weeks 1–8

Linear algebra

  • vectors, matrices, dot product
  • eigenvalues, SVD, PCA intuition
  • matrix factorisation
foundation for DL

Calculus & optimisation

  • derivatives, chain rule, gradient
  • backpropagation intuition
  • sgd, adam, learning rates

Probability & stats

  • Bayes, distributions, MLE
  • hypothesis testing, p-value
  • entropy, cross‑entropy, KL

Information theory

  • entropy, mutual information
  • KL divergence, cross‑entropy
loss functions
Phase 2 · classical machine learning
supervised · unsupervised · feature eng · xgboost · weeks 9–20

Regression

  • linear, ridge, lasso, polynomial
  • MAE, RMSE, R²

Classification

  • logistic regression, decision trees
  • random forest, xgboost, lightgbm
precision/recall

Unsupervised

  • k‑means, DBSCAN, PCA, t‑SNE
  • anomaly detection: isolation forest

Imbalance & tuning

  • SMOTE, class weights
  • cross‑val, optuna, SHAP
Phase 3 · deep learning & PyTorch

NN fundamentals

  • neurons, activations, backprop
  • optimizers: sgd, adam, adamw
  • batch norm, dropout

Computer vision

  • CNNs, resnet, efficientnet
  • object detection: YOLO, DETR
transfer learning

NLP & transformers

  • BERT, GPT, attention
  • huggingface, fine‑tuning

PyTorch in depth

  • autograd, nn.Module, DataLoader
  • mixed precision, distributed
Phase 4 · generative AI · LLMs · RAG

Prompt engineering

  • zero/few‑shot, chain‑of‑thought
  • system prompts, JSON mode

RAG pipelines

  • chunking, embeddings, vector DB
  • chroma, pinecone, hybrid search

LangChain / LlamaIndex

  • chains, agents, tools
  • ReAct, multi‑agent (crewai)

Fine‑tuning (LoRA, QLoRA)

  • PEFT, TRL, Axolotl
  • DPO, RLHF basics
Phase 5 · MLOps · production · serving

Experiment tracking

  • MLflow, WandB, DVC
  • model registry, data versioning

Serving & inference

  • FastAPI, BentoML, Triton
  • vLLM, quantization, ONNX

Monitoring & CI/CD

  • Evidently, data drift, concept drift
  • GitHub Actions, Airflow
structured weekly roadmap
PhaseTimelineCore topicsDeliverable
Phase 1Weeks 1‑8Math: linear algebra, calculus, statsmath problem sets + notebook
Phase 2Weeks 9‑20Classical ML: sklearn, XGBoost, feature eng4 kaggle‑style projects
Phase 3Weeks 21‑34Deep learning: PyTorch, CNNs, transformersimage classifier + nlp pipeline
Phase 4AWeeks 35‑46Gen AI: prompt, RAG, agents, fine‑tuningRAG chatbot + agent app
Phase 4BWeeks 35‑46MLOps: experiment tracking, serving, monitorautomated ML pipeline
Phase 5Weeks 47‑58Specialisation: recsys / time series / GNNspecialised project
Phase 6Weeks 55‑65Industry capstone: full‑stack ML systemproduction‑grade capstone

⚡ industry capstone tracks

🔍 recommendation engine

  • two‑tower neural CF + matrix factor
  • feast feature store, redis cache
  • A/B testing, drift monitoring

📄 RAG + LLM document QA

  • LangChain, GPT‑4o, pinecone
  • RAGAS eval, streaming FastAPI
  • QLoRA fine‑tuning, langsmith
pytorch 2.x huggingface vLLM mlflow kubeflow evidently
explore capstone structure

📌 must‑have (first job)

  • python, numpy, pandas, sklearn
  • pytorch: train neural nets
  • 3+ projects with github + demo
  • sql, evaluation metrics

📚 top free resources

  • fast.ai / cs229 (youtube)
  • huggingface course
  • karpathy's zero‑to‑hero
  • kaggle learn

🧠 interview prep

  • bias‑variance, backprop
  • attention / transformers
  • ML system design