ml · industry roadmap 2025

machine learning · 2025

math

mlops

llm · genai

Phase 1 · mathematics for ML

linear algebra · calculus · probability · statistics · weeks 1–8

Linear algebra

vectors, matrices, dot product
eigenvalues, SVD, PCA intuition
matrix factorisation

foundation for DL

Calculus & optimisation

derivatives, chain rule, gradient
backpropagation intuition
sgd, adam, learning rates

Probability & stats

Bayes, distributions, MLE
hypothesis testing, p-value
entropy, cross‑entropy, KL

Information theory

entropy, mutual information
KL divergence, cross‑entropy

Phase 2 · classical machine learning

supervised · unsupervised · feature eng · xgboost · weeks 9–20

Regression

linear, ridge, lasso, polynomial
MAE, RMSE, R²

Classification

logistic regression, decision trees
random forest, xgboost, lightgbm

Unsupervised

k‑means, DBSCAN, PCA, t‑SNE
anomaly detection: isolation forest

Imbalance & tuning

SMOTE, class weights
cross‑val, optuna, SHAP

Phase 3 · deep learning & PyTorch

NN fundamentals

neurons, activations, backprop
optimizers: sgd, adam, adamw
batch norm, dropout

Computer vision

CNNs, resnet, efficientnet
object detection: YOLO, DETR

NLP & transformers

BERT, GPT, attention
huggingface, fine‑tuning

PyTorch in depth

autograd, nn.Module, DataLoader
mixed precision, distributed

Phase 4 · generative AI · LLMs · RAG

Prompt engineering

zero/few‑shot, chain‑of‑thought
system prompts, JSON mode

RAG pipelines

chunking, embeddings, vector DB
chroma, pinecone, hybrid search

LangChain / LlamaIndex

chains, agents, tools
ReAct, multi‑agent (crewai)

Fine‑tuning (LoRA, QLoRA)

PEFT, TRL, Axolotl
DPO, RLHF basics

Phase 5 · MLOps · production · serving

Experiment tracking

MLflow, WandB, DVC
model registry, data versioning

Serving & inference

FastAPI, BentoML, Triton
vLLM, quantization, ONNX

Monitoring & CI/CD

Evidently, data drift, concept drift
GitHub Actions, Airflow

structured weekly roadmap

Phase	Timeline	Core topics	Deliverable
Phase 1	Weeks 1‑8	Math: linear algebra, calculus, stats	math problem sets + notebook
Phase 2	Weeks 9‑20	Classical ML: sklearn, XGBoost, feature eng	4 kaggle‑style projects
Phase 3	Weeks 21‑34	Deep learning: PyTorch, CNNs, transformers	image classifier + nlp pipeline
Phase 4A	Weeks 35‑46	Gen AI: prompt, RAG, agents, fine‑tuning	RAG chatbot + agent app
Phase 4B	Weeks 35‑46	MLOps: experiment tracking, serving, monitor	automated ML pipeline
Phase 5	Weeks 47‑58	Specialisation: recsys / time series / GNN	specialised project
Phase 6	Weeks 55‑65	Industry capstone: full‑stack ML system	production‑grade capstone

⚡ industry capstone tracks

🔍 recommendation engine

two‑tower neural CF + matrix factor
feast feature store, redis cache
A/B testing, drift monitoring

📄 RAG + LLM document QA

LangChain, GPT‑4o, pinecone
RAGAS eval, streaming FastAPI
QLoRA fine‑tuning, langsmith

pytorch 2.x huggingface vLLM mlflow kubeflow evidently

explore capstone structure

📌 must‑have (first job)

python, numpy, pandas, sklearn
pytorch: train neural nets
3+ projects with github + demo
sql, evaluation metrics

📚 top free resources

fast.ai / cs229 (youtube)
huggingface course
karpathy's zero‑to‑hero
kaggle learn

🧠 interview prep

bias‑variance, backprop
attention / transformers
ML system design