MSc in Data Science
Semester-wise Syllabus for an MSc in Data Science
Semester 1: Foundations of Data Science
-
Programming for Data Science (Python/R)
-
Python basics (NumPy, Pandas), R (tidyverse)
-
Data structures, loops, functions, and OOP concepts
-
-
Mathematics for Data Science
-
Linear algebra (vectors, matrices, eigenvalues)
-
Calculus (gradients, optimization), Probability (distributions, Bayes’ theorem)
-
-
Statistics for Data Science
-
Descriptive/inferential statistics, hypothesis testing
-
Regression analysis, ANOVA, non-parametric tests
-
-
Data Wrangling & Visualization
-
Data cleaning (missing values, outliers)
-
Visualization tools (Matplotlib, Seaborn, ggplot2, Tableau)
-
-
Database Management (SQL/NoSQL)
-
SQL queries (joins, subqueries), MongoDB basics
-
ETL processes, data pipelines
-
Semester 2: Machine Learning & Big Data
-
Machine Learning Fundamentals
-
Supervised learning (Linear Regression, Decision Trees, SVM)
-
Unsupervised learning (Clustering, PCA, K-means)
-
Model evaluation (cross-validation, ROC curves)
-
-
Big Data Technologies
-
Hadoop ecosystem (HDFS, MapReduce)
-
Spark (PySpark, Spark SQL), distributed computing
-
-
Advanced Statistics
-
Bayesian methods, time series analysis (ARIMA)
-
Experimental design (A/B testing)
-
-
Cloud Computing for Data Science
-
AWS/GCP/Azure for data storage & processing
-
Serverless architectures (Lambda, BigQuery)
-
-
Domain Elective (Choose 1)
-
Healthcare Analytics: EHR data, predictive modeling
-
Financial Data Science: Risk modeling, algorithmic trading
-
Semester 3: Advanced Topics & Specializations
-
Deep Learning
-
Neural networks (CNNs, RNNs, Transformers)
-
Frameworks: TensorFlow, PyTorch
-
-
Natural Language Processing (NLP)
-
Text preprocessing, sentiment analysis, BERT
-
Topic modeling (LDA), chatbots
-
-
Data Engineering
-
Airflow for workflow automation
-
Kafka for real-time data streaming
-
-
Electives (Choose 2–3)
-
Computer Vision: Image classification, YOLO
-
Reinforcement Learning: Q-learning, Deep Q Networks
-
Graph Analytics: Network analysis, GNNs
-
Ethics in AI: Bias, fairness, GDPR compliance
-
-
Industry Case Studies
-
Capstone project kickoff (problem statement, data sourcing)
-
Semester 4: Capstone Project & Deployment
-
Scalable Machine Learning
-
Model deployment (Flask, FastAPI, Docker)
-
MLOps (MLflow, Kubeflow)
-
-
Business Intelligence & Storytelling
-
Dashboarding (Power BI, Dash)
-
Communicating insights to stakeholders
-
-
Capstone Project
-
End-to-end project (e.g., recommendation engine, fraud detection)
-
GitHub portfolio, research paper (optional)
-
-
Internship (Optional)
-
6–8 weeks with industry partners (tech firms, startups)
-