2023 2024 MBA

2023 2024 MBA (https://mba.ind.in/forum/)
-   Main Forum (https://mba.ind.in/forum/main-forum/)
-   -   Sathyabama Institute of Science and Technology BE CSE SCSA3016 Data Science Syllabus (https://mba.ind.in/forum/sathyabama-institute-science-technology-cse-scsa3016-data-science-syllabus-508686.html)

Arvind Kumar 30th November 2020 09:56 AM

Sathyabama Institute of Science and Technology BE CSE SCSA3016 Data Science Syllabus
 
Sathyabama Institute of Science and Technology BE CSE SCSA3016 Data Science Syllabus

SATHYABAMA INSTITUTE OF SCIENCE AND TECHNOLOGY SCHOOL OF COMPUTING

SCSA3016 DATA SCIENCE
L T P Credits Total Marks
3 0 0 3 100

UNIT 1 LINEARALGEBRA 9 Hrs.
Algebraic view – vectors 2D, 3D and nD, matrices, product of matrix & vector, rank, null space, solution of over determined
set of equations and pseudo-inverse. Geometric view - vectors, distance, projections, eigenvalue decomposition, Equations
of line, plane, hyperplane, circle, sphere, Hypersphere.

UNIT 2 PROBABILITY AND STATISTICS 9 Hrs.
Introduction to probability and statistics, Population and sample, Normal and Gaussian distributions, Probability Density
Function, Descriptive statistics, notion of probability, distributions, mean, variance, covariance, covariance matrix,
understanding univariate and multivariate normal distributions, introduction to hypothesis testing, confidence interval for
estimates.

UNIT 3 EXPLORATORY DATA ANALYSIS AND THE DATA SCIENCE PROCESS 9 Hrs.
Exploratory Data Analysis and the Data Science Process - Basic tools (plots, graphs and summary statistics) of EDA -
Philosophy of EDA - The Data Science Process - Data Visualization - Basic principles, ideas and tools for data visualization
- Examples of exciting projects- Data Visualization using Tableau.

UNIT 4 MACHINE LEARNING TOOLS, TECHNIQUES AND APPLICATIONS 9 Hrs.
Supervised Learning, Unsupervised Learning, Reinforcement Learning, Dimensionality Reduction, Principal Component
Analysis, Classification and Regression models, Tree and Bayesian network models, Neural Networks, Testing, Evaluation
and Validation of Models.

UNIT 5 INTRODUCTION TO PYTHON 9 Hrs.
Data structures-Functions-Numpy-Matplotlib-Pandas- problems based on computational complexity-Simple case studies
based on python (Binary search, common elements in list), Hash tables, Dictionary.
Max. 45 Hrs.

COURSE OUTCOMES
On completion of the course, student will be able to
CO1 - Explain the basic terms of Linear Algebra and Statistical Inference.
CO2 - Describe the Data Science process and how its components interact.
CO3 - Apply EDA and the Data Science process in a case study.
CO4 - Classify Data Science problems.
CO5 - Analyse and correlate the results to the solutions.
CO6 - Simulate Data Visualization in exciting projects.

TEXT / REFERENCE BOOKS
1. Cathy O’Neil and Rachel Schutt. Doing Data Science, Straight Talk From The Frontline. O’Reilly. 2014.
2. Introduction to Linear Algebra - By Gilbert Strang, Wellesley-Cambridge Press, 5th Edition.2016.
3. Applied Statistics and Probability For Engineers – By Douglas Montgomery.2016.
4. Jure Leskovek, Anand Rajaraman and Jeffrey Ullman. Mining of Massive Datasets. v2.1, Cambridge University Press.
2014. (free online)
5. Avrim Blum, John Hopcroft and Ravindran Kannan. Foundations of Data Science.
6. Jiawei Han, Micheline Kamber and Jian Pei. Data Mining: Concepts and Techniques, 3rd Edition. ISBN 0123814790.
2011.
7. Trevor Hastie, Robert Tibshirani and Jerome Friedman. Elements of Statistical Learning, 2nd Edition. ISBN
0387952845. 2009. (free online)

END SEMESTER EXAMINATION QUESTION PAPER PATTERN
Max. Marks : 100 Exam Duration : 3 Hrs.
PART A : 10 Questions of 2 marks each-No choice 20 Marks
PART B : 2 Questions from each unit with internal choice, each carrying 16 marks 80 Marks


All times are GMT +5.5. The time now is 09:12 AM.

Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2024, vBulletin Solutions, Inc.
Search Engine Friendly URLs by vBSEO 3.6.0 PL2


1 2