#1
 
 
Sathyabama Institute of Science and Technology BE CSE SCSA3016 Data Science Syllabus SATHYABAMA INSTITUTE OF SCIENCE AND TECHNOLOGY SCHOOL OF COMPUTING SCSA3016 DATA SCIENCE L T P Credits Total Marks 3 0 0 3 100 UNIT 1 LINEARALGEBRA 9 Hrs. Algebraic view – vectors 2D, 3D and nD, matrices, product of matrix & vector, rank, null space, solution of over determined set of equations and pseudoinverse. Geometric view  vectors, distance, projections, eigenvalue decomposition, Equations of line, plane, hyperplane, circle, sphere, Hypersphere. UNIT 2 PROBABILITY AND STATISTICS 9 Hrs. Introduction to probability and statistics, Population and sample, Normal and Gaussian distributions, Probability Density Function, Descriptive statistics, notion of probability, distributions, mean, variance, covariance, covariance matrix, understanding univariate and multivariate normal distributions, introduction to hypothesis testing, confidence interval for estimates. UNIT 3 EXPLORATORY DATA ANALYSIS AND THE DATA SCIENCE PROCESS 9 Hrs. Exploratory Data Analysis and the Data Science Process  Basic tools (plots, graphs and summary statistics) of EDA  Philosophy of EDA  The Data Science Process  Data Visualization  Basic principles, ideas and tools for data visualization  Examples of exciting projects Data Visualization using Tableau. UNIT 4 MACHINE LEARNING TOOLS, TECHNIQUES AND APPLICATIONS 9 Hrs. Supervised Learning, Unsupervised Learning, Reinforcement Learning, Dimensionality Reduction, Principal Component Analysis, Classification and Regression models, Tree and Bayesian network models, Neural Networks, Testing, Evaluation and Validation of Models. UNIT 5 INTRODUCTION TO PYTHON 9 Hrs. Data structuresFunctionsNumpyMatplotlibPandas problems based on computational complexitySimple case studies based on python (Binary search, common elements in list), Hash tables, Dictionary. Max. 45 Hrs. COURSE OUTCOMES On completion of the course, student will be able to CO1  Explain the basic terms of Linear Algebra and Statistical Inference. CO2  Describe the Data Science process and how its components interact. CO3  Apply EDA and the Data Science process in a case study. CO4  Classify Data Science problems. CO5  Analyse and correlate the results to the solutions. CO6  Simulate Data Visualization in exciting projects. TEXT / REFERENCE BOOKS 1. Cathy O’Neil and Rachel Schutt. Doing Data Science, Straight Talk From The Frontline. O’Reilly. 2014. 2. Introduction to Linear Algebra  By Gilbert Strang, WellesleyCambridge Press, 5th Edition.2016. 3. Applied Statistics and Probability For Engineers – By Douglas Montgomery.2016. 4. Jure Leskovek, Anand Rajaraman and Jeffrey Ullman. Mining of Massive Datasets. v2.1, Cambridge University Press. 2014. (free online) 5. Avrim Blum, John Hopcroft and Ravindran Kannan. Foundations of Data Science. 6. Jiawei Han, Micheline Kamber and Jian Pei. Data Mining: Concepts and Techniques, 3rd Edition. ISBN 0123814790. 2011. 7. Trevor Hastie, Robert Tibshirani and Jerome Friedman. Elements of Statistical Learning, 2nd Edition. ISBN 0387952845. 2009. (free online) END SEMESTER EXAMINATION QUESTION PAPER PATTERN Max. Marks : 100 Exam Duration : 3 Hrs. PART A : 10 Questions of 2 marks eachNo choice 20 Marks PART B : 2 Questions from each unit with internal choice, each carrying 16 marks 80 Marks 
Thread Tools  Search this Thread 
