# DATA SCIENCE WITH PYTHON

## Machine Learning & Analytics

# Course Description

Python programming, in the recent years, has become one of the most preferred languages in Data Science. And when it comes to building Machine Learning systems, Python provides an ideally powerful and flexible platform to build on. Through a comprehensive, hands-on approach, this course provides you the opportunity you need to experiment with a wide variety of Data Science and Machine Learning algorithms. We believe that a practical, hands-on approach is the key to meaningful learning and skills advancement. With this in mind, we integrate real-life exercises and activities throughout our trainings, with long-term retention of learning and development in mind.

40 Hours

- Introducing data science, with a focus on the job outlook and market requirements
- Data Science Project Life Cycle
- Basics of Statistics – Measures of Central Tendency and Measures of Dispersion
- Discrete and Continuous Distribution Functions
- Advanced Statistics Concepts – Sampling, Statistical Inference and Testing of Hypothesis
- Introduction of Python Programming, Anaconda and Spyder
- Installation and Configuration of Python
- Control Structures and Data Structures in Python
- Hands-on Applied Statistics Concepts using Python
- Functions and Packages in Python
- Graphics and Data Visualization Libraries in Python
- Introduction to Machine Learning
- Machine Learning Models and Case Studies with Python

- Software developers and programmers who want to reap the benefits of a lucrative Data Science and Machine Learning career
- Data Analysts or Financial Analysts from the non-IT industry who want to make a transition to the IT industry
- Individuals, students and corporate professionals who want to upgrade their technical skill set

None

**Introduction to Data Science**

- Data Science Introduction
- Data Science Toolkit
- Job outlook
- Prerequisite, Target Audience
- Data Science Project Lifecycle – CRISP-DM Model

**Basics of Statistics**

- Statistics Concepts
- Random variable
- Type of Random variables
- Central Tendencies – Mean, Mode, Median, Probability, Probability Distribution
- Random variables, PMF, PDF, CDF
- Type of RV – Nominal, Ordinal, Interval, Ratio; Variance, Standard Deviation
- Normal Distribution, Standard Normal Distribution
- Binomial Distribution,
- Poisson Distribution

**Advanced Statistics**

- Sampling
- Inferential Statistics
- Sampling Distribution
- Central Limit Theorem
- Simulation
- Null and Alternative Hypothesis
- Hypothesis Testing
- 1-tail test and 2-tail test, type I and Type II error
- z test & t test

**Python Programming for Data Science (Lab)**

- Introduction to Python, Anaconda & Spyder, Installation & Configuration
- Data Structures in Python
- List
- Tuples
- Array in NumPy
- Matrices
- Dataframe in Pandas
- Control Structure & Functions – If-Else, For loop, While loop
- Slicing, dicing & filter operations

**Applied Statistics in Python (Lab)**

- Normal distribution
- Simulation
- Hypothesis testing
- Other statistical concepts using Python

**Graphics and Data Visualization, Exploratory Data Analysis in Python (Lab)**

- Graphics and Data Visualization libraries in Python
- Plotly
- Matplotlib
- Seaborn
- Other useful packages/functions in Python
- Exploratory Data Analysis Exercise in Python

**Machine Learning Concepts**

- Introduction to machine Learning
- Supervised and Unsupervised ML, Parametric/Non parametric Machine Learning Algorithms,
- Machine Learning Models
- Linear Regression
- Logistic Regression
- Classification & KNN
- Decision trees
- Random Forest
- Clustering – K Means & hierarchical Clustering,
- Time Series Analysis
- ARIMA Models,
- Support Vector Machine
- Model Validation/Cross validation techniques, Parameter tuning

**ML Case Studies on**

- Regression
- Classification
- Decision Tree
- Random Forest
- Clustering
- Time Series Analysis