### Course Overview

Data Science in its simpler terms is about generating critical business value from the data by various creative ways. It can also be defined as a mix of data research, algorithms and technology in order to solve complex analytical issues. Data is being by generated by Companies at an exponential pace. The usable Data form can be different for different section of people working in an organization. Data Science helps us to explore the data to the granular form and find the needed insights. Data Science is about being analytical or inquisitive wherein asking new questions, doing new explorations and keep learning is a part of job for Data Scientists.

Our Data Science Training Course in Pune helps you to understand and acquire indepth Data Analytics skills and techniques using R and Python languages. With our experienced and professional trainers, and hands on Data Science Course, we make sure that candidates are well versed with the techniques and take the maximum benefit from our course. The Data Analytics Course with R ensures your job with Big MNC’s and our placement team assures you with jobs as well by providing you the placement calls

### COURSE FEATURES

- Resume & Interviews Preparation Support
- Hands on Experience on Project.
- 100 % Placement Assistance
- Multiple Flexible Batches
- Missed Sessions Covered
- Practice Course Material

### At the end of Data Science Training Course, Participants will be able to:

- Understand what Data Science is and the skill sets needed to be a data scientist.
- Understand the basic terms what Statistical Inference means and probability distributions.
- Understand the Data Science Process and how its components interact.
- Understand basic machine learning algorithms (Linear Regression, k-Nearest Neighbours (k-NN), k-means, Naive Bayes) for predictive modeling. And why Linear Regression and k-NN are poor choices for Filtering Spam.
- Why Naive Bayes is a better alternative.
- Identify common approaches used for Feature Generation. Identify basic Feature Selection algorithms (Filters, Wrappers, Decision Trees, Random Forests) and use in applications.
- Identify and explain fundamental mathematical and algorithmic ingredients that constitute a Recommendation Engine (dimensionality reduction, singular value decomposition, principal component analysis). Build their own recommendation system using existing components.
- Work effectively (and synergically) in teams on data science projects.

### Course Duration

- Weekends: 40-50 hours

### Prerequisites

- Basic knowledge of any programming Language.
- Basic knowledge of Database (SQL) and files (MS Excel, CSV etc.)
- Basic high school Algebra and Geometry

### Who Should Attend?

- Developers aspiring to become Data Scientists.
- Freshers / Experience IT Professional
- BigData Professionals.
- Professionals who wants to understand machine learning

### Course

**1.1 Statistics**

- Basic Statistics
- Measure of central tendency
- Types of Distributions
- Measures of Central Tendency
- Arithmetic Mean / Average
- Mean, Mode, Medium
- Statistical Inference: Estimation.
- Statistical Inference: Decision Making.
- Decision Making: Hypothesis Testing.

**1.2 Probability and Probability Distributions**

- Standard Normal Distribution
- Normal Distribution
- Geometric Distribution
- Poisson Distribution
- Binomial Distribution

**2.1 Introduction: What is Data Science?**

- What is data Science?
- Importance of Data Science.
- Demand for Data Science Professional.
- Brief Introduction to Big data and Data Analytics.

Lifecycle of data science. - Tools and Technologies used in data Science.
- Business Intelligence vs Data Science.
- Role of a data scientist.

**2.2 Introduction to R**

- Understanding R
- Which Companies Use R?
- Understanding Comprehensive R Archive Network (CRAN)
- How to Install R on Operating Systems?
- How to Install R on Windows from CRAN Website?
- IDEs for R
- R Packages: Installation and Practice
- Understanding R Programming
- Studying Operators in R
- Operators: Arithmetic, Relational, Logical, Assignments
- Statements in R Programming
- Conditional Statements in R
- Break and Next Statement
- If else () Function
- Switch Function
- Scan () Function
- Loops in R
- How to Run an R Script and Batch Script?
- R Functions: Commonly Used and String Functions

**2.3 Basic Operations in R Programming**

- Types of Objects in R
- Naming standards in R
- Creating Objects in R
- Data Structure in R
- Matrix, Data Frame, String, Vectors
- Understanding Vectors & Data input in R
- Lists, Data Elements
- Creating Data Files using R

**2.4 Data Handling in R Programming**

- Basic Operations in R – Expressions, Constant Values, Arithmetic, Function Calls, Symbols
- Sub-setting Data
- Selecting (Keeping) Variables
- Excluding (Dropping) Variables
- Selecting Observations and Selection using Subset Function
- Merging Data
- Sorting Data
- Adding Rows
- Visualization using R
- Data Type Conversion
- Built-In Numeric Functions
- Built-In Character Functions
- User Built Functions
- Control Structures
- Loop Functions

**2.5 ****R data structure**

- Variables in R
- Scalars
- Vectors
- Matrices
- List
- Data frames
- Cbind,Rbind, attach and detach functions in R
- Factors
- Getting a subset of Data
- Missing values
- Converting between vector types
- How to Import Files in R?
- How to Import an Excel File?
- How to Import Minitab File?
- Importing Table and CSV Files
- Importing Data from SQL Databases
- Types of Apply Functions •
- Apply () Function: Lapply, Sapply, Tapply
- Vapply () Function, Mapply () Function
- Understanding Dplyr Package

**2.6 ****Using functions in R**

- Apply Function Family
- Commonly used Mathematical Functions
- Commonly used Summary Functions
- Commonly used String Functions
- User defined functions
- local and global variable
- Working with dates

**2.7 Introduction To Machine Learning**

- What is Machine Learning?
- What is the Challenge?
- Why Learn?
- When is Learning required?
- Data Mining
- Application Areas and Roles
- Supervised Learning
- Unsupervised Learning
- Reinforcement learning

**2.8 Machine Learning Concepts & Terminologies**

- Key tasks of Machine Learning
- Modelling Terminologies
- Learning a Class from Examples
- Probability and Inference
- PAC (Probably Approximately Correct) Learning
- Noise
- Noise and Model Complexity
- Triple Trade-Off
- Association Rules
- Association Measures

**2.9 Linear Regression**

- Introduction to Linear Regression
- Linear Regression with Multiple Variables
- Disadvantage of Linear Models
- Interpretation of Model Outputs
- Understanding
*Covariance*and*Colinearity*

**2.10 ****Hypothesis Testing**

- Hypothesis Testing
- Null Hypothesis, P-Value
- Need for Hypothesis Testing in Business
- Two tailed, Left tailed & Right tailed test
- Hypothesis Testing Outcomes : Type I & II erros
- Parametric vs Non-Parametric Testing
- Parametric Tests , T – Tests : One sample, two sample, Paired
- One Way ANOVA
- Importance of Parametric Tests
- Non Parametric Tests : Chi-Square, Mann-Whitney, Kruskal-Wallis etc.,
- Which Test to Choose?
- Ascerting accuracy of Data

**2.11 Decision Trees And Supervised Learning**

- Decision Tree – data set
- How to build decision tree?
- Understanding Kart Model
- Classification Rules- Overfitting Problem
- Stopping Criteria And Pruning
- How to Find final size of Trees?
- Model A decision Tree.
- Naive Bayes
- Random Forests and Support Vector Machines
- Interpretation of Model Outputs

**2.12 ****Unsupervised Classification Algorithms **

- Understanding the working of Kmeans Algorithm
- Cluster Size Optimization vs Definition Optimization,
- Hierarchical and non-hierarchical
- K- medoid and Fuzzy K means
- Case study for clustering
- Hierarchical Clustering
- k-Means algorithm for clustering – groupings of unlabeled data points.
- Principal Component Analysis(PCA)- Data
- Independent components analysis(ICA)
- Anomaly Detection
- Recommender System-collaborative filtering algorithm

**2.13 ****Introduction to Deep Learning**

- Neural Network
- Understaing Neural Network Model
- Understanding Tuning of Neural Network

### FAQ

There are a lot of job opportunities in various job portals for freshers. The key thing employer would be keen to know is whether you have the conceptual knowledge or not. The projects provided by ExcelR in various concepts will only reinforce your learning to make you market ready for the jobs.

R has approximately 50% market share & it is open source (free of cost). Hence, R is very lucrative in the analytics space. Almost all the jobs are asking for experience & exposure in R. Demand for other statistical tools is decreasing steadily & hence it is recommended to be futuristic and invest time in learning R.

Yes,You need to carry your own laptop.To start with ,You need to install R And R studio installed in your system.Both Of these are open source and in first class,trainer will help you to setup the environment in your system.