Data Science with R Online Training

Data Science with R Programming Course Content

Essential to R programming

An Introduction to R

  • History of R
  • Introduction to R
  • The R environment
  • What is Statistical Programming?
  • Why use a command line?
  • Your first R session

Introduction to the R language

  • Starting and quitting R
  • Recording your work

Basic features of R

  • Calculating with R
  • Named storage
  • Functions
  • Exact or approximate?
  • R is case-sensitive
  • Listing the objects in the workspace
  • Vectors
  • Extracting elements from vectors
  • Vector arithmetic
  • Simple patterned vectors
  • Missing values and other special values
  • Character vectors
  • Factors
  • More on extracting elements from vectors
  • Matrices and arrays
  • Data frames
  • Dates and times

Import and Export data in R

Importing data in to R

  • CSV File
  • Excel File
  • Import data from text table
  • SAS and SPSS datasets

Exporting Data from R

  • CSV File
  • Text Table
  • Excel File
  • SAS dataset

Merge / Join

  • Inner Join
  • Left Join
  • Right Join
  • Full Join
  • Anti-Join
  • Semi Join

Programming statistical graphics

High-level plots

  • Bar charts and dot charts
  • Pie charts
  • Histograms
  • Box plots
  • Scatterplots
  • QQ plots
  • Density Plot

Choosing a high-level graphic

Low-level graphics functions

  • The plotting region and margins
  • Adding to plots
  • Setting graphical parameters

Programming with R

Flow control

  • The for() loop
  • The if() statement
  • The while() loop
  • The repeat loop, and the break and next statements
  • Apply
  • Sapply
  • Lapply

Managing complexity through functions  What are functions?

  • Scope of variables

Data Manipulation Techniques using R programming

Data in R

  • Modes and Classes
  • Data Storage in R
  • Testing for Modes and Classes
  • Structure of R Objects
  • Conversion of Objects
  • Missing Values
  • Working with Missing Values

Reading and Writing Data

  • Reading Vectors and Matrices
  • Data Frames: read.table
  • Comma- and Tab-Delimited Input Files
  • Fixed-Width Input Files
  • Extracting Data from R Objects
  • Connections
  • Reading Large Data Files
  • Generating Data
  1. Sequences
  2. Random Numbers
  3. Permutations
  4. Random Permutations
  5. Enumerating All Permutations
  • Working with Sequences Vs Spreadsheets
  1. The RODBC Package on Windows
  2. The gdata Package (All Platforms)
  • Saving and Loading R Data Objects
  • Working with Binary Files
  • Writing R Objects to Files in ASCII Format
  1. The write Function
  2. The write.table function
  3. Reading Data from Other Programs

Dates

  • Date
  • The chron Package
  • POSIX Classes
  • Working with Dates
  • Time Intervals
  • Time Sequences
  • Current time
  • Present date

Factors

  • Using Factors
  • Numeric Factors Vs Manipulating Factors
  • Creating Factors from Continuous Variables

Subscripting

  • Basics of Subscripting
  • Numeric Subscripts
  • Character Subscripts
  • Logical Subscripts
  • Subscripting Matrices and Arrays
  • Specialized Functions for Matrices
  • Lists
  • Subscripting Data Frames

Character Manipulation

  • Basics of Character Data
  • Displaying and Concatenating Character
  • Working with Parts of Character Values
  • Regular Expressions in R
  • Basics of Regular Expressions
  • Breaking Apart Character Values
  • Using Regular Expressions in R
  • Substitutions and Tagging

Reshaping Data

  • Modifying Data Frame Variables
  • Recoding Variables
  • The recode Function
  • Reshaping Data Frames
  • The reshape Package
  • Combining Data Frames

Data Manipulation

  • Random Selection of rows and columns
  • Summarization
  • Sort, Arrange
  • Group by
  • Filter

Missing Value and Outlier

  • Identify Missing values
  • Impute missing values
  • Identify Outliers
  • Capping outliers