Bishwajit Ghose

Course outline

Installing Jupyter

Basics of data analysis with Pandas

  • Introduction to Python packages: Pandas, Numpy.
  • Importing non-python datasets (Excel, JSON, Parquet) 
  • Basic functions (shape, head, tail, dtype, loc/iloc) functions) 
  • Data Cleaning (Duplicates, outliers and missing values)
  • Data types (Conversion of data types)
  • Renaming and deleting columns 
  • Concatenating and transforming 
  • Subsetting data frames 
  • Filtering and selecting
  • Aggregation and grouping 
  • Groupby function 
  • Reshaping data frames: stack() and unstack()

Correlation analysis (Parametric& non-parametric)

Predictive modelling

  • Multiple Linear regression 
  • Logistic regression
  • Decision tree 
  • Random forest 

Dimension reduction methods 

  • Explanatory factor analysis 
  • Principal component analysis

Numerical Python: 

  • NumPy Converting data frame to arrays 
  • Basic arithmetic operations of arrays