Course outline
Installing Jupyter
Basics of data analysis with Pandas
- Introduction to Python packages: Pandas, Numpy.
- Importing non-python datasets (Excel, JSON, Parquet)
- Basic functions (shape, head, tail, dtype, loc/iloc) functions)
- Data Cleaning (Duplicates, outliers and missing values)
- Data types (Conversion of data types)
- Renaming and deleting columns
- Concatenating and transforming
- Subsetting data frames
- Filtering and selecting
- Aggregation and grouping
- Groupby function
- Reshaping data frames: stack() and unstack()
Correlation analysis (Parametric& non-parametric)
Predictive modelling
- Multiple Linear regression
- Logistic regression
- Decision tree
- Random forest
Dimension reduction methods
- Explanatory factor analysis
- Principal component analysis
Numerical Python:
- NumPy Converting data frame to arrays
- Basic arithmetic operations of arrays