Course curriculum

  • 1

    Foundation

    • Course Introduction

    • Course Outcomes

    • Course Structure

    • Imbalanced Classification Defined

    • Causes of Class Imbalance

    • Challenge of Imbalance Classification

    • Examples of Class Imbalance

  • 2

    Understanding Class Imbalance

    • Create Synthetic Dataset with Class Distribution

    • Effect of Skewed Class Distributions

    • Visualizing Extreme Skew

    • Why Imbalanced Classification Is Hard

    • Compounding Effect of Dataset Size

    • Compounding Effect of Label Noise

    • Compounding Effect of Data Distribution

  • 3

    Model Evaluation

    • Evaluation Metrics and Imbalance

    • Taxonomy of Classifier Evaluation Metrics

    • Ranking Metrics for Imbalanced Classification

    • Probabilistic Metrics for Imbalanced Classification

    • How to Choose an Evaluation Metric

    • Accuracy Fails for Imbalanced Classification

    • Accuracy Paradox

    • Demo: Accuracy for Imbalanced Classification

    • Precision for Imbalanced Classification

    • Precision for Multi-Class Classification

    • Recall for Imbalanced Classification

    • Demo: Recall for Imbalanced Classification

    • F-Measure for Imbalanced Classification

    • Demo: F- Measure for Imbalanced Classification

    • ROC Curves and Precision-Recall Curves

    • ROC Curve

    • Demo: ROC Curve

    • ROC Area Under Curve (AUC) Score

    • Precision-Recall Curves

    • Precision-Recall Area Under Curve (AUC) Score

    • ROC AUC on with Severe Imbalance

    • ROC and Precision-Recall Curves With a Severe Imbalance

    • Probability Scoring Methods in Python

    • Log Loss Score

    • Brier Score

    • Cross-Validation for Imbalanced Classification

    • Challenge of Evaluating Classifiers

    • Failure of k-Fold Cross-Validation

  • 4

    Data Sampling

    • Data Sampling Methods for Imbalanced Classification

    • Oversampling Techniques

    • Undersampling Techniques

    • Combinations of Techniques

    • Random Resampling Imbalanced Datasets

    • Demo: Random Oversampling Imbalanced Datasets

    • Demo: Random Undersampling Imbalanced Datasets

    • Demo: Combining Random Oversampling and Undersampling Techniques

    • Synthetic Minority Oversampling Technique (SMOTE)

    • SMOTE for Balancing Data

    • SMOTE for Classification

    • Borderline-SMOTE SVM

    • Adaptive Synthetic Sampling (ADASYN)

    • Undersampling Methods

    • Near Miss Undersampling (NearMiss-1)

    • Near Miss Undersampling (NearMiss-2 and NearMiss-3)

    • Condensed Nearest Neighbor Rule Undersampling

    • Tomek Links for Undersampling

    • Edited Nearest Neighbors Rule for Undersampling (ENN)

    • Neighborhood Cleaning Rule for Undersampling

  • 5

    Cost-Sensitive Learning

    • Cost-Sensitive Learning for Imbalanced Classification

    • Not All Classification Errors Are Equal

    • Cost-Sensitive Learning

    • Cost-Sensitive Imbalanced Classification

    • Cost-Sensitive Methods

    • Cost-Sensitive Algorithms

    • Cost-Sensitive Ensembles

    • Cost-Sensitive Logistic Regression

    • Logistic Regression for Imbalanced Classification

    • Weighted Logistic Regression with Scikit-Learn

    • Grid Search Weighted Logistic Regression

    • Cost-Sensitive Decision Trees for Imbalanced Classification

    • Decision Trees for Imbalanced Classification

    • Weighted Decision Tree With Scikit-Learn

    • Grid Search Weighted Decision Tree

    • Develop a Cost-Sensitive Neural Network for Imbalanced Classification

    • Neural Network Model in Keras

    • Deep Learning for Imbalanced Classification

    • Weighted Neural Network With Keras

  • 6

    Project

    • Project: Breast Cancer Dataset

    • Haberman Breast Cancer Survival Dataset

    • Dataset Exploration

    • Model Test and Baseline Result

    • Evaluate Probabilistic Models

    • Model Evaluation With Scaled Inputs

    • Model Evaluation With Power Transform