Skip to content

Latest commit

 

History

History
11 lines (6 loc) · 837 Bytes

README.md

File metadata and controls

11 lines (6 loc) · 837 Bytes

LendingClub Loan Extension Analysis

This project aims to analyze data for loans through 2007-2015 from Lending Club available on Kaggle. Dataset contains over 887K observations and 74 variables among which one is describing the loan status.

Scope

  • Conducted regression analysis to predict loan interest rates based on initial borrower characteristics applying L1, L2 Regularization and Dimension Reduction techniques.

  • Applied feature selection to identify features pertaining to applicants likely to default on their loan and extended the analysis to determine which loan category was most likely to default.

  • Performed Cross-Validated Repeated Undersampling to fix class imbalance and implemented Logistic Regression, LDA and Random Forest models to identify charge-off loans, obtaining an accuracy of 95% on test data.