FinTech Final Project — Credit Default Prediction

This project builds credit default prediction models using the UCI Default of Credit Card Clients dataset. It focuses on improving defaulter detection with threshold tuning, and provides explainability via SHAP.

What you can do with this repo

Train and evaluate a Random Forest model with threshold tuning (targeting Recall ≈ 0.6).
Generate evaluation artifacts: confusion matrix, ROC curve / AUC, threshold sensitivity analysis, and SHAP plots.
(Optional) Train & save Logistic Regression / Random Forest / XGBoost models.
(Optional) Run an interactive Streamlit dashboard for credit risk scoring.

Environment & Dependencies

Recommended: Python 3.9+
Install dependencies:

pip install -U pip
pip install pandas numpy scikit-learn matplotlib seaborn shap joblib xgboost streamlit plotly

Quick Start

1) Train & evaluate Random Forest (with threshold tuning)

Run from the project root:

python train_rf.py

This script will:

Load uci_default_cleaned.csv
Split train/test sets
Train a Random Forest (class-weighted for imbalance)
Find a threshold targeting Recall ≈ 0.6
Print classification metrics
Export plots and CSV outputs (see below)

2) (Optional) Train & save all models for the dashboard

python "Train and Save All Models.py"

It will generate:

lr_model.pkl, rf_model.pkl, xgb_model.pkl
feature_names.pkl, reference_data.csv

3) (Optional) Run the Streamlit dashboard

streamlit run web_app.py

Output Files

Generated by python train_rf.py (saved to the project root):

File	Description
`confusion_matrix_final.png`	Confusion matrix (threshold-adjusted)
`roc_curve_final.png`	ROC curve & AUC
`shap_importance_bar.png`	SHAP feature importance (bar)
`shap_summary_plot.png`	SHAP summary plot
`threshold_comparison.png`	Recall/Precision/Accuracy vs. threshold
`threshold_sensitivity_analysis.csv`	Threshold performance table

Repository Layout (high-level)

Key files/folders:

final_project/
├─ Readme.md
├─ train_rf.py
├─ train_rf.ipynb
├─ Train and Save All Models.py
├─ web_app.py
├─ uci_default_cleaned.csv
├─ Dataset/                  # raw + reference CSVs
├─ Random_Forest/            # additional RF experiments/results
├─ Logistic_Regression/      # LR report/code
└─ web_source/               # web app bundle (models + assets)

Notes

Threshold tuning is used to prioritize recall for the defaulter class.
Model artifacts (*.pkl) are included to make the dashboard runnable without retraining.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FinTech Final Project — Credit Default Prediction

What you can do with this repo

Environment & Dependencies

Quick Start

1) Train & evaluate Random Forest (with threshold tuning)

2) (Optional) Train & save all models for the dashboard

3) (Optional) Run the Streamlit dashboard

Output Files

Repository Layout (high-level)

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Dataset		Dataset
Logistic_Regression/Logistic_Regression		Logistic_Regression/Logistic_Regression
Random_Forest		Random_Forest
web_source		web_source
.gitignore		.gitignore
LICENSE		LICENSE
Random_Forest.zip		Random_Forest.zip
Readme.md		Readme.md
Train and Save All Models.py		Train and Save All Models.py
all_three_model.ipynb		all_three_model.ipynb
confusion_matrix_final.png		confusion_matrix_final.png
feature_names.pkl		feature_names.pkl
fintech_final_project (Ramdom Forest).pptx		fintech_final_project (Ramdom Forest).pptx
lr_model.pkl		lr_model.pkl
output.txt		output.txt
reference_data.csv		reference_data.csv
rf_model.pkl		rf_model.pkl
rf_output.txt		rf_output.txt
roc_curve_final.png		roc_curve_final.png
shap_importance_bar.png		shap_importance_bar.png
shap_summary_plot.png		shap_summary_plot.png
threshold_comparison.png		threshold_comparison.png
threshold_sensitivity_analysis.csv		threshold_sensitivity_analysis.csv
train_rf.ipynb		train_rf.ipynb
train_rf.py		train_rf.py
uci_default_cleaned.csv		uci_default_cleaned.csv
web_app.py		web_app.py
web_source.zip		web_source.zip
xgb_model.pkl		xgb_model.pkl

Folders and files

Latest commit

History

Repository files navigation

FinTech Final Project — Credit Default Prediction

What you can do with this repo

Environment & Dependencies

Quick Start

1) Train & evaluate Random Forest (with threshold tuning)

2) (Optional) Train & save all models for the dashboard

3) (Optional) Run the Streamlit dashboard

Output Files

Repository Layout (high-level)

Notes

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages