Predicting Possible Loan Default Using Machine Learning

Sep 10, 2023

Project Title: Predicting Possible Loan Default Using Machine Learning

Tools and Libraries: Python, pandas, sweetviz, scikit-learn, TPOT, matplotlib, seaborn

GitHub Repository: The full code of the analysis can be found here - Link

Approach:

  1. Data Exploration: Analyzed dataset, and identified missing values, duplicates, and key insights.

  2. Auto-EDA: Automated data exploration with Sweetviz.

  3. Feature Correlation Analysis: Examined feature relationships with loan approval status.

  4. Model Development:

    • Random Forest Classifier: Trained and evaluated for 77.2% accuracy.

    • AutoML using TPOT: Achieved 78.9% accuracy.

Visualisations:
The correlation analysis shows that Credit_History, Property_Area_Semiurban, Education_Graduate, Married_Yes, and Dependents_2 positively influence loan approval, while features like Married_No, Education_Not Graduate, and CoapplicantIncome negatively impact it.

Results:

  • TPOT-based model outperformed Random Forest in accuracy (78.9% vs. 77.2%) and recall (99% vs. 95%).

Conclusion: This project developed predictive models for loan approval. Ongoing monitoring, training and data quality are critical for model success.

If you want to reach out!