Multi-Algorithm Machine Learning Approach for Loan Risk Prediction: Comparative Analysis of Performance of Different Models

https://doi.org/10.47194/orics.v6i2.348

Authors

Keywords:

Finance, machine learning, loan risk, prediction models

Abstract

This study focuses on predicting the probability of customer default in consumer loan products using historical customer behavior data. The dataset includes information such as income, age, work experience, marital status, home ownership, car ownership, and others. Several techniques such as Principal Component Analysis (PCA) are used to reduce the dimensions of correlated features. Various machine learning algorithms are applied, such as Logistic Regression, K-Nearest Neighbors (KNN), Random Forest, XGBoost, and others. The results show that the XGBoost model provides the best performance with the highest AUC Score even though Random Forest provides the best accuracy performance.

References

Barongo, R.I., & Mbelwa, J.T. (2024). Using machine learning for detecting liquidity risk in banks. Machine Learning with Applications, 15, 100511.

Chen, Y., Calabrese, R., & Martin-Barragan, B. (2024). Interpretable machine learning for imbalanced credit scoring datasets. European Journal of Operational Research, 312(1), 357-372.

Kozina, A., Kuźmiński, Ł., Nadolny, M., Miałkowska, K., Tutak, P., Janus, J., Płotnicki, F., Walaszczyk, E., Rot, A., Dziembek, D., & Krol, R. (2023). The default of leasing contracts prediction using machine learning. Procedia Computer Science, 225, 424–433.

Li, X., Ergu, D., Zhang, D., Qiu, D., Cai, Y., & Ma, B. (2022). Prediction of loan default based on multi-model fusion. Procedia Computer Science, 199, 757–764.

Rodgers, W., Hudson, R., & Economou, F. (2023). Modeling credit and investment decisions based on AI algorithmic behavioral pathways. Technological Forecasting and Social Change, 191, 122471.

Setiawan, N., Suharjito, & Diana. (2019). A comparison of prediction methods for credit default on peer to peer lending using machine learning. Procedia Computer Science, 157, 38–45.

Wang, O., Zhang, Y., Lu, Y., & Yu, X. (2020). A comparative assessment of credit risk model based on machine learning: A case study of bank loan data. Procedia Computer Science, 174, 141–149.

Xianyu, Q., & Hai, M. (2023). Research on default prediction model of corporate credit risk based on big data analysis algorithm. Procedia Computer Science, 221, 300–307.

Zedda, S. (2024). Credit scoring: Does XGboost outperform logistic regression? A test on Italian SMEs. Research in International Business and Finance, 70(Part B), 102397.

Zhu, L., Qiu, D., Ergu, D., Ying, C., & Liu, K. (2019). A study on predicting loan default based on the random forest algorithm. Procedia Computer Science, 162, 503–513.

Published

2025-06-30

How to Cite

Sinaga , H. S. .V. (2025). Multi-Algorithm Machine Learning Approach for Loan Risk Prediction: Comparative Analysis of Performance of Different Models. Operations Research: International Conference Series, 6(2), 73–89. https://doi.org/10.47194/orics.v6i2.348