Metode Hybrid Decision Tree – Adaptive Boosting pada Klasifikasi Credit Scoring

Nurul Aini, Dewi Rahmatin, Entit Puspita

Abstract


Bank Indonesia revealed indications of an increase in new credit disbursements by banking institutions in August 2023. As the number of credit transactions rises, the risk of problematic loans also increases. Therefore, credit providers must be more careful in selecting high-quality borrowers to reduce credit risk. One way to reduce credit risk is credit scoring, a widely used risk assessment system that helps financial institutions or credit providers evaluate potential borrowers, individuals, and companies. This study aims to identify the best model by calculating the accuracy levels of the Decision Tree C4.5 – AdaBoost model and the Logistic Regression – AdaBoost model in classifying credit scoring for Home Credit. The Decision Tree – AdaBoost model demonstrated the best performance, balancing accuracy, precision, recall, F1-Score, and ROC-AUC. This model outperformed the Logistic Regression - AdaBoost model. The accuracy level of the best Decision Tree – AdaBoost model in classifying credit scoring for Home Credit is 70%, indicating that the Decision Tree – AdaBoost model is quite effective in determining credit scoring classifications.

 

Keywords: AdaBoost, Classification, Credit Scoring, Decision Tree, Logistic Regression.


Abstrak

Bank Indonesia mengungkapkan peningkatan penyaluran kredit baru oleh lembaga perbankan pada Agustus 2023. Semakin tinggi jumlah transaksi kredit, maka semakin tinggi risiko kredit bermasalah. Oleh karena itu, perusahaan pemberi kredit harus lebih cermat dalam memilih calon peminjam yang berkualitas agar dapat mengurangi risiko kredit. Salah satu cara dalam mengurangi risiko kredit adalah credit scoring, yakni suatu sistem penilaian risiko kredit yang banyak digunakan untuk membantu lembaga keuangan atau perusahaan pemberi kredit dalam mengevaluasi calon peminjam, baik individu ataupun perusahaan. Penelitian ini bertujuan untuk menentukan model terbaik dengan menghitung tingkat akurasi dari model Decision Tree – AdaBoost dan model Logistic Regression – AdaBoost dalam menentukan klasifikasi credit scoring pada perusahaan Home Credit. Hasil evaluasi modelDecision Tree – AdaBoost menunjukkan performa terbaik dengan keseimbangan yang baik antara akurasi, precision, recall, F1-Score, dan ROC-AUC. Model ini berhasil mengungguli model Logistic Regression – AdaBoost. Tingkat akurasi model terbaik dari Decision Tree – AdaBoost dalam menentukan klasifikasi credit scoring pada perusahaan Home Credit sebesar 70% menunjukkan bahwa model Decision Tree – AdaBoost sudah cukup baik dalam menentukan klasifikasi credit scoring.



Keywords


AdaBoost, Credit Scoring, Klasifikasi, Logistic Regression, Pohon Keputusan.

Full Text:

PDF

References


Alenzi, H. Z., & Aljehane, N. O. (2020). Fraud detection in credit cards using logistic regression. International Journal Of Advanced Computer Science And Applications (Ijacsa), 11(12), 540-551.

Bansal, M., Goyal, A., & Choudhary, A. (2022). A comparative analysis of K-nearest neighbor, genetic, support vector machine, decision tree, and long short term memory algorithms in machine learning. Decision Analytics Journal, 3(100071), 1-21.

Bastos, J. A. (2022). Predicting credit scores with boosted decision trees. Forecasting, 4, 925-935.

Brzozowska, J., Pizon, J., Baytikenova, G., Gola, A., Zakimova, A., & Piotrowska, K. (2023). Data engineering in crisp-dm process production data – case study. Applied Computer Science, 19(3), 83-95.

Chopra, A., & Bhilare, P. (2018). Application of ensemble models in credit scoring models. Business Perspectives and Research, 6(2), 129-141.

Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1), 119-139.

Hssina, B., Merbouha, A., Ezzikouri, H., & Erritali, M. (2014). A comparative study of decision tree ID3 and C4.5. International Journal of Advanced Computer Science and Applications, 4(2), 13-19.

Jadhav, S. D., & Chane, H. P. (2016). Comparative Study of K-NN, naive bayes and decision tree classification techniques. International Journal of Science and Research (IJSR), 5(1), 1842-1845.

Lee, C. S., & Cheang, P. Y. (2021). Predictive analytics in business analytics: decision tree. Advances in Decision Sciences, 26(1), 1-30.

Muflihah, I. Z. (2017). Analisis financial distress perusahaan manufaktur di Indonesia dengan regresi logistik. Majalah Ekonomi, 22(2), 254-269.

Naufal, M. F., Subrata, Susanto, A. F., Kansil, C. N., & Huda, S. (2023). Analisis perbandingan algoritma machine learning untuk prediksi potensi hilangnya nasabah bank. Techno.com, 22(1), 1-11.

Shah, K., Patel, H., Sanghvi, D., & Shah, M. (2020). A comparative analysis of logistic regression, random forest and KNN models for the text classification. Augmented Human Research, 5(1), 12.

Silva, E. C., Lopes, I. C., Correia, A., & Faria, S. (2020). A logistic regression model for consumer default risk. Journal of Applied Statistics, 47(13), 2879-2894.

Wang, G., Ma, J., Huang, L., & Xu, K. (2012). Two credit scoring models based on dual strategy ensemble trees. Knowledge-Based Systems, 26, 61-68.

Xiao, J., Wang, Y., Chen, J., Xie, L., & Huang, J. (2021). Impact of resampling methods and classification models on the imbalanced credit scoring problems. Information Sciences, 569, 508-526.

Zhang, X., & Chen, X. (2021). Research on breach prediction for big data through hybrid ensemble learning and logistic regression. Journal of Physics: Conference Series, 1982(2021).




DOI: https://doi.org/10.17509/jem.v12i2.75401

Refbacks

  • There are currently no refbacks.


Copyright (c) 2024 Mathematics Program Study, Universitas Pendidikan Indonesia

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.



Google Scholar Logo PNG vector in SVG, PDF, AI, CDR format