TY - GEN
T1 - Analysis Accuracy of XGBoost Model for Multiclass Classification - A Case Study of Applicant Level Risk Prediction for Life Insurance
AU - Mustika, Widya Fajar
AU - Murfi, Hendri
AU - Widyaningsih, Yekti
PY - 2019/10
Y1 - 2019/10
N2 - Risk level assessment for insurance applicants is an important part of life insurance, so it needs to be classified. Determination of the level of risk claims on life insurance is based on the applicant's historical data. Submission to become a member of a life insurance requires a short time. But the application of a machine learning model can help classify prospective insurance applicants based on the level of risk quickly. One machine learning model is Extreme Gradient tree boosting (XGBoost) which is a decision tree based model. This model is used to predict risk in life insurance. The missing values in the data used are overcome by several strategies in the data processing process to increase the accuracy value of the XGBoost model. The results of this study show that the accuracy of the XGBoost model is 0.60730 with kappa units which indicates that the XGBoost model is very good and can be applied to the problem of predicting the level of risk claims for life insurance applicants. When compared to the decision tree, random forest and Bayesian ridge models, the performance of the XGoost model still excels in processing missing values in the data used.
AB - Risk level assessment for insurance applicants is an important part of life insurance, so it needs to be classified. Determination of the level of risk claims on life insurance is based on the applicant's historical data. Submission to become a member of a life insurance requires a short time. But the application of a machine learning model can help classify prospective insurance applicants based on the level of risk quickly. One machine learning model is Extreme Gradient tree boosting (XGBoost) which is a decision tree based model. This model is used to predict risk in life insurance. The missing values in the data used are overcome by several strategies in the data processing process to increase the accuracy value of the XGBoost model. The results of this study show that the accuracy of the XGBoost model is 0.60730 with kappa units which indicates that the XGBoost model is very good and can be applied to the problem of predicting the level of risk claims for life insurance applicants. When compared to the decision tree, random forest and Bayesian ridge models, the performance of the XGoost model still excels in processing missing values in the data used.
KW - big data
KW - ensemble learning
KW - machine learning
KW - multi-class classification
KW - risk level prediction
KW - XGBoost
UR - http://www.scopus.com/inward/record.url?scp=85080899395&partnerID=8YFLogxK
U2 - 10.1109/ICSITech46713.2019.8987474
DO - 10.1109/ICSITech46713.2019.8987474
M3 - Conference contribution
T3 - Proceeding - 2019 5th International Conference on Science in Information Technology: Embracing Industry 4.0: Towards Innovation in Cyber Physical System, ICSITech 2019
SP - 71
EP - 77
BT - Proceeding - 2019 5th International Conference on Science in Information Technology
A2 - Pratomo, Awang Hendrianto
A2 - Pranolo, Andri
A2 - Hernandez, Leonel
A2 - Drezewski, Rafal
A2 - Voliansky, Roman
A2 - Zakaria, Mohamad Shanudin
A2 - Akbar, Bagus Muhammad
A2 - Saifullah, Shoffan
A2 - Akbar, Ahmad Taufiq
A2 - Husaini, Rochmat
A2 - Heriyanto, Heriyanto
A2 - Suryotomo, Andiko Putro
A2 - Permadi, Vynska Amalia
A2 - Tahalea, Sylvert Prian
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 5th International Conference on Science in Information Technology, ICSITech 2019
Y2 - 23 October 2019 through 24 October 2019
ER -