TY - JOUR
T1 - The accuracy of XGBoost for insurance claim prediction
AU - Fauzan, Muhammad Arief
AU - Murfi, Hendri
N1 - Funding Information:
This work was supported by Universitas Indonesia under PITTA 2018 grant. Any opinions, findings, and conclusions or recommendations are the authors' and do not necessarily reflect those of the sponsor.
Publisher Copyright:
© 2018, International Center for Scientific Research and Studies.
PY - 2018
Y1 - 2018
N2 - The increasing trend of claim frequency and claim severity for auto-insurance result in need of methods to quickly file claims while maintaining accuracy. One of them is machine learning that treats the problem as supervised learning. The volume of the historical claim data is usually large. Moreover, there are many missing values for many features of the data. Therefore, we need machine learning models that can handle both data characteristics. XGBoost is a new ensemble learning that should be very suitable for both data characteristics. In this paper, we apply and analyze the accuracy of XGBoost for the problem of claim prediction. We also compare the performance of XGBoost with that of another ensemble learning, i.e., AdaBoost, Stochastic GB, Random Forest, and online learning-based method, i.e., Neural Network. Our simulations show that XGBoost gives better accuracies in term of normalized Gini than other methods.
AB - The increasing trend of claim frequency and claim severity for auto-insurance result in need of methods to quickly file claims while maintaining accuracy. One of them is machine learning that treats the problem as supervised learning. The volume of the historical claim data is usually large. Moreover, there are many missing values for many features of the data. Therefore, we need machine learning models that can handle both data characteristics. XGBoost is a new ensemble learning that should be very suitable for both data characteristics. In this paper, we apply and analyze the accuracy of XGBoost for the problem of claim prediction. We also compare the performance of XGBoost with that of another ensemble learning, i.e., AdaBoost, Stochastic GB, Random Forest, and online learning-based method, i.e., Neural Network. Our simulations show that XGBoost gives better accuracies in term of normalized Gini than other methods.
KW - Claim prediction
KW - Ensemble learning
KW - Large volume
KW - Missing values
KW - XGBoost
UR - http://www.scopus.com/inward/record.url?scp=85050668061&partnerID=8YFLogxK
M3 - Article
AN - SCOPUS:85050668061
VL - 10
SP - 159
EP - 171
JO - International Journal of Advances in Soft Computing and its Applications
JF - International Journal of Advances in Soft Computing and its Applications
SN - 2074-8523
IS - 2
ER -