The accuracy of XGBoost for insurance claim prediction

Muhammad Arief Fauzan, Hendri Murfi

Research output: Contribution to journalArticlepeer-review

48 Citations (Scopus)


The increasing trend of claim frequency and claim severity for auto-insurance result in need of methods to quickly file claims while maintaining accuracy. One of them is machine learning that treats the problem as supervised learning. The volume of the historical claim data is usually large. Moreover, there are many missing values for many features of the data. Therefore, we need machine learning models that can handle both data characteristics. XGBoost is a new ensemble learning that should be very suitable for both data characteristics. In this paper, we apply and analyze the accuracy of XGBoost for the problem of claim prediction. We also compare the performance of XGBoost with that of another ensemble learning, i.e., AdaBoost, Stochastic GB, Random Forest, and online learning-based method, i.e., Neural Network. Our simulations show that XGBoost gives better accuracies in term of normalized Gini than other methods.

Original languageEnglish
Pages (from-to)159-171
Number of pages13
JournalInternational Journal of Advances in Soft Computing and its Applications
Issue number2
Publication statusPublished - 2018


  • Claim prediction
  • Ensemble learning
  • Large volume
  • Missing values
  • XGBoost


Dive into the research topics of 'The accuracy of XGBoost for insurance claim prediction'. Together they form a unique fingerprint.

Cite this