XGBoost in handling missing values for life insurance risk prediction

Deandra Aulia Rusdah, Hendri Murfi

Research output: Contribution to journalArticlepeer-review

39 Citations (Scopus)


Insurance risk prediction is carried out to classify the levels of risk in insurance industries. From the machine learning point of view, the problem of risk level prediction is a multi-class classification. To classify the risk, a machine learning model will predict the level of applicant’s risk based on historical data. In the insurance applicant’s historical data, there will be the possibility of missing values so that it is necessary to deal with these problems to provide better performance. XGBoost is a machine learning method that is widely used for classification problems and can handle missing values without an imputation preprocessing. This paper analyzed the performance of the XGBoost model in handling the missing values for risk prediction in life insurance. The simulations show that the XGBoost model without any imputation preprocessing gives a comparable accuracy to one of the XGBoost models with an imputation preprocessing.

Original languageEnglish
Article number1336
JournalSN Applied Sciences
Issue number8
Publication statusPublished - Aug 2020


  • Life insurance
  • Machine learning
  • Missing values
  • Multi-class classification
  • Risk prediction
  • XGBoost


Dive into the research topics of 'XGBoost in handling missing values for life insurance risk prediction'. Together they form a unique fingerprint.

Cite this