An analysis of the proportion of feature subsampling on XG boost - A case study of claim prediction in car insurance

Wafíyatul Khusna, Hendri Murfí

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Claim prediction is one of the important elements in the insurance. The increasing frequency of claim makes the data volume also increases to become big data. So, we need the right machine learning method to help insurance companies manage big data more efficiently. XGBoost is a machine learning model based on decision trees. XGBoost can be applied for claim prediction case in the form of two-class or multi-class classification. We may select a subset of features in building the XGBoost model especially for data with a large number of features. In this paper, we examine the influence of the proportion of features on the accuracy of the XGBoost model. Our simulations show that by randomly using 1/5 of features, the XGBoost model can produce accuracy comparable to the model that uses all features. It means that the XGBoost model is scalable in terms of the proportion of features.

Original languageEnglish
Title of host publicationInternational Conference on Science and Applied Science, ICSAS 2020
EditorsBudi Purnama, Dewanta Arya Nugraha, Fuad Anwar
PublisherAmerican Institute of Physics Inc.
ISBN (Electronic)9780735440302
DOIs
Publication statusPublished - 16 Nov 2020
Event2020 International Conference on Science and Applied Science, ICSAS 2020 - Surakarta, Indonesia
Duration: 7 Jul 2020 → …

Publication series

NameAIP Conference Proceedings
Volume2296
ISSN (Print)0094-243X
ISSN (Electronic)1551-7616

Conference

Conference2020 International Conference on Science and Applied Science, ICSAS 2020
Country/TerritoryIndonesia
CitySurakarta
Period7/07/20 → …

Fingerprint

Dive into the research topics of 'An analysis of the proportion of feature subsampling on XG boost - A case study of claim prediction in car insurance'. Together they form a unique fingerprint.

Cite this