Gram-Schmidt Orthogonalization for feature ranking and selection - A case study of claim prediction

Yuni Rosita Dewi, Hendri Murfi, Yudi Satria

Research output: Contribution to journalArticlepeer-review

Abstract

Claim prediction is an important process in the insurance industry to prepare the right type of insurance policy for each potential policyholder. The frequency of claim predictions is highly increasing that head the problem of big data in terms of both the number of features and the number of policyholders. One of machine learning paradigms to handle the problem of the big data is dimensionality reduction by using a feature selection method. In this paper, we examine a new feature selection method for claim prediction using Gram-Schmidt Orthogonalization. In this method, the next features are iteratively selected based on the farthest distance to space spanned by the current features. Therefore, the advantage of the Gram-Schmidt Orthogonalization method is that it can provide a subset of the feature ranking without ordering all features. Our simulation shows that by using only about 26% of features, the predictor can reach comparable accuracy when it uses all features. It means that the Gram-Schmidt Orthogonalization-based feature selection method may need memory usage of about 26%, which is very significant in the context of the Big Data problem.

Original languageEnglish
Pages (from-to)57-62
Number of pages6
JournalInternational Journal of Machine Learning and Computing
Volume10
Issue number1
DOIs
Publication statusPublished - 1 Jan 2020

Keywords

  • big data
  • Claim prediction
  • Feature ranking
  • Feature selection
  • Gram-Schmidt orthogonalization

Fingerprint Dive into the research topics of 'Gram-Schmidt Orthogonalization for feature ranking and selection - A case study of claim prediction'. Together they form a unique fingerprint.

Cite this