TY - GEN
T1 - Analysis Accuracy of Random Forest Model for Big Data - A Case Study of Claim Severity Prediction in Car Insurance
AU - Dewi, Kartika Chandra
AU - Murfi, Hendri
AU - Abdullah, Sarini
PY - 2019/10
Y1 - 2019/10
N2 - Insurance claim is one of the important elements in the field of insurance services. Claim severity refers to the amount of fund that must be spent to repair the damage. The amount of insurance claim is influenced by many factors. This causes the volume of data to be very large. Therefore, a suitable method is required. Random Forest, one of the machine learning methods can be implemented to handle this problem. This thesis applies the Random Forest model to predict the amount of this claim severity on car insurance. Furthermore, analysis of the effect of the number of features used on model accuracy is conducted. The simulation result shows that the Random Forest model can be applied in cases of prediction of claim severity, which is a case of regression in the context of machine learning. Only by using 1/3 of the overall features, the accuracy of the Random Forest model can produce accuracy that is comparable to that obtained when using all features which is around 99%. This result confirms the scalability of Random Forest, especially in terms of the number of features. Hence, the Random Forest model can be used as a solution to Big Data problems related to data volume.
AB - Insurance claim is one of the important elements in the field of insurance services. Claim severity refers to the amount of fund that must be spent to repair the damage. The amount of insurance claim is influenced by many factors. This causes the volume of data to be very large. Therefore, a suitable method is required. Random Forest, one of the machine learning methods can be implemented to handle this problem. This thesis applies the Random Forest model to predict the amount of this claim severity on car insurance. Furthermore, analysis of the effect of the number of features used on model accuracy is conducted. The simulation result shows that the Random Forest model can be applied in cases of prediction of claim severity, which is a case of regression in the context of machine learning. Only by using 1/3 of the overall features, the accuracy of the Random Forest model can produce accuracy that is comparable to that obtained when using all features which is around 99%. This result confirms the scalability of Random Forest, especially in terms of the number of features. Hence, the Random Forest model can be used as a solution to Big Data problems related to data volume.
KW - Big Data
KW - claim severity prediction
KW - ensemble learning
KW - machine learning
KW - Random Forest
KW - scalability
UR - http://www.scopus.com/inward/record.url?scp=85080941450&partnerID=8YFLogxK
U2 - 10.1109/ICSITech46713.2019.8987520
DO - 10.1109/ICSITech46713.2019.8987520
M3 - Conference contribution
T3 - Proceeding - 2019 5th International Conference on Science in Information Technology: Embracing Industry 4.0: Towards Innovation in Cyber Physical System, ICSITech 2019
SP - 60
EP - 65
BT - Proceeding - 2019 5th International Conference on Science in Information Technology
A2 - Pratomo, Awang Hendrianto
A2 - Pranolo, Andri
A2 - Hernandez, Leonel
A2 - Drezewski, Rafal
A2 - Voliansky, Roman
A2 - Zakaria, Mohamad Shanudin
A2 - Akbar, Bagus Muhammad
A2 - Saifullah, Shoffan
A2 - Akbar, Ahmad Taufiq
A2 - Husaini, Rochmat
A2 - Heriyanto, Heriyanto
A2 - Suryotomo, Andiko Putro
A2 - Permadi, Vynska Amalia
A2 - Tahalea, Sylvert Prian
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 5th International Conference on Science in Information Technology, ICSITech 2019
Y2 - 23 October 2019 through 24 October 2019
ER -