A Comparative Performance Evaluation of Random Forest Feature Selection on Classification of Hepatocellular Carcinoma Gene Expression Data

Moh Abdul Latief, Titin Siswantining, Alhadi Bustamam, Devvi Sarwinda

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

10 Citations (Scopus)

Abstract

Hepatocellular carcinoma is one of the cancers that cause death in the world. We get hepatocellular carcinoma data in the form of microarray data gene expression obtained from the National Center for Biotechnology Information website consisting of 40 samples and 54675 features. The main purpose of this research is to compare the performance evaluation of Hepatocellular Carcinoma by applying feature selection to several classification algorithms. Random Forest feature selection method will be paired with several classification algorithms such as Support Vector Classification, Neural Network Classification, Random Forest, Logistic Regression, and Naïve Bayes. This study uses 5-fold cross-validation as an evaluation method. The results showed that Random Forest algorithm, Neural Network, Vector Machine Classification, and Naive Bayes show higher classification performance evaluation than without using random forest feature selection, while the Logistic Regression model provides a higher performance evaluation without using Random Forest feature selection. Support Vector Classification offers the highest performance evaluation compared to four other algorithms using feature selection, but Logistic Regression provides higher performance evaluation compared to different classification algorithms without feature selection.

Original languageEnglish
Title of host publicationICICOS 2019 - 3rd International Conference on Informatics and Computational Sciences
Subtitle of host publicationAccelerating Informatics and Computational Research for Smarter Society in The Era of Industry 4.0, Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728146102
DOIs
Publication statusPublished - Oct 2019
Event3rd International Conference on Informatics and Computational Sciences, ICICOS 2019 - Semarang, Indonesia
Duration: 29 Oct 201930 Oct 2019

Publication series

NameICICOS 2019 - 3rd International Conference on Informatics and Computational Sciences: Accelerating Informatics and Computational Research for Smarter Society in The Era of Industry 4.0, Proceedings

Conference

Conference3rd International Conference on Informatics and Computational Sciences, ICICOS 2019
Country/TerritoryIndonesia
CitySemarang
Period29/10/1930/10/19

Keywords

  • cross-validation
  • logistic regression
  • naïve bayes
  • neural networks
  • random forest
  • support vector classification

Fingerprint

Dive into the research topics of 'A Comparative Performance Evaluation of Random Forest Feature Selection on Classification of Hepatocellular Carcinoma Gene Expression Data'. Together they form a unique fingerprint.

Cite this