Multiclass classification of breast cancer large scale datasets for detecting cancer drivers

A. R. Bagasta, Z. Rustam

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Over the past decade, scientists have found that even healthy genes can cause cancer due to hormonal growth disorder. The pattern of multiclass classification on data mining has recently become an important topic for research, especially in the health sector. Classification of cancer cells also plays an important role in the development of almost all types of cancer, and in this case, we focus on breast cancer. Therefore, studying Multiclass Classification is crucial to the experts in diagnosing cancer. Since datasets on type of breast cancer cell are plenty, it is important to pay more attention to the method to be as efficient as it could be for we are going to process such large datasets. Based on big data technologies, this study proposes the feature selection step in high dimension data classification problem and datasets with dozens of features. Multiclass Classification supports a study to adopt big data solutions. This machine learning techniques analyze a breast mass by analyzing the digitized image of a fine needle aspirate (FNA) which describes characteristics of the cell nuclei present in breast cancer. From the datasets of various classifications of breast mass will be investigated further to determine their active role in cancer. Especially, based on this research aimed to identify and analyze the ability of Support Vector Machine (SVM) as a Classification method and Relief F-Based Feature Selection as a Selection Method for diagnosing breast cancer driver. This method could be an efficient method for cancer classification with the accurate performance of 91 %.

Original languageEnglish
Title of host publicationProceedings of the 4th International Symposium on Current Progress in Mathematics and Sciences, ISCPMS 2018
EditorsTerry Mart, Djoko Triyono, Ivandini T. Anggraningrum
PublisherAmerican Institute of Physics Inc.
ISBN (Electronic)9780735419155
DOIs
Publication statusPublished - 4 Nov 2019
Event4th International Symposium on Current Progress in Mathematics and Sciences 2018, ISCPMS 2018 - Depok, Indonesia
Duration: 30 Oct 201831 Oct 2018

Publication series

NameAIP Conference Proceedings
Volume2168
ISSN (Print)0094-243X
ISSN (Electronic)1551-7616

Conference

Conference4th International Symposium on Current Progress in Mathematics and Sciences 2018, ISCPMS 2018
CountryIndonesia
CityDepok
Period30/10/1831/10/18

Keywords

  • discovery in cancer
  • feature selection
  • Hormonal growth disorder
  • Relief Feature
  • SVM

Fingerprint Dive into the research topics of 'Multiclass classification of breast cancer large scale datasets for detecting cancer drivers'. Together they form a unique fingerprint.

  • Cite this

    Bagasta, A. R., & Rustam, Z. (2019). Multiclass classification of breast cancer large scale datasets for detecting cancer drivers. In T. Mart, D. Triyono, & I. T. Anggraningrum (Eds.), Proceedings of the 4th International Symposium on Current Progress in Mathematics and Sciences, ISCPMS 2018 [020051] (AIP Conference Proceedings; Vol. 2168). American Institute of Physics Inc.. https://doi.org/10.1063/1.5132478