TY - JOUR
T1 - Comparing random forest and support vector machines for breast cancer classification
AU - Aroef, Chelvian
AU - Rivan, Yuda
AU - Rustam, Zuherman
N1 - Funding Information:
This research supported financially by the Ministry of Research and Higher Education Republic of Indonesia (KEMENRISTEKDIKTI) with a PTUPT 2020 research grant scheme, ID number 1621/UN2. R3.1/HKP.05.00/2019
Funding Information:
This research supported financially by the Ministry of Research and Higher Education Republic of Indonesia (KEMENRISTEKDIKTI) with a PTUPT 2020 research grant scheme, ID number 1621/UN2.R3.1/HKP.05.00/2019
Publisher Copyright:
© 2020, Universitas Ahmad Dahlan.
PY - 2020/4/1
Y1 - 2020/4/1
N2 - There are more than 100 types of cancer around the world with different symptoms and difficulty in predicting its appearance in a person due to its random and sudden attack method. However, the appearance of cancer is generally marked by the growth of some abnormal cell. Someone might be diagnosed early and quickly treated, but the cancerous cell most times hides in the body of its victim and reappear, only to kill its sufferer. One of the most common cancers is breast cancer. According to Ministry of Health, in 2018, breast cancer attacked 42 out of every 100.000 people in Indonesia with approximately 17 deaths. In addition, the Ministry recorded a yearly increase in cancer patients. Therefore, there is adequate need to be able to determine those affected by this disease. This study applied the Boruta feature selection to determine the most important features in making a machine learning model. Furthermore, the Random Forest (RF) and Support Vector Machines (SVM) were the machine learning model used, with highest accuracies of 90% and 95% respectively. From the results obtained, the SVM is a better model than random forest in terms of accuracy.
AB - There are more than 100 types of cancer around the world with different symptoms and difficulty in predicting its appearance in a person due to its random and sudden attack method. However, the appearance of cancer is generally marked by the growth of some abnormal cell. Someone might be diagnosed early and quickly treated, but the cancerous cell most times hides in the body of its victim and reappear, only to kill its sufferer. One of the most common cancers is breast cancer. According to Ministry of Health, in 2018, breast cancer attacked 42 out of every 100.000 people in Indonesia with approximately 17 deaths. In addition, the Ministry recorded a yearly increase in cancer patients. Therefore, there is adequate need to be able to determine those affected by this disease. This study applied the Boruta feature selection to determine the most important features in making a machine learning model. Furthermore, the Random Forest (RF) and Support Vector Machines (SVM) were the machine learning model used, with highest accuracies of 90% and 95% respectively. From the results obtained, the SVM is a better model than random forest in terms of accuracy.
KW - Breast cancer
KW - Random forest
KW - Support vector machines
UR - http://www.scopus.com/inward/record.url?scp=85084037666&partnerID=8YFLogxK
U2 - 10.12928/TELKOMNIKA.V18I2.14785
DO - 10.12928/TELKOMNIKA.V18I2.14785
M3 - Article
AN - SCOPUS:85084037666
VL - 18
SP - 815
EP - 821
JO - Telkomnika
JF - Telkomnika
SN - 1693-6930
IS - 2
ER -