TY - GEN
T1 - Integration of Bagging and greedy forward selection on image pap smear classification using Naïve Bayes
AU - Riana, Dwiza
AU - Hidayanto, Achmad Nizar
AU - Fitriyani,
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/10/27
Y1 - 2017/10/27
N2 - Herlev dataset consists of 7 cervical cell classes, i.e. superficial squamous, intermediate squamous, columnar, mild dysplasia, moderate dysplasia, severe dysplasia, and carcinoma in situ is considered. The dataset will be tested to classify two classes, consisting of normal and abnormal cells. Seven different cell types will be classified to separate the cells into 7 classes which are 3 normal cell classes and 4 abnormal cell classes. There are still some difficulties to classify the dataset into seven classes. This Pap smear image dataset has a class with a number of different and unbalanced classes. Another condition is that the data has features that are suspected to be irrelevant, so it is still difficult to classify especially abnormal classes. To handle the class imbalance, this study used ensemble method (Bagging). For handling data that had features and had no contribution, we made feature selection of Greedy Forward Selection. Furthermore, Naïve Bayes was used as learning algorithms. The results of this study obtained the highest accuracy value for the classification of two classes that are normal and abnormal using Naïve Bayes model with Greedy Forward Selection of 92.15%. As the classification of seven classes is good enough for Naïve Bayes model and Greedy Forward Selection with Bagging of 63.25% although it still needs to improve.
AB - Herlev dataset consists of 7 cervical cell classes, i.e. superficial squamous, intermediate squamous, columnar, mild dysplasia, moderate dysplasia, severe dysplasia, and carcinoma in situ is considered. The dataset will be tested to classify two classes, consisting of normal and abnormal cells. Seven different cell types will be classified to separate the cells into 7 classes which are 3 normal cell classes and 4 abnormal cell classes. There are still some difficulties to classify the dataset into seven classes. This Pap smear image dataset has a class with a number of different and unbalanced classes. Another condition is that the data has features that are suspected to be irrelevant, so it is still difficult to classify especially abnormal classes. To handle the class imbalance, this study used ensemble method (Bagging). For handling data that had features and had no contribution, we made feature selection of Greedy Forward Selection. Furthermore, Naïve Bayes was used as learning algorithms. The results of this study obtained the highest accuracy value for the classification of two classes that are normal and abnormal using Naïve Bayes model with Greedy Forward Selection of 92.15%. As the classification of seven classes is good enough for Naïve Bayes model and Greedy Forward Selection with Bagging of 63.25% although it still needs to improve.
KW - Bagging
KW - Naïve Bayes
KW - Pap smear images
KW - classification
KW - feature selection
UR - http://www.scopus.com/inward/record.url?scp=85040225381&partnerID=8YFLogxK
U2 - 10.1109/CITSM.2017.8089320
DO - 10.1109/CITSM.2017.8089320
M3 - Conference contribution
AN - SCOPUS:85040225381
T3 - 2017 5th International Conference on Cyber and IT Service Management, CITSM 2017
BT - 2017 5th International Conference on Cyber and IT Service Management, CITSM 2017
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 5th International Conference on Cyber and IT Service Management, CITSM 2017
Y2 - 8 August 2017 through 10 August 2017
ER -