The implementation of hybrid clustering using fuzzy c-means and divisive algorithm for analyzing DNA human Papillomavirus cause of cervical cancer

Diyah Septi Andryani, Alhadi B., Dian Lestari

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

Clustering aims to classify the different patterns into groups called clusters. In this clustering method, we use n-mers frequency to calculate the distance matrix which is considered more accurate than using the DNA alignment. The clustering results could be used to discover biologically important sub-sections and groups of genes. Many clustering methods have been developed, while hard clustering methods considered less accurate than fuzzy clustering methods, especially if it is used for outliers data. Among fuzzy clustering methods, fuzzy c-means is one the best known for its accuracy and simplicity. Fuzzy c-means clustering uses membership function variable, which refers to how likely the data could be members into a cluster. Fuzzy c-means clustering works using the principle of minimizing the objective function. Parameters of membership function in fuzzy are used as a weighting factor which is also called the fuzzier. In this study we implement hybrid clustering using fuzzy c-means and divisive algorithm which could improve the accuracy of cluster membership compare to traditional partitional approach only. In this study fuzzy c-means is used in the first step to find partition results. Furthermore divisive algorithms will run on the second step to find sub-clusters and dendogram of phylogenetic tree. To find the best number of clusters is determined using the minimum value of Davies Bouldin Index (DBI) of the cluster results. In this research, the results show that the methods introduced in this paper is better than other partitioning methods. Finally, we found 3 clusters with DBI value of 1.126628 at first step of clustering. Moreover, DBI values after implementing the second step of clustering are always producing smaller IDB values compare to the results of using first step clustering only. This condition indicates that the hybrid approach in this study produce better performance of the cluster results, in term its DBI values.

Original languageEnglish
Title of host publicationSymposium on Biomathematics, SYMOMATH 2016
EditorsBeben Benyamin, Kasbawati
PublisherAmerican Institute of Physics Inc.
ISBN (Electronic)9780735414938
DOIs
Publication statusPublished - 27 Mar 2017
Event4th International Symposium on Biomathematics, SYMOMATH 2016 - Makassar, Indonesia
Duration: 7 Oct 20169 Oct 2016

Publication series

NameAIP Conference Proceedings
Volume1825
ISSN (Print)0094-243X
ISSN (Electronic)1551-7616

Conference

Conference4th International Symposium on Biomathematics, SYMOMATH 2016
Country/TerritoryIndonesia
CityMakassar
Period7/10/169/10/16

Fingerprint

Dive into the research topics of 'The implementation of hybrid clustering using fuzzy c-means and divisive algorithm for analyzing DNA human Papillomavirus cause of cervical cancer'. Together they form a unique fingerprint.

Cite this