TY - GEN
T1 - Implementation of spectral clustering on microarray data of carcinoma using k -means algorithm
AU - Frisca,
AU - B., Alhadi
AU - Siswantining, Titin
N1 - Publisher Copyright:
© 2017 Author(s).
PY - 2017/3/27
Y1 - 2017/3/27
N2 - Clustering is one of data analysis methods that aims to classify data which have similar characteristics in the same group. Spectral clustering is one of the most popular modern clustering algorithms. As an effective clustering technique, spectral clustering method emerged from the concepts of spectral graph theory. Spectral clustering method needs partitioning algorithm. There are some partitioning methods including PAM, SOM, Fuzzy c-means, and k-means. Based on the research that has been done by Capital and Choudhury in 2013, when using Euclidian distance k-means algorithm provide better accuracy than PAM algorithm. So in this paper we use k-means as our partition algorithm. The major advantage of spectral clustering is in reducing data dimension, especially in this case to reduce the dimension of large microarray dataset. Microarray data is a small-sized chip made of a glass plate containing thousands and even tens of thousands kinds of genes in the DNA fragments derived from doubling cDNA. Application of microarray data is widely used to detect cancer, for the example is carcinoma, in which cancer cells express the abnormalities in his genes. The purpose of this research is to classify the data that have high similarity in the same group and the data that have low similarity in the others. In this research, Carcinoma microarray data using 7457 genes. The result of partitioning using k-means algorithm is two clusters.
AB - Clustering is one of data analysis methods that aims to classify data which have similar characteristics in the same group. Spectral clustering is one of the most popular modern clustering algorithms. As an effective clustering technique, spectral clustering method emerged from the concepts of spectral graph theory. Spectral clustering method needs partitioning algorithm. There are some partitioning methods including PAM, SOM, Fuzzy c-means, and k-means. Based on the research that has been done by Capital and Choudhury in 2013, when using Euclidian distance k-means algorithm provide better accuracy than PAM algorithm. So in this paper we use k-means as our partition algorithm. The major advantage of spectral clustering is in reducing data dimension, especially in this case to reduce the dimension of large microarray dataset. Microarray data is a small-sized chip made of a glass plate containing thousands and even tens of thousands kinds of genes in the DNA fragments derived from doubling cDNA. Application of microarray data is widely used to detect cancer, for the example is carcinoma, in which cancer cells express the abnormalities in his genes. The purpose of this research is to classify the data that have high similarity in the same group and the data that have low similarity in the others. In this research, Carcinoma microarray data using 7457 genes. The result of partitioning using k-means algorithm is two clusters.
UR - http://www.scopus.com/inward/record.url?scp=85017605824&partnerID=8YFLogxK
U2 - 10.1063/1.4978977
DO - 10.1063/1.4978977
M3 - Conference contribution
AN - SCOPUS:85017605824
T3 - AIP Conference Proceedings
BT - Symposium on Biomathematics, SYMOMATH 2016
A2 - Benyamin, Beben
A2 - Kasbawati, null
PB - American Institute of Physics Inc.
T2 - 4th International Symposium on Biomathematics, SYMOMATH 2016
Y2 - 7 October 2016 through 9 October 2016
ER -