Subtype of cancer identification for patient survival prediction using semi supervised method

Ito Wasito, Ionia Veritawati

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)


Recently, there are number existing techniques in the literature for performing cancer subtype identification. However, most of these techniques assume that different subtypes of cancer are already known to exist. Even though methods for identifying such subtypes exist, these methods work well only for specific datasets. For those reasons, it would be desirable to develop a procedure to find such subtypes on small set of genes that relevant to the clinical data that is applicable in a wide variety of circumstances. Finally, those identified subtypes would be very useful to predict accurate future patient survival. We used experimental data from [13] that consist of 1,536 genes in 100 colorectal carcinoma cancer and 11 normal tissues. Firstly, we identify relevant genes those correlated with patient survival time data. The genes will be selected using Cox regression technique for further analysis by considering only the genes with a p-value less than 0.01. Based on our computation, 63 best genes have been identified for prediction patient survival analysis. Then, 2-means clustering is applied to find the patients subgroups using those 63 genes. Having subgroups identified, we apply Support Vector Machines (SVM) to classify the future patient survival prediction into appropriate subgroup. For the existence of tumour clinical data, we successfully identify 2 subgroups of patients with significant pvalues based on Kaplan-Meier graph. On the existence of metastasis clinical data, we are also successful to discover 2 subgroups for each group. Even though there is no prior subtypes information is exist, we able still predict survival time of cancer patients using combination unsupervised and supervised method called as "semi supervised" methods. The results show our proposed methods successfully unveil subgroups on various colorectal carcinoma parameters. The only partly success is on lymphnode parameter that our proposed method could successfully identify different survival time on lymphnode-0 and lymphnode-1 with significant p-value.

Original languageEnglish
Pages (from-to)215-222
Number of pages8
JournalJournal of Convergence Information Technology
Issue number14
Publication statusPublished - Aug 2012


  • Cancer subtype
  • Cox regression
  • K-Means clustering
  • Patient survival
  • Semi supervised


Dive into the research topics of 'Subtype of cancer identification for patient survival prediction using semi supervised method'. Together they form a unique fingerprint.

Cite this