Abstract
The purpose of this research is to find a strong correlation between genes and conditions of diabetes mellitus gene expression data from obese and lean people using three-phase biclustering. The first step is to use Singular Value Decomposition (SVD) to decompose matrix gene expression data into two global-based gene and condition matrices. The second step is to use Partition around Medoid (PAM) to cluster gene and condition-based matrices using Euclidean distance, forming several biclusters that were further evaluated using the Modified Lift Algorithm based on Pearson correlation, which is a very appropriate method to detect an additive-multiplicative bicluster type. The algorithm processes are run using open-source R software. The resulting biclusters of the proposed algorithm having a strong correlation among genes and samples are obtained so that the method has high potential in future medical research.
Original language | English |
---|---|
Pages (from-to) | 326-343 |
Number of pages | 18 |
Journal | International Journal of Data Mining and Bioinformatics |
Volume | 24 |
Issue number | 4 |
DOIs | |
Publication status | Published - 2020 |
Keywords
- Correlated bicluster
- Diabetes mellitus
- Microarray data
- MLA
- Modified lift algorithm
- R software