TY - GEN
T1 - Triclustering analysis using extended dimension iterative signature algorithm (EDISA) on lung disease gene expression data
AU - Apriana, Dwi Aji
AU - Siswantining, Titin
AU - Sarwinda, Devvi
AU - Soemartojo, Saskya Mary
N1 - Funding Information:
V. ACKNOWLEDGMENT This research is supported by PUTI Proceedings Research Grant from Universitas Indonesia: NKB-962/UN.2RST/HKP.05.00/2020. The authors are thankful to all parties that involve in the process of writing this paper.
Publisher Copyright:
© 2020 IEEE.
PY - 2020/10/6
Y1 - 2020/10/6
N2 - Triclustering has been applied to three-dimensional gene expression data (gene, condition, and time) to group the dataset into sub-matrix groups that have similarities. One of the algorithms of triclustering analysis is the Extended Dimension Iterative Signature Algorithm (EDISA). This algorithm considers the Pearson distance between each gene and condition against the mean vector as a measure of similarity. The primary process in EDISA is the iteration process by deleting each gene and condition with a Pearson distance to the mean vector above a certain threshold. It is a measure of the similarity of a gene and condition to the mean of the tricluster candidate. EDISA was applied for lung disease gene expression's data using several scenarios with different thresholds. The result is that the higher the threshold value of each gene and condition, the more genes and conditions in the tricluster. Also, an evaluation was carried out using the Tricluster Diffusion (TD) Score value to find the best scenario where the best scenario was the scenario with the smallest TD Score. This algorithm's application to lung disease data generates triclusters, which can detect genes that distinguish the characteristics of patients with lung disease and healthy patients.
AB - Triclustering has been applied to three-dimensional gene expression data (gene, condition, and time) to group the dataset into sub-matrix groups that have similarities. One of the algorithms of triclustering analysis is the Extended Dimension Iterative Signature Algorithm (EDISA). This algorithm considers the Pearson distance between each gene and condition against the mean vector as a measure of similarity. The primary process in EDISA is the iteration process by deleting each gene and condition with a Pearson distance to the mean vector above a certain threshold. It is a measure of the similarity of a gene and condition to the mean of the tricluster candidate. EDISA was applied for lung disease gene expression's data using several scenarios with different thresholds. The result is that the higher the threshold value of each gene and condition, the more genes and conditions in the tricluster. Also, an evaluation was carried out using the Tricluster Diffusion (TD) Score value to find the best scenario where the best scenario was the scenario with the smallest TD Score. This algorithm's application to lung disease data generates triclusters, which can detect genes that distinguish the characteristics of patients with lung disease and healthy patients.
KW - EDISA
KW - Gene Expression Data
KW - Pearson Distance
KW - Threshold Value
KW - Tricluster Diffusion Score
UR - http://www.scopus.com/inward/record.url?scp=85112620399&partnerID=8YFLogxK
U2 - 10.1109/IBIOMED50285.2020.9487606
DO - 10.1109/IBIOMED50285.2020.9487606
M3 - Conference contribution
AN - SCOPUS:85112620399
T3 - IBIOMED 2020 - Proceedings of the 37th International Conference on Biomedical Engineering
SP - 7
EP - 12
BT - IBIOMED 2020 - Proceedings of the 37th International Conference on Biomedical Engineering
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 37th International Conference on Biomedical Engineering, IBIOMED 2020
Y2 - 6 October 2020 through 8 October 2020
ER -