Analyzing Kinship in Severe Acute Respiratory Syndrome Coronavirus 2 DNA Sequences Based on Hierarchical and K-Means Clustering Methods Using Multiple Encoding Vector

Evander Banjarnahor, Alhadi Bustamam, Titin Siswantining, Patuan Tampubolon

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)

Abstract

Based on the World Health Organization data obtained in mid-April 2021, Coronavirus or Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) has already infected more than 134.9 million people worldwide. The virus attacks human breathing, which can cause lung infections and even death. More than 2.9 million people worldwide have died due to coronavirus infection. Meanwhile, more than 1.5 million people in Indonesia have been infected, and 42.5 thousand died because of this coronavirus. Based on this data, carrying out a kinship analysis of the coronavirus is important to reduce its spread. Identifying the kinship of the COVID-19 virus and its spread can be done by forming a phylogenetic tree and clustering. This study uses the Multiple Encoding Vector method in analyzing the sequences and Euclidean distance to determine the distance matrix. This research will then use the Hierarchical clustering method to determine the number of initial centroids, which will be used later by the K-Means clustering method kinship in the SARS-CoV-2 DNA sequence. This study took samples of DNA sequences of SARS-CoV-2 from several infected countries. From the simulation results, the ancestors of SARS-CoV-2 came from China. The analysis results also show that the closest ancestors of COVID-19 to Indonesia came from India. The SARS-CoV-2 DNA sequence also consisted of nine clusters, and the sixth cluster has the greatest number of members.

Original languageEnglish
Pages (from-to)2237-2247
Number of pages11
JournalInternational Journal on Advanced Science, Engineering and Information Technology
Volume12
Issue number6
DOIs
Publication statusPublished - 2022

Keywords

  • Bioinformatics
  • Clustering
  • Dna kinship
  • Phylogenetic analysis
  • Sequence alignment

Fingerprint

Dive into the research topics of 'Analyzing Kinship in Severe Acute Respiratory Syndrome Coronavirus 2 DNA Sequences Based on Hierarchical and K-Means Clustering Methods Using Multiple Encoding Vector'. Together they form a unique fingerprint.

Cite this