TY - JOUR
T1 - Enabling Artificial Intelligence for Genome Sequence Analysis of COVID-19 and Alike Viruses
AU - Ahmed, Imran
AU - Jeon, Gwanggil
N1 - Publisher Copyright:
© 2021, International Association of Scientists in the Interdisciplinary Areas.
PY - 2022/6
Y1 - 2022/6
N2 - Recent pandemic of COVID-19 (Coronavirus) caused by severe acute respiratory syndrome Coronavirus 2 (SARS-CoV-2) has been growing lethally with unusual speed. It has infected millions of people and continues a mortifying influence on the global population’s health and well-being. In this situation, genome sequence analysis and advanced artificial intelligence techniques may help researchers and medical experts to understand the genetic variants of COVID-19 or SARS-CoV-2. Genome sequence analysis of COVID-19 is crucial to understand the virus’s origin, behavior, and structure, which might help produce/develop vaccines, antiviral drugs, and efficient preventive strategies. This paper introduces an artificial intelligence based system to perform genome sequence analysis of COVID-19 and alike viruses, e.g., SARS, middle east respiratory syndrome, and Ebola. The system helps to get important information from the genome sequences of different viruses. We perform comparative data analysis by extracting basic information of COVID-19 and other genome sequences, including information of nucleotides composition and their frequency, tri-nucleotide compositions, count of amino acids, alignment between genome sequences, and their DNA similarity information. We use different visualization methods to analyze these viruses’ genome sequences and, finally, apply machine learning based classifier support vector machine to classify different genome sequences. The data set of different virus genome sequences are obtained from an online publicly accessible data center repository. The system achieves good classification results with an accuracy of 97% for COVID-19, 96%, SARS, and 95% for MERS and Ebola genome sequences, respectively. Graphic Abstract: [Figure not available: see fulltext.]
AB - Recent pandemic of COVID-19 (Coronavirus) caused by severe acute respiratory syndrome Coronavirus 2 (SARS-CoV-2) has been growing lethally with unusual speed. It has infected millions of people and continues a mortifying influence on the global population’s health and well-being. In this situation, genome sequence analysis and advanced artificial intelligence techniques may help researchers and medical experts to understand the genetic variants of COVID-19 or SARS-CoV-2. Genome sequence analysis of COVID-19 is crucial to understand the virus’s origin, behavior, and structure, which might help produce/develop vaccines, antiviral drugs, and efficient preventive strategies. This paper introduces an artificial intelligence based system to perform genome sequence analysis of COVID-19 and alike viruses, e.g., SARS, middle east respiratory syndrome, and Ebola. The system helps to get important information from the genome sequences of different viruses. We perform comparative data analysis by extracting basic information of COVID-19 and other genome sequences, including information of nucleotides composition and their frequency, tri-nucleotide compositions, count of amino acids, alignment between genome sequences, and their DNA similarity information. We use different visualization methods to analyze these viruses’ genome sequences and, finally, apply machine learning based classifier support vector machine to classify different genome sequences. The data set of different virus genome sequences are obtained from an online publicly accessible data center repository. The system achieves good classification results with an accuracy of 97% for COVID-19, 96%, SARS, and 95% for MERS and Ebola genome sequences, respectively. Graphic Abstract: [Figure not available: see fulltext.]
KW - Artificial intelligence
KW - COVID-19
KW - Genome sequence analysis
KW - Machine learning
KW - SVM
UR - http://www.scopus.com/inward/record.url?scp=85112604900&partnerID=8YFLogxK
U2 - 10.1007/s12539-021-00465-0
DO - 10.1007/s12539-021-00465-0
M3 - Article
C2 - 34357528
AN - SCOPUS:85112604900
SN - 1913-2751
VL - 14
SP - 504
EP - 519
JO - Interdisciplinary sciences, computational life sciences
JF - Interdisciplinary sciences, computational life sciences
IS - 2
ER -