TY - JOUR
T1 - Higher entropy observed in SARS-CoV-2 genomes from the first COVID-19 wave in Pakistan
AU - Ghanchi, Najia Karim
AU - Nasir, Asghar
AU - Masood, Kiran Iqbal
AU - Abidi, Syed Hani
AU - Mahmood, Syed Faisal
AU - Kanji, Akbar
AU - Razzak, Safina
AU - Khan, Waqasuddin
AU - Shahid, Saba
AU - Yameen, Maliha
AU - Raza, Ali
AU - Ashraf, Javaria
AU - Ansar, Zeeshan
AU - Dharejo, Mohammad Buksh
AU - Islam, Nazneen
AU - Hasan, Zahra
AU - Hasan, Rumina
N1 - Publisher Copyright:
© 2021 Ghanchi et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
PY - 2021/8
Y1 - 2021/8
N2 - Background We investigated the genome diversity of SARS-CoV-2 associated with the early COVID-19 period to investigate evolution of the virus in Pakistan. Materials and methods We studied ninety SARS-CoV-2 strains isolated between March and October 2020. Whole genome sequences from our laboratory and available genomes were used to investigate phylogeny, genetic variantion and mutation rates of SARS-CoV-2 strains in Pakistan. Site specific entropy analysis compared mutation rates between strains isolated before and after June 2020. Results In March, strains belonging to L, S, V and GH clades were observed but by October, only L and GH strains were present. The highest diversity of clades was present in Sindh and Islamabad Capital Territory and the least in Punjab province. Initial introductions of SARSCoV- 2 GH (B.1.255, B.1) and S (A) clades were associated with overseas travelers. Additionally, GH (B.1.255, B.1, B.1.160, B.1.36), L (B, B.6, B.4), V (B.4) and S (A) clades were transmitted locally. SARS-CoV-2 genomes clustered with global strains except for ten which matched Pakistani isolates. RNA substitution rates were estimated at 5.86 x10-4. The most frequent mutations were 5' UTR 241C > T, Spike glycoprotein D614G, RNA dependent RNA polymerase (RdRp) P4715L and Orf3a Q57H. Strains up until June 2020 exhibited an overall higher mean and site-specific entropy as compared with sequences after June. Relative entropy was higher across GH as compared with GR and L clades. More sites were under selection pressure in GH strains but this was not significant for any particular site. Conclusions The higher entropy and diversity observed in early pandemic as compared with later strains suggests increasing stability of the genomes in subsequent COVID-19 waves. This would likely lead to the selection of site-specific changes that are advantageous to the virus, as has been currently observed through the pandemic.
AB - Background We investigated the genome diversity of SARS-CoV-2 associated with the early COVID-19 period to investigate evolution of the virus in Pakistan. Materials and methods We studied ninety SARS-CoV-2 strains isolated between March and October 2020. Whole genome sequences from our laboratory and available genomes were used to investigate phylogeny, genetic variantion and mutation rates of SARS-CoV-2 strains in Pakistan. Site specific entropy analysis compared mutation rates between strains isolated before and after June 2020. Results In March, strains belonging to L, S, V and GH clades were observed but by October, only L and GH strains were present. The highest diversity of clades was present in Sindh and Islamabad Capital Territory and the least in Punjab province. Initial introductions of SARSCoV- 2 GH (B.1.255, B.1) and S (A) clades were associated with overseas travelers. Additionally, GH (B.1.255, B.1, B.1.160, B.1.36), L (B, B.6, B.4), V (B.4) and S (A) clades were transmitted locally. SARS-CoV-2 genomes clustered with global strains except for ten which matched Pakistani isolates. RNA substitution rates were estimated at 5.86 x10-4. The most frequent mutations were 5' UTR 241C > T, Spike glycoprotein D614G, RNA dependent RNA polymerase (RdRp) P4715L and Orf3a Q57H. Strains up until June 2020 exhibited an overall higher mean and site-specific entropy as compared with sequences after June. Relative entropy was higher across GH as compared with GR and L clades. More sites were under selection pressure in GH strains but this was not significant for any particular site. Conclusions The higher entropy and diversity observed in early pandemic as compared with later strains suggests increasing stability of the genomes in subsequent COVID-19 waves. This would likely lead to the selection of site-specific changes that are advantageous to the virus, as has been currently observed through the pandemic.
UR - http://www.scopus.com/inward/record.url?scp=85114336886&partnerID=8YFLogxK
U2 - 10.1371/journal.pone.0256451
DO - 10.1371/journal.pone.0256451
M3 - Article
C2 - 34464419
AN - SCOPUS:85114336886
SN - 1932-6203
VL - 16
JO - PLoS ONE
JF - PLoS ONE
IS - 8 August
M1 - e0256451
ER -