Back

Observed strong pervasive positive selection in the N-terminal domain, receptor-binding domain and furin-cleavage sites of SARS-CoV-2 Spike protein sampled from Zimbabwean COVID-19 patients.

Kambarami, M. S.; Manasa, J.; Mushiri, T.

2022-04-28 infectious diseases
10.1101/2022.04.27.22274357 medRxiv
Show abstract

Mutations primarily in the Spike (S) gene resulted in the emergence of many SARS-CoV-2 variants like Alpha, Beta, Delta and Omicron variants. This has also caused a number of COVID-19 pandemic waves which have impacted human lives in different ways due to restriction measures put in place to curb the spread of the virus. In this study, evolutionary patterns found in SARS-CoV-2 sequences of samples collected from Zimbabwean COVID-19 patients were investigated. High coverage SARS-CoV-2 whole genome sequences were downloaded from the GISAID database along with the GISAID S gene reference sequence. Biopython, NumPy and Pandas Data Science packages were used to load, slice and clean whole genome sequences outputting a fasta file with approximate Spike (S) gene sequences. Alignment of sliced dataset with GISAID reference sequence was done using Jalview 2.11.1.3 to find exact sequences of SARS-CoV-2 S gene. Evidence of recombination signals was investigated using RDP 4.1 and pervasive selection in the S gene was investigated using FUBAR algorithm hosted on the Datamonkey webserver. Matplotlib and Seaborn Python packages were used for Data Visualisation. A plot of Bayes factor hypothesizing non-synonymous substitution being greater than synonymous substitution ({beta} > ) in the S protein sites showed 3 peaks with evidence of strong divergence. These 3 diverging S protein sites were found to be D142G, D614G and P681R. No evidence of recombination was detected by 9 methods of RDP which use different approaches to detect recombination signals. This study is useful in guiding drug, vaccine and diagnostic innovations toward better control of the pandemic. Additionally, this study can guide other non-biological interventions as we better understand the changes in various viral characteristics driven by the observed evolutionary patterns.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
Infection, Genetics and Evolution
43 papers in training set
Top 0.1%
22.1%
2
Scientific Reports
3102 papers in training set
Top 10%
8.3%
3
PLOS ONE
4510 papers in training set
Top 26%
6.7%
4
Viruses
318 papers in training set
Top 0.7%
6.7%
5
Journal of Medical Virology
137 papers in training set
Top 0.6%
4.8%
6
Frontiers in Microbiology
375 papers in training set
Top 3%
3.5%
50% of probability mass above
7
Archives of Virology
14 papers in training set
Top 0.1%
3.5%
8
Heliyon
146 papers in training set
Top 0.8%
2.6%
9
Frontiers in Genetics
197 papers in training set
Top 3%
2.3%
10
PeerJ
261 papers in training set
Top 5%
2.0%
11
Frontiers in Cellular and Infection Microbiology
98 papers in training set
Top 2%
2.0%
12
Virus Research
36 papers in training set
Top 0.4%
2.0%
13
Pathogens
53 papers in training set
Top 0.5%
1.8%
14
Gene Reports
13 papers in training set
Top 0.3%
1.7%
15
Frontiers in Medicine
113 papers in training set
Top 4%
1.6%
16
Access Microbiology
22 papers in training set
Top 0.3%
1.5%
17
Journal of General Virology
46 papers in training set
Top 0.5%
1.5%
18
Computers in Biology and Medicine
120 papers in training set
Top 3%
1.3%
19
F1000Research
79 papers in training set
Top 3%
1.2%
20
Wellcome Open Research
57 papers in training set
Top 1%
1.2%
21
Microbiology Spectrum
435 papers in training set
Top 5%
0.8%
22
Frontiers in Public Health
140 papers in training set
Top 8%
0.7%
23
Virology
56 papers in training set
Top 0.8%
0.7%
24
Genomics
60 papers in training set
Top 3%
0.7%
25
Gene
41 papers in training set
Top 2%
0.7%
26
Journal of Clinical Virology
62 papers in training set
Top 0.9%
0.7%
27
Life Science Alliance
263 papers in training set
Top 2%
0.7%
28
PLOS Pathogens
721 papers in training set
Top 10%
0.6%