Back

Evidence for ZAP-independent CpG reduction in SARS-CoV-2 genome, and pangolin coronavirus origin of 5'UTR

Afrasiabi, A.; Alinejad-Rokny, H.; Lovell, N.; Xu, Z.; Ebrahimi, D.

2020-10-24 genomics
10.1101/2020.10.23.351353 bioRxiv
Show abstract

SARS-CoV-2, the causative agent of COVID-19, has an RNA genome, which is, overall, closely related to the bat coronavirus sequence RaTG13. However, the ACE2-binding domain of this virus is more similar to a coronavirus isolated from a Guangdong pangolin. In addition to this unique feature, the genome of SARS-CoV-2 (and its closely related coronaviruses) has a low CpG content. This has been postulated to be the signature of an evolutionary pressure exerted by the host antiviral protein ZAP. Here, we analyzed the sequences of a wide range of viruses using both alignment-based and alignment free approaches to investigate the origin of SARS-CoV-2 genome. Our analyses revealed a high level of similarity between the 5UTR of SARS-CoV-2 and that of the Guangdong pangolin coronavirus. This suggests bat and pangolin coronaviruses might have recombined at least twice (in the 5UTR and ACE2 binding regions) to seed the formation of SARS-CoV-2. An alternative hypothesis is that the lineage preceding SARS-CoV-2 is a yet to be sampled bat coronavirus whose ACE2 binding domain and 5UTR are distinct from other known bat coronaviruses. Additionally, we performed a detailed analysis of viral genome compositions as well as expression and RNA binding data of ZAP to show that the low CpG abundance in SARS-CoV-2 is not related to an evolutionary pressure from ZAP.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
Journal of Virology
456 papers in training set
Top 0.6%
10.1%
2
Virus Research
36 papers in training set
Top 0.1%
10.1%
3
Frontiers in Genetics
197 papers in training set
Top 0.3%
10.1%
4
Viruses
318 papers in training set
Top 0.5%
8.4%
5
Scientific Reports
3102 papers in training set
Top 14%
6.8%
6
PLOS ONE
4510 papers in training set
Top 32%
4.9%
50% of probability mass above
7
BMC Genomics
328 papers in training set
Top 0.8%
3.7%
8
PLOS Pathogens
721 papers in training set
Top 4%
3.6%
9
Frontiers in Microbiology
375 papers in training set
Top 4%
2.6%
10
Archives of Virology
14 papers in training set
Top 0.2%
2.1%
11
Genes
126 papers in training set
Top 0.6%
2.1%
12
Virus Evolution
140 papers in training set
Top 0.8%
1.7%
13
Heliyon
146 papers in training set
Top 2%
1.7%
14
Gene Reports
13 papers in training set
Top 0.3%
1.7%
15
Genomics
60 papers in training set
Top 1%
1.7%
16
Communications Biology
886 papers in training set
Top 14%
1.2%
17
Journal of Medical Virology
137 papers in training set
Top 3%
1.1%
18
mSystems
361 papers in training set
Top 6%
0.9%
19
Virology
56 papers in training set
Top 0.6%
0.9%
20
iScience
1063 papers in training set
Top 27%
0.9%
21
Gene
41 papers in training set
Top 2%
0.9%
22
PeerJ
261 papers in training set
Top 12%
0.9%
23
International Journal of Molecular Sciences
453 papers in training set
Top 14%
0.8%
24
Antiviral Research
49 papers in training set
Top 0.4%
0.8%
25
F1000Research
79 papers in training set
Top 4%
0.8%
26
Genome Biology and Evolution
280 papers in training set
Top 2%
0.7%
27
Biology
43 papers in training set
Top 3%
0.7%
28
mBio
750 papers in training set
Top 11%
0.7%
29
PLOS Genetics
756 papers in training set
Top 16%
0.7%
30
Microbiology Spectrum
435 papers in training set
Top 6%
0.6%