Back

Comparative genomic analysis of Mycobacterium tuberculosis reveals evolution and genomic instability within Uganda I sub-lineage

Kanyerezi, S.; Nabisubi, P.

2020-10-25 bioinformatics
10.1101/2020.10.24.353425 bioRxiv
Show abstract

IntroductionTuberculosis (TB) is the leading cause of morbidity and mortality globally, responsible for an estimated annual 10.0 million new cases and 1.3 million deaths among infectious diseases with Africa contributing a quarter of these cases in 2019. Classification of Mycobacterium tuberculosis (MTB) strains is important in understanding their geographical predominance and pathogenicity. Different studies have gone ahead to classify MTB using different methods. Some of these include; RFLP, spoligotyping, MIRU-VNTR and SNP set based phylogeny. The SNP set based classification has been found to be in concordance with the region of difference (RD) analysis of MTB complex classification system. In Uganda, the most common cause of pulmonary tuberculosis (PTB) is Uganda genotype of MTB and accounts for up to 70 % of isolates. MethodsSequenced MTB genome samples were retrieved from NCBI and others from local sequencing projects. The genomes were subjected to snippy (a rapid haploid variant calling and core genome alignment) to call variants and annotate them. Outputs from snippy were used to classify the isolates into Uganda genotypes and Non Ugandan genotypes based on 62 SNP set. The Ugandan genotype isolates were later subjected to 413 SNP set and then to a pan genome wide association analysis. Results6 Uganda genotype isolates were found not to classify as either Uganda I or II genotypes based on the 62 SNP set. Using the 413 SNP set, the 6 Uganda genotype isolates were found to have only one SNP out of the 7 SNPs that classify the Uganda I genotypes. They were also found to have both missense and frameshift mutations within the ctpH gene whereas the rest of Uganda I that had a mutation within this gene, was a missense. ConclusionAmong the Uganda genotypes genomes, Uganda I genomes are unstable. We used publicly available datasets to perform analysis like mapping, variant calling, mixed infection, pan-genome analysis to investigate and compare evolution of the Ugandan genotype.

Matching journals

The top 3 journals account for 50% of the predicted probability mass.

1
Tuberculosis
11 papers in training set
Top 0.1%
24.1%
2
PLOS ONE
4510 papers in training set
Top 5%
24.1%
3
F1000Research
79 papers in training set
Top 0.2%
5.2%
50% of probability mass above
4
Scientific Reports
3102 papers in training set
Top 21%
5.2%
5
PLOS Neglected Tropical Diseases
378 papers in training set
Top 2%
3.8%
6
Frontiers in Cellular and Infection Microbiology
98 papers in training set
Top 1.0%
3.8%
7
Microbial Genomics
204 papers in training set
Top 0.7%
3.3%
8
Genomics
60 papers in training set
Top 0.7%
2.0%
9
PeerJ
261 papers in training set
Top 5%
2.0%
10
Frontiers in Medicine
113 papers in training set
Top 4%
1.6%
11
BMC Microbiology
35 papers in training set
Top 0.7%
1.4%
12
Clinical Infectious Diseases
231 papers in training set
Top 4%
1.2%
13
Journal of Clinical Microbiology
120 papers in training set
Top 1%
1.0%
14
Journal of Medical Virology
137 papers in training set
Top 3%
1.0%
15
Microbiology Spectrum
435 papers in training set
Top 5%
0.8%
16
Journal of Fungi
31 papers in training set
Top 0.5%
0.8%
17
BMC Bioinformatics
383 papers in training set
Top 6%
0.8%
18
Microorganisms
101 papers in training set
Top 2%
0.8%
19
The Journal of Infectious Diseases
182 papers in training set
Top 6%
0.7%
20
Gigabyte
60 papers in training set
Top 2%
0.7%
21
mSphere
281 papers in training set
Top 7%
0.5%
22
Journal of Infection and Public Health
15 papers in training set
Top 1.0%
0.5%
23
Infection, Genetics and Evolution
43 papers in training set
Top 1%
0.5%