Back

Evaluation of somatic variant calling methods on high coverage tumour-only amplicon sequencing data in a clinical environment

Bharne, D.; Gaston, D.

2026-04-11 bioinformatics
10.64898/2026.04.08.717310 bioRxiv
Show abstract

One of the current workhorses of next-generation sequencing in clinical molecular diagnostics laboratories for profiling somatic mutations in tumours are amplicon-based targeted sequencing panels. Many open-source somatic variant callers are available; however, their use in clinical applications remains under explored. Therefore, we integrated outputs of six variant callers (FreeBayes, MuTect2, Pisces, Platypus, VarDict and VarScan) into a Snakemake pipeline and evaluated tumour-only data from the HD789 commercial reference standard sequenced in triplicate on three different sequencing runs using the Illumina AmpliSeq Focus panel on MiSeq and NextSeq 2000. A 1:4 dilution sample was sequenced for evaluating limits of variant detection. The called variants were analysed along depth, allele frequency, and other sequencing metrics. The variant callers were evaluated by their level of concordance and performance on known somatic variants. FreeBayes consistently called the largest number of somatic variants in each sample but also included more potential artifacts. Overall, FreeBayes, VarScan, MuTect2, and Pisces had the best performance on HD789 data.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
Clinical Chemistry
22 papers in training set
Top 0.1%
10.4%
2
Scientific Reports
3102 papers in training set
Top 5%
10.4%
3
Genome Medicine
154 papers in training set
Top 0.6%
8.7%
4
The Journal of Molecular Diagnostics
36 papers in training set
Top 0.1%
7.4%
5
BMC Bioinformatics
383 papers in training set
Top 1%
7.4%
6
PLOS ONE
4510 papers in training set
Top 26%
6.6%
50% of probability mass above
7
Journal of Clinical Microbiology
120 papers in training set
Top 0.5%
3.7%
8
BMC Genomics
328 papers in training set
Top 1%
2.7%
9
Diagnostics
48 papers in training set
Top 0.7%
2.1%
10
PLOS Computational Biology
1633 papers in training set
Top 13%
2.1%
11
Computational and Structural Biotechnology Journal
216 papers in training set
Top 3%
1.9%
12
Frontiers in Molecular Biosciences
100 papers in training set
Top 1%
1.7%
13
Frontiers in Bioinformatics
45 papers in training set
Top 0.2%
1.7%
14
Nature Communications
4913 papers in training set
Top 51%
1.7%
15
NAR Genomics and Bioinformatics
214 papers in training set
Top 2%
1.5%
16
Briefings in Bioinformatics
326 papers in training set
Top 4%
1.4%
17
Bioinformatics
1061 papers in training set
Top 8%
1.4%
18
BMC Medical Genomics
36 papers in training set
Top 0.6%
1.4%
19
Human Mutation
29 papers in training set
Top 0.5%
1.1%
20
Biology Methods and Protocols
53 papers in training set
Top 2%
1.0%
21
GigaScience
172 papers in training set
Top 2%
0.9%
22
PeerJ
261 papers in training set
Top 12%
0.9%
23
Communications Biology
886 papers in training set
Top 20%
0.8%
24
International Journal of Molecular Sciences
453 papers in training set
Top 17%
0.7%
25
iScience
1063 papers in training set
Top 36%
0.7%
26
Journal of Translational Medicine
46 papers in training set
Top 3%
0.7%
27
Genome Biology
555 papers in training set
Top 8%
0.7%
28
Frontiers in Oncology
95 papers in training set
Top 4%
0.5%
29
Journal of Medical Virology
137 papers in training set
Top 5%
0.5%
30
Laboratory Investigation
13 papers in training set
Top 0.4%
0.5%