Back

The performance of in silico prediction tools for variant curation in a panel of cancer genes.

Nelson, N.; Niaz, A.; Fairfax, K.; Bryan, T. M.; Lucas, S.; Dickinson, J.

2025-07-30 genetic and genomic medicine
10.1101/2025.07.29.25331316 medRxiv
Show abstract

Rare single base pair changes in genes are an important cause of disease, as they can reside in key regions of the gene influencing biological function by impacting the protein conformation and protein interactions. Generation of the necessary experimental evidence to define the outcome of the presence of these gene variants is time consuming and costly. These challenges have led to the development of a plethora of in silico prediction tools. These tools frequently use similar sources of information and are trained on overlapping multi-gene truth datasets. However, frequently there has been no quantitative validation of the performance of these in silico tools for individual genes. Here we have applied the ClinGen Sequence Variant Interpretation Working Groups recommended in silico score thresholds to a set of predisposition gene variants with established pathogenicity/benignity. Of the genes assessed (BRCA1, BRCA2, TP53, TERT and ATM), in silico tool predictions showed inferior sensitivity (<65%) for pathogenic TERT variants and inferior sensitivity ([&le;]81%) for benign TP53 variants. This validation study highlights in silico tool performance can be gene-specific and is dependent on the training set on which the algorithm is built. Where there are sufficient numbers of established benign and pathogenic missense variants based on clinical and functional evidence, the use of in silico tool scores should be validated for individual genes. For genes where this is not possible and gene-agnostic in silico score cut offs are used, consideration of missense variant-protein structural impact relationships is suggested.

Matching journals

The top 3 journals account for 50% of the predicted probability mass.

1
Human Mutation
29 papers in training set
Top 0.1%
33.4%
2
Genetics in Medicine
69 papers in training set
Top 0.2%
12.7%
3
Journal of Medical Genetics
28 papers in training set
Top 0.1%
10.2%
50% of probability mass above
4
Scientific Reports
3102 papers in training set
Top 23%
4.9%
5
npj Genomic Medicine
33 papers in training set
Top 0.1%
3.6%
6
Frontiers in Bioinformatics
45 papers in training set
Top 0.1%
2.9%
7
Genome Medicine
154 papers in training set
Top 3%
2.6%
8
Frontiers in Molecular Biosciences
100 papers in training set
Top 1%
2.1%
9
European Journal of Human Genetics
49 papers in training set
Top 0.4%
2.1%
10
The American Journal of Human Genetics
206 papers in training set
Top 2%
1.7%
11
International Journal of Molecular Sciences
453 papers in training set
Top 9%
1.5%
12
PLOS ONE
4510 papers in training set
Top 56%
1.5%
13
Human Genetics
25 papers in training set
Top 0.2%
1.5%
14
Human Genomics
21 papers in training set
Top 0.2%
1.3%
15
PLOS Computational Biology
1633 papers in training set
Top 19%
1.2%
16
Genomics
60 papers in training set
Top 2%
1.2%
17
The Journal of Molecular Diagnostics
36 papers in training set
Top 0.3%
1.0%
18
Human Genetics and Genomics Advances
70 papers in training set
Top 0.8%
0.8%
19
Cancer Research
116 papers in training set
Top 4%
0.7%
20
Journal of the American Medical Informatics Association
61 papers in training set
Top 2%
0.7%
21
JCO Clinical Cancer Informatics
18 papers in training set
Top 1.0%
0.7%
22
npj Breast Cancer
18 papers in training set
Top 0.3%
0.5%
23
Genetic Epidemiology
46 papers in training set
Top 1%
0.5%