Back

Evo 2 Predicts Cardiomyopathy-Associated Variants and Elucidates Their Underlying Mechanisms

kurozumi, a.; otsuka, n.; Masamichi, I.; kawakami, t.; Isagawa, T.; kodera, s.; takeda, n.

2026-05-17 genomics
10.64898/2026.05.15.725304 bioRxiv
Show abstract

BackgroundAlthough advances in next-generation sequencing have accelerated the identification of genetic variants in cardiomyopathy, interpreting variants of uncertain significance (VUS) remains a clinical challenge. Evo 2 is a high-resolution genomic artificial intelligence model capable of predicting pathogenicity across large sequence contexts and enabling mechanistic interpretation; however, its application in cardiovascular genetics is limited. Here, we evaluated the utility of Evo 2 for assessing the pathogenicity and underlying mechanisms of cardiomyopathy-associated variants. MethodsWe used Evo 2 to predict the pathogenicity of single-nucleotide variants in cardiomyopathy-related genes listed on ClinVar. We assessed the ability of the model to identify characteristic structural features in both coding and noncoding regions using internal representation such as embeddings, and to infer the molecular mechanisms of variants within these regions. ResultsEvo 2 demonstrated high predictive accuracy for pathogenicity, achieving an AUROC of 0.983 and an AUPRC of 0.915. Notably, sparse autoencoders (SAEs) from embeddings identified features corresponding to higher-order structural features, including coiled-coil and actin-binding domains characteristic of cardiomyopathy-related proteins, and accurately detected mutations known to disrupt these domains. The model recognized the binding motif of the cardiac-enriched transcription factor TBX5 with SAEs and accurately predicted a single-nucleotide polymorphism affecting TBX5 binding affinity after supervised fine-tuning. ConclusionsEvo 2 demonstrated strong performance for both predicting pathogenicity and extracting biological features of cardiomyopathy-associated variants. It may represent a powerful emerging tool for evaluating VUS in cardiovascular medicine.

Matching journals

The top 10 journals account for 50% of the predicted probability mass.

1
Genome Medicine
154 papers in training set
Top 0.4%
10.6%
2
PLOS Computational Biology
1633 papers in training set
Top 4%
8.6%
3
Bioinformatics
1061 papers in training set
Top 4%
4.9%
4
Human Genomics
21 papers in training set
Top 0.1%
4.4%
5
Frontiers in Genetics
197 papers in training set
Top 1%
4.4%
6
BMC Medical Genomics
36 papers in training set
Top 0.1%
3.7%
7
Genetics in Medicine
69 papers in training set
Top 0.4%
3.7%
8
Scientific Reports
3102 papers in training set
Top 34%
3.7%
9
Circulation
66 papers in training set
Top 1.0%
3.7%
10
European Heart Journal - Digital Health
15 papers in training set
Top 0.2%
3.1%
50% of probability mass above
11
Database
51 papers in training set
Top 0.2%
2.7%
12
Circulation Research
39 papers in training set
Top 0.4%
2.7%
13
BMC Genomics
328 papers in training set
Top 1%
2.5%
14
Human Genetics
25 papers in training set
Top 0.1%
2.1%
15
Computational and Structural Biotechnology Journal
216 papers in training set
Top 3%
1.9%
16
Bioinformatics Advances
184 papers in training set
Top 3%
1.7%
17
European Journal of Human Genetics
49 papers in training set
Top 0.6%
1.7%
18
PLOS ONE
4510 papers in training set
Top 52%
1.7%
19
Journal of Translational Medicine
46 papers in training set
Top 0.8%
1.7%
20
BioData Mining
15 papers in training set
Top 0.3%
1.7%
21
Nature Machine Intelligence
61 papers in training set
Top 3%
1.2%
22
European Heart Journal
16 papers in training set
Top 0.6%
1.0%
23
Genetic Epidemiology
46 papers in training set
Top 0.7%
0.9%
24
BMC Bioinformatics
383 papers in training set
Top 6%
0.9%
25
Circulation: Genomic and Precision Medicine
42 papers in training set
Top 1%
0.9%
26
NAR Genomics and Bioinformatics
214 papers in training set
Top 3%
0.9%
27
iScience
1063 papers in training set
Top 26%
0.9%
28
Journal of the American Heart Association
119 papers in training set
Top 4%
0.8%
29
Patterns
70 papers in training set
Top 2%
0.8%
30
Journal of Personalized Medicine
28 papers in training set
Top 1%
0.8%