Back

Harnessing AI to Build Virtual Cells

Cheng, X.; Li, P.; Guo, H.; Liang, Y.; Gong, J.; de Vazelhes, W.; Gou, C.; Xie, P.; Song, L.; Xing, E. P.

2026-04-30 bioinformatics
10.64898/2026.04.11.717183 bioRxiv
Show abstract

A virtual cell is a world model of a cell: a computational system that predicts, simulates and programs cellular processes across modalities and scales. An important path toward this goal is to model how genetic and chemical perturbations give rise to transcriptional responses, a core capability for disease understanding and drug discovery. However, current approaches remain expert-intensive, relying on iterative manual model design, training and debugging over months. Here we present VCHarness, an autonomous AI system that constructs perturbation-response models by combining an AI coding agent with multimodal biological foundation models. The system explores large spaces of architectures and training pipelines with minimal human intervention, iteratively generating, evaluating and refining candidate models. Across multiple perturbation-response benchmarks, VCHarness identifies architectures that outperform expert-designed approaches while reducing development time from months to days. It further uncovers non-obvious architectural patterns associated with improved performance, indicating that automated search can extend beyond conventional design strategies. These results suggest a shift from manually engineered models toward autonomous systems for constructing components of virtual cell world models, enabling scalable and data-driven exploration of cellular systems.

Matching journals

The top 1 journal accounts for 50% of the predicted probability mass.

1
Cell Systems
167 papers in training set
Top 0.1%
52.0%
50% of probability mass above
2
PLOS Computational Biology
1633 papers in training set
Top 7%
4.9%
3
Nature Communications
4913 papers in training set
Top 44%
2.7%
4
iScience
1063 papers in training set
Top 9%
2.4%
5
Nature Methods
336 papers in training set
Top 4%
2.4%
6
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 28%
2.1%
7
Nature
575 papers in training set
Top 9%
2.1%
8
Bioinformatics Advances
184 papers in training set
Top 3%
1.8%
9
Bioinformatics
1061 papers in training set
Top 7%
1.7%
10
PLOS ONE
4510 papers in training set
Top 53%
1.7%
11
Cell Genomics
162 papers in training set
Top 3%
1.7%
12
Science
429 papers in training set
Top 14%
1.7%
13
Nature Machine Intelligence
61 papers in training set
Top 2%
1.5%
14
eLife
5422 papers in training set
Top 50%
1.1%
15
Cell
370 papers in training set
Top 15%
1.0%
16
Scientific Reports
3102 papers in training set
Top 69%
1.0%
17
Nature Computational Science
50 papers in training set
Top 2%
0.8%
18
npj Digital Medicine
97 papers in training set
Top 4%
0.7%
19
Nature Biotechnology
147 papers in training set
Top 8%
0.7%
20
Nucleic Acids Research
1128 papers in training set
Top 18%
0.7%
21
BMC Bioinformatics
383 papers in training set
Top 7%
0.7%
22
Genome Research
409 papers in training set
Top 4%
0.7%
23
Frontiers in Genetics
197 papers in training set
Top 9%
0.7%
24
Frontiers in Computational Neuroscience
53 papers in training set
Top 2%
0.7%
25
Science Advances
1098 papers in training set
Top 31%
0.7%
26
Advanced Science
249 papers in training set
Top 20%
0.7%
27
Computers in Biology and Medicine
120 papers in training set
Top 5%
0.6%
28
Computational and Structural Biotechnology Journal
216 papers in training set
Top 11%
0.6%
29
Development
440 papers in training set
Top 4%
0.6%
30
ACS Synthetic Biology
256 papers in training set
Top 4%
0.6%