Back

Using a GPT-5-driven autonomous lab to optimize the cost and titer of cell-free protein synthesis

Smith, A. A.; Wong, E. L.; Donovan, R. C.; Chapman, B. A.; Harry, R.; Tirandazi, P.; Kanigowska, P.; Gendreau, E. A.; Dahl, R. H.; Jastrzebski, M.; Cortez, J. E.; Bremner, C. J.; Hemuda, J. C. M.; Dooner, J.; Graves, I.; Karandikar, R.; Lionetti, C.; Christopher, K.; Consiglio, A. L.; Tran, A.; McCusker, W.; Nguyen, D. X.; Nunes da Silva, I. B.; Bautista-Ayala, A. R.; McNerney, M. P.; Atkins, S.; McDuffie, M.; Serber, W.; Barber, B. P.; Thanongsinh, T.; Nesson, A.; Lama, B.; Nichols, B.; LaFrance, C.; Nyima, T.; Byrn, A.; Thornhill, R.; Cai, B.; Ayala-Valdez, L.; Wong, A.; Che, A. J.; Thavaraj

2026-02-05 synthetic biology
10.64898/2026.02.05.703998 bioRxiv
Show abstract

We used an autonomous lab, comprising a large language model (LLM) and a fully automated cloud laboratory, to optimize the cost efficiency of cell-free protein synthesis (CFPS). By conducting iterative optimization, the LLM-driven autonomous lab was able to achieve a 40% reduction in the specific cost ($/g protein) of CFPS relative to the state of the art (SOTA). This cost reduction was accompanied by a 27% increase in protein production titer (g/L). Iterative experimental design, experiment execution, data capture and analysis, data interpretation, and new hypothesis generation were all handled by the LLM-driven autonomous lab. The interface between OpenAIs GPT-5 LLM and Ginkgo Bioworks cloud laboratory incorporated built-in validation checks via a Pydantic schema to ensure that AI-designed experiments were properly specified. Experimental designs were translated into programmatic specification of multi-instrument biological workflows by Ginkgos Catalyst software and executed on Ginkgos Reconfigurable Automation Cart (RAC) laboratory automation platform, with human intervention largely limited to reagent and consumables preparation, loading and unloading. By integrating LLMs with programmatic control of a cloud lab, we demonstrate that an LLM-driven autonomous lab can successfully perform a real-world scientific task, highlighting the potential of AI-driven autonomous labs for scientific advancement.

Matching journals

The top 9 journals account for 50% of the predicted probability mass.

1
ACS Synthetic Biology
256 papers in training set
Top 0.4%
12.3%
2
Communications Biology
886 papers in training set
Top 0.1%
10.1%
3
Nature Communications
4913 papers in training set
Top 26%
6.8%
4
ACS Omega
90 papers in training set
Top 0.2%
4.8%
5
Computational and Structural Biotechnology Journal
216 papers in training set
Top 2%
3.6%
6
Synthetic Biology
21 papers in training set
Top 0.1%
3.6%
7
Advanced Science
249 papers in training set
Top 6%
3.2%
8
iScience
1063 papers in training set
Top 6%
3.1%
9
Nucleic Acids Research
1128 papers in training set
Top 7%
2.7%
50% of probability mass above
10
npj Systems Biology and Applications
99 papers in training set
Top 0.7%
2.7%
11
eLife
5422 papers in training set
Top 31%
2.7%
12
Scientific Reports
3102 papers in training set
Top 46%
2.6%
13
PLOS ONE
4510 papers in training set
Top 48%
2.1%
14
Metabolic Engineering
68 papers in training set
Top 0.3%
2.1%
15
Bioinformatics
1061 papers in training set
Top 6%
2.1%
16
The Plant Journal
197 papers in training set
Top 2%
1.7%
17
Frontiers in Bioengineering and Biotechnology
88 papers in training set
Top 1%
1.7%
18
BMC Bioinformatics
383 papers in training set
Top 5%
1.7%
19
Metabolites
50 papers in training set
Top 0.6%
1.5%
20
Nature Machine Intelligence
61 papers in training set
Top 2%
1.5%
21
Briefings in Bioinformatics
326 papers in training set
Top 5%
1.3%
22
Journal of Chemical Information and Modeling
207 papers in training set
Top 2%
1.3%
23
Analytical Chemistry
205 papers in training set
Top 2%
1.3%
24
Cell Systems
167 papers in training set
Top 9%
1.1%
25
Molecular Systems Biology
142 papers in training set
Top 1%
0.9%
26
Journal of Proteome Research
215 papers in training set
Top 2%
0.9%
27
Frontiers in Pharmacology
100 papers in training set
Top 4%
0.9%
28
Communications Chemistry
39 papers in training set
Top 0.8%
0.9%
29
PLOS Computational Biology
1633 papers in training set
Top 23%
0.8%
30
SLAS Technology
11 papers in training set
Top 0.2%
0.8%