Optimization of PURE system composition using automation and active learning

Bernard-Lapeyre, Y.; Cleij, C.; Sakai, A.; Huguet, M.-J.; Danelon, C.

2026-03-25 synthetic biology

10.64898/2026.03.23.713685 bioRxiv

Show abstract

Protein synthesis using recombinant elements (PURE) system has been widely applied in various biological research fields and synthetic cell construction. Optimization efforts to enhance the PURE system performance by adjusting its individual components have remained limited to the expression of single genes with a small number of molecular compositions tested, making it difficult to link component composition to system-level performance across different DNA contexts. Here, we combine automated acoustic liquid handling with an active learning framework to explore broadly the compositional landscape of PURE system. By grouping the 69 individual components (including proteins and tRNAs) into 21 functional sets and iteratively guiding experiments with active learning, we rapidly identify improved compositions and demonstrated up to 3-fold enhancement in protein yield and translation rate for a single reporter gene. We further show that optimization drivers differ between low and high DNA concentrations, revealing that optimal PURE compositions are DNA concentration-dependent. We then apply this optimization strategy to enhance the expression of a 41-kb synthetic chromosome containing 15 genes by maximizing the fluorescence intensities of two reporter proteins. While a 3-fold improvement could be reached on the two gene products guiding learning, a full proteomic analysis revealed that optimization is gene-specific, i.e., changes in PURE system compositions differently impact the amounts of synthesized proteins encoded on the same DNA template. Together, this work establishes active learning as an efficient strategy to navigate the high-dimensional PURE compositional space and provides mechanistic insight into DNA context-dependence of gene expression optimization.

Optimization of PURE system composition using automation and active learning

Matching journals