Back

Validation and analysis of 12,000 AI-driven CAR-T designs in the Bits to Binders competition

Kosonocky, C. W.; Abel, A. M.; Feller, A. L.; Cifuentes Rieffer, A. E.; Woolley, P. R.; Lala, J.; Barth, D. R.; Gardner, T.; Ekker, S. C.; Ellington, A. D.; Wierson, W. A.; Marcotte, E. M.

2026-03-03 bioinformatics
10.64898/2026.03.03.709355 bioRxiv
Show abstract

Artificial intelligence (AI) methods for proteins have advanced rapidly, improving structure prediction and design, particularly for de novo binders. However, most evaluations emphasize binding affinity rather than higher-order biological function. We present Bits to Binders, a global competition benchmarking de novo binder design in the context of chimeric antigen receptor (CAR) T cells. Teams from 42 countries submitted 12,000 designs of 80-amino acid binders targeting human CD20 as CAR binding domains. Designs were screened by pooled CAR-T proliferation, identifying 707 designs exhibiting significant CD20-specific enrichment, with team hit rates from 0.6% to 38.4%. Top-performing candidates were validated as individual constructs, measuring CD20-specific proliferation, expansion, cytokine production, and targeted cell lysis. We identified common design methodologies and factors correlated with DNA synthesis, expression, and target-specific T cell activation which nearly double the success rates when applied as a retrospective filter. We release this dataset as an open resource, with practical recommendations to support more effective AI-driven binder design.

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
Nature Communications
4913 papers in training set
Top 8%
17.1%
2
Cell Systems
167 papers in training set
Top 2%
8.0%
3
Nucleic Acids Research
1128 papers in training set
Top 3%
6.2%
4
mAbs
28 papers in training set
Top 0.1%
6.2%
5
Bioinformatics
1061 papers in training set
Top 5%
4.2%
6
Nature Methods
336 papers in training set
Top 3%
3.9%
7
Nature Machine Intelligence
61 papers in training set
Top 0.8%
3.9%
8
Briefings in Bioinformatics
326 papers in training set
Top 2%
3.5%
50% of probability mass above
9
PLOS Computational Biology
1633 papers in training set
Top 11%
3.0%
10
Nature Biotechnology
147 papers in training set
Top 4%
2.0%
11
Structure
175 papers in training set
Top 1%
2.0%
12
Computational and Structural Biotechnology Journal
216 papers in training set
Top 4%
1.8%
13
Bioinformatics Advances
184 papers in training set
Top 3%
1.7%
14
BMC Bioinformatics
383 papers in training set
Top 4%
1.7%
15
Communications Biology
886 papers in training set
Top 10%
1.7%
16
Cell Genomics
162 papers in training set
Top 3%
1.7%
17
Journal of Cheminformatics
25 papers in training set
Top 0.3%
1.7%
18
Journal of Chemical Information and Modeling
207 papers in training set
Top 2%
1.6%
19
Nature
575 papers in training set
Top 11%
1.6%
20
Scientific Reports
3102 papers in training set
Top 63%
1.4%
21
Advanced Science
249 papers in training set
Top 13%
1.3%
22
Frontiers in Immunology
586 papers in training set
Top 5%
1.3%
23
Science
429 papers in training set
Top 16%
1.3%
24
Patterns
70 papers in training set
Top 2%
1.1%
25
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 40%
0.9%
26
Genome Medicine
154 papers in training set
Top 7%
0.9%
27
eLife
5422 papers in training set
Top 54%
0.9%
28
iScience
1063 papers in training set
Top 30%
0.8%
29
npj Systems Biology and Applications
99 papers in training set
Top 2%
0.8%
30
Protein Science
221 papers in training set
Top 2%
0.8%