mAbs
○ Informa UK Limited
Preprints posted in the last 90 days, ranked by how well they match mAbs's content profile, based on 28 papers previously published here. The average preprint has a 0.02% match score for this journal, so anything above that is already an above-average fit.
Zhao, Y.; Yilmaz, M.; Lee, E.; Teh, C.; Guo, L.; Sonmez, K.; Giancardo, L.; Trang, G.; Xu, F.; Espinosa-Cotton, M.; Cheung, N.-K.; Kim, J.; Cheng, X.
Show abstract
Therapeutic antibody discovery remains slow and resource-intensive, with traditional methods providing limited control over epitope selection. We present a workflow for de novo nanobody design applied to a novel Desmoplastic Small Round Cell Tumor target encompassing four stages: (1) epitope identification guided by our hotspot recommendation agent using physical chemistry-based structure and sequence analysis tools with two curated databases (IEDB, PFAM), (2) de novo nanobody generation using three independent methods (RFantibody, IgGM, mBER) across multiple predicted antigen structures and nanobody frameworks, (3) multi-metric scoring including structural metrics from folding models, and in silico binding affinity from our sequence-based predictor, (4) high-throughput yeast surface display (YSD) screening followed by surface plasmon resonance (SPR) characterization of the specific binders. We generated 288,000 nanobody designs spanning eight target epitope regions and three variable domains of heavy chain-only antibody (VHH) frameworks. Multi-objective Pareto filtering with our candidate selection agent yielded 100,000 candidates for YSD screening with fluorescence-activated cell sorting (FACS). Of 116 enriched candidates advanced to SPR characterization, 46/116 (39.7%) produced reliable kinetic fits with Rmax [≥] 30 RU, yielding KD values from 0.66 nM to 305 nM (median 31.7 nM). These results show that an agent-guided computational workflow can design nanomolar to sub-nanomolar nanobody binders against a novel target without experimental structure or prior antibody information.
Zhou, Q.; Chomicz, D.; Melvin, D.; Griffiths, M.; Yahiya, S.; Reece, S.; Le Pannerer, M.-M.; Krawczyk, K.
Show abstract
Preclinical antibody discovery relies on progressive screening and down-selection of candidate antibodies from large immune repertoires, yet this critical process is poorly represented in existing public databases. Here we introduce KyDab (Kymouse Antibody Database), a well-curated database of antibody discovery selection data generated using standardized workflows on the Kymouse humanized mouse platform. The current release includes 11 Kymouse platform mice immunisation studies covering 51 immunogens, more than 120,000 paired heavy-light chain sequences, and binding measurements for a selected subset of experimentally characterized clones. By capturing full-funnel selection data with consistent metadata and both positive and negative experimental outcomes, KyDab provides a valuable data resource for the development and evaluation of artificial intelligence models for antibody discovery. KyDab is accessible https://kydab.naturalantibody.com, and the database will be continuously updated as new datasets become available.
Kim, C.; Gaballa, M.; Lee, D.; Jouanguy, E.; Zhang, S.-Y.; Casanova, J.-L.; Yatim, A.
Show abstract
The binding of transmembrane (TM) ligands to their cognate TM receptors on neighboring cells governs intercellular adhesion and direct cell-cell communication. However, these interactions are difficult to study in vitro because they depend on membrane presentation, ligand orientation, receptor clustering, and avidity, features often not captured by soluble recombinant ligands or cell-free assays. Here, we describe a flow cytometry-based assay using fluorescent, lentiviral-derived virus-like particles (VLPs) displaying TM ligands to quantify binding to their receptors on target cells. Fluorescent VLPs are generated in-house by plasmid transfection in HEK293T cells and enable direct fluorescent detection without fluorochrome-conjugated secondary antibodies. The system is modular and readily accommodates engineered ligand constructs, including patient-derived variants. We applied this platform to generate ICAM-1-displaying fluorescent VLPs and to study human LFA-1 function in patient-derived leukocytes. This protocol provides a detailed workflow for VLP production and in vitro binding assays, offering a simple, quantitative, and cost-effective approach for studying TM ligand-receptor interactions in a membrane context. The system is well suited for mechanistic studies, functional assessment of patient-derived variants, and direct binding assays using patient-derived cells. Integrating the assay into multicolor flow cytometry panels enables simultaneous immunophenotyping and quantification of up to four ligand-receptor interactions at single-cell resolution. Key featuresO_LIQuantifies TM ligand-receptor binding in a membrane context using fluorescent VLPs and flow cytometry. C_LIO_LIFully in-house, modular system based on plasmid transfection in HEK293T cells, without reliance on recombinant ligands or fluorochrome-conjugated secondary antibodies. C_LIO_LISupports testing of engineered ligand variants, including patient-derived alleles, and direct functional studies on patient-derived cells. C_LIO_LICompatible with multicolor flow cytometry panels, enabling simultaneous immunophenotyping and quantification of up to four ligand-receptor interactions at single-cell resolution. C_LI Graphical overview O_FIG O_LINKSMALLFIG WIDTH=197 HEIGHT=200 SRC="FIGDIR/small/725198v1_ufig1.gif" ALT="Figure 1"> View larger version (55K): org.highwire.dtl.DTLVardef@a43069org.highwire.dtl.DTLVardef@166491borg.highwire.dtl.DTLVardef@49c7d4org.highwire.dtl.DTLVardef@1de36a0_HPS_FORMAT_FIGEXP M_FIG C_FIG
Kim, Y.; Kwon, H.; Hong, J.; Kang, C. K.; Park, W. B.; Kim, H.-R.; Lee, C.-H.
Show abstract
BackgroundCombinatorial fragment antigen-binding (Fab) libraries encode an immense heavy-light chain pairing space, often exceeding 10{superscript 1} possible combinations, which far surpasses the diversity that can be experimentally constructed and screened in display systems. As a result, direct Fab screening samples only a small fraction of the theoretical search space, creating a practical bottleneck for functional binder discovery. ResultsHere, we frame Fab discovery as a staged search problem by decoupling heavy-chain (HC) and light-chain (LC) exploration. We implemented a sequential HC preselection-remating workflow in yeast surface display, in which antigen-reactive HC variants are first enriched and subsequently recombined with a diverse LC repertoire to reconstruct a focused Fab library. In a SARS-CoV-2 spike-targeted campaign, HC and LC libraries of 2.05 x 10 and 2.33 x 10 members corresponded to a theoretical pairing space of approximately 4.8 x 10{superscript 1} combinations. Sequential HC enrichment followed by LC remating allowed recovery of multiple functional Fab clones from a tractable library scale of approximately 10, including clones that shared a common HC scaffold but carried distinct LC partners. A representative recombinant IgG output showed broad but heterogeneous spike/RBD binding, measurable pseudovirus neutralization activity (EC = 11.1 nM), and compatibility with standard early biophysical characterization after full-length IgG reformatting. ConclusionsThese results provide proof of principle that combinatorial Fab discovery can be approached as a staged exploration problem under realistic library-size constraints. By focusing downstream Fab reconstruction on an antigen-compatible HC subspace, sequential HC preselection followed by LC remating offers a practical strategy for exploring otherwise intractable antibody pairing landscapes in eukaryotic display systems.
Hossain, D.; Abir, F. A.; Zhang, S.; Chen, J. Y.
Show abstract
Despite major advances in computational antibody engineering, no systematic comparison of modern open-source LLM backbone families for antibody sequence generation exists, nor is it known whether architectural differences matter at compact model scales. In this study, five compact transformer variants inspired by prominent open-source LLM families (Llama-4, Gemma-3, DeepSeek-V3, Mistral 7B, and NVIDIA Nemotron-3) were customized and trained from scratch for de novo VH single-domain antibody (sdAb) design. All five models were pretrained from scratch on 15 million sequences from the Observed Antibody Space (OAS) database. Pretraining yielded uniformly high generative fidelity across architectures: sequence diversity 0.507-0.516 (CV=0.8%), uniqueness approaching 1.0, and novelty 0.925-0.977 (CV=2.2%). The models were subsequently fine-tuned on disease-stratified repertoires spanning SARS-CoV-2 (n=4,688), HIV (n=430), HER2 (n=22,778), and Ebola virus (n=2,868). Structural assessment of top-ranked candidates of those case studies via AlphaFold-2, Boltz-2, RoseTTAFold-2, and ESMFold produced mean pLDDT scores of 92.88{+/-}1.54 to 93.77{+/-}2.16, with no statistically significant inter-model differences (Kruskal-Wallis H=2.06, p>0.05; N=100), indicating no statistically detectable difference was observed across architectures at this compressed scale in a single-seed experiment, suggesting that generative capacity at this parameter regime is primarily determined by training data and model scale rather than family-specific design elements at this scale. Computational docking yielded predicted binding free energies of -36.34 to -65.60 kcal/mol; independent biological rigor validation through IMGT-defined CDR-H3 extraction, BLASTp novelty assessment, and NetMHCIIpan 4.3 MHC-II immunogenicity profiling collectively confirmed antigen-binding loop novelty (CDR-H3 identity 0-29% to closest database hits), germline-consistent humanness (77-90% VH germline content), and immunogenically silent antigen-binding surfaces with no strong MHC-II binders detected across CDR regions in any candidate. We further introduce a proof-of-concept agentic evaluation pipeline leveraging the Model Context Protocol (MCP) with Claude Sonnet 4.6, enabling automated structural profiling and candidate prioritization across disease targets.
White, W. L.; Moseley, E.; Tremblay, J. M.; Reilly, J.; Da'Darah, A. A.; Skelly, P.; Cowen, L. J.; Shoemaker, C. B.
Show abstract
Nanobodies have recently emerged as alternatives to classical antibodies in therapeutic and diagnostic contexts from parasites to bacteria to viruses, promising improved stability and simpler manufacturing. To improve nanobody discovery efficiency, we developed an integrated experimental and computational pipeline for detailed characterization of the target binding properties of complete alpaca immune repertoires using our custom Nanobody Meta-clustering Analysis Platform (NanoMAP). We tested our pipeline on three distinct pools of targets, immunizing two alpacas with each pool and generating cDNA and phage display libraries from their immune repertoires. We then panned the phage libraries on each target. To produce more detailed binding information, we performed panning variations using subunits, natural variants, intact pathogens, and binding site competitors. Deep sequencing reads from nanobody libraries before and after each panning were pooled and analyzed with NanoMAP to identify nanobody clonal families and assess their levels of enrichment from the library in each panning, reflecting their affinities. NanoMAP outperformed standard clustering methods, producing clonal families that are coherent in sequence and function and detecting rare but high affinity families. By aggregating sequencing data within clonal families, NanoMAP produced reliable and rich data on nanobody repertoire binding phenotypes for each antigen, enhancing nanobody discovery capabilities.
Flores-Mora, F. E.; Brodsky, J.; Cerna, G. M.; Tse, A.; Hoover, R. L.; Bartelle, B. B.
Show abstract
Despite >50 years of methods development, specific antibodies are still generated at low throughput and remain in high demand across biotechnology. Most biologics and immunoprobes are monoclonal antibodies, developed using a combination of inoculating animals with a target antigen, engineered candidate libraries, and multiple rounds of selection using phage or yeast display. Here we introduce a synthetic biology scheme to eliminate the need for nearly all of these steps, by combining Surface display on E. coli and Phage display with the microvirus {Phi}X174, Assisting Continuous Evolution (SurPhACE). Instead of building libraries for screening, SurPhACE runs a closed evolutionary program. A typical experiment can have 1011 mutant candidates under active selection, with complete turnover of the mutant population every 30min, or >5x1012 unique mutants per day, using less than 100mL of bacterial culture media. We demonstrate SurPhACE for optimizing a nanobody to a related epitope, and develop novel nanobodies for an arbitrary target using a minimal starting library to establish a proof of concept and identify best practices for this scalable method for generating protein binders.
de Kanter, J. K.; Smorodina, E.; Minnegalieva, A.; Arts, M.; Blaabjerg, L. M.; Frolenkova, M.; Rawat, P.; Wolfram, L.; Britze, H.; Wilke, Y.; Weissenborn, L.; Lindenburg, L.; Engelhart, E.; McGowan, K. L.; Emerson, R.; Lopez, R.; van Bemmel, J. G.; Demharter, S.; Spreafico, R.; Greiff, V.
Show abstract
Accurately modeling antibody-antigen interactions requires distinguishing intrinsic binding affinity ("protein-interaction") from protein biophysical properties ("protein-quality"), including folding, stability, and expression. However, high-throughput mutational measurements commonly used to train and benchmark computational models often conflate these effects, obscuring the true determinants of molecular recognition. Here, we present an experimental and analytical framework to disentangle protein-interaction effects from protein-quality effects in single-domain antibody (VHH)-antigen binding. Using a large-scale deep mutational scanning (DMS) dataset spanning four VHH-antigen complexes, with single and double mutations in both partners, we introduce control binders to quantify protein-quality changes independently of protein-interaction. This enables decomposition of experimentally measured affinity into protein-interaction and protein-quality components at scale. Leveraging the disentangled dataset, we evaluated state-of-the-art structure- and sequence-based models for protein-quality and protein-interaction prediction and show that their performance largely reflects protein-quality rather than protein-interaction effects. Our results highlight a major confounder in current datasets and suggest that accounting for protein-quality will be essential for training next-generation affinity-prediction models. Nomenclature Antibody related termsO_LIPrimary VHH: The VHH of a VHH-antigen complex for which the paratope and the epitope weremutated. C_LIO_LIControl VHH: A second VHH that binds to the same antigen as the primary VHH but has non-overlapping epitope positions and therefore does not bind to any of the mutated antigen positions. C_LI Affinity-related termsO_LIReal Affinity: "The strength of the interaction between two [...] molecules that bind reversibly (interact)" 1. In the context of antibody-antigen binding, it quantifies interactions between active proteins (which are expressed and correctly folded 2 and are therefore functionally and biologically active (see below). It is commonly quantified by the equilibrium dissociation constant, KD. C_LIO_LIObserved affinity ({degrees}KD): The interaction strength experimentally measured between two molecules. Unlike real affinity, this value is confounded by the biophysical properties of the individual binding partners, specifically their folding, stability, and expression levels. Consequently, the observed affinity often differs from the real/intrinsic affinity if a significant fraction of the protein population is inactive 3. NOTE: Unless otherwise specified, {degrees}KD is reported in - log10 space. For example, a {degrees}KD of -9 corresponds to 10-9M or 1nM. C_LIO_LIChange in observed affinity ({Delta}{degrees}KD): The shift in the observed affinity between two proteins upon mutation, reported as the log10-transformed fold change. A value of 1 reflects a 10-fold difference, a value of 2 a 100-fold difference, etc. This aggregate change resolves into two distinct biophysical components 2, 4: O_LIProtein-interaction change: The change in the intrinsic thermodynamic affinity between the two binding partners, each in its active state (i.e., the specific change in interface Gibbs free energy because both enthalpy and entropy are considered). C_LIO_LIProtein-quality change: The change in the fraction of the mutated protein population that is biologically active - meaning it is expressed, correctly folded, and stable 2, 5. O_LIFolding: The process that guides the polypeptide chain toward its native conformation, which is a prerequisite for forming a functional binding site. C_LIO_LIStability: The thermodynamic capacity to maintain the folded structure over time and under physiological conditions. Stability (decrease in Gibbs free energy from the unfolded to the folded state) ensures the binding interface remains intact and prevents competing processes such as aggregation 6. C_LIO_LIExpression: The steady-state abundance of the protein. This is largely dependent on proper folding and stability, as cellular quality control mechanisms degrade proteins that fail to fold or remain stable at functional concentrations. C_LI C_LI C_LIO_LIChange in relative affinity ({Delta}{Delta}{degrees}KD): the difference between the {Delta}{degrees}KD of the primary VHH compared to the control VHH for a given epitope mutation. C_LI Model-related termsO_LIESM-IF1 sc: Single-chain (sc) structure-conditioned inverse folding model (ESM-IF1), using the isolated monomer structure of the mutated protein: either the VHH or the antigen 7. C_LIO_LIESM-IF1 mc: Multi-chain (mc) structure-conditioned model (ESM-IF1), using the full complex structure (both antibody and antigen) 7. C_LIO_LIStability prediction score: Score that represents the predicted change in stability based on a single mutation, normally represented as {Delta}{Delta}G. C_LI
Hoormann, M. J.; Becza, N.; Yao, L.; Kuerten, S.; Tary-Lehmann, M.; Sautto, G. A.; Lehmann, P. v.; Kirchenbaum, G. A.
Show abstract
The biological efficacy of an antibody is largely defined by its affinity. Moreover, because the binding affinity of an antibody can span orders of magnitude, each antigen-specific B cell would not be expected to contribute equally to humoral defense: high-affinity antibodies are likely to possess increased potency in comparison to those with lower affinities. Hence, assessing the affinity spectrum of a persons antigen-specific B cell repertoire would provide valuable information on their immune competence. Currently, cloning and expression of large numbers of monoclonal antibodies (mAbs) per test subject would be required to gain such insights, but this is impractical in the context of large-scale immune monitoring efforts. Here, we introduce a variant of the B cell ImmunoSpot assay that can simultaneously assess the relative affinity distribution of hundreds of individual B cells in a test sample. Additionally, we also demonstrated its suitability for high-throughput assay workflows that require minimal labor and exploit machine-assisted image analysis software tools. Specifically, as proof of principle, we verified that B cell hybridomas secreting mAbs of different affinities for the SARS-CoV-2 Spike protein could readily be distinguished through simple titration of the soluble antigen detection probe. Furthermore, using this assay methodology we provide evidence for affinity maturation within the Spike-specific memory B cell repertoire following a second COVID-19 mRNA vaccination. Collectively, we introduce a high-throughput suitable and scalable methodology with the potential of filling a major gap in the immune monitoring field: characterizing the affinity distribution of antigen-specific B cells in large study cohorts.
Agu, C. V.; Martelly, W.; Cook, R. L.; Gushgari, L. R.; Kesiraju, S.; Moreno, S.; Yapici, E.; Mohan, M.; Takulapalli, B.
Show abstract
Epitope mapping is central to rational antibody drug design, affinity optimization and the anticipation of therapeutic resistance mechanisms. Here, we demonstrate the use of Sensor Integrated Proteome on Chip (SPOC) technology for single amino acid resolution epitope mapping. By generating high throughput (HTP) binding kinetics data, we identify important residues within the target epitope whose mutations alter drug-target interactions. The SPOC platform integrates simultaneous HTP cell-free production of folded proteins in nanowells from immobilized plasmid DNAs or linear expression cassettes and capture onto biosensor chips for subsequent label-free binding kinetic analysis using surface plasmon resonance (SPR). The model system comprised the extracellular domain (ECD) of CD20, a membrane-spanning 4-domain family protein, screened against its FDA-approved therapeutic monoclonal antibodies (thAbs) - rituximab and ocrelizumab. Using our proprietary POC protein nanofactory system, a partial deep mutationally scanned (DMS) CD20 ECD mutant library of 79 variants was produced on SPOC biosensor chips via rational single amino acid substitutions of the epitope and surrounding residues with alanine, aspartic acid, lysine, and serine, collectively representing four broad classes of amino acid side chain chemistries: nonpolar, acidic, basic, and polar neutral. The SPOC protein biosensor chip was then screened with both thAbs using SPOC SPR to generate kinetic affinity data, evaluate mutations that led to affinity loss or gain, and ultimately identify critical epitope residues that interface with the antibodies. Most mutations within the rituximab and ocrelizumab epitopes - EPANPSEK and YNCEPANPSEKNSPST, respectively - resulted in complete loss of binding or >25% increase in apparent KD. Notably, N171, P172, and S173 mutations, irrespective of side chain substitution, resulted in complete loss of rituximab binding while at least three diverse side chain substitutions at E168, P169, N171, P172, S173, E174, K175, and T180, led to complete loss of binding for ocrelizumab. These outcomes identify the listed residues as the most critical contact points for their respective antibodies. Interestingly, we also found that functional side-chain substitutions at some residues flanking the epitope increased affinity. This indicates that these non-epitope residues contribute to antibody contact, and that polarity at these sites is a tractable lever for affinity modulation by targeting the corresponding contact residues on the antibody CDRs. The proposed SPOC approach of screening drug candidates against on-chip library of mutationally-scanned therapeutic targets is relevant in the early phase of drug development to resolve epitopes at the residue-level to support more informed down-selection of candidates. It facilitates cost-effective improvement of thAbs, enhancing therapeutic efficacy across a wide array of therapeutic targets, including rare variants that might otherwise lead to therapeutic resistance.
Meda, R. S.; Doshi, J.; Iyer, E.; Shastry, S.; Mysore, V.
Show abstract
Therapeutic nanobodies must combine target binding with biophysical and chemical properties that determine manufacturability, stability, and clinical viability, collectively termed developability, yet most computational design pipelines still treat developability as a post-hoc filter rather than an integrated training objective. We present Aiki-GeNano, a three-stage language-model alignment pipeline for epitope-conditioned nanobody generation that integrates multiple developability signals directly into training, using only sequence information and previously published predictors. Across 65 target epitopes and relative to the supervised baseline, the combined pipeline raised predicted mean melting temperature by 6.6 {degrees}C, halved isomerization-motif severity, reduced deamidation, N-glycosylation sequons and CDR methionine-oxidation motifs, and preserved predicted humanness and solubility. On a shared 10-target GPCR benchmark, Aiki-GeNano achieved the highest predicted melting temperature and the lowest isomerization severity among five contemporary VHH generators. Starting from ProtGPT2 and a 1.35-million-pair binder dataset generated on an mRNA-display platform, the pipeline applies supervised fine-tuning, Direct Preference Optimization on 522,800 pairs ranked by a composite of selectivity, predicted thermal stability, solubility, and humanness, and Group Reward-Decoupled Policy Optimization against six sequence-based rewards (FR2 hydrophobicity, hydrophobic-patch coverage, chemical-liability motifs, Wilkinson-Harrison expression probability, VHH hallmark residues, scaffold integrity). Generated sequences differ from the nearest training sequence by a mean of 8.1-9.0 amino acids out of 126, and two alternative training trajectories converge to distinct amino-acid-composition strategies with similar liability outcomes but different thermal-stability gains, indicating initialization-dependent convergence of the reward-optimized policy. Predicted humanness was preserved at the level of the camelid VHH scaffold of the training library -- a data-side limitation rather than a methodological one, since the framework was effectively constant across all preference pairs. Applicability to the drug discovery and development pipeline, limitations of predicted-property evaluation, and future work are discussed.
Melo, R.; Viegas, T.
Show abstract
Single-chain variable fragments (scFvs) are widely used in diagnostic and therapeutic applications. These antibody fragments comprise two antibody variable domains connected by a flexible peptide linker whose properties critically influence folding, stability, oligomeric state, and antigen-binding. Therefore, careful linker selection represents a key step in scFv design. Guanylyl Cyclase C (GUCY2C) is a tumor-associated cell surface receptor expressed in gastrointestinal malignancies, including more than 90% of colorectal cancer (CRC) cases across all disease stages. Its restricted physiological expression pattern makes GUCY2C an attractive target for immunotherapy and precision oncology therapies. Here, we investigated the structural and functional consequences of incorporating alternative linker designs into an anti-GUCY2C scFv. Using molecular modeling, protein-protein docking, and molecular dynamics (MD) simulations, we evaluated the conformational stability, interdomain organization, and antigen-binding interactions of each construct. Our results provide a dynamic, structure-based assessment of how linker composition influences GUCY2C recognition and scFv structural behavior. Furthermore, this work establishes a computational framework for the rational optimization of GUCY2C-targeted antibody fragments.
Bajgain, Y.; Guo, M.; Hager, K. M.; Nguyen, A. W.; Zhang, Y.; Maynard, J. A.
Show abstract
Antibody-dependent cellular cytotoxicity (ADCC) is a major mechanism of action for many FDA-approved therapeutic antibodies that is driven by interactions between the antibody Fc and Fc{gamma} receptors (Fc{gamma}Rs) on immune effector cells. Murine models used for preclinical antibody evaluation currently have limited predictive value for clinical ADCC performance due to interspecies differences in Fc-Fc{gamma}R interactions. The molecular determinants governing Fc-Fc{gamma}R engagement in mice remain poorly defined, complicating the interpretation of murine ADCC data and its clinical relevance. To address this, we present the high-resolution crystal structure of the receptor that regulates Fc-mediated cytotoxicity in mice, mouse Fc{gamma}RIV, alone and in complex with mouse IgG2a Fc. This complex preserves key features of the human IgG1 Fc-human Fc{gamma}RIIIa interface which mediates ADCC in humans including salt bridges, hydrogen bonds, and a proline sandwich. However, subtle variations in receptor orientation, Fc-Fc{gamma}R electrostatics, and glycan positions reduce human IgG1 Fc- mouse Fc{gamma}RIV binding affinity, resulting in species-restricted Fc-Fc{gamma}R mediated immune responses. Modeling of human IgG1 Fc interactions with mouse Fc{gamma}RIV predicted steric clashes, suggesting opportunities to modulate the interaction. One structure-guided substitution variant of human IgG1, Fchumo, maintains comparable human Fc{gamma}RIIIa engagement with enhanced binding to and activation of mouse Fc{gamma}RIV, relative to human IgG1 Fc. This study provides proof-of-concept for engineering human Fc domains for cross-species Fc{gamma}R recognition and provides a strategic framework to improve the predictive power of in vivo preclinical models.
Grabarczyk, D.; Kocikowski, M.; Parys, M.; Cohen, S. B.; Alfaro, J. A.
Show abstract
MotivationEncoding antibodies (Abs) and nanobodies (Nbs) as mRNA enables in vivo production of therapeutic proteins. However, this approach requires meeting two species-dependent requirements: the mRNA encoding must support efficient expression in the host species, and the encoded protein sequence must resemble the natural Ab repertoire of the recipient species to minimize immunogenicity. These requirements motivate species-conditioned generative models for joint mRNA and protein design. ResultsWe propose SpeciefAI a transformer-based model for multi-species Ab and Nb species sequence-harmonisation by generation of novel Framework Regions (FRs) tailored to input Complementarity-Determining Regions (CDRs). Our model works directly in the mRNA space and learns the correspondence between FRs and CDRs in six species. The model is capable of generating sequences with a highly similar distribution to natural sequences and a mean absolute difference in codon adaptation index (CAI) of 0.013 and 0.033 for humans and dogs respectively. We show that the generated human sequences are highly human (0.95 T20 score) and canine sequences highly canine (0.95 cT20 score). We furthermore demonstrate that we can generate diverse candidate sequences using our method. Availability and ImplementationSource code is available on https://github.com/Dominko/SpeciefAI. OAS and COGNANO data are publicly available on https://opig.stats.ox.ac.uk/webapps/oas/ and https://cognanous.com/datasets/vhh-corpus (preprocessed versions available upon request). Canine data is available on https://zenodo.org/records/18301526.
Ahmed, S.; Devalle, F.; Leisen, L.; Pham, T.; Amofah, B.; Lee, A.; Hutchinson, M.; Chakiath, C.; DiChiara, J.; Farzandh, S.; Kreitz, M.; Hinton, A.; Mody, N.; Dippel, A.; Kaplan, G.; Pouryahya, M.
Show abstract
Antibody-based biologics are expanding rapidly, yet challenges in development from self-association, high viscosity, aggregation, and unfavorable clearance underscore the need for accurate in silico screening. Clone self-interaction biolayer interferometry (CSI-BLI) is a plate-based, low-material assay of weak, reversible self-association that serves as an early proxy for high-concentration viscosity and a complementary predictor of in vivo clearance. In a 246-mAb panel, CSI-BLI moderately correlates with viscosity; further, in hFcRn Tg32 mice (41 antibodies), CSI-BLI strongly associates with clearance. Here, we present an end-to-end framework that distinguishes high versus low self-interacting clones (CSI-BLI class) by coupling a fine-tuned protein language model (ESM-2) with residue-aligned 3D context from AlphaFold-predicted structures encoded as residue graphs. Disentangled multi-stream attention fuses sequence content, chain-aware positional information, and structural signals to capture spatially proximate interactions that are distant in sequence. Edit-distance-controlled splits across 1499 IgGs and 988 VHHs assess generalization. The structure-aware model achieves the highest hold-out performance (VHH-Fc F1 = 0.76; IgG F1 = 0.57), while a sequence-only disentangled variant outperforms a standard PLM baseline without structural inputs. Complementary biophysical feature-based models, built from AlphaFold structures and sequence/structure-derived physicochemical descriptors with cluster-aware selection, deliver robust, interpretable performance (VHH F1 = 0.72; IgG F1 = 0.57), with SHAP analyses highlighting charge/dipole, hydrophobicity, and aggregation-propensity drivers across CDRs and frameworks. This interaction-aware sequence-structure framework, supported by interpretable feature models, is extensible to other developability endpoints and broader protein classification tasks where joint modeling of language-derived representations and residue-level geometry is advantageous.
Stefanius, K.; Raut, S.; Presley, B.; Dave, D. P.
Show abstract
Traditional clonogenic assays remain central to evaluating the self-renewal capacity of tumor cells. However, the assay relies on subjective endpoint measurements, is restricted to two-dimensional monolayer growth, and lacks the single cell resolution required to resolve heterogeneous expansion behaviors. We describe a high-density microwell array-based platform for quantitative assessment of single cell clonogenic growth outcomes, defined by cell count distributions spanning non-dividing, slow-dividing, and fast-dividing three-dimensional colony forming phenotypes. This approach links initial single-cell occupancy to defined growth outcomes across thousands of indexed microwells per well. The platform integrates high-density, low-adhesion microwell arrays within industry standard device plate formats and an automated image analysis pipeline incorporating machine learning, enabling parallel quantification of spatially indexed founder-derived microwells using widely accessible automated imaging systems. The assay was implemented in both 4-well and 96-well plate formats to evaluate reproducibility and scalability across different plate configurations. Using three glioblastoma cell lines as model systems, we demonstrate reproducible single founder occupancy and consistent clonal growth outcome distributions across replicate formats. This integrated microscale assay platform enables systematic quantitative characterization of clonogenic expansion capacity at single cell resolution and is compatible with applications in cancer biology, therapeutic testing, and functional single cell phenotyping. By resolving single-cell persistence, limited expansion and high expansion outcomes within a scalable high-density format, this approach expands the analytical resolution of single cell clonogenic profiling beyond traditional binary colony scoring.
Liu, G.; He, M.; Sun, L.; Cheng, F.; Zhang, Y.
Show abstract
Large language model (LLM) agents have automated tool use in chemistry, but orchestrating multi-step computational biology workflows--spanning structure prediction, protein design, and covalent conjugation--remains manually intensive. Here we present Open Intelligence Hub (OIH), an autonomous LLM-agent platform that dynamically plans and executes 32 containerised tools for protein binder design and antibody-drug conjugate (ADC) prioritization. OIH introduces tier-based decision routing, ipSAE-guided interface filtering, and failure-to-knowledge distillation from 265 curated cases. Across five oncology targets, the agent correctly classified all five evaluated targets and required human correction for hotspot selection in only one case, producing binders ranked by ipSAE (Nectin-4 ipTM = 0.87, HER2 ipTM = 0.85). A controlled ablation suggests that the agents PPI-informed routing yields improved downstream ipTM and ipSAE scores than epitope-guided alternatives. The LLM-agnostic architecture enables deployment with local or commercial models without pipeline changes. All results are computational predictions awaiting experimental validation.
JIA, S.; Lysenko, A.; Boroevich, K. A.; Sharma, A.; Tsunoda, T.
Show abstract
Prognostic stratification in multiple myeloma (MM) relies on staging systems that assign patients to fixed categories at diagnosis and discard the temporal information that accumulates during treatment. We developed a dynamic multimodal framework that predicts residual overall survival using observation windows ranging from 1 to 18 months post-diagnosis. The model integrates DeepInsight-transformed gene expression representation, longitudinal laboratory measurement trajectories across 10 analytes, and treatment history for three drug classes through an adaptive fusion mechanism that accounts for missing clinical observations. On the MMRF CoMMpass cohort (n = 752), five-fold cross-validation yielded a concordance index (C-index) of 0.773 {+/-} 0.024 and a time-dependent AUC at a 1-year prediction horizon (tdAUC1yr) of 0.789 {+/-} 0.021, outperforming all evaluated baseline methods including DeepSurv (0.633 {+/-} 0.095) and random survival forests (0.636 {+/-} 0.024) on matched cross-validation splits. Modality ablation identified longitudinal laboratory measurements as the strongest individual contributor (C-index 0.693); the DeepInsight spatial encoding of gene expression yielded higher discrimination than a multilayer perceptron (MLP) baseline operating on the same features (0.624 vs. 0.596). Kaplan-Meier analysis showed significant prognostic group separation at all primary landmarks (log-rank p < 0.001; hazard ratios 3.46-3.93). A distilled student model retaining only the DeepInsight representation and five baseline clinical features achieved C-index 0.672 and tdAUC1yr 0.740 on an independent microarray cohort (GSE24080, n = 507) without retraining. Interpretability analysis identified prognostic associations consistent with established myeloma biology, including ubiquitin-proteasome pathway genes, endoplasmic reticulum stress markers, and Interferon Alpha Response pathway enrichment.
Nevarez-Mejia, J.; Trevizani, R.; Abawi, A.; Johansson, E.; Sutherland, A.; Grifoni, A.; da Silva Antunes, R.; Sette, A.
Show abstract
Defining HLA restriction of T cell epitopes is essential for understanding immune responses in infectious disease, autoimmunity, and vaccine design. Current bioinformatic programs, including the IEDB RATE tool, enable inference of single-HLA restrictions from immune response data of HLA-typed donors. However, T cell epitopes are frequently presented by multiple HLA alleles, a phenomenon termed promiscuous restriction, limiting the utility of single-allele approaches. To address this limitation, we developed COMBO-RATE, an extension of RATE that systematically evaluates combinations of HLA alleles to identify multi-allelic restriction patterns. Analysis of three independent datasets spanning distinct antigen systems and different epitope discovery strategies revealed that promiscuous restriction is a near-universal feature of immunodominant epitopes. Focusing on the 43 immunodominant CD4 T cell epitopes identified in a B. pertussis genome-wide screen, COMBO-RATE outperformed conventional RATE, identifying restrictions for 35 of 43 epitopes, compared to 24 by RATE alone, and uncovered 64 additional allele restrictions, including 29 unique alleles. Experimental validation using single-HLA transfected cell lines and antigen presentation assays confirmed COMBO-RATE-inferred restrictions, demonstrating that a single epitope can be independently presented by distinct HLA alleles. Overall, COMBO-RATE provides a robust and scalable framework for defining complete HLA restriction profiles from existing population response data, with important implications for the design of vaccines requiring broad HLA coverage across genetically diverse populations. This pipeline is available as both a Python package and a user-friendly web application.
Gudbergsson, J. M.; Etzerodt, A.
Show abstract
With the introduction of dedicated nanoscale flow cytometers, the need for suitable compensation beads has emerged. Here, we present a rapid and cost-effective method to generate [~]100 nm antibody-binding compensation beads compatible with a wide range of antibody species for use in nanoscale flow cytometry. This approach may provide a practical interim solution until commercial alternatives become available.