Structure
○ Elsevier BV
Preprints posted in the last 30 days, ranked by how well they match Structure's content profile, based on 175 papers previously published here. The average preprint has a 0.14% match score for this journal, so anything above that is already an above-average fit.
Ha, B. H.; Tkacik, E.; Gazgalis, D.; Kang, H.; Jang, D. M.; Chakraborty, S.; Jeon, H.; Eck, M. J.
Show abstract
Upon RAS-driven membrane recruitment, RAF kinases ARAF, BRAF and CRAF are activated via formation of homo- or hetero-dimers to initiate signaling through the MAP kinase cascade. Although RAF heterodimers are important for both physiologic and oncogenic signaling, they have been little studied at a structural and biochemical level. Here we report the preparation, biochemical characterization, and the cryo-EM structure of a 14-3-3-bound BRAF/CRAF heterodimer complex. The heterodimer exhibited kinetic parameters and sensitivity to a panel of twelve structurally diverse RAF inhibitors that were closely similar to, or intermediate between, those of BRAF and CRAF homodimers. Cryo-EM structures of the heterodimer with and without MEK1 revealed an overall organization essentially identical to that of RAF homodimers, but with an asymmetric interaction in the MEK1-bound structure in which the BRAF N-terminal acidic (NtA) motif extends across the dimer interface to engage the CRAF RKTR motif. Mutagenesis of this interface unexpectedly revealed that replacing the acidic NtA sequence with a basic RARA sequence yields highly active RAF homodimers and heterodimers, demonstrating that negative charge in the NtA motif is not required for activity. Collectively, our findings suggest that the charge state of the NtA motif influences RAF activity through effects on local backbone dynamics and the stability of the inactive kinase conformation, rather than via stereospecific recognition across the dimer interface.
Liu, Y.; Lee, K.-Y.; He, Y.; Kim, D.; Chang, H.; Cherezov, V.; Feigon, J.; Qin, P. Z.
Show abstract
Double-stranded DNA minicircles have been observed in a variety of biological settings and are also widely employed in biotechnology, therapeutic applications, and basic research. Here, we report a cryo-EM structure of a 95-basepair minicircle (dsMC95) at a 5.3 [A] resolution. dsMC95 forms a closed ring as designed and no local deformation is observed. The two DNA strands are fully resolved, with the major and minor grooves clearly distinguishable. Analysis reveals a nine-fold periodicity in the helical twist, which corresponds to approximately 10.56 base pairs per turn. Together with groove width analysis, the data indicate that dsMC95 maintains a B-DNA configuration. The dsMC95 ring exhibits an in-plane ellipticity of 1.13 and an out-of-plane displacement of 15{degrees}, with differences in out-of-plane displacements observed between the two half-segments. The dsMC95 structure, which is the only free DNA cryo-EM structure with a resolution better than 6 [A] to date, allows comparison to other structures to better understand DNA physical features such as bending. The findings advance our understanding of DNA structure under topological constraints and may inform studies of naturally occurring small circular DNA as well as the manipulation of DNA in nanotechnology applications.
Briggs, D. C.; Duffy, R. T.; Ateaque, S.; Maslen, S.; Naharaj, H.; Barde, Y.-A.; DiStefano, P. S.; Lindsay, R. M.; Armstrong, P. C.; Peach, C. J.; McDonald, N. Q.
Show abstract
The brain-derived neurotrophic factor (BDNF)-tropomyosin receptor kinase B (TrkB) signalling axis is a key effector of synaptic plasticity and neuroprotection. While TrkB activation is a major objective towards preventing dysfunction of the nervous system, it cannot be reached with exogenous BDNF administration given the unfavourable physiochemical properties of BDNF. In addition, BDNF also activates a tumour necrosis factor pathway by binding to the neurotrophin receptor p75. The TrkB agonist ZEB85 provides an alternative route to the selective activation of TrkB. We report here the structural basis for the interaction between human TrkB, and both ZEB85 and BDNF, and reveal that a sulfated tyrosine modification is indispensable for ZEB85 activation of TrkB signalling. Using structure-guided BDNF- and ZEB85-binding deficient TrkB mutants, we assessed their ability to sequester ligands from full-length TrkB in cultured human neurons. We found that the BDNF binding site extends into the extracellular juxtamembrane domain of TrkB but does not require the sulfotyrosine at residue 400 to activate TrkB. Together with biophysical analysis and AlphaFold modelling these results also explain how BDNF can displace ZEB85 from TrkB through an overlapping epitope. Our findings reveal unique features of TrkB, not present in the related neurotrophin receptors TrkA and TrkC, and suggest new directions to explore the role of sulfotyrosine in TrkB signalling and identify new TrkB-specific protein ligands. One Sentence SummaryInvestigation of the mechanism of action of TrkB agonist ZEB85 extends molecular understanding of TrkB activation.
Kinman, L. F.; Grassetti, A. V.; Carreira, M. V.; Davis, J. H.
Show abstract
The emergence of single-particle cryoEM as a powerful method for structure determination has in large part been fueled by its ability to resolve both single static structures and complex conformational landscapes. Indeed, modern approaches to the heterogeneous reconstruction task can resolve 100s-1,000s of different maps from a single cryoEM dataset. How accurate these algorithms are, however, has proven difficult to rigorously assess, due to a lack of suitable benchmark datasets containing both realistic noise features and ground-truth labels. To address this obstacle, we recently developed a series of benchmark datasets that leverage the targeting power of Cas9 and the programmable heterogeneity of DNA to newly offer access to ground-truth per-particle structural labels in real data. Here, we challenged two popular heterogeneous reconstruction algorithms with mixed particle stacks resampled in silico from these datasets, finding that existing approaches resolve the encoded heterogeneity with limited accuracy. In particular, in realistic particle stacks with complex, multi-scale, and multi-axis heterogeneity, we observed that reconstruction of encoded heterogeneity depended strongly on the application of prior information about where heterogeneity was expected, and that individual particle assignments were made with significant error even when the correct structural states were reconstructed. Both molecular breathing motions and data collection features, such as defocus and projection angle, contributed to the observed particle assignment error. These results highlight important shortcomings of existing heterogeneous reconstruction methods and suggest new avenues for method development in both data collection strategies and in heterogeneous classification and reconstruction algorithms.
Zhang, S.; Maddipatla, S. A.; Vedula, S.; Marx, A.; Bronstein, A. M.
Show abstract
{beta}-turns are among the most common structural motifs in proteins, yet their conformational dynamics and sequence determinants remain incompletely understood. Here we present a data-driven classification and dynamic analysis of {beta}-turn conformations using large-scale molecular dynamics trajectories from the mdCATH database. Clustering of backbone dihedral angles using a cross-bond Ramachandran representation identifies six {beta}-turn types, including a previously uncharacterized hybrid I/I' cluster that combines geometric features of canonical type I and I' conformations. Time-resolved analysis indicates that this hybrid state acts as a transient intermediate state of {beta}-turns. Transitions observed in molecular dynamics simulations closely match NMR ensembles and altlocs detected in X-ray crystal structures, with the most dominant exchanges occurring between type I and II, and between type I' and II' turns. Sequence analysis shows that each turn type exhibits characteristic amino acid preferences at the central residues (i + 1 and i + 2). Within these overall preferences, specific residue pairs display distinct biases toward static or dynamic behavior. Targeted in silico substitutions that interchange dynamic- and static-enriched residue pairs shift the conformational behavior of turns accordingly, providing direct support for these sequence-dynamics relationships. Analysis of flanking secondary-structure environments reveals that structural context further modulates turn flexibility, with strand- and coil-associated turns exhibiting higher dynamic propensity than helix-associated turns. Together, these results reveal how sequence composition and structural context jointly shape the conformational landscape of {beta}-turns.
Vangos, N. E.; DeLear, P. E.; Thomas, E. C.; Verhey, K.; DeSantis, M. E.; Zanic, M.; Sept, D.; Cianfrocco, M. A.
Show abstract
Microtubules are dynamic filaments of tubulin heterodimers that comprise an essential part of the eukaryotic cytoskeleton1. The nucleotide state of tubulin controls microtubule dynamics: stable GTP-microtubules favor polymerization, whereas unstable GDP-microtubules drive depolymerization2. Anticancer compounds such as Taxol (paclitaxel) target microtubule dynamicity by preventing microtubule depolymerization3,4. Despite decades of work, the molecular basis of microtubule dynamics remains poorly defined. Using cryo-EM, we determined [~]2.2 [A] structures of human microtubules in GTP-like (GMPCPP) and GDP states. Comparison of these two states revealed switch-like structural changes as tubulins transition from the pre-hydrolysis (GMPCPP) to the post-hydrolysis (GDP) state. Additional structure determination of Taxol-bound microtubules at [~]2.2 [A] showed that Taxol binding converts the microtubule lattice into a pre-hydrolysis state by reversing the structural switches flipped during GTP hydrolysis. Focusing our analysis on the microtubule seam shows that the pre-hydrolysis conformation of GMPCPP or Taxol-GDP exhibits favorable lateral interactions at the seam, with lattice deformations clearly visible at the GDP seam. Together, our data show the existence of structural switches in tubulin that are coupled to the nucleotide state and are exploited by Taxol to stabilize microtubules into a pre-hydrolysis-like state. (191 words)
Weinert, T.; Standfuss, J.; Seidel, H. P.
Show abstract
Macromolecular crystallographic refinement underpins structural biology, yet existing software packages often lack accessible, modular codebases amenable to rapid method development. Here, we introduce TorchRef, a PyTorch-based crystallographic refinement framework that exposes all refinable parameters, atomic coordinates, displacement parameters, occupancies, and scale factors to automatic differentiation. The framework implements FFT-based structure-factor calculations, the French-Wilson treatment of intensities, bulk-solvent modeling with established mask parameters, and stereochemical restraints from the CCP4 Monomer Library. A modular target architecture allows loss functions to be combined, weighted, and extended independently of the core refinement machinery. Validation against 1,000 PDB structures demonstrates that TorchRef-based refinement reproduces a median R-free within 1% of Phenix while maintaining comparable model quality. Structure factor calculation in TorchRef scales readily across multiple CPU cores and is over 100 times faster on modern GPUs than CCTBX. To showcase how modern methods like time-resolved crystallography can benefit from the flexibility that TorchRef provides, we implemented direct refinement of a typical time-resolved model against amplitude differences, a use case currently not explored by classic refinement programs. TorchRef is released under the MIT license with full API documentation and tutorials, providing an accessible platform for developing and testing new crystallographic refinement protocols. SynopsisTorchRef is an open-source PyTorch-based crystallographic refinement framework that exposes all refinable parameters to automatic differentiation, delivers GPU-accelerated structure-factor evaluation more than 100x faster than CCTBX, and enables new workflows, such as direct refinement against amplitude differences in time-resolved crystallography.
Cho, Y.; Tsuboyama, K.; Litberg, T. J.; Jung, M. D.; Obisesan, A.; Wang, Q.; Phoumyvong, C. M.; Thibeault, J.; Ovchinnikov, S.; Rocklin, G. J.
Show abstract
Predicting absolute protein folding stability is a long-standing challenge in biophysics, with broad applications in protein design and in understanding genetic variation and evolution. Physics-based simulations have shown limited success at predicting stability and are often computationally intractable, and machine learning methods have been constrained by the lack of sufficiently large experimental datasets. We recently introduced cDNA display proteolysis, a cell-free approach that can measure folding stability for nearly one million protein domains in parallel. Here, we applied this method to measure stability for 1.8 million diverse protein domains 60-80 amino acids in length primarily taken from the MGnify metagenomic database and spanning over 200,000 sequence families. Using this new "MGnify Stability dataset", we developed the predictive models SaProt{Delta}G and ESM3{Delta}G, which accurately predict absolute folding stability for small domains with root mean squared error of 0.8 kcal/mol over a 6 kcal/mol range (Spearman rank correlation of 0.88). These predictors show high accuracy at predicting effects of substitutions, insertions, and deletions, successfully identify global trends toward higher stability in thermophilic organisms, and improve discrimination of stable and unstable computationally designed proteins. Our results illustrate how megascale biophysical measurements can complement existing evolutionary and structural data to enable accurate absolute stability prediction for small domains.
Bajgain, Y.; Guo, M.; Hager, K. M.; Nguyen, A. W.; Zhang, Y.; Maynard, J. A.
Show abstract
Antibody-dependent cellular cytotoxicity (ADCC) is a major mechanism of action for many FDA-approved therapeutic antibodies that is driven by interactions between the antibody Fc and Fc{gamma} receptors (Fc{gamma}Rs) on immune effector cells. Murine models used for preclinical antibody evaluation currently have limited predictive value for clinical ADCC performance due to interspecies differences in Fc-Fc{gamma}R interactions. The molecular determinants governing Fc-Fc{gamma}R engagement in mice remain poorly defined, complicating the interpretation of murine ADCC data and its clinical relevance. To address this, we present the high-resolution crystal structure of the receptor that regulates Fc-mediated cytotoxicity in mice, mouse Fc{gamma}RIV, alone and in complex with mouse IgG2a Fc. This complex preserves key features of the human IgG1 Fc-human Fc{gamma}RIIIa interface which mediates ADCC in humans including salt bridges, hydrogen bonds, and a proline sandwich. However, subtle variations in receptor orientation, Fc-Fc{gamma}R electrostatics, and glycan positions reduce human IgG1 Fc- mouse Fc{gamma}RIV binding affinity, resulting in species-restricted Fc-Fc{gamma}R mediated immune responses. Modeling of human IgG1 Fc interactions with mouse Fc{gamma}RIV predicted steric clashes, suggesting opportunities to modulate the interaction. One structure-guided substitution variant of human IgG1, Fchumo, maintains comparable human Fc{gamma}RIIIa engagement with enhanced binding to and activation of mouse Fc{gamma}RIV, relative to human IgG1 Fc. This study provides proof-of-concept for engineering human Fc domains for cross-species Fc{gamma}R recognition and provides a strategic framework to improve the predictive power of in vivo preclinical models.
Qian, J.; Gong, Y.; Liu, F.; Huang, Y.; Guo, G.; Zhu, Y.; Huang, Q.
Show abstract
Accurate particle picking from noisy cryo-EM micrographs is essential for high-resolution reconstruction. Current deep learning methods rely on manually annotated data, which is labor-intensive, subjective, and limits particle recall under low signal-to-noise ratio (SNR). Here we introduce ParSeek, an automated picker trained entirely on synthetic data without human annotation. Synthetic micrographs are generated by projecting known 3D structures into realistic background patches that reproduce experimental noise. On seven public cryo-EM datasets, ParSeek outperformed Topaz and CryoSegNet on four datasets, achieving the highest F1-score (up to 0.82) and reaching 0.63 on a challenging membrane protein dataset. Density maps from ParSeek-picked particles showed cross-correlation coefficients up to 0.995 with the reference and a minimal resolution difference of 0.1 [A]. ParSeek also overcame severe orientation bias on an influenza dataset, yielding a reasonable reconstruction. Applied to three experimental datasets (an antibody-antigen complex and two GPCRs), ParSeek enabled reconstructions at 5.0 [A], 4.0 [A], and 2.8 [A], respectively. The 2.8 [A] map resolved side-chain densities and ligand flexibility. This study establishes a fully synthetic-data-driven strategy that eliminates manual annotation for training cryo-EM deep-learning models, paving the way for automated, unbiased particle picking.
Powell, W.; Yan, N.; Tse, E.; Sin, N.; Melo, A.; Southworth, D. R.; Gestwicki, J. E.
Show abstract
Tau hyperphosphorylation is linked to tauopathy aggregates, but the effects of individual sites on tau assembly remain unclear. Here, we used protein semisynthesis to generate defined Tau(297-407) proteoforms containing specific combinations of phosphorylation within the PHF1 epitope. Spontaneous aggregation revealed that phosphorylation of S396, and S400 to a lesser extent, promoted nucleation, while phosphorylation of T403 and S404 suppressed assembly. A similar reactivity trend of pS396 > pS400 > WT > pT403 > pS404 was observed for seeded assembly, and all proteoforms, even anti-aggregation ones, were incorporated. The fibrils have similar thermodynamic stability, suggesting that phosphorylation selectively tunes reactivity and not thermodynamics. Cryo-EM revealed that pS400 produces a chronic traumatic encephalopathy (CTE) protofilament conformation. Strikingly, one of the structures observed in the pS400 sample appeared to capture a secondary nucleation step. Together, these studies reveal the importance of positional effects of phosphorylation on tau self-assembly.
Osumi, K. M.; Murray, D. T.
Show abstract
GFAP is a type III intermediate filament primarily found within astrocytes and is known to maintain proper cell structure and mechanical strength. Mutations in GFAP are implicated in the pathology of Alexander disease, a neurodegenerative disease characterized by cytoplasmic inclusions of protein, known as Rosenthal fibers. GFAP has a typical type III intermediate filament domain structure, consisting of a highly conserved alpha-helical rod domain bracketed by an intrinsically disordered N-terminal head and C-terminal tail domains. While the general domain organization of monomeric GFAP and the assembly process for higher order quaternary structures are known, we lack an atomic resolution mechanistic understanding of GFAP assembly into mature filaments. Understanding the structure of GFAP filaments and how mutations disrupt this structure will provide vital information into how mutations produce Alexander disease pathology. As a first step towards a mechanistic description, we characterized GFAP wild type tetrameric and filamentous assemblies using solid state NMR and compared the results to those obtained from an assembly-deficient GFAP mutant. For wild-type GFAP, we observe surprisingly uniform rigid alpha helical structure and can spectroscopically resolve highly mobile intrinsically disordered regions in the filament assemblies. Wild type tetramers show increased mobility, likely arising from the head and tail domains. Mutation of the highly conserved cysteine at position 294 to serine results in an inability to form full-length filament assemblies. We show that the rigid regions of the C294S mutant assemblies largely remain structurally consistent with wild type tetrameric assemblies but differ from wild-type filament assemblies. There is an increase in highly mobile regions for the C294S mutant relative to the wild-type. Our results provide a foundation for developing solid state NMR approaches to characterize intermediate filament assembly mechanisms and the interfering effect of disease mutations.
Mainan, A.; Roy, S.; Kirmizialtin, S.
Show abstract
Discrepancies between biomolecular structures resolved by cryo-electron microscopy (cryo-EM) and X-ray crystallography (XRD) often arise from differences in ionic conditions and construct design, yet their mechanistic impact on RNA folding remains unclear. In the SARS-CoV-2 frameshifting stimulatory element, cryo-EM and XRD structures reveal distinct pseudoknot conformations--a bent and a coaxially stacked state--complicating its structure-function relationship. Here, combining all-atom explicit-solvent simulation results with a structure-based electrostatic model, we show that Mg{superscript 2} ions drive transitions between these states by stabilizing long-range tertiary interactions and modulating local dynamical coupling involving the slippery site and stem 3. Energy landscape analysis reveals distinct folding pathways, while deletion of the slippery segment in crystallographic constructs alters intermediates and produces pathways inconsistent with single-molecule optical tweezer experiments. This study demonstrates how condition-dependent experiments encode complementary interaction-level information and how physics-based computational approaches integrate these to yield a coherent, mechanistic picture of RNA folding. TOC GRAPHICS O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=108 SRC="FIGDIR/small/722415v1_ufig1.gif" ALT="Figure 1"> View larger version (41K): org.highwire.dtl.DTLVardef@1a7c324org.highwire.dtl.DTLVardef@fcabceorg.highwire.dtl.DTLVardef@736704org.highwire.dtl.DTLVardef@7061e6_HPS_FORMAT_FIGEXP M_FIG C_FIG
Kervadec, J.; De Sciscio, M. L.; Mahler, A.; Dufossee, M.; Dupont, C.; Rufin, Y.; Boudier-Lemosquet, A.; James, C.; Teletchea, S.; Manon, S.; D'Abramo, M.; Priault, M.
Show abstract
Multi-domain Bcl-2 family proteins share the ability to form dimers and oligomers, regardless of their pro- or anti-apoptotic activity. Homotypic interactions (pro-pro and anti-anti) and heterotypic interactions (pro-anti) are well-documented, but the role of higher-order organization in their survival/death functions and membrane interactions remains largely unresolved. Looking into anti-apoptotic Bcl-xL, essentially engineered/truncated proteoforms lacking the disordered loop and/or the hydrophobic C-terminal helix, have been used as proxies of the full-length (FL) protein, to elaborate on structural transitions and intermediate states between monomers and homooligomers prior to membrane insertion. Using a minimalist approach with recombinant FL-Bcl-xL (aa 1-233) and artificial nano-membranes, we demonstrate that both the loop and the C-terminal helix are potent contributors to Bcl-xL structural plasticity. Unlike 3D domain swapping (3DDS) dimers resolved with the C-terminal truncated protein, FL-Bcl-xL organized in solution as dimers bridging the unique Cys151 from two monomers. This spontaneous fold indicates that the C-terminal helix drives FL-Bcl-xL to explore different conformations than truncated Bcl-xL. Yet, dimerization was not a prerequisite for membrane insertion into nanodiscs and Cys151 did not contribute to Bcl-xL survival functions in cells. These data support monomeric Bcl-xL as the minimal functional unit in membranes. Further exploring the frequently deleted disordered loop, we discovered that deamidation of Asn52 and Asn66 in IsoAsp, but not in Asp, impairs membrane insertion into nanodiscs. Thus, this reductionist biochemical approach clarifies the loss of tumorigenic function we observed for deamidated Bcl-xL in xenograft experiments in vivo.
Guo, X.
Show abstract
Building and refining cryo-EM atomic models often requires long, project-specific workflows that combine map inspection, prior structural knowledge, restraints, refinement, validation and expert review. Existing programs perform many individual operations, but coordinating them across iterative model-building sessions remains manual and difficult to audit. We present StructAgent, a user-guided multi-agent resource for cryo-EM model building and refinement. StructAgent couples a domain agent for literature-grounded structural reasoning with an execution agent that runs local software, tracks state, recovers from failures and records provenance. Expert approval gates control major model-changing actions. In three case studies, StructAgent refitted a 64-chain proteasome from an earlier template, audited 530 ribosomal metal-ion sites and guided a chemically ambiguous ligand fit in a folate-metabolism enzyme from ongoing work. These demonstrations show that agentic orchestration can convert modeling intent into auditable, reviewable software workflows while preserving expert control and final scientific judgment.
Risi, C. M.; Larrinaga, T.; Kostyukova, A. S.; Gregorio, C. C.; Galkin, V. E.
Show abstract
Cardiac contraction depends on synchronized interactions between myosin-based thick filaments and actin-based thin filaments (TFs). Precise regulation of TFs length is vital for cardiac function, as any alteration in length leads to severe myopathies. Actin filaments form the backbone of the TF and have two unequal ends - fast-growing barbed and slow-growing pointed. In muscle, the barbed end is capped at the Z-line, while the pointed end is regulated by the tropomodulin family of proteins. Tropomodulin caps the pointed end, while leiomodin-2 (Lmod2) promotes actin nucleation and pointed end elongation. Lmod2 has a unique C-terminal extension (CTE) that is important for actin nucleation and binds to the sides of matured TFs. The structural mechanism by which Lmod2 promotes elongation remains elusive. We employed cryo-electron microscopy to visualize the structure of growing pointed ends nucleated by Lmod2 from profilactin. We show that Lmod2s leucine-rich repeat domain (LRR) stabilizes terminal actin subunits by binding across the helical groove of actin. We identified two distinct populations of pointed-end LRR-containing complexes on one or both actin strands. LRR binding pushes the terminal actins outward from their ideal positions in the actin filament, introducing strain at the pointed end that squeezes LRR from the filaments exterior. We also show that the Lmod2 CTE may stabilize Lmod2 binding to the pointed end. We suggest that Lmod2 promotes the addition of new actins to the pointed end but is expelled from the growing filament, thereby maintaining the concentration of Lmod2 required for further elongation.
Ali, M.; Hutchings, J.; Dutta, T.; Jean, N.; Greenan, G.; Montabana, E. A.; Schwartz, J.; Finn, M. G.; Haury, M.; Agard, D.; Carragher, B.; Kopylov, M.; Paraan, M.
Show abstract
Standardized biological specimens are essential for optimizing cryoEM workflows and benchmarking instrument performance. While apoferritin fulfills this role for single-particle analysis, no equivalent exists for cryo-electron tomography. Ribosomes are frequently used but require large datasets due to C1 symmetry and structural heterogeneity, limiting rapid optimization and standardized comparison of workflows. Here, we present PP7 virus-like particles (VLPs) overexpressed in E. coli as a scalable in situ benchmark. VLPs have high orders of symmetry enabling rapid, high-resolution validation of tomographic pipelines from minimal datasets, while their distinct structural features across low to high resolutions provide a practical resolution metric.
Grassetti, A. V.; Kinman, L. F.; Davis, J. H.
Show abstract
Single-particle cryoEM is increasingly used to resolve conformational and compositional ensembles, yet objective evaluation of heterogeneous reconstruction methods remains limited by the scarcity of experimental benchmarks with per-particle ground-truth labels. Indeed, many widely used experimental"benchmark" datasets necessarily validate observed states retrospectively while purely synthetic datasets provide ground-truth labels but typically fail to capture experimentally realistic complexities including confounding structural heterogeneity, imaging noise, contaminants, and orientation biases, which dominate real-world analyses. Here we develop an experimentally grounded benchmark dataset for heterogeneous reconstruction using catalytically inactive Streptococcus pyogenes Cas9 bound to a constant sgRNA and to target DNA duplexes engineered to carry extensions of defined length. We assembled, purified, vitrified, and imaged thirteen complexes independently, such that the dataset-of-origin provides an unambiguous label for each particles encoded state while preserving the full experimental complexity of cryoEM data. Independent refinements of the pure datasets recover the engineered DNA-extension signal and define a simple quantitative readout, DNA-extension occupancy, that increases monotonically with designed extension length. The same reconstructions also reveal substantial non-encoded conformational variability within the Cas9 core, showing that this benchmark embeds a known structural signal within broader structural heterogeneity that methods must confront in practice. To separate these axes of variation, we used systematic deep classification to generate curated particle subsets depleted of selected domain motions while retaining the encoded labels. We further provide pooled particle stacks with standardized per-particle poses in a common reference frame and a lightweight framework for in silico particle pooling to generate challenge datasets with user-defined ground-truth distributions of encoded and non-encoded structural heterogeneity. Together, this resource supports robust benchmarking of heterogeneous reconstruction algorithms and provides a biochemically tractable model system for evaluating entire cryoEM pipelines, including alternative data-collection and preprocessing approaches, under experimentally realistic conditions.
Hungerland, J.; Kostritski, A.; Koch, K.-W.; Solov'yov, I.
Show abstract
Avian phototransduction and magnetoreception have been proposed to involve shared retinal proteins, including interactions between long-wavelength opsin (LWO), the cone-specific heterotrimeric G protein (Gt), and cryptochrome 4a (Cry4a), yet structural information on avian phototransduction complexes is lacking. Here we present and critically assess two atomistic models of the European robin LWO-Gt complex generated by distinct modelling strategies. A full-complex prediction using AlphaFold3 yields a tightly packed, structurally stable interface but exhibits pronounced activation-like conformational features of the Gt-subunit that persist in simulations of the isolated protein, revealing a strong bias toward the active state. In contrast, a template-guided assembly based on single-chain predictions and an experimental rhodopsin-Gt reference structure forms a weaker interface and shows no intrinsic activation bias, while still displaying subtle activation-related dynamics. These results demonstrate that machine-learned complex prediction can encode functional states independently of the local interaction environment, thereby limiting its interpretability for signalling mechanisms that hinge on activation equilibria. Our findings highlight the need for explicit assessment of conformational-state bias when modelling regulatory protein assemblies and provide a structural framework for future studies of Cry4a-dependent modulation of retinal G-protein signalling in avian magnetoreception.
So-Last, M. G. F.; Hale, T.; Burt, A.; Allegretti, M.
Show abstract
Cellular cryo-electron tomography (cryo-ET) reveals high-resolution details of macromolecules within their native cellular environment. However, in situ cryo-ET datasets are large and highly heterogeneous, which makes comprehensive identification and extraction of the many different elements of cellular architecture for high-resolution analysis a challenging, time-consuming and often tedious task. Here we present easymode, a library of pretrained general segmentation networks for cryo-ET, trained on over 4,000 tilt series spanning a large and diverse variety of sources. Easymode enables in situ structural determination workflows by rendering tomogram content computationally accessible, without requiring any per-dataset training. Beyond directly facilitating high-resolution subtomogram averaging of a selection of widely prevalent complexes, we show how easymode can be used to leverage cellular context in subtomogram averaging workflows, helping identify, align, or filter particle sets, and enabling automated mapping of the cellular landscape surrounding target proteins. We use easymode to determine the in situ structure of rare inosine monophosphate dehydrogenase (IMPDH) filaments at 4.0 A resolution, and to map and visualize the surrounding cellular environment.