Science — Latest Matching Preprints

1

Misleading Success: Genomes Reveal Critical Risks to European Gray Wolves

Ravagni, S.; Battilani, D.; Salado, I.; Lobo, D.; Sarabia, C.; Leiva, C.; Caniglia, R.; Fabbri, E.; Ciucci, P.; Girardi, M.; Santos, F. I.; Kusak, J.; Mattucci, F.; Naderi, M.; Nowak, C.; Sekercioglu, C.; Skrbinsek, T.; Velli, E.; Stronen, A. V.; Vila, C.; Godinho, R.; Leonard, J.; Vernesi, C.

2026-03-23 evolutionary biology 10.64898/2026.03.20.713253 medRxiv

Top 0.1%

44.6%

Show abstract

Have European gray wolves recovered? Despite an increase to [~]21,000 wolves (Canis lupus), our genomic analyses reveal significant risks to their long-term viability. We analyzed over 200 whole-genomes spanning five major European populations. Rather than a single recovering population, European wolves form a mosaic of isolated, independently evolving lineages, mostly diverging in the late Pleistocene. All lineages have contemporary effective population sizes below the threshold for long-term viability (Ne [≥] 500) and show extensive inbreeding. Runs of homozygosity reveal population-specific inbreeding histories spanning recent to deep timeframes. Most lineages exhibit higher realized than masked genetic load, indicating emerging inbreeding depression. These findings challenge claims that downlisting European wolves is biologically warranted: none of these populations currently meets thresholds associated with favorable conservation status.

2

A deep learning predictor of bindable protein surfaces toguide generative synthetic biology

Almeida-Souza, L.

2026-04-16 synthetic biology 10.64898/2026.04.16.718848 medRxiv

Top 0.1%

41.1%

Show abstract

The advent of generative machine learning models has revolutionized de novo design of protein binders. However, the wide adoption of this revolution is bottlenecked by computational cost. For many targets, binder design commonly requires computationally intensive sampling across structures, often wasting days of GPU time on unwanted or geometrically inviable regions. Here, IARA (Interface Analysis and Recognition Architecture) is introduced, a deep learning Graph Neural Network designed as a rapid structural filter to triage protein binder generative pipelines. IARA is trained entirely on BindCraft trajectories generated against s RFdiffusion-generated targets. Based on a slim network with only seven residue features, IARA maps the binder designability of input proteins in a matter of seconds. On validation runs using BindCraft, RFdiffusion and BoltzGen, IARA successfully identified the optimal binding interface for practically all targets. By instantly pinpointing the highest-probability binding pockets, IARA democratizes synthetic biology, drastically reducing the exploratory GPU compute required for successful de novo binder generation.

3

Genomic surveillance of a deeply sampled local population reveals age-specific drivers of RSV transmission

Kwon, J.; de Vries, E. M.; Lemey, P.; Li, K.; Breban, M.; Laing, K.; Ferguson, D.; Schulz, W. L.; Oliveira, C. R.; Bont, L. J.; Pitzer, V. E.; Weinberger, D. M.; Grubaugh, N. D.; Hill, V.; Redmond, S.

2026-05-18 epidemiology 10.64898/2026.05.07.26350887 medRxiv

Top 0.1%

37.3%

Show abstract

Respiratory syncytial virus (RSV) disproportionately causes severe infections among infants and older adults, yet the key age group responsible for viral spread to other age groups remains poorly defined. While current immunization approaches effectively reduce disease severity among the most vulnerable, identifying the core drivers of infection is essential to effectively disrupt population-level transmission. By generating 910 whole-genome viral sequences of RSV from all age groups (<1 to 65+ years) in Connecticut, we identified that children aged 12-35 months are the primary drivers of viral transmission to other age groups. This group significantly shapes the genetic diversity of circulating strains. Furthermore, we found that RSV is introduced into the community through frequent and independent entries from other US regions throughout the year, rather than through a single explosive seasonal introduction or long-term local persistence. Ultimately, our findings justify prevention strategies that expand beyond reducing disease burden to actively prioritizing the reduction of transmission and infection.

4

Whole-genome phylogenomics and synteny resolve a single origin of body-plan asymmetry in flatfishes

Gallego-Garcia, J.; Hays, D.; Tongboonkua, P.; Minich, J. J.; Hilgers, L.; Michael, T. P.; Hiller, M.; Zhang, C.; Orti, G.; Arcila, D.; Pfeiffer, W.; Duarte-Ribeiro, E.; Mirarab, S.; Betancur-R., R.

2026-05-26 evolutionary biology 10.64898/2026.05.25.727411 medRxiv

Top 0.1%

36.8%

Show abstract

Flatfishes display the most dramatic asymmetric body plan in vertebrates, yet whether this rare innovation evolved once (flatfish monophyly, FM) or multiple times (flatfish polyphyly, FP) has remained contentious. A recent genome-wide study supported FP by placing Psettodes, the earliest-diverging flatfish lineage, among symmetric relatives within Carangaria, the clade that also includes billfishes, jacks, mahi-mahi, and barracudas. Subsequent work traced this to base-composition artifacts and inadequate substitution modeling. Here we revisit the question using whole-genome phylogenomic and synteny data from 17 carangarian species spanning flatfishes and carangarian relatives. We contribute three new chromosome-level assemblies, including the first for Psettodes. Nucleotide-based coalescent analyses (e.g., ROADIES, CASTER) yield strong support for FM, with Psettodes sister to all other flatfishes. Microsynteny analyses built from conserved gene-order blocks corroborate this result: topology tests, cluster-profile counts, and rearrangement-based trees favor FM over two competing FP topologies. Macrosynteny, based on chromosome-scale rearrangements, yields a more mixed signal, with support for FM depending on the metric and taxon-sampling scheme. We interpret this scale-dependent pattern in the context of the explosive post-Cretaceous radiation of Carangaria. The short intervals between speciation events that characterize rapid radiations appear to have left sufficient signal in fine-grained microsyntenic rearrangements, while chromosome-scale rearrangements were too rare to consistently resolve these closely spaced splits. When integrated with evidence from conserved developmental mechanisms active during metamorphosis, the stage at which flatfish asymmetry first emerges, and from the exceptionally complete fossil record, our multi-scale genomic evidence supports a single evolutionary origin of flatfish asymmetry.

5

Integrated surveillance resolves Darien paradox of Oropouche virus emergence in Panama migration corridor

Rodriguez, X.; Perez-Jimenez, J. G.; Alexander, L. W.; Lezcano-Coba, C.; Galue, J.; Juarez, Y.; Beltran, D.; Smith, D. R.; Kadir, M.; Ali, D. W.; Corrales, R.; Trujillo Rodriguez, L.; Valdiviezo, G. E.; Thomas, Q. K.; Cicalo, A.; Fitzpatrick, M. C.; Luquette, A. E.; Cameron Sayer, L.; Cer, R. Z.; Malagon, F.; Grajales, I. A.; Rivera, L. F.; Gonzalez-R, Z.; Antioco, J.; Walters-Valdes, E.; Meneghello-Ponce, N.; Vittor, A. Y.; Escobar-Lee, K.; Abouganem-Shaw, A.; Rodriguez, F.; Aguirre, E.; Loyola, S.; Tinoco, Y.; Moreno, B.; Chen-German, M.; Ampuero, S.; Gomez-Angelo, A.; Correa-Duarte, S.; Ace

2026-06-01 epidemiology 10.64898/2026.05.28.26354376 medRxiv

Top 0.1%

36.4%

Show abstract

Oropouche virus (OROV) spread across the Americas in 2024, yet Panama Darien migration corridor saw no outbreak until nearly a year after Brazil January 2024 peak, raising two hypotheses: cryptic circulation masked by diagnostic gaps, or recent introduction under permissive climatic conditions. Here we resolve this paradox using integrated clinical, genomic, and climate-informed surveillance. Among 1,040 individuals tested, 43% were OROV-positive and showed a clinical signature distinct from co-circulating arboviruses, including headache more frequent than in dengue (RR 2.38, 95% CI 1.74-3.24). The household secondary attack rate was 56%, and waste burning independently predicted infection. Phylogeographic reconstruction identified a single recent introduction in October 2024 with no evidence of adaptive evolution, excluding prolonged cryptic persistence. Climate-informed models indicate broad outbreak susceptibility across Panama, with Bocas del Toro and Los Santos as the next highest-risk provinces. These findings identify a Central American foothold for OROV with potential for further northward spread.

6

From receptor binding to biogeography: Multi-scale prediction of filovirus hosts in bats

Castellanos, A. A.; Anthony, S. J.; Chandran, K.; Lasso, G.; Wells, H. L.; Han, B. A.

2026-05-19 ecology 10.64898/2026.05.18.726005 medRxiv

Top 0.1%

36.3%

Show abstract

Forecasting zoonotic risk requires identifying which host species are biologically susceptible to infection, yet susceptibility is rarely predicted using frameworks that integrate molecular mechanisms with macroecology. Filoviruses, a diverse group of bat-associated viruses that include Ebola and Marburg viruses, illustrate this challenge: viral entry depends on interactions between viral glycoproteins and the host receptor NPC1, and host ecology and distribution determine opportunity of viral entry. Additionally, receptor sequence data used for informing viral entry are available for only a small fraction of bat species. Here, we extend virus-specific susceptibility prediction across the global diversity of bats by integrating experimentally measured and physicochemically inferred virus-receptor binding strengths with phylogenetic, ecological, and environmental data. Using boosted regression models trained on binding assay labels, we generate predictions of NPC1-mediated binding strength for more than 1,300 bat species. Predicted susceptibility is strongly structured by evolutionary relationships, with high binding concentrated in particular bat lineages, but is further differentiated within clades by morphology, life-history strategy, and environmental context. Strikingly, macroevolutionary structure alone recovers interaction patterns originally derived from amino acid-level physicochemistry, indicating that information about receptor-mediated compatibility is recoverable from host evolutionary history and ecological traits. Predicted high binding strength extends well beyond historically recognized outbreak regions, suggesting that the fundamental host range of filoviruses may be substantially broader than their currently realized distribution. By scaling receptor biology to global host diversity, this multi-scale framework expands mechanistic susceptibility forecasting beyond species with available molecular data and provides a generalizable approach for integrating molecular and ecological information in zoonotic prediction.

7

Evolution as Active Geometry: The Geometric State Equation of the Tree of Life

Fenn, R.; Fenn, A.

2026-03-13 evolutionary biology 10.64898/2026.03.09.710612 medRxiv

Top 0.1%

35.9%

Show abstract

Any process that generates information at a constant rate into a branching hierarchy faces a geometric packing problem: the number of distinguishable lineages grows exponentially, but Euclidean space grows only polynomially. We show that this tension forces a unique resolution. By deriving a geometric state equation from three physical postulates--information flux, hierarchical topology, and geometric fidelity--we prove that any such system must embed into a hyperbolic manifold of curvature{kappa} = (h ln 2/(n - 1))2, where h is the entropy rate and n the embedding dimension. The equation has zero adjustable parameters, a unique positive solution, and a globally stable equilibrium. For the tree of life, back-solving across all systems tested--from decade-old viral outbreaks to 3.8-billion-year cellular lineages--yields a universal embedding dimension of n = 2.00 {+/-} 0.05 despite orders-of-magnitude variation in mutation rate and timescale. This topological invariant, combined with the effective entropy of the genetic code (h {approx} 1.61 bits), predicts a curvature of{kappa} = 1.245. Five independent neural networks trained on 5,550 genomes from all domains of life, receiving no phylogenetic supervision, converge to{kappa} = 1.247 {+/-} 0.003 (CV = 0.24%), confirming the prediction within 0.2%. Independent validation across 15 viral families spanning 101-108 years of divergence yields Pearson r = 0.996 between predicted and measured curvatures. Extending the test to the 20-letter amino acid alphabet, we embed 15 protein family phylogenies into [H]2 and measure{kappa} protein = 3.80 {+/-} 0.60, confirming the predicted 3.1x curvature increase ({kappa} = 3.90) to within 2.6%, while recovering n = 2.03 {+/-} 0.10 across alphabets. The curvature of the tree of life is not a historical accident but a geometric constraint imposed by the information capacity of the genetic code. Graphical Abstract O_FIG O_LINKSMALLFIG WIDTH=174 HEIGHT=200 SRC="FIGDIR/small/710612v1_ufig1.gif" ALT="Figure 1"> View larger version (40K): org.highwire.dtl.DTLVardef@5e586eorg.highwire.dtl.DTLVardef@1ffaf62org.highwire.dtl.DTLVardef@1537c2borg.highwire.dtl.DTLVardef@1fcf0dc_HPS_FORMAT_FIGEXP M_FIG C_FIG The tree of life embeds optimally into 2D hyperbolic space with curvature{kappa} = 1.247 {+/-} 0.003, matching the prediction{kappa} = (h ln 2)2 = 1.245 from the geometric state equation to within 0.2%. Top: Voronoi tessellation of 5,550 genome embeddings in the Poincare disk, colored by domain (Bacteria, Archaea, Eukarya). LUCA occupies the center; cell boundaries are hyperbolic geodesics (circular arcs orthogonal to the disk boundary). Bottom: Five independent neural networks converge to the same curvature (CV = 0.24%), the state equation predicts curvature across both DNA and protein alphabets (3.1x curvature increase) with zero adjustable parameters, and cross-system validation confirms the curvature-entropy relationship (r = 0.996).

8

MrtR of Mesorhizobium tianshanense reveals both activation and inhibition mechanisms of a LuxR-type quorum sensing receptor

Stoutland, I. M.; Blackwell, H. E.

2026-05-29 biochemistry 10.64898/2026.05.28.728602 medRxiv

Top 0.1%

32.8%

Show abstract

Quorum sensing (QS) enables common gram-negative bacteria to coordinate collective behaviors through small molecule signals, yet how these signals tune receptor activity remains incompletely understood. Here, we define a mechanism by which ligand structure controls function in a LuxR-type QS receptor. Using structural and biochemical analyses, we investigate MrtR from Mesorhizobium tianshanense and show that ligand acyl-chain length governs receptor assembly and activity. We present full-length structures of MrtR bound to activating and inhibitory ligands, revealing a switch in oligomeric state. Long-chain (C14) N-acyl L-homoserine lactones (AHLs) act as agonists by promoting intra- and inter-subunit interactions that lead to homodimerization and DNA binding. In contrast, shorter (C8) AHLs fail to promote these contacts, favoring a monomeric, inactive state. Ligands of intermediate length produce graded responses consistent with partial dimer stabilization. Biochemical measurements of DNA binding, thermostability, and oligomerization, together with targeted mutagenesis, support this model and establish the functional importance of key structural contacts. These findings provide the first structural comparison of a full-length LuxR-type receptor bound to both agonist and antagonist. Our findings expand the known structural and mechanistic diversity of the LuxR family and suggest mechanistic similarities between structurally distinct receptors. SIGNIFICANCEQuorum sensing (QS) regulates diverse bacterial behaviors, and LuxR-type receptors are attractive targets for applications ranging from antivirulence strategies to synthetic biology and agriculture. Despite intense interest in developing chemical modulators of these systems, the molecular basis by which small molecules agonize or antagonize LuxR-type receptors remains poorly understood. Here, we investigate the LuxR-type receptor MrtR and report crystal structures of the full-length receptor bound to an agonist and an antagonist, revealing how structurally similar compounds produce opposing outcomes. Notably, MrtR exhibits an unprecedented dimerization interface mediated by a ligand-responsive loop that undergoes large conformational changes. These findings establish a new structural framework for understanding signal discrimination in LuxR-type receptors and may enable rational reprogramming of QS in natural and engineered systems.

9

Global comparison of influenza A and B epidemiology identifies consistent geographic and socio-demographic predictors

Gunning, C. E.; Rezaeimalek, S.; Rohani, P.

2026-03-16 epidemiology 10.64898/2026.03.14.26348363 medRxiv

Top 0.1%

32.7%

Show abstract

Seasonal influenza outbreaks are caused by types A and B that together account for an estimated 3-5 million severe cases each year. Most attention has focused on influenza A viruses (IAVs) due to their rapid evolutionary dynamics and high disease burden, and has been concentrated in well-observed high-income regions. Here, we use a macroecological approach to compare and contrast the global epidemiology of IAVs and influenza B viruses (IBVs) across 111 countries and 15 influenza seasons (2010-2024). We first show how temporal correlations between countries depends on both distance and geographic region. For both IAV and IBV, we find high overall synchrony among northern temperate countries, whereas tropical countries display marked heterogeneity. At the longer time scale of influenza seasons, we next quantify sampling intensity, positivity, seasonality, fade-out dynamics and the timing and variability of epidemic peaks. We then describe how these long-term epidemiological outcomes change in association with a suite of 17 geographic, climatic, and socio-economic variables. In addition, we document persistent surveillance gaps, particularly in Africa, and highlight ongoing but spatially variable impacts of the SARS-CoV-2 pandemic-era on sampling. Overall, we find strong correspondence between the macroscopic features of IAV and IBV epidemiology, with critical roles played by geography and climate (especially latitude and temperature), economics (per capita GDP) and demographics (population size and per capita birth rate). Significance StatementThe global circulation of seasonal influenza A and B viruses (IAV and IBV) imposes major human health impacts each year that very widely across space and time. An improved understanding of these dynamics could improve public health preparedness, response, and intervention efforts. Here we offer a comprehensive comparison of IAV and IBV dynamics across 15 seasons, 111 countries, and six continents. We demonstrate the impact of distance and region on temporal correlation, quantify how measures of influenza seasonality change with geographic and socioeconomic factors, and predict how frequently influenza cases are absent from countries. Our study finds widespread similarities between IAV and IBV (along with key differences), documents notable geographic clusters of countries with shared dynamics, and highlights persistent gaps in global influenza surveillance.

10

Cultural transmission and genomic co-divergence in the willow tit across the Palearctic

Syarifa, A.; Martens, J.; Päckert, M.; Kvist, L.; Wu, L.; Sun, Y.-H.; Wolf, J. B. W.; Knief, U.

2026-04-24 evolutionary biology 10.64898/2026.04.23.720297 medRxiv

Top 0.1%

32.3%

Show abstract

Cultural transmission of mating behaviour can both promote and constrain genetic divergence, yet its long-term population-genetic consequences remain unclear. In songbirds, learned song can generate behavioural isolation, but its potential to shape genome-wide differentiation at continental scales is rarely assessed. Willow tits provide a compelling system, as three culturally transmitted song types are largely allopatric across the Palearctic, while northern populations exhibit mixed repertoires. Here, we combined a chromosome-level reference genome, whole-genome resequencing of 88 willow tits (Poecile montanus) spanning all 14 subspecies, and palaeodistribution modelling to reconstruct the species evolutionary history across the Palearctic. Phylogenetic and demographic analyses indicate an origin in Asia during the Late Pliocene to Early Pleistocene, followed by expansions that yielded three deeply diverged genomic lineages in the Asian, Central European, and Northern Palearctic regions. The boundaries of these lineages coincide with major song-type divisions. Tests of historical allele sharing show that gene flow occurred preferentially among lineages that share the same or similar song types, even after accounting for geography, consistent with learned song contributing to prezygotic isolation. Peripheral, song-monotypic populations exhibit signatures of repeated bottlenecks associated with glacial isolation, whereas large northern populations retained broader song repertoires and signals of long-term connectivity. These results provide genome-wide continental evidence that culturally transmitted song mirrors and likely reinforces genomic structure through time in a widespread passerine bird.

11

The genetic legacy of archaic hominins in Central and Southeast Asia uncovers three distinct Denisovan populations

Antoine-Derouet, C.; Adam Doucet, J.; Leakhena Phoeung, C.; Dorzhu, C.; Hegay, T.; Heyer, E.; Chaix, R.; Bon, C.; Detroit, F.; Toupance, B.; Laurent, R.

2026-05-07 evolutionary biology 10.64898/2026.05.06.723201 medRxiv

Top 0.1%

32.3%

Show abstract

The sequencing of Neanderthal and Denisovan genomes has provided new insights into human evolution. Today, interactions between Neanderthals, Denisovans, and populations of European, East Asian, and Oceanian descent are well documented. However, neighboring regions such as Central and Southeast Asia remain understudied for archaic admixture despite their key geographic location and complex migration histories. To fill this gap, we investigate archaic ancestry in 16 populations from Central Asia and 14 from mainland Southeast Asia. Our results show that Neanderthal and Denisovan ancestry in these populations is of the same order of magnitude as in other Eurasian populations. However, although Denisovan ancestry accounts for less than 1% in mainland Asian populations, it originates from several admixture events involving different Denisovan populations. In particular, we find in Southeast Asia that Denisovan ancestry results from three distinct admixture events with three different Denisovan populations, highlighting the complexity of Denisovan contact with the ancestors of present-day Southeast Asian populations and providing new insights into the extensive geographic distribution of Denisovan populations.

12

A domesticated totivirus-like tandem array undergoes interspecific transfer and asymmetric evolution

Taylor, D.; Tringali, D. A.

2026-05-25 evolutionary biology 10.64898/2026.05.24.726934 medRxiv

Top 0.1%

32.3%

Show abstract

RNA paleoviruses are expected to evolve more slowly than their exogenous viral progenitors. We show that a four-gene tandem array (STORM, Scheffersomyces Totivirus-like Responsive Module; genes TLC1-TLC4) in wood-associated yeasts violates this expectation, evolving faster at the protein level than its exogenous totiviral relatives while persisting for over 15 million years. STORM has accumulated greater amino-acid divergence than its exogenous totiviral relatives over a much shorter host phylogenetic window ([~]54 MY of Scheffersomyces history versus [~]225 MY for exogenous totivirus diversification), under significant relaxation of selective constraint (RELAX K < 1). Tandem duplication resulted in asymmetric evolution within the array. For example, TLC4 alone has retained the predicted decapping loop motif (lost from TLC1, TLC2, and TLC3) and a totivirus-like capsid fold. Other copies remain more constrained in structure and sequence, indicating functional partitioning. All four genes are transcriptionally active, embedded in host antiviral and RNA-decay regulatory neighborhoods, with condition-dependent expression. Hundreds of reference gene trees for Scheffersomyces are concordant with the species tree, with only two unrelated singleton exceptions; the STORM array is the only locus where all paralogs share a well-supported, locus-coherent discordance. Distance-based tests are inconsistent with incomplete lineage sorting, and shared discordance with an adjacent ATP10 pseudogene and a transposase (Tc1/mariner superfamily) implicates transposon-mediated co-mobilization. We infer at least two interspecific transfers of STORM. Our results reveal how hosts can domesticate a mobile virus-like module whose paralogs escape strong purifying selection and explore sequence space while the core fold is conserved.

13

A haplotype-resolved bluethroat (Luscinia s. svecica) genome assembly uncovers the complex MHC region

Strand, M. A.; Enevoldsen, E. L. G.; Toerresen, O. K.; Skage, M.; Ferrari, G.; Tooming-Klunderud, A.; Leder, E. H.; Lifjeld, J. T.; Johnsen, A.; Jakobsen, K. S.

2026-03-30 genomics 10.64898/2026.03.26.714473 medRxiv

Top 0.1%

32.2%

Show abstract

We describe a chromosome-level, haplotype-resolved genome assembly from a female bluethroat (Luscinia s. svecica). The assembly comprises two pseudo-haplotypes of 1461 Mb and 1171 Mb, with 77.4% and 88.4% scaffolded into 40 autosomal chromosomes and the W and Z sex chromosomes (haplotype one). Assembly completeness is high (BUSCO 99.2% and 94.9%), with 22,462 and 18,769 annotated protein-coding genes for haplotypes one and two, respectively. The use of Oxford Nanopore Technologies sequencing enables resolution of genomic regions that are often fragmented in genome assemblies, including the hypervariable Major Histocompatibility Complex (MHC). We find that MHC loci include both the canonical organization of tandemly duplicated MHCII{beta} genes with a single MHCIIA, and a distinct arrangement in which MHCI and MHCII{beta} loci are interspersed in intermixed arrays, and that substantial structural differences between haplotypes are directly resolved in the assembly.

14

The Neanderthal population history and the introgression landscape inferred from the UK Biobank

Morez Jacobs, A.; Soltantouyeh, A.; Zeloni, R.; Carollo, F.; Mezzavilla, M.; Marnetto, D.; Pagani, L.

2026-04-04 evolutionary biology 10.64898/2026.04.03.716297 medRxiv

Top 0.1%

32.2%

Show abstract

Neanderthal haplotypes in present-day Eurasians are unevenly distributed across the genome, forming introgression deserts and high-frequency segments consistent with adaptive introgression, with additional random variation affected by genetic drift. However, current estimates are limited by modest sample sizes and analyses restricted to subsets of the genome, given that any individual carries only 1-2% Neanderthal ancestry. Here we extract and analyse Neanderthal haplotypes from 45,000 imputed and phased genomes in the UK Biobank. Even at this scale, the number of sites overlapping Neanderthal haplotypes approaches--but does not reach--saturation, with rare haplotypes still being discovered. Using the derived allele frequency spectrum within the surviving Neanderthal segments, we infer a divergence time of 2,061 generations between the introgressed lineage and the Vindija Neanderthal, and estimate the effective population size of the introgressed lineage to Ne = 6,564. Individual-level resolution allows identification of 545 independent loci with excess Neanderthal homozygosity, consistent with ongoing selection. Despite the extensive dataset, a substantial portion of the genome remains a Neanderthal desert. Within these regions, we detect seven Human Accelerated Regions affected by recent human selective sweeps (TMRCA <650 kya), four located within introns of cerebellum-expressed genes, providing further support for their potential as modern human-specific adaptation.

15

Rapid centromere turnover and the adaptive radiation of lemurs

Trivedi, M.; Gianfrate, F.; de Gennaro, L.; Ayllon, M.; Munson, K. M.; Hoekzema, K.; Yoo, D.; Ehmke, E.; Yoder, A. D.; Chang, S.; Lalgudi, C.; Krasnow, M. A.; Ventura, M.; Eichler, E. E.

2026-05-19 genomics 10.64898/2026.05.16.725662 medRxiv

Top 0.1%

32.1%

Show abstract

Centromeres represent essential chromosomal structures required for faithful chromosome segregation during cell division but are paradoxically hypermutable, leading to centromere drive and reproductive isolation in closely related species. Using long-read sequencing, we generate nearly complete genomes (2.1-2.5 Gbp) from eight lemur species and characterize the sequence, epigenetic and cytogenetic structure of 223 strepsirrhini centromeres providing an alternative primate perspective of centromere evolution. No lemur centromere consists of -satellite DNA that typifies the haplorhine lineage; instead, each species evolved its own distinct higher-order centromeric repeat sequence, varying substantially in both monomer length (ranging from 41-548 bp) and primary sequence composition (GC percentages 28.7-67.9%) including centromere cooption of telomeric repeats in brown lemurs. Most centromeres show characteristic hypomethylation dip regions (110-300 kbp) as candidates for kinetochore attachment. The centromere sequence motif shows no apparent sequence homology among lemur genera, even for species separated by less than 15 million years (Lemur and Eulemur). We estimate a >6-fold increased rate in primary centromeric motif turnover in strepsirrhines when compared to haplorhines and this occurred in conjunction with positive selection of the CENP-B protein in lemur lineages. We propose that lemur radiation and centromere diversification are linked, whereby accelerated motif turnover provides a stasipatric barrier contributing to rapid chromosomal evolution.

16

Convergent natural selection at both ends of Eurasia during parallel radical lifestyle shifts in the last ten millennia

Barton, A. R.; Rohland, N.; Mallick, S.; Pinhasi, R.; Akbari, A.; Reich, D.

2026-04-04 evolutionary biology 10.64898/2026.04.03.716344 medRxiv

Top 0.1%

31.8%

Show abstract

Ancient DNA-based studies of natural selection have focused on West Eurasia due to the availability of large sample sizes, but rich insights are expected to come from comparative studies that can reveal which patterns are shared and which region-specific. We test around seven million variants for selection in 1,862 ancient East Eurasians (867 with new data) distributed over the last ten millennia. Using a generalized linear mixed model to control for population structure, we identify 40 genome-wide significant signals of selection, which have a particularly strong impact on immune and cardiometabolic traits just as in West Eurasia. East and West Eurasia show highly correlated signals of adaptation both for individual alleles and for complex traits, showing how these geographically separate groups experienced convergent evolution in response to parallel transitions to food producing economies and the accompanying lifestyle changes. An exception is the genetic determinants of light skin color: West Eurasians depigmented in the last 10,000 years, but most skin lightening in East Asians arose prior to the Holocene.

17

Punctuated Evolution of Endomembrane Compartments in Proto-Eukaryotes

Shridhar, S.; Kumari, K.; Thattai, M.

2026-04-14 evolutionary biology 10.64898/2026.04.13.718263 medRxiv

Top 0.1%

31.7%

Show abstract

Eukaryotic cells are defined by their endomembranes: compartments such as the endoplasmic reticulum (ER), Golgi and endosomes, exchanging cargo via vesicles. The evolutionary origins of endomembrane compartments remain unclear. Here we construct molecular-evolutionary trajectories for the stepwise addition of compartments after the emergence of the proto-ER in an ancestral eukaryote. We represent compartments and vesicles as nodes and edges of a directed graph. Vesicle budding and fusion regulators such as coats and SNAREs control cargo flows and determine compartment compositions. We computationally sample billions of possible graphs, and enumerate how duplication, deletion and mutation of regulators drive graph transitions. We find that evolutionary trajectories display punctuated shifts in compartment composition and number, interspersed with thousands of neutral mutations. The first added compartment inherits functions from the proto-ER or plasma membrane, or gains novel functions. Our results show how, given a billion years, simple molecular steps can generate complex endomembrane systems. SO_SCPLOWIGNIFICANCEC_SCPLOW SO_SCPLOWTATEMENTC_SCPLOWEukaryotic cells contain a system of endomembrane compartments that sort, process and deliver molecules to precise cellular destinations. This endomembrane system is a defining feature of all complex life, yet its evolutionary origins remain obscure. How did a proto-eukaryote with a single ancestral endomembrane compartment evolve into a cell with a Golgi, endosomes, lysosomes and other compartments characteristic of modern eukaryotes? We model this process from first principles, connecting the duplication, deletion and mutation of molecular regulators to compartment gain or loss. We find a punctuated pattern of endomembrane elaboration: a long phase of neutral exploration, driven by the mutation of duplicate gene copies, precedes the emergence of new compartments and functions.

18

Effects of introgressed Neanderthal alleles on present-day brain morphology

Zeloni, R.; Amaolo, A.; Morez Jacobs, A.; Zapparoli, E.; Akl, Y.; Shafie, M.; Huerta-Sanchez, E.; Pizzagalli, F.; Provero, P.; Pagani, L.; Marnetto, D.

2026-04-17 genomics 10.64898/2026.04.14.718380 medRxiv

Top 0.1%

31.7%

Show abstract

Neanderthal introgression contributed a small fraction of genetic variants to present-day non-African genomes. While differences in cranial globularity between Neanderthal and modern humans are well documented from endocasts, the phenotypic consequences of these introgressed alleles can illuminate otherwise inaccessible genetically divergent brain structures. We analyzed 370 MRI-derived brain traits--including cortical and subcortical regional measurements, cortical folding metrics, diffusion tracts--in nearly 40,000 UK Biobank participants. To quantify the impact of Neanderthal ancestry, we intersected trait-associated loci with Neanderthal-derived variants identified from introgressed segments imputed in the same subjects. Low-frequency introgressed variants were depleted for detectable effects on brain phenotypes, whereas common introgressed variants showed no comparable depletion. Conversely, Neanderthal deserts were consistently enriched for functional effects. Eight associations were fine-mapped to Neanderthal-derived variants: one locus near the gene DAAM1 was especially prominent across multiple traits, including opposite effects in the cuneus and precuneus mediated by introgressed regulatory variants. Genome-wide directional alignment of Neanderthal effects was limited but became evident when focusing on suggestive loci: frontal and parietal areas were the most consistently affected traits, though not in a direction that obviously mirrors known modern-archaic morphology divergence. Several of these loci also influenced neuropsychiatric traits, with detectable polygenic consequences against schizophrenia and towards major depression, linking neuroanatomical and neuropsychiatric impact of Neanderthal introgression. These findings suggest that while introgressed alleles affecting divergent neuroanatomy between modern humans and Neanderthals were largely purged, a subset of tolerated alleles continues to shape human brain morphology and mental health.

19

Genome-wide genealogies reveal deep admixtures forming modern humans

Loya, H.; Gupta Hinch, A.; Palamara, P. F.; Speidel, L.; Myers, S. R.

2026-04-17 evolutionary biology 10.64898/2026.04.17.719197 medRxiv

Top 0.1%

31.5%

Show abstract

Over the past decade, genomic modelling has revealed a rich tapestry of admixtures shaping present-day human populations. These have largely focused on the past few thousand years, when ancestral populations are either well characterised by present-day genomic diversity or directly observed through ancient DNA. Genomic modelling and fossil evidence have so far only provided a fragmented picture of the coexistence and mixing of human groups in the deeper past. Here, we propose a new method, GhostBuster, that leverages inferred genome-wide genealogies to detect admixture events of unsampled ghost populations, while simultaneously inferring accurate local ancestry. Local ancestry enables us to identify ancestry-specific genomic signatures that independently corroborate the events. We identify at least three waves of "back-to-Africa" migrations starting [~]14,000 years ago. Applying GhostBuster to deeper timescales reveals that modern humans were shaped by repeated episodes of mixture. Around 50,000 years ago, we identify a human lineage that expanded to form present-day non-Africans, while also expanding within Africa, mixing with the other local African group in varying proportions. These ancient groups help explain polygenic score portability differences within Africa, and exhibit differences in population size and recombination landscapes. Extending our analysis further back to between 300,000 and 1 million years ago reveals two deeply diverged ancestral lineages. These lineages evolved profoundly different recombination landscapes, with different PRDM9 alleles (PRDM9-A and C) and recombination hotspots. We demonstrate that both Neanderthals and ancestral modern humans are formed through a mixture of these two lineages, with no evidence of gene flow from the PRDM9-A-carrying group into Denisovans.

20

Chemical suppression of a bacterial immune system revives repressed phages

Zhang, C.; Sabonis, D.; Cai, Y.; Zang, Z.; Tamulaitiene, G.; Gerdt, J. P.

2026-04-28 biochemistry 10.64898/2026.04.28.721336 medRxiv

Top 0.1%

31.4%

Show abstract

Many antiviral immune systems have recently been discovered in bacteria. The mechanisms of several are obscure, as are their individual significance for antiphage defense. To shed light on the mechanism and significance of the two-component type I Thoeris antiphage immune system, we leveraged high-throughput phenotypic screening to identify three small molecule inhibitors. The inhibitors target the ThsA NADase component, inhibiting its 3'-cADPR-activated filamentation. The temporal control afforded by the small-molecule inhibitors allowed us to answer an outstanding question in antiviral immunity--is persistent immunity required to repress phage titers, or do immune systems become unnecessary after eradicating infectious phages? We found that Thoeris immunity must be maintained, as chemical inhibition enabled repressed phages to revive and overtake the bacterial population. Furthermore, due to the cooperative nature of antiviral immunity, we found that Thoeris must be inhibited in only 10% of the bacteria to cause phage-induced lysis of the entire population.