Back

GENETICS

Oxford University Press (OUP)

Preprints posted in the last 90 days, ranked by how well they match GENETICS's content profile, based on 189 papers previously published here. The average preprint has a 0.05% match score for this journal, so anything above that is already an above-average fit.

1
Genotype frequency dynamics in finite-sized, partially clonal population with mutation

Stoeckel, S.; Masson, J.-P.

2026-04-13 genetics 10.64898/2026.04.10.717696 medRxiv
Top 0.1%
26.1%
Show abstract

Most eukaryotes reproduce using partial clonality, for which appropriate population genetic models remain limited. This gap constrains our ability to accurately reconstruct past population dynamics, predict future trajectories, and infer the evolutionary processes involved. We present a Wright-Fisher-like model tailored for tracking the mean and the variance of genotype frequencies over generations at one locus with multiple alleles in a same finite-sized population with mutation. Different initial conditions and rates of clonality generate unique mean trajectories of genotype frequencies. Partially clonal populations converge to the same unique stable equilibrium as exclusively sexual populations, that only depends on the reciprocal mutation rates between alleles. The dynamics unfold in two phases: First, genotype frequencies move towards Hardy-Weinberg proportions; Then iterate along the Hardy-Weinberg proportions until reaching the stable equilibrium. Mean allele frequencies and gene diversity remain unchanged by different rates of clonality along the trajectories. Instead, clonality influences the speed at which populations return to Hardy-Weinberg proportions and thus shapes the temporal sequence of genotype frequency distributions over generations. Variance around each mean trajectory depends only on parental genotype frequency distributions and population size, not on clonality. Taken together, these explain why both negative and positive Fis values are expected in partially clonal populations, and why variance of Fis across loci is a reliable proxy for inferring clonal rates. Our model will enable the analysis and prediction of changes in genotype frequencies within monitored populations, and will support future inference methods relying on time-series genotyping data from a target population. HighlightsO_LIOut of equilibrium, sexual and clonal populations share the same two-step dynamics. C_LIO_LIFirst, return to Hardy-Weinberg parabola impacted by rates of clonality; Then, iteration along this parabola until reaching equilibrium that only depends on mutation rates C_LIO_LIIncreasing clonality change the speed and direction of mean dynamics out of Hardy-Weinberg parabola without affecting mean allele frequencies C_LIO_LIVariance around mean dynamics depends on parental genotype frequencies and population size but not affected by clonality C_LI Graphical abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=98 SRC="FIGDIR/small/717696v1_ufig1.gif" ALT="Figure 1"> View larger version (13K): org.highwire.dtl.DTLVardef@1207e9dorg.highwire.dtl.DTLVardef@587d2dorg.highwire.dtl.DTLVardef@18224eborg.highwire.dtl.DTLVardef@145e2ed_HPS_FORMAT_FIGEXP M_FIG C_FIG

2
Parameterizing the genetic architecture under stabilizing selection

Lee, H.; Terhorst, J.

2026-03-27 genetics 10.64898/2026.03.27.714826 medRxiv
Top 0.1%
18.3%
Show abstract

Across many complex traits, genetic variants with larger effect sizes tend to occur at lower frequencies, which is often interpreted as a signature of stabilizing selection. In statistical genetics, the so-called -model captures this relationship by assuming that effect size variance is inversely proportional to heterozygosity raised to a power 0 [<=] [<=] 1. Although empirically useful, the -model is phenomenological rather than mechanistic and lacks a direct population-genetic interpretation. In this paper, we derive an alternative to the -model based on evolutionary theory. Our approach yields a linear mixed model in which the frequency dependence of effect size emerges naturally as a function of interpretable evolutionary quantities describing mutational variance, selection intensity, and coupling between the focal and selected traits. These quantities enter through two identifiable variance components that can be estimated by restricted maximum likelihood (REML). The resulting framework links a fitness-landscape model to standard mixed-model methodology, enabling both inference on evolutionary parameters and downstream prediction by best linear unbiased prediction (BLUP). In forward simulations, the model accurately recovers the focal-trait variance and generally improves genetic prediction relative to conventional -model baselines.

3
An exact formula for the contribution of sampling error to r2, a common measure of linkage disequilibrium

Waples, R. S.

2026-05-21 evolutionary biology 10.64898/2026.05.19.726388 medRxiv
Top 0.1%
18.1%
Show abstract

Interest in quantifying linkage disequilibrium (LD, non-random associations of alleles at different loci) has skyrocketed in recent years as researchers have focused on use of LD in genome-wide association studies (GWAS), for studying historical demography, and for estimating effective population size (Ne). The most widely used LD metric is r2 = the squared correlation of alleles at a pair of loci. Despite a half century of efforts, developing an unbiased expectation of r2 as a function of the many factors that can affect it (physical linkage, genetic drift, selection, migration, mutation, mating systems) remains elusive. Furthermore, even when all of these other factors are absent, empirical estimates of r2 are upwardly biased by sampling a finite number (S) of individuals, and that must be accounted for if one wants to focus on the desired signal of LD. Previous approaches to estimate [Formula] have been shown to be biased to greater or lesser degrees. The purpose of this short paper is to demonstrate that a simple and apparently exact expression for [Formula] does exist for the special case where sampling error is the only factor contributing to r2, in which case [Formula] = 1/(S - 1). When other factors contribute heavily to LD, [Formula] shrinks toward 0 as empirical r2 [->] 1. However, for estimating contemporary Ne with unlinked markers, empirical r2 will generally be small and 1/(S - 1) will provide a robust estimate of [Formula].

4
Beyond single-slope Mendelian randomization: structural representation of genetic heterogeneity in joint effect space

Hao, H.; Chen, D.; Qian, C.; Zhou, X.; Huang, H.; Zuo, J.; Wang, G.; Peng, X.; Liu, H.-X.

2026-03-14 genetic and genomic medicine 10.64898/2026.03.12.26348288 medRxiv
Top 0.1%
17.2%
Show abstract

Causal effects in complex traits are typically represented by a single linear slope. While conventional Mendelian randomization (MR) provides efficient scalar estimates, projection-based summaries do not explicitly capture structural organisation in joint effect space under genetic heterogeneity. We introduce MR-UBRA (Mendelian randomization-Unified Bayesian Risk Architecture), a probabilistic framework that decomposes instrumental variants into genetic risk fragments (GRFs) and quantifies extreme deviations using tail-risk metrics defined on the standardised residual magnitude |e|. MR-UBRA preserves the classical MR estimand while offering a structurally resolved representation of genetic heterogeneity. Across stroke subtypes, AF[->]CES, smoking[->]lung cancer, and BMI[->]T2D, effect-space distributions exhibit reproducible asymmetry, amplitude stratification, and multi-modal structure. MR-UBRA resolves component-level organisation, separating tail-dominant contributions from the causal core while maintaining consistency with the classical MR estimand. Simulations and boundary regimes demonstrate adaptive model complexity: MR-UBRA selects K>1 when multi-component structure is present and collapses to K=1 under homogeneous conditions, avoiding spurious stratification. These results support viewing causal effects in complex traits as structured distributions in joint effect space, enhancing causal representation without altering the MR estimand. Graphical AbstractMendelian randomization (MR) typically represents causal effects with a single linear slope. Under genetic heterogeneity, instrumental effects in joint ({beta}X, {beta}Y) space may exhibit multi-component structure and amplitude stratification that cannot be captured by a scalar summary. MR-UBRA fits a standard error-weighted mixture model to decompose instruments into genetic risk fragments (GRFs), estimates GRF-specific effects using posterior-weighted soft-IVW, and quantifies extreme deviations through tail-risk metrics (VaR/CVaR). Across empirical analyses and boundary regimes, MR-UBRA adapts model complexity (K) to structural signal, collapsing to K=1 under homogeneous conditions. O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=144 SRC="FIGDIR/small/26348288v1_ufig1.gif" ALT="Figure 1"> View larger version (31K): org.highwire.dtl.DTLVardef@1627086org.highwire.dtl.DTLVardef@1c9982eorg.highwire.dtl.DTLVardef@262730org.highwire.dtl.DTLVardef@d6e551_HPS_FORMAT_FIGEXP M_FIG C_FIG

5
Omitted familial extrinsic risk inflates inferred intrinsic lifespan heritability

Kornilov, S. A.

2026-04-06 genetics 10.64898/2026.04.02.716222 medRxiv
Top 0.1%
12.6%
Show abstract

Shenhar et al. (2026) report 50% "intrinsic" lifespan heritability after calibrating a one-component correlated-frailty survival model to Scandinavian twin lifespans. Their framework is mathematically coherent, but the intrinsic component is not identified if heritable, mortality-relevant extrinsic susceptibility is omitted at calibration. We show that one-component calibration absorbs omitted familial extrinsic structure into the intrinsic frailty scale parameter{sigma}{theta} , and that this variance absorption is visible through separate diagnostics (1) Variance absorption. Under misspecification,{sigma}{theta} is inflated by +22.1% (95% CI: 21.5-22.7%), corresponding to +49% inflation in [Formula]. Falconer h2 is downstream of calibration and inherits a +9.2 pp bias (95% CI: 8.7-9.7). The{sigma}{theta} inflation is model-general: +22% (GM), +18% (MGG), +14% (SR); any dependence summary that is strictly increasing in{sigma}{theta} inherits this inflation, so Falconer h2 is one affected downstream quantity among many (Corollary B3). (2) Structural fingerprint. In the joint twin survival surface S(t1, t2), misspecification produces systematic dependence errors (ISE 48x that of the recovery model). Conditional twin dependence is inflated at all ages, peaking at age 80 ({Delta}r = 0.048). (3) Specificity. The bias requires an omitted component that is both heritable and mortality-relevant. Three negative controls, a boundary check ({rho} = 0), and a two-component recovery refit ({sigma}{theta} restored to within -3.2%) establish specificity. ACE decomposition yields C {approx} 0 throughout: the omitted extrinsic component loads onto A (because it is shared 1.0/0.5 in MZ/DZ), so switching summary statistics does not restore identification. (4) Sensitivity and falsifiability. Over an empirically anchored regime ({sigma}{gamma} [isin] [0.30, 0.65],{rho} [isin] [0.20, 0.50]), Falconer bias ranges from +2.8 to +18.9 pp (mean 9 pp). If{rho} is sufficiently negative, the bias reverses sign in all three model families (Corollary B4). A full-likelihood robustness check shows that this upward pull is partly structural and partly estimator-specific: in the same misspecified one-component model, ML still inflates{sigma}{theta} (+3%), whereas matching only rMZ inflates it much more (+21%). These results do not resolve true intrinsic heritability but establish that Shenhars 50% estimate carries a structured, model-general upward bias originating in the fitted latent variance{sigma}{theta} .

6
The phenotypic nonspecificity of cell-to-cell signalling in Drosophila melanogaster.

Percival-Smith, A.; Brabrook, C.

2026-05-21 genetics 10.64898/2026.05.19.726339 medRxiv
Top 0.1%
10.3%
Show abstract

An expectation of a hypothesis that proposes cell-to-cell signalling pathways are redundant due to the redundancy of pathway terminal transcription factors (TFs) was tested by screening 35 signalling ligands (SLs) for rescue of a decapentaplegic (dpp) hypomorphic wing growth phenotype. The screen identified three examples of partial rescue: Hedgehog (HH), Semphorin 1a (SEMA1A) and Wnt ortholog 2 (WNT2). HH overexpression with dppGAL4 may increase the expression of DPP activity from the hypomorphic dpp alleles. However, SEMA1A and WNT2 did not phenocopy ectopic expression of HH or DPP and neither SEMA1A nor WNT2 were required for wing growth suggesting substitution of DPP for partial restoration of wing growth. The WNT2 rescue was dependent on the Frizzled 4 (FZ4) WNT receptor excluding the possibility that WNT2 weakly binds the DPP receptor. Although examples of phenotypic nonspecificity of SL function were identified, this is an expectation, and not direct proof, of the hypothesis of TF redundancy. Screen Report SummaryAn expectation of a hypothesis proposing that cell-to-cell signalling pathways are redundant due to the redundancy of the pathway terminal transcription factors was tested by screening for replacement of one signalling ligand (DPP; SLa) with another SLb for wing growth. Three non-DPP SLs were identified in the screen of 35SLs: HH, SEMA1A and WNT2. Genetic analysis of Sema1a and Wnt2 suggests functional complementation of dpp for wing growth suggesting that SEMA1A and WNT2 partially replace DPP for wing growth. Therefore, an expectation of the hypothesis is met.

7
The contribution of non-additive genetic effects to the genetic variance of polyploid species.

Clo, J.

2026-05-14 genetics 10.64898/2026.05.12.724556 medRxiv
Top 0.1%
9.9%
Show abstract

Whole genome duplication is a common mutation in eukaryotes with far-reaching phenotypic effects. The resulting morphological, physiological, and fitness consequences and how they affect the survival probability of newly polyploid lineages are intensively studied, but very little is known about the effect of genome doubling on the short-term evolvability of populations. Understanding the effect of polyploidization on the adaptive potential of populations is of crucial importance to predict the future of polyploid populations. In this paper, I investigate the immediate consequences of genome doubling on the genetic variance of populations. To do so, I performed numerical iterations and simulations of how the genetic variance of a quantitative trait changes after polyploidization, under different genetic architectures (additivity, dominance, and epistasis). I found that genetic variance generally decreases after genome doubling. Non-additive gene actions can make autotetraploid populations genetically more diverse than their diploid progenitors in rare cases, notably with overdominance and directional epistasis. By collecting estimates from the agronomic literature, I found that both dominance and epistatic variance contribute to the genetic variance of polyploid populations. These results bring new insights into the adaptive potential of newly formed tetraploid populations, and call for further experimental investigations of how polyploidization is associated with a short-term decrease in evolvability.

8
Telomeric amplicons of SUL1 and Y' in yeast are generated by microhomology-mediated break induced replication occurring in cis

Brewer, B. J.; Martin, R.; Ramage, E.; Payen, C.; Di Rienzi, S. C.; Zhao, Y.; Zane, K.; Verhey, J.; Galey, M.; Miller, D. E.; Ong, G. T.; McKee, J. L.; Alvino, G. M.; Dunham, M. J.; Raghuraman, M. K.

2026-04-09 genetics 10.64898/2026.04.07.716220 medRxiv
Top 0.1%
9.9%
Show abstract

Gene amplification is a potent driver of evolution and is thought to contribute to genetic diseases, including cancer. The yeast Saccharomyces cerevisiae is a powerful organism for understanding amplification mechanisms. When yeast is grown long term in sulfate-limiting chemostats, amplification of the gene that encodes the primary sulfate transporter, SUL1, is a common outcome. Here we describe a form of SUL1 amplification in which multiple copies of the right terminal region of chromosome II are appended in tandem to a native telomere. We find this form of amplicon when we delete the origin of replication next to SUL1 or delete a variety of genes involved in DNA metabolism. It is the only form of amplification found in a yku70{Delta} mutant suggesting that unprotected telomeres are involved. We propose that these terminal addition events occur when the unprotected 3 G1-3T telomeric sequence invades a short ([~]7 bp) internal telomere sequence (ITS) to begin a form of microhomology-mediated break-induced replication (mmBIR) that has been documented in type-I survivors of telomerase mutants. In addition to amplification of the right end of chromosome II we also find that telomeres containing the sub-telomeric repeat Y experience similar tandem amplification events and show that their formation is reduced in a pol32{Delta} mutant, a gene required for mmBIR. Within individual amplicons the ITSs and Ys are nearly identical, suggesting that the multiple copies of the amplified region are generated in a single mmBIR event that we describe as pseudo-rolling circle mmBIR. A similar amplification event at the P-telomere of human chromosome 18 has four copies of a [~]54 kb region separated by ITSs of nearly identical size. This finding suggests that these additional copies of the terminal fragment of human chromosome 18 arose by the same pseudo-rolling circle mechanism, perhaps during a period of telomeric stress. AUTHOR SUMMARYThe human genome is peppered with duplicates (or higher numbers) of segments that are located at sites both nearby and distant from the original, ancestral segments. These Copy Number Variants, or CNVs, appear to be highly variable among different individuals and are being examined with great interest as potential loci associated with genetic disease. Experimentally determining how these CNVs arise and become distributed across the genome is nearly impossible using humans. We are using budding yeast as the model organism to explore mechanisms of gene amplification. In this work we show that by destabilizing the ends of yeast chromosomes (telomeres) or by interfering with genes involved in the replication, repair, or recombination of DNA results in a specific form of segmental copy number increase that is initiated at telomeres. We propose that a telomere invades an internal chromosome site and sets up a pseudo-circular template for conservative DNA replication. The outcome is a chromosome with multiple, identical copies of a chromosome end arranged in tandem. We believe that it is also a major mechanism used by cells to repair telomeres that have become eroded during aging.

9
Inherited long telomeres induce a genome-wide transcriptional response in budding yeast

Sidarava, V.; Lydall, D.

2026-04-19 genetics 10.64898/2026.04.15.718807 medRxiv
Top 0.1%
8.8%
Show abstract

Eukaryotes typically maintain telomere length within a defined range. While short telomeres are known to activate DNA damage responses and limit cell proliferation, long telomeres are associated with extended proliferative capacity. The broader cellular consequences of long telomeres are comparatively less well understood. In budding yeast Saccharomyces cerevisiae, long telomeres have been shown to influence gene expression at specific loci, but whether long telomeres affect transcription genome-wide has not been reported. Here, we analysed transcriptomes in a lineage that inherited long telomeres (originally due to a rif2{Delta} mutation). Transcriptomes were assessed over two rounds of mitosis and meiosis in the absence of the rif2{Delta} mutation. We show that strains with long telomeres exhibit a distinct gene expression profile, including upregulation of membrane transporters and downregulation of a smaller subset of genes. Both up- and down-regulated genes were distributed across the genome, arguing against a purely telomere-proximal effect on gene expression. Affected genes were enriched for Rap1 binding sites, consistent with a model in which long telomeres sequester telomere-associated transcriptional regulators, such as Rap1, and thereby affect gene expression at non-telomeric binding sites for these regulators. Accordingly, the magnitude of transcriptional changes was greatest in strains with the longest telomeres. Together, our findings demonstrate that long telomeres induce a genome-wide transcriptional response that can accompany inherited long telomeres across generations. Similar effects of long telomeres are likely to occur in other eukaryotes, including humans, where long telomeres are associated with disease. Article summaryTelomeres protect chromosome ends, and their length is tightly regulated. While short telomeres are known to be harmful, the effects of long telomeres are less well understood. Using budding yeast, we show that inherited long telomeres alter the expression of dozens of genes across the genome, particularly membrane transporters. These changes are consistent with a model in which long telomeres sequester regulatory proteins away from other loci. Our findings may have broader implications in more complex organisms, including humans.

10
The causes of signed linkage disequilibrium within genomic datasets

Stetsenko, R.; Merot, C.; Glemin, S.; Roze, D.

2026-04-21 genomics 10.64898/2026.04.17.719204 medRxiv
Top 0.1%
8.3%
Show abstract

Several recent studies have quantified signed linkage disequilibrium (LD) among mutations in genomic datasets, often reporting positive LD, particularly among mutations presumed to be less deleterious, such as synonymous variants. In this article, we investigate two potential sources of this positive LD: the focus on rare alleles, as adopted in several previous studies, and errors arising in the mapping of short-read sequences onto a reference genome. Using coalescent simulations, we extend previous theoretical results of the effect of focusing on rare alleles, and show that derived alleles present at similar frequencies tend to be in positive LD. Reanalyzing datasets from Capsella grandiflora and Drosophila melanogaster, we show that LD among synonymous derived alleles vanishes in the absence of any conditioning on frequency, while LD between mutations categorized as potentially deleterious by the SIFT4G program stays positive. However, we show that in both cases, this positive LD may be at least partly caused by the potential mismapping of a small fraction of sequences in some individuals, which could be a consequence of structural variants that are absent from the reference genome. Overall, these results show that average signed LD among mutations can be strongly affected by technical artifacts even if these concern only a minority of variants. Finally, we discuss other possible sources of positive LD among deleterious mutations.

11
Laboratory yeast crosses reveal limited epistasis in the genetic basis of complex traits

Gupta, M.; Holmes, C. M.; Belousova, J.; Gopalakrishnan, S.; Rego-Costa, A.; Desai, M. M.

2026-04-06 genetics 10.64898/2026.04.04.716439 medRxiv
Top 0.1%
6.5%
Show abstract

Mapping the genetic basis of complex traits is complicated by the presence of epistatic interactions between loci. While work in molecular genetics identifies numerous specific genetic interactions, statistical analyses of quantitative traits frequently conclude that additive (nonepistatic) models explain most heritable variation. However, these conclusions are typically limited by the narrow range of genetic relatedness(e.g. in F1 offspring of a biparental or circular cross). Here, we use a barcoded panel of Saccharomyces cerevisiae genotypes with a broad range of relatedness to quantify the effects of epistasis on the genetic architecture of seven complex traits. We find limited contributions of epistasis to the genetic basis of these traits. These results indicate that epistasis beyond that detected in standard yeast crosses may exist, yet it contributes little to phenotypic variance in these systems.

12
The effect of a reduction in population size on mean fitness and inbreeding depression

Lopez-Cortegano, E.; Charlesworth, B.

2026-05-21 genetics 10.64898/2026.05.15.725556 medRxiv
Top 0.1%
6.4%
Show abstract

A sudden reduction in population size increases the rate of genetic drift, reducing variability and increasing the mean level of homozygosity. The resulting increased exposure of recessive or partially recessive, strongly deleterious alleles to selection against homozygotes may lead to their being purged from the population, potentially allowing mean fitness to increase after an initial decline, and accelerating the decline in inbreeding depression associated with reduced variability. However, detailed population genetic theory on the effects of population bottlenecks on mean fitness and inbreeding depression remains limited. We develop a theoretical framework for small, randomly mating populations founded from a large population near mutation-selection-drift equilibrium, using both simulations and approximate analytical predictions. These provide quantitative predictions for the dynamics of the populations mean fitness and level of inbreeding depression following a bottleneck. In particular, we derive an approximate expression for the time needed for mean fitness to recover after an initial decline; such a recovery requires selection to be sufficiently strong relative to drift and mutations to be sufficiently recessive. In contrast, weakly deleterious mutations cause reductions in mean fitness and inbreeding depression that are similar in size to those predicted from increases in neutral homozygosity.

13
Environmental impacts on gene expression noise and its relationship with fitness

Haque, T.; Siddiq, M. A.; Duveau, F. M.; Wittkopp, P.

2026-05-18 evolutionary biology 10.64898/2026.05.18.725919 medRxiv
Top 0.1%
6.3%
Show abstract

Genetically identical cells grown in the same environment show variation in gene expression known as expression noise. Expression noise can be heritable and impact fitness, making it subject to natural selection. Increasing expression noise for the Saccharomyces cerevisiae TDH3 gene was shown to be beneficial in glucose-based media when mean TDH3 expression was far from the fitness optimum but deleterious when it was close to this optimum. Here, we show that growth on different carbon sources alters the effects of new mutations on TDH3 expression noise and examine the fitness effects of changing expression noise. In galactose-based media, we observed the same relationship between expression noise and fitness seen in glucose-based media, but in glycerol- and ethanol-based media, we observed the opposite relationship or no significant relationship, respectively. Using simulations of single-cell organisms, we found that these differences were most likely explained by environment-specific relationships between gene expression and fitness. We also found that, far from the optimum, the fitness effects of noise were greatest when expression was highly heritable between mother and daughter cells. The empirical observations and simulations reported in this study show how environments influence both the production of expression noise and its impacts on fitness.

14
Epistatic fitness landscapes emerge from parallel adaptive walks in breeding network metapopulations

Monyak, T.; Morris, G.

2026-03-20 genetics 10.64898/2026.03.18.712732 medRxiv
Top 0.1%
6.3%
Show abstract

Global networks of crop breeding programs leverage diverse germplasm, but diversity increases the complexity of maintaining stability in their elite genepools. To characterize genetic heterogeneity in breeding metapopulations and develop insights on how to manage it, we simulated the evolution of breeding populations on fitness landscapes. We revealed the geometric decrease in the average effect size of alleles segregating as standing variation that become fixed along an adaptive walk. We also demonstrated how independent adaptive walks of subpopulations are influenced by genetic drift, leading to cryptic genetic heterogeneity among elite genepools. This variation is released when elite lines derived from independent subpopulations are crossed, leading to segregation for 2-4X more major QTL in admixed families as in unadmixed families, and 2-4X more epistatic interactions. The emergent property of fitness epistasis for traits under stabilizing selection is well-understood in evolutionary genetics, but under-appreciated in crop quantitative genetics. To highlight the importance of this phenomenon, we constructed an empirical genotype-to-fitness landscape from the sorghum NAM, a global admixed prebreeding resource, demonstrating the utility of fitness landscapes for inferring genetic compatibilities within metapopulations. Our findings suggest that in breeding networks, strategies for effective germplasm exchange must account for epistasis in the oligogenic component of the genetic architecture of locally-adapted traits. Article summaryModern public sector crop improvement happens in networks of breeding programs that routinely exchange genetic information. Traditional models for understanding quantitative traits have limited predictiveness in situations with such genetic heterogeneity. This study uses breeding simulations and empirical data to show the utility of the fitness landscape framework for characterizing the genetic architecture of complex traits in breeding metapopulations. By simulating the evolution of breeding programs and integration into networks, it demonstrates how epistatic interactions between large-effect alleles are a fundamental property that must be accounted for when exchanging germplasm. Graphical Abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=102 SRC="FIGDIR/small/712732v1_ufig1.gif" ALT="Figure 1"> View larger version (25K): org.highwire.dtl.DTLVardef@1541326org.highwire.dtl.DTLVardef@b553a8org.highwire.dtl.DTLVardef@8758b4org.highwire.dtl.DTLVardef@1d0bdcd_HPS_FORMAT_FIGEXP M_FIG C_FIG

15
A lethal ORC ATPase mutation is suppressed by alterations in ORC and RNA Pol II transcription components

Martinez-Rodriguez, L. E.; Bell, S. P.

2026-05-05 genetics 10.64898/2026.05.01.722367 medRxiv
Top 0.1%
6.3%
Show abstract

The origin recognition complex (ORC) selects origins of replication and directs the loading of the Mcm2-7 replicative helicase at these sites. Five of the six ORC subunits are related to the AAA+ family of ATPases. Although functions for ATP hydrolysis by Cdc6 and the Mcm2-7 complex have been described, the essential role of ORC ATP hydrolysis remains unclear. We performed a genetic screen in Saccharomyces cerevisiae for suppressors of the lethal phenotype of the orc4-R267A allele, which disrupts ORC ATP hydrolysis in vitro. We identified six causative mutations, five of which are distributed across different ORC subunits. The suppressor mutations in Orc1 and Orc4, but not the other ORC subunits, increase the in vitro helicase loading activity of ATPase-defective ORC (ORC4R). Allele specificity studies showed the alleles specifically suppress defects at ATPase interfaces within the ORC-Cdc6 complex. The sixth allele is a mutation in TOA2, a subunit of the TFIIA general transcription factor. Mutations in the general transcription factors TBP and TFIIB, and the large subunit of RNA Polymerase II also suppressed the orc4-R267A lethality, suggesting that reducing transcription is sufficient for suppression. Our study identifies multiple ways to suppress the lethal phenotype of an ATPase defective ORC allele and reveals a connection between ORC ATP hydrolysis and transcription.

16
C. elegans models of Alternating Hemiplegia of Childhood have dominant neuromuscular junction defects

Wall, D.; Friedberg, A.; Lins, J.; Khalifa, R.; Partipilo, S.; Hart, A. C.

2026-04-26 neuroscience 10.64898/2026.04.22.720250 medRxiv
Top 0.1%
6.3%
Show abstract

Dominant missense mutations in ATP1A3, encoding a Na+, K+ ATPase -3 subunit, can cause Alternating Hemiplegia of Childhood (AHC), but how these mutations lead to AHC remains unclear. Here, we establish the first C. elegans AHC models by introducing AHC-causing ATP1A3 patient mutations (D801N, E815K, L839P, and G947R) into the orthologous gene, eat-6, using CRISPR/Cas9. Homozygous C. elegans AHC model animals have recessive developmental defects. Heterozygous AHC model animals have dominant defects in neuromuscular junction (NMJ) function that are inconsistent with haploinsufficiency and dominant sleep or arousal defects. Previous work in a Drosophila G755S AHC model found that loss of a K-dependent, Na/Ca{superscript 2} exchanger exacerbated neuronal defects. We introduced a loss-of-function allele of the orthologous C. elegans gene, ncx-4, into C. elegans AHC models; loss of ncx-4 function did not consistently alter C. elegans AHC model defects across alleles. Our results establish novel C. elegans models of AHC with robust phenotypes, demonstrate that AHC mutations disrupt NMJ function, and provide proof-of-concept for discovering cross-species modifiers of AHC-related phenotypes. Summary StatementWe report the first C. elegans models of Alternating Hemiplegia of Childhood. D801N, E815K, L839P, and G947R AHC model animals have recessive development defects and dominant neuromuscular defects.

17
Seasonal fluctuations in fitness result in severe reductions in effective population size

Johnson, O. L.; Tobler, R.; Schmidt, J. M.; Huber, C. D.

2026-04-01 evolutionary biology 10.64898/2026.03.30.715388 medRxiv
Top 0.1%
6.0%
Show abstract

Genetic evidence for fluctuating selection has begun to accumulate for different species over the past few decades, especially for the Drosophila genus where studies have reported hundreds of loci undergoing putatively adaptive oscillations across successive seasons. However, most theoretical and simulation studies of fluctuating selection have relied on abstract or weakly parameterized models, making it difficult to assess their relevance for natural populations. In this study, we simulate multilocus seasonally fluctuating selection under a recently developed model and examine its effect on the variance effective population size (Ne) at a genome-wide scale. By recapitulating genomic, demographic, and evolutionary parameters from natural Drosophila populations in our simulations, we were able to reproduce allele frequency oscillations reported in recent studies and show that these lead to [~]50% genome-wide reductions in Ne. We also demonstrate that Ne reductions are well predicted by the maximum frequency amplitude among all adaptively fluctuating loci, and that the frequency amplitudes are largely determined by the number of adaptively fluctuating loci and the strength of their epistatic interactions. Our results demonstrate that fluctuating selection can substantially reduce effective population size and underscore the importance of temporally variable selection in shaping genome-wide patterns of variation beyond classical models. Article SummaryGenetic studies of fluctuating selection in natural populations have grown steadily over the past decade, with reports suggesting that hundreds of loci undergo adaptive oscillations over seasonal timescales in cosmopolitan Drosophila populations. By simulating seasonally fluctuating selection under a recently developed model and ecological scenarios informed by published studies, the authors show that this mode of selection can reduce effective population size by [~]50%, with the magnitude of the reduction correlated with the locus exhibiting the largest allele frequency fluctuations. These findings highlight fluctuating selection as an important factor shaping genome-wide patterns of genetic variation and effective population size.

18
Pitfalls in estimating and interpreting the contribution of ultra-rare genetic variants to the heritability of complex traits

Wang, H.; Wainschtein, P.; Sidorenko, J.; Fikere, M.; Zhang, Y.; Kemper, K. E.; Zheng, Z.; Hivert, V.; Zeng, J.; Goddard, M. E.; Visscher, P. M.; Yengo, L.

2026-04-07 genetic and genomic medicine 10.64898/2026.04.06.26350278 medRxiv
Top 0.1%
5.0%
Show abstract

Assessing the contribution of ultra-rare variants (minor allele frequency <0.01%) to the heritability of complex traits remains challenging due to limited understanding of potential biases. Here, we focus on singletons (that is, variants observed only once in the study sample), the most abundant class of ultra-rare variants, to showcase various confounders of heritability estimates and underline pitfalls in their interpretation. We show through theory, simulations, and analysis of 5,330,210 exome-sequenced singletons in 305,813 unrelated European-ancestry individuals in the UK Biobank that (i) population stratification induces both upward and downward biases in singleton-based heritability estimates (), (ii) estimates capture non-additive genetic effects, and (iii) asymptotic standard errors of estimates from likelihood-based procedures are generally mis-calibrated when traits are not normally distributed. We further showcase these biases in real-data analyses of 22 quantitative phenotypes and report, after accounting for these pitfalls, significant estimate for number of children (3.4%), peak expiratory flow (1.9%), red blood cell count (2.5%), white blood cell count (1.9%) and heel bone mineral density (2.4%). Overall, our study provides recommendations for robust inference of heritability from ultra rare variants and underscores that reliable estimates for ordinal and binary traits will require far larger sample sizes and improved methods, given that confounding in these traits remains difficult to detect and correct

19
Transforming Semi-structured Variant Assessments into Computable Clinical Assertions: A Pilot Study for AI-Assisted Curation

Cannon, M. J.; Bratulin, A.; Kuzma, K.; Puthawala, D.; Corsmeier, D.; Schieffer, K.; Kelly, B.; Cottrell, C.; Wagner, A. H.

2026-05-08 health informatics 10.64898/2026.05.07.26352456 medRxiv
Top 0.1%
4.9%
Show abstract

Genomic medicine relies on expert evaluation of genomic variants, but this process is dramatically slowed by a lack of readily-accessible genomic knowledge. Although genomic knowledge resources such as ClinVar and CIViC support structured data sharing and provide interfaces for adding structure, much of the variant interpretation data generated upstream of these resources is not readily interoperable with these resources, limiting the ability of clinical labs to share data and creating knowledge silos. Here we evaluate a strategy for breaking down these knowledge silos in a pilot study to transform semi-structured variant classification knowledge into computable clinical assertions leveraging the Global Alliance for Genomics and Health (GA4GH) Genomic Knowledge Standards specifications. We programmatically mapped previously captured somatic cancer clinical significance classifications from spreadsheets to the GA4GH Variant Annotation specification. For diagnostic classification data, this approach enabled reuse of standards-aware submission tooling to share 1,499 records to ClinVar. We then studied how AI-assisted curation approaches to overcome gaps in unstructured text enabled scalable curation of prior classifications in unstructured text. Using this approach, we were able to accurately classify clinical significance for 71.8% (117/163) of randomly sampled prognostic evidence statements. We conclude with an overview of how this work may be generalized to make computationally inaccessible variant evidence from other clinical laboratories broadly reusable in downstream knowledgebases such as CIViC and ClinVar.

20
Measurement strategy alters inferred age-dependent accumulation and mortality risk of mosaic Y loss

Ware, A.; Weyrich, M.; Fatima, S.; Xu, T.; Radhakrishnan, S.; Kapfer, P.; Yang, X.; Schiethe, L.; Zanders, L.; Cremer, S.; Mas-Peiro, S.; Dimmeler, S.; Speer, T.; Zeiher, A.; Abplanalp, W.

2026-03-10 health informatics 10.64898/2026.03.09.26347951 medRxiv
Top 0.1%
4.9%
Show abstract

Mosaic loss of Y chromosome (mLOY) is a widely used biomarker of biological aging, yet whether its inferred age-dependent accumulation and associated clinical risk are invariant to measurement strategy remains unclear. We compared intensity-based and phase-based quantification approaches in 223,251 men from the UK Biobank to determine how analytic definitions influence estimates of mLOY burden, risk thresholds and population prevalence. Phase-based quantification revealed a steeper and more stable age-dependent accumulation of mLOY and identified excess mortality risk at lower mosaic burdens than intensity-based metrics. These differences shifted the inferred onset of biological risk and expanded the proportion of individuals classified as affected from 5.3% to 19.2%. Conventional thresholding preferentially excluded low-burden mosaicism, compressing risk gradients and reducing statistical resolution for downstream associations. These findings show that analytic definitions materially alter inferred accumulation dynamics, risk thresholds and population prevalence of mosaic Y loss.