Genetics
◐ Oxford University Press (OUP)
Preprints posted in the last 30 days, ranked by how well they match Genetics's content profile, based on 225 papers previously published here. The average preprint has a 0.21% match score for this journal, so anything above that is already an above-average fit.
Lopez-Cortegano, E.; Charlesworth, B.
Show abstract
A sudden reduction in population size increases the rate of genetic drift, reducing variability and increasing the mean level of homozygosity. The resulting increased exposure of recessive or partially recessive, strongly deleterious alleles to selection against homozygotes may lead to their being purged from the population, potentially allowing mean fitness to increase after an initial decline, and accelerating the decline in inbreeding depression associated with reduced variability. However, detailed population genetic theory on the effects of population bottlenecks on mean fitness and inbreeding depression remains limited. We develop a theoretical framework for small, randomly mating populations founded from a large population near mutation-selection-drift equilibrium, using both simulations and approximate analytical predictions. These provide quantitative predictions for the dynamics of the populations mean fitness and level of inbreeding depression following a bottleneck. In particular, we derive an approximate expression for the time needed for mean fitness to recover after an initial decline; such a recovery requires selection to be sufficiently strong relative to drift and mutations to be sufficiently recessive. In contrast, weakly deleterious mutations cause reductions in mean fitness and inbreeding depression that are similar in size to those predicted from increases in neutral homozygosity.
Martinez-Rodriguez, L. E.; Bell, S. P.
Show abstract
The origin recognition complex (ORC) selects origins of replication and directs the loading of the Mcm2-7 replicative helicase at these sites. Five of the six ORC subunits are related to the AAA+ family of ATPases. Although functions for ATP hydrolysis by Cdc6 and the Mcm2-7 complex have been described, the essential role of ORC ATP hydrolysis remains unclear. We performed a genetic screen in Saccharomyces cerevisiae for suppressors of the lethal phenotype of the orc4-R267A allele, which disrupts ORC ATP hydrolysis in vitro. We identified six causative mutations, five of which are distributed across different ORC subunits. The suppressor mutations in Orc1 and Orc4, but not the other ORC subunits, increase the in vitro helicase loading activity of ATPase-defective ORC (ORC4R). Allele specificity studies showed the alleles specifically suppress defects at ATPase interfaces within the ORC-Cdc6 complex. The sixth allele is a mutation in TOA2, a subunit of the TFIIA general transcription factor. Mutations in the general transcription factors TBP and TFIIB, and the large subunit of RNA Polymerase II also suppressed the orc4-R267A lethality, suggesting that reducing transcription is sufficient for suppression. Our study identifies multiple ways to suppress the lethal phenotype of an ATPase defective ORC allele and reveals a connection between ORC ATP hydrolysis and transcription.
Kocik, R. A.; Ahrens, J.; Gasch, A. P.
Show abstract
Yeast responding to acute stress reallocate cellular resources, in part via the Environmental Stress Response (ESR) that induces stress-defense genes while repressing ribosome-biogenesis and growth genes. The purpose and regulation of coordinated induction and repression is incompletely understood, but both responses are influenced by ESR transcription factors Msn2 and Msn4 (Msn2/4). Here we used single-cell microscopy and transcriptomic analysis to investigate the role of upstream regulator Pde2 in ESR regulation and post-stress fitness. Loss of PDE2 weakened and shortened Msn2 activation following salt stress and produced muted induction of Msn2/4 targets, similar to a msn2{triangleup}msn4{triangleup} strain. In contrast, Pde2 had at most a minor impact on ESR repressor Dot6, yet was important for repression of its targets beyond Msn2/4 influence. Consistent with our recent resource-reallocation model, pde2{triangleup} cells had normal or faster post-stress growth rates, despite weaker activation of the ESR. We discuss implications for ESR regulation and function.
Chandra, S.; Gao, Z.
Show abstract
Recent studies have reported consistent inter-population differences in GC content at polymorphic sites in multiple species, including humans. Specifically, populations that experienced recent bottlenecks exhibit lower average GC content (GC%) at common polymorphic sites compared to non-bottlenecked groups--an observation previously interpreted as indication of rapid evolution of base composition. In this study, we investigate the evolutionary and technical factors driving these patterns across humans, mice, maize, and silkworm. We find that GC% at polymorphic sites is highly sensitive to the allele frequency threshold applied. Relaxing this threshold reduces inter-population differences to negligible levels in humans and significantly attenuates similar signals in other species. We further observe substantial GC% variation across allele frequency bins, a pattern driven by the differential abundance of different mutation types. We demonstrate that these observations are collectively driven by an interaction between demographic history and a universal excess of strong-to-weak mutations relative to weak-to-strong mutations, which is counteracted by GC-biased gene conversion (gBGC) over long evolutionary timescales. Forward-in-time simulations with realistic parameters recapitulate observed patterns of GC% variation across both populations and allele frequency bins. Overall, our findings reveal that the base composition at polymorphic sites is strongly shaped by the interaction between demographic history, mutation bias, and gBGC, and does not represent stable, genome-wide trends. Consequently, inter-population differences in GC content--especially at common variants--should not be interpreted as evidence of ongoing divergence in base composition or shifts in mutation patterns.
Miao, X.; Edge, M. D.; Harpak, A.
Show abstract
Standard genome-wide association studies (GWASs) are vulnerable to confounding factors, including stratification, assortative mating, and dynastic effects. Family studies such as sibling-based GWAS (sib-GWAS) mitigate such confounding and are becoming the tool of choice for teasing apart direct genetic effects--causal effects of ones genotype on ones own phenotype-- from other factors. However, due in part to their smaller sample sizes, sib-GWAS allelic effect estimates are substantially more variable than standard (i.e., population-based) GWAS estimates. The quantification of this uncertainty is essential for many uses of sib-GWAS, including polygenic scoring, causal inference (e.g., Mendelian randomization), disentangling direct from indirect familial effects, and measuring assortative mating. Here, we investigate sources of uncertainty in sib-GWAS allelic effect estimators. We study their impacts on the biases of three uncertainty measurement methods, including two that are commonly used and a new resampling-based approach we propose. We find that heterogeneity in allelic effects or heteroskedasticity across families (e.g., due to variation in genetic backgrounds or environments) can bias existing methods, and that this bias is more severe for small samples and rare variants. In contrast, the resampling-based approach we propose is approximately unbiased under all scenarios we considered. We validate our theoretical predictions, as well as the importance of effect heterogeneity and heteroskedasticity, using simulations and empirical analysis in the UK Biobank. In sum, this study helps understand the sources of uncertainty in family-based genotype-phenotype association studies and provides a robust method to estimate uncertainty.
Treaster, M.; White, M. A.
Show abstract
Many taxa have evolved heteromorphic sex chromosomes like the XY system found in mammals. In additional to the sex determination gene which directs development of the gonad into an ovary or testis, sex chromosomes can have drastically different gene content, leading to substantial genetic differences between genetic males and females beyond their gonad identity. Studying the effects of these genetic differences is challenging, as the sex chromosomes and sex determination gene are inherited together, so the effects of genetic differences between the X and Y cannot be easily isolated from the hormonal differences produced by the ovary and testis. The threespine stickleback fish has a heteromorphic XY sex chromosome system and a wide range of well documented sex differences in morphology and behaviors, including complex mating behaviors and male-only parental care. Through genetic manipulation of amhy, the newly identified sex determination gene in threespine stickleback, we are able to generate gonadal males and females with either the XX or XY sex chromosome complement and analyze the separate effects of gonadal sex and sex chromosome complement on sexually dimorphic gene expression. We find that sex chromosomes have a larger effect on gene expression than gonadal sex in somatic tissues, while gonadal sex has a larger effect on expression in the gonads. We also find that the X and Y chromosomes are enriched for genes which show differential expression between females and males. Our findings demonstrate the significant biological impact of sex chromosomes outside of primary sex determination and showcase the utility of the threespine stickleback for studying the genetic basis of sex differences.
Zhang, L.; Paterson, A. D.; Sun, L.
Show abstract
Testing for Hardy-Weinberg equilibrium (HWE) is a fundamental component of genetic data analysis, widely used for quality control and model validation. Although HWE testing is well established for autosomal loci, inference on the X chromosome is more complex due to sex-specific genotype structures and potential sex differences in minor allele frequency (sdMAF). Existing tests differ in their assumptions about sdMAF and male sample inclusion, often leading to distinct but poorly characterized null hypotheses. We develop a general statistical framework for HWE inference using the robust allele-based regression model. By formulating HWE testing as an assessment of allele-level dependence, the framework directly parameterizes Hardy-Weinberg disequilibrium, unifies existing Pearson{chi} 2-based tests under explicit modeling assumptions, and clarifies their null hypotheses, degrees of freedom, and sensitivity to sdMAF. The framework also accommodates covariate and population-structure adjustment within a unified regression-based formulation. The proposed framework provides robust, interpretable, and flexible inference, establishing a unified statistical foundation for HWE testing across autosomal and X-chromosomal regions. Simulation studies and analysis of high-coverage 1000 Genomes Project data demonstrate that commonly used X-chromosome tests can exhibit inflated type I error or misleading inference when sdMAF is present.
Waples, R. S.
Show abstract
Interest in quantifying linkage disequilibrium (LD, non-random associations of alleles at different loci) has skyrocketed in recent years as researchers have focused on use of LD in genome-wide association studies (GWAS), for studying historical demography, and for estimating effective population size (Ne). The most widely used LD metric is r2 = the squared correlation of alleles at a pair of loci. Despite a half century of efforts, developing an unbiased expectation of r2 as a function of the many factors that can affect it (physical linkage, genetic drift, selection, migration, mutation, mating systems) remains elusive. Furthermore, even when all of these other factors are absent, empirical estimates of r2 are upwardly biased by sampling a finite number (S) of individuals, and that must be accounted for if one wants to focus on the desired signal of LD. Previous approaches to estimate [Formula] have been shown to be biased to greater or lesser degrees. The purpose of this short paper is to demonstrate that a simple and apparently exact expression for [Formula] does exist for the special case where sampling error is the only factor contributing to r2, in which case [Formula] = 1/(S - 1). When other factors contribute heavily to LD, [Formula] shrinks toward 0 as empirical r2 [->] 1. However, for estimating contemporary Ne with unlinked markers, empirical r2 will generally be small and 1/(S - 1) will provide a robust estimate of [Formula].
Kalra, S.; Sanchez, G.; Stubin, A.; Le, A.; Bakshian, A.; Ortiz Diaz, B.; Mark, B. M.; Pena, C.; Parker, E.; Johnston, E.; Hsu, E.; Brangham, G.; Bala-Mehta, I.; Perez, L.; Milrod, M.; Stanten, M.; Nakamura, M.; Hwang, P.; Ptaszynska, S.; Cander, S.; Park, S.; Tan, T. L.; Zhou, Y.; Coolon, J.
Show abstract
Gene-by-environment (GxE) interactions play a major role in shaping both phenotypic and molecular variation, with important implications for human health and disease. In this study, we used the Doxycycline (Dox) regulated, tetracycline-responsive (Tet-Off) promoter system to sequentially reduce or titrate gene expression levels of the essential yeast transcription factor Repressor Activator Protein 1 (RAP1) similar to a hypomorph allele series, across three distinct environments: Yeast Peptone Dextrose (YPD) media, YPD media with Heat Shock (HS), and Yeast Peptone Acetate (YPAC) media. We then performed RNA sequencing (RNA Seq) to assess global transcriptional responses to RAP1 reduction in these different growth environments. Our analysis first focused on the independent effects of varying RAP1 expression levels within and across environments. We then explored GxE interactions, revealing a subset of genes with significant consequences of reduced levels of RAP1 and environment-specific expression patterns. Notably, many genes exhibited opposite effects of RAP1 titration on gene expression when yeast were grown in YPAC media compared to YPD media and/or HS, suggesting environment-dependent regulatory architecture. This design reveals how cells integrate internal transcriptional and regulatory changes with external environmental cues, providing a deeper view of GxE architecture. Using Weighted Gene Co-expression Network Analysis (WGCNA), we identified co-regulated gene modules, and by combining this with transcription factor motif enrichment tests, our study identified candidate regulators driving their dynamics. Our findings demonstrate that gene regulatory networks can vary dramatically depending on the environmental context an organism experiences, which can then influence the specific phenotypes produced by a particular genetic perturbation. This illustrates the complexity of genotype-environment interactions and the importance of studying gene function in multiple environments to gain a truly comprehensive understanding of a genes sometimes numerous and diverse functions.
Stevenson, E.-L.; Kelliher, C. M.; Kettenbach, A. N.; Loros, J. J.; Dunlap, J. C.
Show abstract
Circadian rhythms, [~]24-hour biological cycles, enable organisms to anticipate rhythmic environmental cycles so they can assign proper day and night functions that align with those cycles. Circadian rhythms are defined by their ability to be reset by external cues, their capacity to continue to oscillate in the absence of those cues, and their capacity to maintain the rate of the clock across a range of ambient temperatures, a property known as temperature compensation. In the Neurospora clock, the White Collar Complex (WCC) drives expression of FRQ which nucleates a complex including FRH and CK1a that phosphorylates and thereby represses WCC activity. Work to date has suggested that kinases may be involved in temperature compensation and that in Neurospora the primary target of these is FRQ. Here we investigate the genetic relationship between two clock kinases, Casein Kinase I (ck-1a) and Casein Kinase II (cka), in their regulation of temperature compensation using novel alleles, ck-1aD135G and {Delta}cka. We find that that the clock relies on Casein Kinase I more at cold temperature, but this changes as temperature increases, and the clock relies more on Casein Kinase II at warm temperatures. Using quantitative proteomics on FRQ across temperatures, we find that the FRQ phosphorylation landscape is dependent on temperature and is altered in temperature compensation mutants. This leads to the development of a phosphorylation driven model for temperature compensation, where key temperature compensation specific domains on FRQ are phosphorylated to regulate period length in response to temperature, including by Casein Kinase I and Casein Kinase II.
Fernandez-Fernandez, J.; Martin-VIllanueva, S.; Ayers, T. N.; Galmozzi, C. V.; Woolford, J. L.; de la Cruz, J.
Show abstract
Ribosome biogenesis is a highly coordinated pathway that involves the assembly of ribosomal RNAs (rRNAs) with ribosomal proteins (r-proteins) to generate functional ribosomal subunits (r-subunits). The Saccharomyces cerevisiae (yeast) large 60S r-subunit consists of three rRNA molecules and 46 r-proteins. The contributions of nearly all r-proteins of the yeast large r-subunit have been characterized; however, a few non-essential proteins remain poorly understood. Although non-essential, human eL22 has been identified as a key player in p53 regulation during ribosomal stress and as a highly mutated target in cancers. Despite this function, the role of eL22 in ribosome maturation is still ill-defined. In this study, we characterized yeast eL22 r-protein. Our results show that eL22 assembles into intermediate nucleolar pre-60S ribosomal particles. Loss of eL22 impairs cell growth and reduces 60S r-subunit accumulation, phenotypes that are exacerbated at low temperatures. Analysis of pre-rRNA processing by pulse-chase labeling, northern blot hybridization, and primer extension reveals a defect in 27S pre-rRNA maturation, specifically at the level of 27SB pre-rRNA processing. Consequently, nuclear export of eL22-deficient pre-60S particles is mildly impaired. Furthermore, we identify genetic interactions between eL22 and neighboring r-proteins, eL38 and eL31. We conclude that eL22 assembly is required for optimal pre-60S maturation during middle nucleolar stages, particularly at low temperatures, a function likely supported by the cooperative action of other r-proteins associated with common elements of 25S rRNA. HighlightsO_LIWe have studied the role of r-protein eL22 in yeast ribosome assembly. C_LIO_LIeL22 is required for 60S ribosomal subunit production. C_LIO_LIThe absence of eL22 is critical at low temperatures. C_LIO_LIeL22 is important for 27SB pre-rRNA processing and nuclear export of pre-ribosomes. C_LIO_LIeL22 functionally interacts with r-proteins eL38 and eL31 in domain III of 25S rRNA. C_LI
Sequeira, A. N.; Szpiech, Z. A.; Huber, C. D.
Show abstract
Identifying signatures of positive selection in humans is complicated by demographic processes such as bottlenecks, migration and admixture, all of which can distort or obscure the genomic patterns produced by selective sweeps. Ancient DNA offers a direct window into past allele and haplotype frequencies, yet most sweep scans in ancient populations rely on allele-frequency or site frequency spectrum (SFS) summaries, with limited use of haplotype-based approaches. Here, we evaluate the performance of haplotype and SFS-based methods for detecting selective sweeps under demographic scenarios that reflect the complex history of ancient and modern Europeans. We extend the haplotype-based likelihood framework saltiLASSI to accommodate pseudohaploid ancient genomes, enabling the use of truncated haplotype frequency spectra and their spatial decay to detect sweeps without requiring phased data. Using forward-in-time simulations, we examine sweeps of varying ages, two pulses of admixture with different source proportions, and cases where selection continues or ceases after admixture. We compare saltiLASSI to a widely used SFS-based approach (SweepFinder2). Our results show that haplotype-based likelihood models retain higher power than SFS methods in admixed populations, particularly when sweep haplotypes are introduced through migration or when selection has not had sufficient time to regenerate a clear SFS signature after admixture. These findings highlight the promise of haplotype-based inference for ancient DNA and demonstrate how model-based approaches can improve the detection of historical selective sweeps in populations with complex demographic histories.
Thrikawala, S.; Naples, B.; Rosowski, E.
Show abstract
One feature key to the versatility of zebrafish as an animal model for biomedical research is the breadth of genetic tools available, including for transgenesis. While the Tol2 transposase system remains the gold standard, its efficiency can be highly variable. Here, we explored the potential of a complementary transgenesis system, Cp36, a large serine recombinase identified from Clostridium perfringens previously found to efficiently integrate target cargo into the human genome without a preinstalled attB site. We generated Cp36-based plasmid constructs for zebrafish transgenesis and compared their performance to matched Tol2 plasmids across multiple experimental contexts, including transient expression, germline transmission, and multi-transgene expression. Cp36 integrates small [~]3.5kb cargo into the zebrafish genome and transmits to the next generation as efficiently as Tol2, but Cp36 performance declines substantially for larger [~]7.5kb constructs. Both Cp36 and Tol2 have comparable efficiency in transiently expressing a second construct regardless of the transposase/recombinase used to integrate the first construct, indicating compatibility with sequential transgenesis strategies. In summary, we demonstrate that Cp36 functions as a new alternative transgenesis method in zebrafish.
Green, L.; Hajiarbabi, S.; Kelleher, E. S.
Show abstract
Organismal tolerance of ionizing radiation is a complex trait whose genetic basis has been studied extensively, in large part due to its significance to human health and technological advancement. Conventional mutant screens in model organisms have revealed the paramount role of DNA damage response (DDR) and repair pathways in determining tolerance to ionizing radiation. However, uncovering natural genetic variation in radiotolerance is also of critical importance, as individual differences are associated with the differential susceptibility to cancer as well as differential response to radiation treatment. Genetic variation that underlies phenotype differences in natural populations often occurs in distinct genes and pathways as compared to the genes of major effect revealed by mutant screens, owing to the impact of natural selection on the former. We therefore sought to isolate natural variation in radiotolerance of Drosophila melanogaster by performing extreme QTL mapping. We generated a large genetically diverse multiparental population and exposed 3rd instar larvae to a semi-lethal dose or ionizing radiation. By sequencing surviving adults and comparing their haplotypes to unexposed controls from the same population, we identified a single major effect QTL spanning the 3rd chromosome centromere. The QTL contains 34 genes, none of which are previously implicated in radiotolerance. We interrogated the impact of these genes on radiotolerance through forward genetic analysis and RNA-seq. Our findings implicate diverse processes in radiotolerance including cell-cycle regulation and innate immune function.
Clo, J.
Show abstract
Whole genome duplication is a common mutation in eukaryotes with far-reaching phenotypic effects. The resulting morphological, physiological, and fitness consequences and how they affect the survival probability of newly polyploid lineages are intensively studied, but very little is known about the effect of genome doubling on the short-term evolvability of populations. Understanding the effect of polyploidization on the adaptive potential of populations is of crucial importance to predict the future of polyploid populations. In this paper, I investigate the immediate consequences of genome doubling on the genetic variance of populations. To do so, I performed numerical iterations and simulations of how the genetic variance of a quantitative trait changes after polyploidization, under different genetic architectures (additivity, dominance, and epistasis). I found that genetic variance generally decreases after genome doubling. Non-additive gene actions can make autotetraploid populations genetically more diverse than their diploid progenitors in rare cases, notably with overdominance and directional epistasis. By collecting estimates from the agronomic literature, I found that both dominance and epistatic variance contribute to the genetic variance of polyploid populations. These results bring new insights into the adaptive potential of newly formed tetraploid populations, and call for further experimental investigations of how polyploidization is associated with a short-term decrease in evolvability.
Xiao, W. F.; Farjo, M. N.; Lowen, A. C.; Koelle, K.
Show abstract
The ecological and evolutionary dynamics of populations, including viral populations, are known to be jointly shaped by deterministic and stochastic processes. While the impact of stochastic processes has been rigorously explored for viral dynamics at the level of the host population, most dynamic models for acutely-infecting respiratory viral pathogens at the within-host scale remain deterministic in their formulation. While this may be reasonable for identifying key processes shaping their within-host viral population dynamics, recent studies indicate that stochastic processes need to be invoked for understanding patterns of within-host viral evolution. Specifically, several studies have shown that viral allele frequencies can change dramatically over the time course of days in acute infections. Here, we use stochastic dynamic models to explore the role of environmental noise in shaping observed patterns of virus evolution in acute respiratory virus infections. We summarize ways in which environmental stochasticity can be biologically realized in these acute viral infections and describe within-host models that can be implemented to jointly yield viral population dynamics and evolutionary dynamics. We further develop a statistical approach to estimate the extent of environmental noise from observed within-host allele frequency changes. We test this approach on simulated data and apply it to existing influenza A virus and SARS-CoV-2 within-host data. With these applications, we show that environmental stochasticity can parsimoniously reproduce key features of empirically observed allele frequency changes without needing to invoke demographic stochasticity or to adopt Wright-Fisher model formulations with a constant effective population size. Finally, we show that purifying selection and positive selection can both still contribute to within-host viral evolution in the context of a noisy environment, providing theoretical support for studies that have found purifying and positive selection in acutely-infecting respiratory virus populations.
Gilmour, S. E.; Fagen, B. L.; Salim, D.; Bravo Nunez, M. A.; Lange, J. J.; Wood, C.; Price, A.; Eickbush, M. T.; Billmyre, R. B.; Cockrell, A. J.; McCroskey, S.; Searcy, M.; Koren, K.; Ramirez-Sanchez, L. F.; Gerton, J. L.; Zanders, S. E.
Show abstract
Centromeres are essential for chromosome segregation, yet in many genomes they are composed entirely of rapidly evolving repetitive DNA, embedded in other repetitive DNA that forms pericentromeric heterochromatin. Due to the difficulties of manipulating these repeat-rich regions, how the relative size of pericentromeric repeat regions influences chromosome segregation remains an open question. Here, we take advantage of the tractable Schizosaccharomyces pombe system by combining population-level analysis, complete long-read assemblies, and engineered near-isogenic strains to test how pericentromeric repeat copy number affects chromosome biology in its native context. We find that pericentromeric dh/dg arrays on chromosome 3 vary almost tenfold in size among natural S. pombe isolates, ranging from 35 to 265 kb. We converted this natural diversity into an experimental system of nearly isogenic strains that primarily differ in pericentromere size (35 to >350 kb). We found that pericentromere size does not alter baseline growth under standard conditions. However, larger pericentromeres alter transcriptional output and sensitize cells to spindle stress. We show that this spindle-stress phenotype depends on heterochromatin: loss of the H3K9 methyltransferase Clr4 abolishes size-dependent differences, whereas artificial targeting of the Chromosomal Passenger Complex to heterochromatin partially rescues the defect. Thus, we find that larger pericentromeres act as sinks for limiting regulatory factors, weakening their effective concentration at centromeres and compromising faithful chromosome segregation under stress. These results establish that naturally occurring copy-number variation within repetitive pericentromeric DNA is not merely noise, but a functional source of variation in chromosome segregation and gene regulation. Our work provides an experimentally tractable framework for understanding how repeat expansion in centromere-proximal heterochromatin influences chromosome behavior across eukaryotes.
Sanchez-Escabias, E.; Rico, D.; Reyes, J. C.
Show abstract
Understanding cis-regulatory elements (CREs) at the single cell level is fundamental to deciphering transcriptional changes during development, cell differentiation, and homeostasis. Recent studies have shown that arbitrary peak-calling thresholds complicate data interpretation and cross-study comparisons. Furthermore, due to the inherent sparsity of single-nuclei ATAC-seq (snATAC-seq) data, distinguishing between truly inaccessible regions and technical dropouts remains challenging. Our analysis of snATAC-seq experiments performed in a well-established cell line suggests that the dichotomy between accessible (open) or inaccessible (close) CREs is misleading. Thousands of accessible regions are present in a very small fraction of cells of the population but they are repeatedly identified, suggesting that they have a low accessibility or are only transiently accessible. However, depending on the detection threshold selected they could be considered as either genuine CREs or noise. To resolve this inconsistency, we propose a model where chromatin accessibility is treated as a continuum, defined by a probability of accessibility (Pa) for each accessible region across cell types and conditions. Through computational simulations, we demonstrate that snATAC-seq results can be explained by a simple "balls into bins" probability model, offering a theoretical framework for calculating Pa distributions from any snATAC-seq dataset. Furthermore, we examine how Pa distributions shift following activation of the TGF{beta} signaling pathway. This probabilistic approach removes the reliance on arbitrary thresholds and supports a more quantitative, and dynamic understanding of accessible regions function.
Madrigal Roca, L. J.; Kelly, J. K.
Show abstract
O_LISynonymous nucleotide variation, which is remarkably high in Mimulus guttatus, can be affected by both codon usage selection (translational efficiency) and linked selection (hitchhiking effects). C_LIO_LICodon usage reflects a genome-wide tug-of-war between mutational pressure toward A/T-ending codons and weak selection favoring G/C-ending codons. The outcome is determined largely by gene expression level and localized variation in recombination rate. C_LIO_LIUsing both mechanistic (ROC-SEMPPR) and population genetic models, we find that most genes are weakly selected for codon usage, about 76% yielding scaled selection coefficients (S = 4Nes) in the range of 0 to 1. Additionally, 4029 genes, primarily involved in photosynthesis, translation, defense, and phosphate scavenging, experience strong selection (S > 1). C_LIO_LILevels of nucleotide variation within genes indicate a strong effect of linked selection. Non-synonymous polymorphism declines in genes with strong purifying selection, and as the rate of (intra-genic) recombination declines. Levels of synonymous polymorphism usually track non-synonymous (owing to background selection), except in genes under the strongest translational selection. C_LIO_LICounterintuitively, we find that codon usage selection has a generally positive effect on synonymous nucleotide diversity at 4-fold degenerate positions. Since mutation strongly disfavors the optimal base in M. guttatus, codon selection in the range of 0 < S < 2 evens the balance (between selection and mutation) and thus inflates heterozygosity. C_LI
Roy, J.; Torkamaneh, D.; Monthony, A. S.
Show abstract
Abstract/SummarySex expression in Cannabis sativa is determined by XX/XY sex chromosomes but remains plastic, with ethylene inhibition inducing male flowers on XX plants and ethylene release inducing female flowers on XY plants. Although ethylene is a central regulator of this process, the contribution of gibberellin signaling to cannabis sex reversal remains poorly defined. Here, we reconstructed the GA biosynthesis, regulation, and signaling pathway in C. sativa and profiled GA-related gene expression during chemically induced sex reversal. Orthology-based searches identified 50 putative C. sativa GA-related genes, widely distributed across the genome, with the X chromosome harboring 11 genes, including six within the non-recombining region. Transcriptomic analyses across vegetative baseline, early post-treatment leaves, and developing flowers showed that expression profiles were broadly similar between XX and XY plants at day 0, weakly perturbed at day 1, and strongly structured by floral phenotype at day 14. Early responses were limited to downregulation of CsGA3ox1 in ethephon-treated XY plants and CsGASA1 in STS-treated XX plants. By day 14, sex reversal was associated with differential expression of key genes, including CsGA1, multiple GA20ox orthologs, CsGID1B, CsSLY2, and several GASA genes, indicating broad remodeling of GA regulation. Our findings position the GA pathway as a downstream module of ethylene-driven sex reversal in C. sativa, with GA activity tracking floral sexual identity, extending the framework of sexual plasticity beyond ethylene, and identifying candidate genes for functional validation and the development of sex-stable cultivars.