Genetics
◐ Oxford University Press (OUP)
Preprints posted in the last 90 days, ranked by how well they match Genetics's content profile, based on 225 papers previously published here. The average preprint has a 0.21% match score for this journal, so anything above that is already an above-average fit.
Gupta, M.; Holmes, C. M.; Belousova, J.; Gopalakrishnan, S.; Rego-Costa, A.; Desai, M. M.
Show abstract
Mapping the genetic basis of complex traits is complicated by the presence of epistatic interactions between loci. While work in molecular genetics identifies numerous specific genetic interactions, statistical analyses of quantitative traits frequently conclude that additive (nonepistatic) models explain most heritable variation. However, these conclusions are typically limited by the narrow range of genetic relatedness(e.g. in F1 offspring of a biparental or circular cross). Here, we use a barcoded panel of Saccharomyces cerevisiae genotypes with a broad range of relatedness to quantify the effects of epistasis on the genetic architecture of seven complex traits. We find limited contributions of epistasis to the genetic basis of these traits. These results indicate that epistasis beyond that detected in standard yeast crosses may exist, yet it contributes little to phenotypic variance in these systems.
Lopez-Cortegano, E.; Charlesworth, B.
Show abstract
A sudden reduction in population size increases the rate of genetic drift, reducing variability and increasing the mean level of homozygosity. The resulting increased exposure of recessive or partially recessive, strongly deleterious alleles to selection against homozygotes may lead to their being purged from the population, potentially allowing mean fitness to increase after an initial decline, and accelerating the decline in inbreeding depression associated with reduced variability. However, detailed population genetic theory on the effects of population bottlenecks on mean fitness and inbreeding depression remains limited. We develop a theoretical framework for small, randomly mating populations founded from a large population near mutation-selection-drift equilibrium, using both simulations and approximate analytical predictions. These provide quantitative predictions for the dynamics of the populations mean fitness and level of inbreeding depression following a bottleneck. In particular, we derive an approximate expression for the time needed for mean fitness to recover after an initial decline; such a recovery requires selection to be sufficiently strong relative to drift and mutations to be sufficiently recessive. In contrast, weakly deleterious mutations cause reductions in mean fitness and inbreeding depression that are similar in size to those predicted from increases in neutral homozygosity.
Li, J.; Hermisson, J.; Sachdeva, H.
Show abstract
We study one of the simplest scenarios of polygenic selection that can be imagined: a subdivided population of diploid individuals expressing an additive trait under spatially homogeneous stabilizing selection. We are interested in the amounts of variation that can be maintained at mutation-selection-migration-drift equilibrium, at individual loci and at the level of the trait, within and among subpopulations. We derive analytical approximations for variance components and summary statistics such as FST and QST under the assumptions of the infinite-island model and compare these with individual-based simulations. We find that: (i) There is a critical migration threshold (which depends on effect sizes of trait loci) below which population structure strongly inflates genic variance in the subdivided population to levels well above those in a panmictic population. Variation within each subpopulation is maximized close to the critical migration rate. (ii) The genetic basis of trait variation across subpopulations is most similar close to this migration threshold and (counter-intuitively) decreases for higher migration rates. This has consequences for the portability of Genome-Wide Association Studies (GWAS) between subpopulations, i.e, the extent to which loci with large contributions to variance in one subpopulation explain variance in other subpopulations. (iii) An analytical mean-field approach based on the single-locus diffusion approximation, together with effective migration and selection parameters (to account for associations between loci), very accurately predicts various quantities.
Lee, H.; Terhorst, J.
Show abstract
Across many complex traits, genetic variants with larger effect sizes tend to occur at lower frequencies, which is often interpreted as a signature of stabilizing selection. In statistical genetics, the so-called -model captures this relationship by assuming that effect size variance is inversely proportional to heterozygosity raised to a power 0 [<=] [<=] 1. Although empirically useful, the -model is phenomenological rather than mechanistic and lacks a direct population-genetic interpretation. In this paper, we derive an alternative to the -model based on evolutionary theory. Our approach yields a linear mixed model in which the frequency dependence of effect size emerges naturally as a function of interpretable evolutionary quantities describing mutational variance, selection intensity, and coupling between the focal and selected traits. These quantities enter through two identifiable variance components that can be estimated by restricted maximum likelihood (REML). The resulting framework links a fitness-landscape model to standard mixed-model methodology, enabling both inference on evolutionary parameters and downstream prediction by best linear unbiased prediction (BLUP). In forward simulations, the model accurately recovers the focal-trait variance and generally improves genetic prediction relative to conventional -model baselines.
Offenstadt, A.; Billiard, S.; Giraud, T.; Veber, A.; Jay, P.
Show abstract
Understanding how mutations evolve on Y chromosomes is central to explaining the origin, diversity and persistence of sex chromosomes. Mutations occurring on the Y chromosome in sexual populations experience selective dynamics that differ markedly from those on autosomes, due to a reduced effective population size and the presence of large non-recombining regions containing alleles maintained in a permanently heterozygous state. These specific features alter gene transmission in the Y chromosome population compared to autosomes, even within the same pedigree. Here, we provide a two-sex diploid Wright-Fisher model that explicitly incorporates both sex chromosomes and autosomes within a unified population framework, in order to capture the influence of these specificities on the fate of mutations, not only considering fixation probabilities but also segregation times. We use diffusion approximations and provide analytical and numerical tools to compute these quantities across a wide range of parameters and selection regimes. We recover classical results on fixation probabilities in various scenarios, including purely beneficial, deleterious or overdominant mutations, and extend them in the light of mean segregation time, a key but often overlooked determinant of evolutionary outcomes over finite timescales. In particular, our analyses show that overdominant mutations are overall more likely to fix in observable time windows on the Y chromosome than on autosomes. Individual-based simulations corroborate our approximations and highlight parameter regimes where the theoretical approach is particularly useful, especially for parameter values inducing long segregation times or small fixation probabilities, for which simulations are impractical. Our results provide a comprehensive and tractable framework for clarifying how chromosome-specific features shape evolutionary dynamics beyond fixation probabilities alone.
Bush, Z. D.; Conery, J. S.; Wilson, H. R.; Naftaly, A. F.; Dinwiddie, D.; Hillers, K. J.; Libuda, D. E.
Show abstract
Crossover recombination events during meiosis repair double-strand DNA breaks and ensure accurate chromosome segregation in most organisms. For many species, the genomic distribution of crossovers is nonrandom and sexually dimorphic. While many species evolved kilobase-scale "hotspots" for crossover formation, the Caenorhabditis elegans genome lacks hotspots, and crossovers are enriched across megabase-scale domains. Further, genetic and cytological studies indicate the crossover frequency in C. elegans spermatogenesis is higher relative to oogenesis in many but not all genetic intervals. To determine the genomic features that contribute to the sexually dimorphic recombination landscape in the absence of hot spots, we defined and analyzed the recombination landscape across the whole genome in C. elegans using whole-genome sequencing and high-resolution recombination mapping in single worms bearing recombinant chromosomes from individual sperm and oocytes. We find that the spatial distribution of crossovers is sexually dimorphic on chromosomes I, II, and III, and that the global rate of double-crossover events is 4.7-fold higher in spermatocytes. Additionally, we find that pairing and synapsis may contribute to the sexually dimorphic crossover landscape. In comparison to the spermatocyte crossover landscape, a higher proportion of oocyte crossovers are formed in the domains directly adjacent to the pairing centers of each chromosome. Further, reducing the genetic dosage of the synaptonemal complex central region protein SYP-2, which is a meiotic chromosome structural protein required for homologous chromosome synapsis, reshapes the oocyte crossover landscape to resemble observations in wild-type spermatocytes. Finally, we found that spermatocyte crossovers are partially enriched in H3K36me3-marked euchromatic regions, while many oocyte crossovers are enriched in H3K27me3-marked heterochromatic regions. Taken together, our studies reveal how synaptonemal complex component dosage and local chromatin states influence crossover placement and the sex-specific regulation of meiotic recombination. Author SummaryProduction of viable eggs and sperm depends on accurate chromosome segregation during meiosis. Segregation of parental copies of homologous chromosomes requires the reciprocal exchange and physical linkage of DNA that arises through crossover recombination. Increasing evidence indicates the existence of sexual dimorphisms during meiotic recombination. In this study, we generated and analyzed high-resolution recombination maps specific to spermatogenesis and oogenesis in the nematode C. elegans, which reveals sex-specific crossover distributions and a higher rate of crossing over in sperm cells. Further, we indicate how specific chromosomal features and structures differentially affect the crossover landscape in eggs versus sperm. Our work highlights how, in a system absent of pre-defined "hotspots" for recombination, local chromatin structures, chromosomal pairing domains, and the abundance of synaptonemal complex proteins are potential drivers for establishing the observable sex differences in crossover recombination.
Courau, P.; Schertzer, E.; Lambert, A.
Show abstract
We study a polygenic trait under stabilizing selection at statistical equilibrium, where genetic effect, mutation rate and mutational bias are heterogeneous across loci. The model assumes L biallelic sites subject to reversible mutations, each allele described by its frequency in the population. Using a diffusion approximation, a mean-field approximation and neglecting linkage disequilibrium, we predict consistent phenomena across several regimes of selection: (1) a small deviation {Delta}* of the trait mean from its optimal value appears and persists due to genetic mutations not aligned with selection; (2) while this deviation is often undetectable at the trait level, it leaves a substantial signature at the locus level by favoring alleles reducing it, resulting in genic selection with mean coefficient s* proportional to -{Delta}* acting pervasively; (3) with stronger selection on the trait, (3a) the value of {Delta}* is decreased but the intensity of genic selection is increased in inverse proportion, resulting in an essentially constant, non negligible value of s*. We show how the stationary distribution of allelic frequencies can be obtained from {Delta}*. The latter can then be characterized as the solution to a fixed-point equation. Finally, we quantify several macroscopic observables of interest (genetic variance, description of the fluctuations of the trait mean as an Ornstein-Uhlenbeck process). The orders of magnitude of the macroscopic observables can be derived on a wide region of the parameter space. The model shows good fit and can straightforwardly be extended to accommodate pleiotropy, dominance, and some forms of epistasis. We also discuss the different breakdown which may occur (Bulmer effect, Hill-Robertson effect, breakdown of the Ornstein-Uhlenbeck approximation for the dynamics of the trait mean, depletion of genetic variability due to low mutation rates).
Martinez-Rodriguez, L. E.; Bell, S. P.
Show abstract
The origin recognition complex (ORC) selects origins of replication and directs the loading of the Mcm2-7 replicative helicase at these sites. Five of the six ORC subunits are related to the AAA+ family of ATPases. Although functions for ATP hydrolysis by Cdc6 and the Mcm2-7 complex have been described, the essential role of ORC ATP hydrolysis remains unclear. We performed a genetic screen in Saccharomyces cerevisiae for suppressors of the lethal phenotype of the orc4-R267A allele, which disrupts ORC ATP hydrolysis in vitro. We identified six causative mutations, five of which are distributed across different ORC subunits. The suppressor mutations in Orc1 and Orc4, but not the other ORC subunits, increase the in vitro helicase loading activity of ATPase-defective ORC (ORC4R). Allele specificity studies showed the alleles specifically suppress defects at ATPase interfaces within the ORC-Cdc6 complex. The sixth allele is a mutation in TOA2, a subunit of the TFIIA general transcription factor. Mutations in the general transcription factors TBP and TFIIB, and the large subunit of RNA Polymerase II also suppressed the orc4-R267A lethality, suggesting that reducing transcription is sufficient for suppression. Our study identifies multiple ways to suppress the lethal phenotype of an ATPase defective ORC allele and reveals a connection between ORC ATP hydrolysis and transcription.
Stetsenko, R.; Merot, C.; Glemin, S.; Roze, D.
Show abstract
Several recent studies have quantified signed linkage disequilibrium (LD) among mutations in genomic datasets, often reporting positive LD, particularly among mutations presumed to be less deleterious, such as synonymous variants. In this article, we investigate two potential sources of this positive LD: the focus on rare alleles, as adopted in several previous studies, and errors arising in the mapping of short-read sequences onto a reference genome. Using coalescent simulations, we extend previous theoretical results of the effect of focusing on rare alleles, and show that derived alleles present at similar frequencies tend to be in positive LD. Reanalyzing datasets from Capsella grandiflora and Drosophila melanogaster, we show that LD among synonymous derived alleles vanishes in the absence of any conditioning on frequency, while LD between mutations categorized as potentially deleterious by the SIFT4G program stays positive. However, we show that in both cases, this positive LD may be at least partly caused by the potential mismapping of a small fraction of sequences in some individuals, which could be a consequence of structural variants that are absent from the reference genome. Overall, these results show that average signed LD among mutations can be strongly affected by technical artifacts even if these concern only a minority of variants. Finally, we discuss other possible sources of positive LD among deleterious mutations.
Kocik, R. A.; Ahrens, J.; Gasch, A. P.
Show abstract
Yeast responding to acute stress reallocate cellular resources, in part via the Environmental Stress Response (ESR) that induces stress-defense genes while repressing ribosome-biogenesis and growth genes. The purpose and regulation of coordinated induction and repression is incompletely understood, but both responses are influenced by ESR transcription factors Msn2 and Msn4 (Msn2/4). Here we used single-cell microscopy and transcriptomic analysis to investigate the role of upstream regulator Pde2 in ESR regulation and post-stress fitness. Loss of PDE2 weakened and shortened Msn2 activation following salt stress and produced muted induction of Msn2/4 targets, similar to a msn2{triangleup}msn4{triangleup} strain. In contrast, Pde2 had at most a minor impact on ESR repressor Dot6, yet was important for repression of its targets beyond Msn2/4 influence. Consistent with our recent resource-reallocation model, pde2{triangleup} cells had normal or faster post-stress growth rates, despite weaker activation of the ESR. We discuss implications for ESR regulation and function.
Chandra, S.; Gao, Z.
Show abstract
Recent studies have reported consistent inter-population differences in GC content at polymorphic sites in multiple species, including humans. Specifically, populations that experienced recent bottlenecks exhibit lower average GC content (GC%) at common polymorphic sites compared to non-bottlenecked groups--an observation previously interpreted as indication of rapid evolution of base composition. In this study, we investigate the evolutionary and technical factors driving these patterns across humans, mice, maize, and silkworm. We find that GC% at polymorphic sites is highly sensitive to the allele frequency threshold applied. Relaxing this threshold reduces inter-population differences to negligible levels in humans and significantly attenuates similar signals in other species. We further observe substantial GC% variation across allele frequency bins, a pattern driven by the differential abundance of different mutation types. We demonstrate that these observations are collectively driven by an interaction between demographic history and a universal excess of strong-to-weak mutations relative to weak-to-strong mutations, which is counteracted by GC-biased gene conversion (gBGC) over long evolutionary timescales. Forward-in-time simulations with realistic parameters recapitulate observed patterns of GC% variation across both populations and allele frequency bins. Overall, our findings reveal that the base composition at polymorphic sites is strongly shaped by the interaction between demographic history, mutation bias, and gBGC, and does not represent stable, genome-wide trends. Consequently, inter-population differences in GC content--especially at common variants--should not be interpreted as evidence of ongoing divergence in base composition or shifts in mutation patterns.
Tarkington, J. A.; Sherlock, G. J.; Mahadevan, A.
Show abstract
Stationary phase in yeast and other microorganisms begins when a limiting nutrient in the environment is exhausted and cell division ceases. Most cells subsequently enter quiescence and lose viability. In spent media, without metabolic byproducts being diluted, cellular processes can modify the environment and cause the relative growth rates of different genotypes to vary over the course of stationary phase. In this work we experimentally evolve S. cerevisiae in batch culture, varying the time spent in stationary phase between growth cycles. We measure the relative fitness of the resulting adaptive clones across a range of environments: with different amounts of time in stationary phase and in two different carbon sources. By comparing the inferred performance (relative growth rate during a period of the growth cycle) of a mutant to that of its ancestor, we can estimate the effects of each observed mutation on performance during various phases of growth. We show that when an adaptive mutation emerges in growth cycles that include a stationary phase, its effect on stationary phase performance is largely independent of the type of carbon source provided. However, for the same group of mutants, mutational effects on performance in early stationary phase are negatively correlated with those effects in late stationary phase, suggesting a trade-off. We also show that increased intervals of stationary phase result in larger fitness effects of adaptive mutations and distinct routes of adaptation. Together, these results demonstrate that stationary phase consists of more than one distinct fitness-related phenotype, and that the phenotypes that allow for high performance in the first few days of stationary phase trade off with those that allow for high performance in later stationary phase.
Miao, X.; Edge, M. D.; Harpak, A.
Show abstract
Standard genome-wide association studies (GWASs) are vulnerable to confounding factors, including stratification, assortative mating, and dynastic effects. Family studies such as sibling-based GWAS (sib-GWAS) mitigate such confounding and are becoming the tool of choice for teasing apart direct genetic effects--causal effects of ones genotype on ones own phenotype-- from other factors. However, due in part to their smaller sample sizes, sib-GWAS allelic effect estimates are substantially more variable than standard (i.e., population-based) GWAS estimates. The quantification of this uncertainty is essential for many uses of sib-GWAS, including polygenic scoring, causal inference (e.g., Mendelian randomization), disentangling direct from indirect familial effects, and measuring assortative mating. Here, we investigate sources of uncertainty in sib-GWAS allelic effect estimators. We study their impacts on the biases of three uncertainty measurement methods, including two that are commonly used and a new resampling-based approach we propose. We find that heterogeneity in allelic effects or heteroskedasticity across families (e.g., due to variation in genetic backgrounds or environments) can bias existing methods, and that this bias is more severe for small samples and rare variants. In contrast, the resampling-based approach we propose is approximately unbiased under all scenarios we considered. We validate our theoretical predictions, as well as the importance of effect heterogeneity and heteroskedasticity, using simulations and empirical analysis in the UK Biobank. In sum, this study helps understand the sources of uncertainty in family-based genotype-phenotype association studies and provides a robust method to estimate uncertainty.
Sidarava, V.; Lydall, D.
Show abstract
Eukaryotes typically maintain telomere length within a defined range. While short telomeres are known to activate DNA damage responses and limit cell proliferation, long telomeres are associated with extended proliferative capacity. The broader cellular consequences of long telomeres are comparatively less well understood. In budding yeast Saccharomyces cerevisiae, long telomeres have been shown to influence gene expression at specific loci, but whether long telomeres affect transcription genome-wide has not been reported. Here, we analysed transcriptomes in a lineage that inherited long telomeres (originally due to a rif2{Delta} mutation). Transcriptomes were assessed over two rounds of mitosis and meiosis in the absence of the rif2{Delta} mutation. We show that strains with long telomeres exhibit a distinct gene expression profile, including upregulation of membrane transporters and downregulation of a smaller subset of genes. Both up- and down-regulated genes were distributed across the genome, arguing against a purely telomere-proximal effect on gene expression. Affected genes were enriched for Rap1 binding sites, consistent with a model in which long telomeres sequester telomere-associated transcriptional regulators, such as Rap1, and thereby affect gene expression at non-telomeric binding sites for these regulators. Accordingly, the magnitude of transcriptional changes was greatest in strains with the longest telomeres. Together, our findings demonstrate that long telomeres induce a genome-wide transcriptional response that can accompany inherited long telomeres across generations. Similar effects of long telomeres are likely to occur in other eukaryotes, including humans, where long telomeres are associated with disease. Article summaryTelomeres protect chromosome ends, and their length is tightly regulated. While short telomeres are known to be harmful, the effects of long telomeres are less well understood. Using budding yeast, we show that inherited long telomeres alter the expression of dozens of genes across the genome, particularly membrane transporters. These changes are consistent with a model in which long telomeres sequester regulatory proteins away from other loci. Our findings may have broader implications in more complex organisms, including humans.
Wang, Z.; Champer, J.
Show abstract
Gene drives are genetic elements that can rapidly spread through populations, offering potential solutions for controlling disease vectors and pests. In some scenarios, it is necessary to utilize drives that can be confined to only target populations. The success of these threshold-dependent gene drives, which require a minimum local frequency to establish, depends critically on the spatial strategy used for introduction. Here, we use a reaction-diffusion model to systematically identify optimal release patterns that maximize the per-capita efficiency for four distinct gene drive designs as well as use of Wolbachia bacteria, which spread similarly to frequency-dependent gene drives. We find that the most efficient release strategy is highly dynamic, transitioning from a broad "everywhere" release for short timeframes to a "multiple-ring" pattern for intermediate times, and finally to a focused "center" release for longer timeframes. These timeframes depend on the specific type of drive, with more powerful variants transitioning more quickly to center releases. Our results demonstrate that these optimized, variable release strategies can be substantially more effective than simple uniform releases. This study provides a quantitative framework for designing effective gene drive implementations, highlighting that a carefully planned spatial strategy is essential for maximizing impact, making optimal use of available resources.
Treaster, M.; White, M. A.
Show abstract
Many taxa have evolved heteromorphic sex chromosomes like the XY system found in mammals. In additional to the sex determination gene which directs development of the gonad into an ovary or testis, sex chromosomes can have drastically different gene content, leading to substantial genetic differences between genetic males and females beyond their gonad identity. Studying the effects of these genetic differences is challenging, as the sex chromosomes and sex determination gene are inherited together, so the effects of genetic differences between the X and Y cannot be easily isolated from the hormonal differences produced by the ovary and testis. The threespine stickleback fish has a heteromorphic XY sex chromosome system and a wide range of well documented sex differences in morphology and behaviors, including complex mating behaviors and male-only parental care. Through genetic manipulation of amhy, the newly identified sex determination gene in threespine stickleback, we are able to generate gonadal males and females with either the XX or XY sex chromosome complement and analyze the separate effects of gonadal sex and sex chromosome complement on sexually dimorphic gene expression. We find that sex chromosomes have a larger effect on gene expression than gonadal sex in somatic tissues, while gonadal sex has a larger effect on expression in the gonads. We also find that the X and Y chromosomes are enriched for genes which show differential expression between females and males. Our findings demonstrate the significant biological impact of sex chromosomes outside of primary sex determination and showcase the utility of the threespine stickleback for studying the genetic basis of sex differences.
Kahraman, A.; Wirth, M.; Hammoud, H.; Reslan, M.; Haidar, M. A.; Djuhadi, G.; Mathejzyk, T.; Reifenstein, E.; Balke, J.; von Kleist, M.; Linneweber, G. A.
Show abstract
Phenotypic variation arises from the interplay of genetic, environmental, and stochastic developmental factors. Quantitative genetics predicts that reducing genetic variation through inbreeding or clonality should reduce phenotypic variation, an assumption that underlies the widespread use of inbred and clonal model organisms in biomedical research. Here, we test this assumption in the facultatively parthenogenic fly Drosophila mercatorum, in which parthenogenesis results in complete homozygosity and clonality after a single generation. Contrary to expectation, clonal parthenogenic flies showed broad shifts in trait means, altered interindividual variability, increased fluctuating asymmetry, and reduced behavioral and developmental canalization relative to sexually reproducing controls. Inbreeding reproduced substantial parts of this phenotype, whereas outcrossing restored robustness, identifying loss of heterozygosity as a major driver of the effect. Our findings show that extreme genetic uniformity can amplify rather than constrain stochastic phenotypic divergence, suggesting that controlled heterozygosity may, in some contexts, provide a more robust and reproducible experimental substrate than highly inbred, isogenic, or clonal animals.
Brewer, B. J.; Martin, R.; Ramage, E.; Payen, C.; Di Rienzi, S. C.; Zhao, Y.; Zane, K.; Verhey, J.; Galey, M.; Miller, D. E.; Ong, G. T.; McKee, J. L.; Alvino, G. M.; Dunham, M. J.; Raghuraman, M. K.
Show abstract
Gene amplification is a potent driver of evolution and is thought to contribute to genetic diseases, including cancer. The yeast Saccharomyces cerevisiae is a powerful organism for understanding amplification mechanisms. When yeast is grown long term in sulfate-limiting chemostats, amplification of the gene that encodes the primary sulfate transporter, SUL1, is a common outcome. Here we describe a form of SUL1 amplification in which multiple copies of the right terminal region of chromosome II are appended in tandem to a native telomere. We find this form of amplicon when we delete the origin of replication next to SUL1 or delete a variety of genes involved in DNA metabolism. It is the only form of amplification found in a yku70{Delta} mutant suggesting that unprotected telomeres are involved. We propose that these terminal addition events occur when the unprotected 3 G1-3T telomeric sequence invades a short ([~]7 bp) internal telomere sequence (ITS) to begin a form of microhomology-mediated break-induced replication (mmBIR) that has been documented in type-I survivors of telomerase mutants. In addition to amplification of the right end of chromosome II we also find that telomeres containing the sub-telomeric repeat Y experience similar tandem amplification events and show that their formation is reduced in a pol32{Delta} mutant, a gene required for mmBIR. Within individual amplicons the ITSs and Ys are nearly identical, suggesting that the multiple copies of the amplified region are generated in a single mmBIR event that we describe as pseudo-rolling circle mmBIR. A similar amplification event at the P-telomere of human chromosome 18 has four copies of a [~]54 kb region separated by ITSs of nearly identical size. This finding suggests that these additional copies of the terminal fragment of human chromosome 18 arose by the same pseudo-rolling circle mechanism, perhaps during a period of telomeric stress. AUTHOR SUMMARYThe human genome is peppered with duplicates (or higher numbers) of segments that are located at sites both nearby and distant from the original, ancestral segments. These Copy Number Variants, or CNVs, appear to be highly variable among different individuals and are being examined with great interest as potential loci associated with genetic disease. Experimentally determining how these CNVs arise and become distributed across the genome is nearly impossible using humans. We are using budding yeast as the model organism to explore mechanisms of gene amplification. In this work we show that by destabilizing the ends of yeast chromosomes (telomeres) or by interfering with genes involved in the replication, repair, or recombination of DNA results in a specific form of segmental copy number increase that is initiated at telomeres. We propose that a telomere invades an internal chromosome site and sets up a pseudo-circular template for conservative DNA replication. The outcome is a chromosome with multiple, identical copies of a chromosome end arranged in tandem. We believe that it is also a major mechanism used by cells to repair telomeres that have become eroded during aging.
Mischler, M.; Vigue, L.; Croce, G.; Weigt, M.; Tenaillon, O.
Show abstract
Quantifying the selective effects of individual mutations is essential to understand how their population-wise frequencies evolve under natural selection and genetic drift. Large genomic datasets provide a real-life experiment that we exploit to characterize the efficiency of selection across different mutations types and populations. Using Direct Coupling Analysis, a model from statistical physics, we derive protein-informed scores for individual non-synonymous mutations identified in 81,440 Escherichia coli genomes. We show that these scores act as a latent variable capturing the probability that a mutation is beneficial, neutral, or mildly to highly deleterious. We contribute to the debate on the importance of synonymous mutations by demonstrating that their selection intensities span a single order of magnitude in the E. coli species, whereas non-synonymous mutations span six orders of magnitude. We further relate selection efficiency to genetic drift, defined as the inverse of population size, and to ecological lifestyle, and we identify a 10,000-fold reduction in selection efficiency between the entire E. coli species and its most pathogenic populations. Together, these results highlight how population genetics and protein variant fitness predictors inform one another: variation in selection efficiency is associated with shifts in the distribution of mutation scores, and population genetics data provide a benchmark to assess the accuracy of these scores. Graphical abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=182 SRC="FIGDIR/small/711857v1_ufig1.gif" ALT="Figure 1"> View larger version (51K): org.highwire.dtl.DTLVardef@1df70corg.highwire.dtl.DTLVardef@1464860org.highwire.dtl.DTLVardef@139d4d3org.highwire.dtl.DTLVardef@1c3a4c5_HPS_FORMAT_FIGEXP M_FIG Schematic representation of the analysis of polymorphism in 81,440 Escherichia coli genomes. 458,443 polymorphic codon sites were identified and oriented using homologous sequences from closely related species. Mutations can be classified as synonymous or non-synonymous based on whether they alter the amino-acid sequence encoded, and real-valued scores predictive of fitness effects can be attributed to mutations within each of these classes. Codon scores reflect the global codon usage preference within the E. coli genome. DCA scores capture position- and amino-acid-specific preference as well as epistatic constraints and are obtained for each protein from a set of distantly related homologous sequences. Coupled with the abundance of polymorphic sites within different E. coli subpopulations, these different polymorphism classifications allow to precisely compare the intensity of selection between different types of mutations and across populations with distinct lifestyles, illustrated here by their pathogenic power. C_FIG
Aninta, S. I.; Tewhey, R.; de Boer, C. G.
Show abstract
Gene expression is governed by the DNA sequence, which is read out through complex interactions between transcription factors (TFs), co-activators, and chromatin. Massively Parallel Reporter Assays (MPRAs) provide a high-throughput framework for functionally characterizing how regulatory DNA sequences impact the expression of a model gene. MPRAs have also proven to be useful for measuring the effects of genetic variation, where each allele is typically tested in the center of [~]200 bp of genomic context cloned into the MPRA; but the impact of variant position and local context remains largely unexplored. In this study, we systematically investigate how shifting the position of a variant within an MPRA probe influences its regulatory activity using models that predict expression in MPRAs from DNA sequence. We find that while the direction of variant effects is usually preserved across positions, the magnitude of expression changes can vary substantially depending on where the variant is placed within the construct. This positional bias appears to be largely explained by the strong position-dependent activity of TFs whose binding the variants perturb. In a subset of cases, interactions consistent with cooperativity between TFs also contributes to position-specific effects. [~]1% of variants appear to disrupt RNA polymerase III (Pol III) promoters within Alu elements, resulting in position-specificity because both A and B boxes are required for function and exclusion of either motif due to window shifts disrupts the variants effects. However, we saw little evidence to support the hypothesis that the positional dependence of variant effects resulted from redundancy of motifs. Overall, our study demonstrates the complexity of cis-regulatory grammar and how it can confound the interpretation of regulatory variants.