Genes
○ MDPI AG
Preprints posted in the last 90 days, ranked by how well they match Genes's content profile, based on 126 papers previously published here. The average preprint has a 0.10% match score for this journal, so anything above that is already an above-average fit.
Ancelin, K.; Somasundaram, P.; Galupa, R.
Show abstract
The X chromosome (chrX) is the eighth largest human chromosome, harbouring an estimated total of 839 protein-coding genes. Historically, the chrX has been described as enriched for genes related to brain development, sexual differentiation and reproduction, earning the epithet of "smart and sexy chromosome". Many studies have confirmed that the chrX is indeed "smart", including a recent systematic analysis of human chrX genes which found an enrichment in genes relevant to brain functions. However, it is less clear whether the chrX being "sexy" still holds true. Here we reviewed the origins of this idea and we evaluated human X-linked genes in terms of their expression across several tissues, their annotated functions and their association with monogenic disorders related to sexual differentiation and reproduction (SDR). We found that sex-specific tissues show higher expression levels from chrX genes than from autosomal genes except in testis, but that X-linked genes are significantly enriched among the most highly expressed genes in testis, specifically within spermatogonia and Sertoli cells. Yet, we found no evidence for an enrichment of genes on the X with annotated functions related to male or female SDR. When analysing SDR-related monogenic disorders, we found a significant enrichment of genes on chrX associated with clinical terms related to male SDR but not with clinical terms related to female or general SDR. Overall, our results support the notion of a somewhat "sexy" X chromosome, shaped by X-linked expression patterns and clinical associations rather than current annotated gene functions.
Yasar, B.; Org, T.; Ivask, M.; Yazgeldi Gunaydin, G.; Boskovic, N.; Jaakma, U.; Kere, J.; Kurg, A.; Katayama, S.
Show abstract
BackgroundDUXC is a multi-copy transcription factor gene found within a long tandem repeat locus in several Laurasiatherians. It is suggested to be functionally similar to human DUX4 because of its shared C-terminal domain and its close phylogenetic relationship to DUX4. DUX family genes are transiently expressed in preimplantation embryos of placental mammals. However, early embryo-derived cDNA proof for DUXC, which is needed for its further functional characterization, has not been reported so far. ResultsOur study provides a full-length sequence of DUXC mRNA, derived from the 8-cell stage in vitro fertilization (IVF) bovine embryos, containing double homeobox and 9aa transactivation domain (9aaTAD)-encoding sequences. Identified DUXC sequence uncovered a first exon that was not previously annotated. We showed that DUXC mRNA levels are independent of the embryonic transcription at the 2-, 4-, and 8-cell stage, whereas its decline, observed from the 8-cell stage onwards, is minor embryonic genome activation (EGA)-dependent. We also investigated the genomic organisation of the DUXC array in eight different cattle breed assemblies, revealing polymorphic internal repeats flanked by an incomplete distal unit at the telomeric end and a much shorter unit at the proximal end of the DUXC array. Despite the presence of a putative polyadenylation signal downstream the distal unit, we presented evidence for the expression of internal but not distal DUXC in early bovine IVF embryos. ConclusionsDUXC is a potential bovine EGA inducer, supported by its expression at peak levels at pre-EGA stages followed by a decrease with a dependency on minor EGA.
Shen, J.; Tang, S.; Xia, Y.; Qin, J.; Xu, H.; Tan, Z.
Show abstract
BackgroundConventional models of human ribosomal DNA (rDNA) array organization have historically depended on transcription-centric boundaries, partitioning the unit into a [~]13 kb rDNA transcription region and a monolithic [~]31 kb intergenic spacer (IGS). While our previous identification of Duplication Segment Units (DSUs) mapped these arrays based on an intuitive analysis of the microsatellite density landscape of the complete reference human genome, our present deep mining of this landscape has revealed a more accurate rDNA Gene Unit Pattern. Methods & ResultsIn this study, we conducted a deep mining analysis of our previously established microsatellite density landscape of the T2T-CHM13 assembly, focusing specifically on nucleolar organizing regions (NORs). We suggest a more accurate rDNA Gene Unit Pattern containing a (CTTT)n microsatellite aggregation ahead of the rDNA gene and a (CT)n microsatellite aggregation behind the gene, rather than a pattern featuring an IGS region inserted between two rDNA genes. ConclusionsA correct rDNA gene pattern of the human genome probably includes a (CTTT)n microsatellite aggregation ahead of the gene and a (CT)n microsatellite aggregation behind it, which possibly constitute cis- and trans-regulating regions; the (CTTT)n and (CT)n microsatellite aggregations may provide two different local stable DNA structures for regulatory protein binding.
Nair, S.; Singh, D.; Saha, A.; Datta, B.; Majumdar, S.
Show abstract
Long non-coding RNAs (lncRNAs) account for a major proportion of the transcriptional output in complex organismal genomes. Their emergence as auxiliary regulators of gene expression as well as their roles in metastasis and cancer progression has put them in the limelight. LncRNAs perform multitudes of functions and often moonlight as regulators, scaffolds and guides. Most lncRNAs are cell and tissue specific and can act as markers for diseases as well as targets for therapeutic interventions. LncRNAs are also known to make use of higher order structures such as G-quadruplexes (G4) to facilitate complex functions and interactions. THAP9-antisense1 (AS1) is a lncRNA coding gene (recently annotated by Ensembl) that codes for 12 lncRNA transcripts and has been implicated in many disease pathologies like gastric cancer, spontaneous neutrophil apoptosis, hepatocellular carcinoma, and the progression of oesophageal cancer. It is the antisense gene pair of the THAP9 gene ( a transposase derived gene) with which it shares a promoter. THAP9-AS1 has been reported to be dysregulated during stress and several cancers. However, the exact role of the lncRNA is not well understood. Bioinformatics driven strategies are used to identify putative quadruplex forming sequences (PQSs) within the lncRNA THAP9-AS1. The identified PQSs are further validated using biophysical, spectroscopic and molecular biology driven techniques. The importance of each G-tract in the formation of a particular RNA G-quadruplex (rG4) is studied via the investigation of several deletion mutants. The findings demonstrate the rG4 forming potential of the identified PQSs within THAP9-AS1.
Li, T.; Wang, y.; Zhang, Z.; Chen, c.; Zheng, n.; Wang, j.; Ning, m.; Wang, j.; Ai, H.; Huang, Y.
Show abstract
BackgroundAlthough the biological mechanism for heterosis has been debated for a long time, heterosis is widely utilized to increase the global productivity of crops and livestock. Recently, the mechanism has been well characterized in crops and livestock with a male-heterogametic XY system due to genomic assembly advancements, especially the availability of haploid genomes. However, the biological mechanism for heterosis remains unclear in poultry possessing the female-heterogametic ZW system. ResultsHere, we assembled chromosome-level diploid and haploid genomes of the Muscovy duck. We developed an efficient and cost-effective method to assemble 12 variation graph-haploid Muscovy duck genomes from three full-sibling pairs with high quality using short-read Illumina sequences. We further characterized genetic, expression and regulatory patterns of parental alleles at multiple scales. We found that maternal haploid genomes generally had more open chromatin organization and higher accessibility, and higher levels of gene expression, while showing similar DNA methylation levels when compared to paternal haploid genomes. In contrast, the female paternal Z chromosome showed the most, and the male paternal Z chromosome presented more, relaxed chromatin organization and chromatin accessibility, and gene expression compared to the male maternal Z chromosome. Thus, the ZW system largely relies on compensation and balance to regulate gene expression on the sex Z chromosome. Moreover, we identified non-Mendelian regions covering 0.26% of the genome ([~]3.18 Mb). These regions contained lower gene density, GC content, and repeat sequence frequency, but were enriched for DNA motifs bound by transcription factors, likely leading to a compacted chromatin structure and lower chromatin accessibility. ConclusionsOur work here provides a comprehensive profile of parental alleles genetic, expression and regulatory patterns in the female-heterogametic ZW system, and might be useful for the utilization of heterosis in poultry.
Cacheux, L.; Dutrillaux, B.; Gerbault-Seureau, M.; Nicolas, V.; Ponger, L.; Bed'Hom, B.; Escude, C.
Show abstract
BackgroundAlpha satellites, a superfamily of AT-rich tandem repeats, are the primary DNA component of centromeres in Platyrrhini and Catarrhini. Analyses of the human genome suggest that centromeres behave like biological ridges, with new alpha satellite families expanding at the centromere core, splitting and displacing older ones towards the pericentromeres. The Cercopithecini tribe, which displays an unusual chromosomal evolution involving multiple chromosomal fissions and centromere formations, represents a promising model to enhance our understanding of alpha satellite DNA evolutionary history. We previously applied targeted sequencing to centromere DNA from two distant species drawn from the Cercopithecini terrestrial and arboreal lineages, and characterized six alpha satellite families exhibiting varying mean sequence identities. MethodsCombining classical and molecular cytogenetics, we mapped the chromosomal distribution of these alpha satellite families across 13 Cercopithecini, one Papionini, and one Colobinae species. A nuclear marker-based phylogeny provided an evolutionary framework for interpretation. ResultsOur phylogeny identifies the terrestrial and arboreal lineages, and a newly designated swamp clade. We observed significant interspecies variations in alpha satellite patterns, including differences in presence/absence and distinct chromosomal distribution patterns (centromeric, pericentromeric, or subtelomeric). Families previously described as heterogeneous (83-87% mean sequence identity) exhibit a centromeric position in the swamp lineage, which is characterized by conserved karyotypes. In contrast, these families show a pericentromeric distribution in the terrestrial and arboreal lineages, replaced at the centromere core by more homogeneous families (95-98% mean sequence identity). In the arboreal clade, which is characterized by highly fissioned karyotypes, putative evolutionary new centromeres show a unique co-occurrence of highly homogeneous and heterogeneous families. Conclusion & ImplicationsWe propose a comprehensive evolutionary scenario for alpha satellite DNA in Cercopithecini, where younger families arise at the centromere core, shift toward the pericentromeres as they age, and eventually face extinction. Our study suggests that alpha satellite DNA and chromosomes evolve in an interdependent manner, with satellite diversification and displacement occurring in parallel with chromosome fissions and centromere repositioning. This comparative cytogenomic approach provides both support for the human-based evolutionary model for alpha satellite DNA and novel temporal insights into its diversification dynamics. Beyond evolutionary genomics, our findings highlight the potential of alpha satellite DNA to complement systematic studies in deciphering complex primate evolutionary histories.
Axelsson, J.; Bruhn-Olszewska, B.; Sarkysian, D.; Markljung, E.; Horbacz, M.; Pla, I.; Sanchez, A.; Nenonen, H.; Elenkov, A.; Dumanski, J. P.; Giwercman, A.
Show abstract
Cancer-related genomic instability (GI) may cause genetic alterations in spermatozoa, implying health issues not only in cancer survivors, but also in their children [1, 2]. We therefore studied Loss of Y chromosome (LOY), considered as hallmark of GI [3-15], in spermatozoa and blood from survivors of childhood and testicular cancer (CC, TC), and controls (CTRL). We found that LOY was statistically significantly more frequent in spermatozoa from cancer survivors than in controls (Odds Ratio [OR]=2.2 for CC vs. CTRL and OR=2.4 for TC vs. CTRL). Furthermore, LOY was about an order of magnitude more prevalent in spermatozoa than in blood among 18-53-year-old males within all cohorts. Our findings suggest that LOY in spermatozoa might be a clinically useful marker of GI, reduced fertility and disease predisposition in males. Introducing LOY in spermatozoa as a biomarker opens a new research avenue into disease prevention and the causes and consequences of LOY.
Laskowski, L. F.; Gruys, M. L.; Huber, R.; DiGeronimo, A.; Arsham, A. M.; Chandrasekaran, V.; Rele, C. P.; Boies, L.
Show abstract
Gene Model for Insulin-like peptide 4 (Ilp4) in the D. simulans DsimGB2 assembly (GCA_000754195.3). The characterization of this ortholog was carried out as part of a larger, ongoing dataset designed to explore the evolution of the insulin/insulin-like growth factor signaling (IIS) pathway across the genus Drosophila, utilizing the Genomics Education Partnership gene annotation protocol within Course-based Undergraduate Research Experiences.
Fernandez Figueroa, V.; Quercia, C. A.; Gallastegui-Ulloa, J.; Robeson, L.; Brauchi, S. E.
Show abstract
G-protein coupled receptors (GPCRs) are responsible for translating environmental signals of various types into cellular signals. Over 40 thousand GPCR orthologs have been discovered in the supergroup Unikonta, and around 800 genes encode for GPCRs in the human genome. In contrast to this astonishing variety, only a handful of GPCR-related genes have been reported in vascular plants, a major group within land plants. In an attempt to advance our understanding of plant GPCRs as well as their role in plant cellular signaling, here we present comprehensive bioinformatic analysis that includes phylogenetic hypotheses, in silico structural analysis, and tissue distribution of transcripts. Altogether, our work strongly suggests that GCR1 is the sole genuine GPCR expressed in Embriophyta. Finally, we briefly discuss the potential role of GCR1 in root hairs, the tubular outgrowths in root epidermal cells that are involved in nutrient absorption, environmental interaction, and root development.
Pellegrini, M.; Kim, R.; Rubbi, L.; Kislik, G.; Smith, D.
Show abstract
The measurement of inbreeding has gained significance across diverse fields, including population and conservation genetics, agricultural genetics, breeding programs for animals and plants, and wildlife management. This is due to the fact that inbreeding leads to increased homozygosity and results in lower genetic diversity, rendering populations more vulnerable to environmental changes, diseases, and other stressors. High or mid-coverage whole genome sequencing (WGS) has been widely used for inbreeding estimation, but it is resource-intensive. We aimed to investigate the use of ultra low-coverage whole genome sequencing (ulcWGS) as a cost-effective alternative for inbreeding analysis. Domestic dogs were used for our study as their extensive breeding histories lead to populations with a wide range of inbreeding levels. We constructed a multi-breed reference panel from high-coverage WGS samples. Inbreeding in independent ulcWGS samples was then estimated using runs of homozygosity (RoH) and inbreeding coefficients (F). We modeled the relationship between these measures and sequencing depth using nonlinear regression, to generate inbreeding estimates relative to sequencing depth. Resulting relative RoH and F measurements were significantly correlated, with purebred dogs exhibiting more runs of homozygosity and higher inbreeding coefficients compared to mixed-breed dogs. Our findings demonstrate that ulcWGS can provide reliable and economical estimations of inbreeding, expanding accessibility to genetic monitoring.
Sattler, M. C.; Singh, A.; Bass, H. W.; Mondin, M.
Show abstract
BackgroundMaize knobs are regions of constitutive heterochromatin that are readily identified in both meiotic and somatic chromosomes. These structures have been characterized as stable throughout the cell cycle, exhibiting late replication during the S-phase, and are composed of two specific families of highly repetitive DNA sequences: K180 and TR-1. Although widely used as cytogenetic markers due to their variability in number and chromosomal position across inbred lines, hybrids, and landraces, little is known about their chromatin structure and dynamics. In this study, we analyzed chromatin accessibility of knobs using DNS-seq data across four maize tissues representing distinct developmental stages. ResultsOur results reveal that K180 knobs exhibit tissue-specific variation in chromatin accessibility, transitioning between open and closed states during development. In contrast, the TR-1 knob of chromosome 4 remained consistently inaccessible across all tissues analyzed. A knob composed of both K180, and TR-1 further supported this observation, with only the K180 region showing dynamic accessibility. To validate these findings, we also analyzed other repetitive regions such as centromeres, which showed a uniformly closed chromatin structure similar to TR-1. These results suggest a unique developmental modulation of chromatin accessibility associated with K180 repeats. While the chromatin accessibility of knobs does not reach the levels observed at Transcription Start Sites (TSS), the comparison among different classes of repetitive DNA within maize constitutive heterochromatin provides compelling evidence for sequence-specific and tissue-specific chromatin dynamics. ConclusionsOur findings uncover a previously unrecognized property of maize knobs and establish a reference for future studies on chromatin organization and epigenetic regulation of repetitive DNA in plant genomes.
Cooper, H. B.; Rojas Lopez, K. E.; Schiavinato, D.; Black, M. A.; Gardner, P. P.
Show abstract
Proteins and non-coding RNAs are functional products of the genome that are central for crucial cellular processes. With recent technological advances, researchers can sequence genomes in the thousands and probe numerous genomic activities of many species and conditions. Such studies have identified thousands of potential proteins, RNAs and associated activities. However there are conflicting interpretations of the results and therefore which regions of the genome are "functional". Here we investigate the relative strengths of associations between coding and non-coding gene functionality and genomic features, by comparing reliably annotated functional genes to non-genic regions of the genome. We find that the strongest and most consistent association between functional genes and genomic features are transcriptional activity and evolutionary conservation. We also evaluated sequence-based statistics, genomic repeats, epigenetic and population variation data. Other features strongly associated with function include histone marks, chromatin accessibility, genomic copy-number, and sequence alignment statistics such as coding potential and covariation. We also identify potential issues with SNP annotations in short non-coding RNAs, as some highly conserved ncRNAs have significantly higher than expected SNP densities. Our results demonstrate the importance of evolutionary conservation and transcription activity for indicating protein-coding and non-coding gene function. Both should be taken into consideration when differentiating between functional sequences and biological or experimental noise.
PATIAL, R.; Ray, S.; Singh, K.; Sobti, R. C.
Show abstract
Infertility is a complex condition affecting both the male and female population. Influenced by multiple factors, it remains a constant challenge due to limited understanding of endometrial abnormalities. With this study we aim to investigate the molecular basis of infertility using transcriptomic analysis of endometrial tissue from the NCBI GEO dataset GSE92324. We performed exploratory data analysis using Principal Component Analysis (PCA) to find samples variance followed by differential gene expression (DGE) analysis using DESeq2 package where we identified 168 significant genes with adjusted p-value < 0.05 and |log2FC| > 2. Upregulated genes included GPX3, CXCL14, and PPARGC1A and downregulated genes included WNK4, GJB2, and TRPM6. Functional enrichment using KEGG and GO showed that differentially expressed genes (DEGs) are involved in immune-inflammatory pathways, lipid metabolism and steroid biosynthesis pathways. Through Ingenuity Pathway Analysis (IPA) we identified affected canonical pathways such as increased innate immune responses, altered lipid metabolism and inhibition of mitochondrial dysfunction. Upstream regulator analysis highlighted PTEN, PRKAA1, HDAC4, IL10RA, and RAD51, which were impacting metabolic pathways and anti-inflammatory signalling. Further, through Weighted Gene Co-expression Network Analysis (WGCNA) we found a Turquoise module that had very strong and highly significant negative correlation (cor = - 0.84, respectively and P < 0.0001) with traits of interest. This led to the discovery of C7orf50 as a novel insight involved in cholesterol metabolism linked to infertility. This integrative approach reveals crucial genes, co-expression modules, and underlying pathways involved in female infertility. GRAPHICAL ABSTRACT O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=139 SRC="FIGDIR/small/701467v1_ufig1.gif" ALT="Figure 1"> View larger version (41K): org.highwire.dtl.DTLVardef@4418a6org.highwire.dtl.DTLVardef@ae7900org.highwire.dtl.DTLVardef@89f581org.highwire.dtl.DTLVardef@154f1a9_HPS_FORMAT_FIGEXP M_FIG C_FIG HIGHLIGHTSO_LIFrom the dataset GSE92324 total of 168 significant DEGs associated with unexplained infertility were identified using adjusted p-value < 0.05 and |log2FC| > and < 2. C_LIO_LIIn comparison with the CTD list we identified five genes C1orf106, C15orf59, LINC00461, C15orf48, and C10orf99 previously unknown as having direct evidence of involvement in infertility. C_LIO_LIWGCNA analysis highlighted the turquoise module as highly associated and gave the novel gene C7orf50 associated with cholesterol metabolism. C_LIO_LIIPA revealed PTEN, PRKAA1, IL10RA, and RAD51 as potential upstream regulators and inflammatory pathways, mitochondrial dysfunction as canonical pathways. C_LIO_LIThe study highlights a novel link between GI inflammation and endometrial receptivity. C_LI
Mays, A.; Cabrera, F.; Macias-Munoz, A.
Show abstract
BackgroundTransposable elements (TEs) are repetitive genetic elements that can jump to new loci causing genome expansions, structural rearrangements, and can, ultimately, propel the evolution of genomes. Despite their significance, the role of TEs in the evolution of genomes and phylogenetic groups remains largely understudied in early diverging lineages. Further, the extent to which TE content varies across species is still an open question. Medusozoa, a group within Cnidaria encompassing jellyfish and hydroids, exhibits an exceptional diversity of life history strategies, body plans, and physiological capabilities. These characteristics, along with its early-diverging phylogenetic position, establish Medusozoa as an ideal system for investigating the composition and evolutionary history of TEs within the group. ResultsWe generated a custom repeat library built from annotations of 25 Medusozoan genomes and used it to characterize TEs, aiming to identify lineage-specific TE content and activity that may correlate with the diversity observed within the group. We found that repetitive element percentage and genome size varied considerably, with Hydrozoa exhibiting the most variation among classes in both respects. DNA transposons were the most prevalent TE classification in all but two genomes, averaging 28% of all genomes. Intra-genus comparisons revealed a surprising degree of differences in TE content. In the genus Aurelia, the expansion of a single DNA transposon superfamily accounted for much of the difference in repetitive element percentage between two species, whereas in the genus Turritopsis, a similar divergence resulted from the proliferation of multiple superfamilies. Interestingly, most genomes showed evidence of recent TE expansions, suggesting ongoing activity in many medusozoan species. ConclusionWe present the first comparative analysis of TEs across all medusozoan classes. Our results reveal class-specific TE dynamics and highlight cases of TE proliferations as lineages diverge. This research provides data on TE activity and diversity that can be used as a resource for future study and fills important gaps in our understanding of TEs in early diverging animal lineages.
He, Z.; Li, Y.; Shkurat, T. P.; Butenko, E. V.; Derevyanchuk, E. G.; Lomteva, S. V.; Chen, L.; Lipovich, L.
Show abstract
BackgroundPolycystic ovary syndrome (PCOS) is a prevalent endocrine disorder and a leading cause of female infertility, with complex genetic, metabolic, and hormonal etiologies. Long non-coding RNAs (lncRNAs) have emerged as important regulators of diverse biological processes, yet their roles in PCOS remain underexplored. Here, we identified and characterized PCOS differentially expressed gene-associated lncRNAs (PDEGAL) with an integrative approach combining expression data, genetic association, and evolutionary analysis. MethodsThirty-three PCOS-associated protein-coding genes were obtained from our prior study, and all their nearby and overlapping lncRNAs were annotated. These candidates were analyzed using UCSC Genome Browser-mapped annotations and datasets, including NCBI RefSeq, GENCODE, GTEx, GWAS SNPs, and conservation, as well as the FANTOM5 cap analysis of gene expression (CAGE) promoter data, to assess their expression, regulatory potential, genetic variant overlaps, and evolutionary conservation. ResultsTwenty-three PDEGALs (18 antisense to, and 5 sharing bidirectional promoters with, known PCOS-associated protein-coding genes) were identified. 17 PDEGALs contained GWAS SNPs with statistically significant disease associations, 9 of which were associated with PCOS-related traits. 5 PDEGALs demonstrated expression in the KGN granulosa cell model of PCOS. Key gene structure element (KGSE) analysis revealed that most PDEGALs are primate-specific. Integrating four criteria--GTEx expression, GWAS SNPs, FANTOM promoterome, and KGSE conservation--highlighted HELLPAR as the only lncRNA fulfilling all four, while five others--PGR-AS1, MTOR-AS1, ENSG00000265179, ENSG00000256218, and LOC105377276--fulfilled three of the four criteria. ConclusionsWe have systematically identified candidate PCOS regulatory lncRNAs with convergent genetic, expression, and evolutionary evidence. These results provide a framework for functional validation and highlight lncRNAs as potential biomarkers and therapeutic targets in PCOS that function by regulating their nearby and overlapping protein-coding genes.
Rodriguez Felizzola, J. J.; Soriano Bermudez, J. J.; Blanco Pastor, J. L.
Show abstract
AimThe commercial interest of grapevines (Vitis vinifera L.) has prompted numerous studies on their origin and genetic resources in the context of global change. However, genomic-scale information on diversity patterns and genetic structure in southwestern Europe remains scarce. This study infers the genetic structure, gene flow events between genetic groups, and genetic refugia of Vitis vinifera ssp. sylvestris in the Iberian Peninsula. LocationThe Iberian Peninsula. TaxonThe wild grapevine, Vitis vinifera L. ssp. sylvestris MethodsWe reanalyzed a set of 137 complete genomes of V. vinifera ssp. sylvestris. After variant calling, validation and annotation, we obtained a high-quality SNP dataset. Using these markers, we performed phylogenetic and population structure analyses to determine the number and spatial distribution of genetic groups and their contact zones. Next, we inferred the timing and directionality of gene flow events between groups. Finally, heterozygosity and allele rarity were estimated to identify populations with high conservation value. ResultsWe detected three major ancestral populations and four putative genetic refugia in the south of the Iberian Peninsula. Demographic analyses indicate sustained gene flow between [~]21,000 and [~]7,000 years ago from a North African ancestral group into Iberian wild populations in the south. Heterozygosity and allele rarity analyses identified populations of high conservation value in a variety of areas within the Iberian Peninsula. Main ConclusionsWe identify the biogeographical factors behind the long-known singularity of wild Iberian grapevines. The southern Iberian Peninsula is a hotspot of genetic diversity for wild grapevines, hosting three ancestral populations and multiple contact zones that acted as micro-refugia. The current genetic variability of Iberian wild grapevines is best explained by natural, climate-driven gene flow between African lineages with Middle Eastern origin and Iberian groups. These contacts were favored by climatic conditions during the late Pleistocene ([~]21,000 years) and early Holocene ([~]8,300 years). Our results dismiss a significant anthropogenic influence during Neolithic domestication for explaining the genetic composition of Iberian wild grapevine genotypes.
Durante, A.; Feve, K.; Naylies, C.; Labrune, Y.; Gress, L.; Lippi, Y.; Legoueix, S.; Milan, D.; Gourdine, J.-L.; Gilbert, H.; Renaudeau, D.; Riquet, J.; Devailly, G.
Show abstract
BackgroundGene expression levels are affected by genetics and environmental effects. However, quantification of the influence of genetics and environmental effects on gene expression remains limited, especially in farm animals. Here, the relative influence of genetic and heat-related environmental variations on gene expression levels was investigated in pigs, using a backcross herd of diverse heat adaptation levels. Backcross animals were raised in either a tropical or temperate environment. Animals raised in temperate environment were subjected to an experimental heat stress at the end of their growth. ResultsWe identified 1,967 differentially expressed genes (DEGs) between pigs raised in the tropical (n = 181) and temperate (n = 180) facilities, and 472 DEGs throughout a 3 weeks experimental heat stress. Transcriptome-wide association (TWAS) study identified 139 associations between gene expression levels and thermoregulation/production traits. We detected 6,014 expression quantitative trait loci (eQTLs) associated with the expression level of 3,297 genes. Genetic variance was estimated to explain 36.3% of gene expression variance on average, and was the main source of variance for 27.7% of transcripts. Most eQTLs found are located in proximal regions (cis-eQTLs) and few within distal regions (trans-eQTLs) to their assigned genes. A trans-eQTL hotspot highlighted a hematopoietic mechanism driven by GPATCH8. An integration of GWAS and TWAS pointed to TMCO1 and ZNF184 as candidate genes for backfat thickness. ConclusionsThis study provides a better understanding of the impact of climate, heat stress and genetic influences on the pig whole blood transcriptome.
Montoliu-Nerin, M.; Strunov, A.; Heyworth, E.; Schneider, D. I.; Thoma, J.; Hua-Van, A.; Courret, C.; Klasson, L. J.; Miller, W. J.
Show abstract
BackgroundAlthough strict maternal transmission of mitochondria is a general feature of animals and humans for ensuring homogeneity in mitochondrial DNA (mtDNA) across generations, exceptions were reported in the recent past. For example, some extremely rare but spectacular cases of heteroplasmy and paternal transmission in humans have questioned the universal evolutionary principle. Hence, as an alternative, the Mega-NUMT concept was coined to explain this discovery and was thereafter partly proven to exist. This concept expands on the quite common transfer of mtDNA fragments to the nucleus (NUMTs) by considering the existence of multicopy mitochondrial nuclear insertions. Mega-NUMT reports are currently restricted to a few cases in animals, including humans. However, even in humans, their detailed genomic organization, natural prevalence, and potential biological functions remain unclear. Methodology/Principal FindingsHere, we discovered that up to 60 full-sized mitochondrial genomes are integrated into the nuclear genome of the neotropical fruit fly Drosophila paulistorum using long-read sequencing and confirmed their presence by in situ hybridization. The copies are organized in one cluster on chromosome 3, which we, due to its similarity with the Mega-NUMT concept, designated the "Dpau Mega-NUMT". Contrary to the rarity in humans, this Mega-NUMT is found at high prevalence (40%) in both long-term laboratory lines and natural D. paulistorum populations of different semispecies. Additionally, the mitochondrial copies in the Mega-NUMT cluster are phylogenetically separated from the current mitotypes of D. paulistorum. Together, these observations suggest long-term maintenance of the Mega-NUMT in nature. Hence, we propose that the Dpau Mega-NUMT may have been transferred to the nuclear genome before D. paulistorum semispecies radiation and maintained at relatively high prevalence in nature by balancing selection due to yet undetermined functions. Conclusions/SignificanceTo our knowledge, this is the first verified existence and detailed dissection of a Mega-NUMT outside cats and humans. We show that Mega-NUMTs can be persistent in nature, even at high prevalence, potentially due to balancing selection. Our findings strengthen the importance of high-quality long-read sequencing technologies for deciphering complex repeat-rich genomic regions to deepen our understanding of the dynamics of genome evolution within genomic "dark matter".
Watcharapalakorn, A.; Poyomtip, T.; Tawonkasiwattanakun, P.; Dewi, P. K. K.; Thomrongsuwannakij, T.; Mahawan, T.
Show abstract
PurposeTo determine whether circadian timing defines critical molecular windows in myopia development and to assess the transferability of circadian gene programs across ocular tissues, disease stages, and species. MethodsPublicly available retinal and choroidal RNA-seq datasets from chick models of form-deprivation myopia were analyzed using unsupervised transcriptomic profiling and multistage machine-learning classification. Circadian windows were defined based on Zeitgeber time, and samples were grouped accordingly for downstream analyses. Classification model robustness was evaluated through cross-tissue and cross-stage validation and further assessed using external validation in an independent dataset. Functional translation to humans was examined using ortholog-based Gene Ontology enrichment analysis to identify conserved biological processes and higher-order regulatory pathways. ResultsA circadian critical window at ZT8-ZT12 exhibited the strongest transcriptional divergence during both myopia onset and progression. Gene signatures derived from this window generalized across retina and choroid and remained predictive across disease stages, supporting coordinated molecular regulation between ocular tissues. External validation confirmed the reproducibility of these signatures despite differences in experimental design and gene coverage. Functional mapping revealed that conserved molecular components in chicks are reorganized into more complex neuroendocrine and regulatory networks in humans, indicating cross-species conservation with increased functional complexity. ConclusionsCircadian timing strongly shapes myopia-related gene expression and underlies coordinated retina-choroid signaling. These findings highlight circadian biology as a key factor of refractive development and suggest that time-dependent mechanisms may influence myopia susceptibility, progression, and response to treatment.
von Hardenberg, S.; Niehaus, I.; Wiemers, A.; Rothoeft, T.; Schaeffer, V.; Huang, K.; Petree, C.; Phillipe, C.; Bruel, A.-L.; Warnatz, K.; Zamani, M.; Ahmadi, R.; Sedaghat, A.; Bahram, S.; Sedighzadeh, S.; Sareh, E.; Khalilian, S.; Landwehr-Kenzel, S.; Schwerk, N.; Abdulwahab, E.; Roesler, J.; Lin, S.-J.; Sabu, S.; Strenzke, N.; Sogkas, G.; Vona, B.; Varshney, G. K.; DiDonato, N.; Bernd, A.
Show abstract
BackgroundThe transport of transfer RNAs (tRNAs) from the nucleus to the cytoplasm is a crucial step in the regulation of gene expression and protein synthesis. This process is mediated by specialized export molecules, among which XPOT (Exportin-t, XPO3) plays a central role by recognizing and transporting mature tRNAs through the nuclear pore complex. XPOT is not essential in RNA trafficking in the simple organisms, however the potential impact of XPOT deficiency in human health remains unresolved. MethodsWe identified eight patients from five unrelated families with rare biallelic germline variants in XPOT resulting in putative loss-of-function. Functional analyses were carried out in patient-derived fibroblasts, lymphoblastoid cells and zebrafish models. Ex vivo immunohistochemical stainings for Xpot were performed in the mouse cochlea. xpot knockout zebrafish models were generated to assess the morphology and hearing ability. ResultsAll patients presented with a uniform clinical phenotype that included increased susceptibility to infection, bronchiectasis, severe sensorineural hearing loss, developmental delay, and growth retardation. We demonstrated a complete absence of XPOT protein expression in three patient-derived cell lines. XPOT deficiency leads to disruptions in protein synthesis of the cytokine TNF pathway upon cellular stimulation. Additional XPO1 inhibition in XPOT deficient cells had little effect on cellular functions, suggesting alternative tRNA nuclear transporter pathways. Increased XPOT immunoreactivity was observed in type I spiral ganglion neurons and hair cells of the mouse cochlea, with enrichment in stereocilia. xpot knockout zebrafish model showed dysmorphic features, and reduced hearing, recapitulating key patient phenotypes. ConclusionsOur findings establish a direct connection between impaired XPOT-dependent tRNA export and human pathology. It illustrates that perturbations in nuclear export pathways lead to disease. It also raises the possibility that other nuclear transport receptors may play similarly underappreciated roles in human health and disease. The identification of XPOT as a disease-associated gene opens up new research directions and potential targets for therapeutic intervention.