Back

Genes

MDPI AG

Preprints posted in the last 7 days, ranked by how well they match Genes's content profile, based on 126 papers previously published here. The average preprint has a 0.10% match score for this journal, so anything above that is already an above-average fit.

1
Evolutionary history of alpha satellite DNA in Cercopithecini: comparative cytogenomics highlights the diversification pattern of primate centromere repeats

Cacheux, L.; Dutrillaux, B.; Gerbault-Seureau, M.; Nicolas, V.; Ponger, L.; Bed'Hom, B.; Escude, C.

2026-04-21 evolutionary biology 10.64898/2026.04.19.719437 medRxiv
Top 0.1%
6.3%
Show abstract

BackgroundAlpha satellites, a superfamily of AT-rich tandem repeats, are the primary DNA component of centromeres in Platyrrhini and Catarrhini. Analyses of the human genome suggest that centromeres behave like biological ridges, with new alpha satellite families expanding at the centromere core, splitting and displacing older ones towards the pericentromeres. The Cercopithecini tribe, which displays an unusual chromosomal evolution involving multiple chromosomal fissions and centromere formations, represents a promising model to enhance our understanding of alpha satellite DNA evolutionary history. We previously applied targeted sequencing to centromere DNA from two distant species drawn from the Cercopithecini terrestrial and arboreal lineages, and characterized six alpha satellite families exhibiting varying mean sequence identities. MethodsCombining classical and molecular cytogenetics, we mapped the chromosomal distribution of these alpha satellite families across 13 Cercopithecini, one Papionini, and one Colobinae species. A nuclear marker-based phylogeny provided an evolutionary framework for interpretation. ResultsOur phylogeny identifies the terrestrial and arboreal lineages, and a newly designated swamp clade. We observed significant interspecies variations in alpha satellite patterns, including differences in presence/absence and distinct chromosomal distribution patterns (centromeric, pericentromeric, or subtelomeric). Families previously described as heterogeneous (83-87% mean sequence identity) exhibit a centromeric position in the swamp lineage, which is characterized by conserved karyotypes. In contrast, these families show a pericentromeric distribution in the terrestrial and arboreal lineages, replaced at the centromere core by more homogeneous families (95-98% mean sequence identity). In the arboreal clade, which is characterized by highly fissioned karyotypes, putative evolutionary new centromeres show a unique co-occurrence of highly homogeneous and heterogeneous families. Conclusion & ImplicationsWe propose a comprehensive evolutionary scenario for alpha satellite DNA in Cercopithecini, where younger families arise at the centromere core, shift toward the pericentromeres as they age, and eventually face extinction. Our study suggests that alpha satellite DNA and chromosomes evolve in an interdependent manner, with satellite diversification and displacement occurring in parallel with chromosome fissions and centromere repositioning. This comparative cytogenomic approach provides both support for the human-based evolutionary model for alpha satellite DNA and novel temporal insights into its diversification dynamics. Beyond evolutionary genomics, our findings highlight the potential of alpha satellite DNA to complement systematic studies in deciphering complex primate evolutionary histories.

2
Comparative analysis of transposable elements in jellyfish and hydroid species (Cnidaria: Medusozoa)

Mays, A.; Cabrera, F.; Macias-Munoz, A.

2026-04-21 evolutionary biology 10.64898/2026.04.17.719288 medRxiv
Top 0.2%
4.0%
Show abstract

BackgroundTransposable elements (TEs) are repetitive genetic elements that can jump to new loci causing genome expansions, structural rearrangements, and can, ultimately, propel the evolution of genomes. Despite their significance, the role of TEs in the evolution of genomes and phylogenetic groups remains largely understudied in early diverging lineages. Further, the extent to which TE content varies across species is still an open question. Medusozoa, a group within Cnidaria encompassing jellyfish and hydroids, exhibits an exceptional diversity of life history strategies, body plans, and physiological capabilities. These characteristics, along with its early-diverging phylogenetic position, establish Medusozoa as an ideal system for investigating the composition and evolutionary history of TEs within the group. ResultsWe generated a custom repeat library built from annotations of 25 Medusozoan genomes and used it to characterize TEs, aiming to identify lineage-specific TE content and activity that may correlate with the diversity observed within the group. We found that repetitive element percentage and genome size varied considerably, with Hydrozoa exhibiting the most variation among classes in both respects. DNA transposons were the most prevalent TE classification in all but two genomes, averaging 28% of all genomes. Intra-genus comparisons revealed a surprising degree of differences in TE content. In the genus Aurelia, the expansion of a single DNA transposon superfamily accounted for much of the difference in repetitive element percentage between two species, whereas in the genus Turritopsis, a similar divergence resulted from the proliferation of multiple superfamilies. Interestingly, most genomes showed evidence of recent TE expansions, suggesting ongoing activity in many medusozoan species. ConclusionWe present the first comparative analysis of TEs across all medusozoan classes. Our results reveal class-specific TE dynamics and highlight cases of TE proliferations as lineages diverge. This research provides data on TE activity and diversity that can be used as a resource for future study and fills important gaps in our understanding of TEs in early diverging animal lineages.

3
Investigating Uptake and Impact of Genetic and Genomic Evaluation Following a Perinatal Demise

Mossler, K.; D'Orazio, E.; Hall, K.; Osann, K.; Kimonis, V.; Quintero-Rivera, F.

2026-04-23 genetic and genomic medicine 10.64898/2026.04.22.26347546 medRxiv
Top 0.9%
1.7%
Show abstract

Objective The decline of the perinatal demise rate is slowing and demises are often unexplained. Significant research has been done regarding diagnostic yield and genetic causes of demise, but little is known about how Geneticist involvement impacts outcomes. The goal of the study was to evaluate post-mortem genetic testing practices and effects of the geneticists involvement. Methods Retrospective data from 111 perinatal demise cases was examined, including rates of prenatal genetic counseling, post-delivery genetics consult, genetic testing, and autopsy investigation. Results In this cohort 54% received genetic testing and 25% received a genetics consult. When compared to those without, cases with genetic specialist involvement were associated with significant increases in testing uptake (p=0.007), diagnostic yield (p<0.001), and patient education (p<0.001). Second trimester stillbirths and those with fewer ultrasound (US) abnormalities were less likely to receive genetic testing (both p values <0.001) and consults (p<0.001, p=0.020). Conclusion Though it was not possible to avoid ascertainment bias, this data demonstrates that geneticist involvement correlates with a higher rate of testing, greater diagnostic yield, and more thorough counseling. These findings underscore the importance of integrating genetics providers into perinatal postmortem healthcare teams.

4
Pseudouridylation of rRNA by specific snoRNA disrupts ribosomal machinery and consequently affects metabolism, longevity and neurodegeneration

Gauvrit, T.; Minquilan, P.; Marchand, V.; Motorin, Y.; MARTIN, J.-R.

2026-04-21 neuroscience 10.64898/2026.04.17.719250 medRxiv
Top 0.9%
1.7%
Show abstract

In our society, ageing, longevity, and neurodegenerative diseases are major concerns of public health. Recently, in Drosophila, we have identified a new cluster of three snoRNAs, including jouvence, and showed that each of them affect longevity and neurodegeneration. As these snoRNAs are required in the epithelium of the gut, these results point-out a causal relationship between the epithelium of the gut and the neurodegenerative lesions through the metabolic parameters, indicating a gut-brain axis. Here, we demonstrate that each snoRNA pseudouridylates a specific site on ribosomal-RNA, which consequently affects the amount of ribosomes as well as the translational efficacy. Moreover, using TRAP experiment assay, we also show that these lacks of pseudouridylations modify the translation of specific genes involved in lipid metabolism. Consequently, these lead to a chronic deregulation of trigycerides and sterols levels, whose correlate to an increase of neurogenerative lesions in old flies, as well as to a modification of longevity.

5
A bidirectional interaction between the SREBP pathway and the LINC complex component nesprin-4 controls lipid metabolism

Al-Sammak, B. F.; Mahmood, H. M.; Bengoechea-Alonso, M. T.; Horn, H. F.; Ericsson, J.

2026-04-21 cell biology 10.64898/2026.04.18.719359 medRxiv
Top 1.0%
1.7%
Show abstract

This report identifies a bidirectional signaling axis connecting lipid metabolism to nuclear mechanotransduction, with the potential to control fatty acid/triglyceride metabolism. The sterol regulatory element-binding (SREBP) family of transcription factors control fatty acid, triglyceride and cholesterol synthesis and metabolism. The family consists of three members: SREBP1a, SREBP1c, and SREBP2, that are regulated by intracellular cholesterol levels and insulin signaling. The SREBP2-dependent control of the LDL receptor gene is a well-established target for cholesterol-lowering therapeutics and the activity of SREBP1c is an attractive target in metabolic disease. In the current report, we identify SYNE4 (nesprin-4), a component of the Linker of Nucleoskeleton and Cytoskeleton (LINC) complex, as a direct target of the SREBP family of transcription factors, and show that nesprin-4 in turn supports SREBP1c function. We identify functional SREBP binding sites in the human SYNE4 promoter and demonstrate that these are required for the sterol- and SREBP-dependent regulation of the promoter. Furthermore, we show that the endogenous SYNE4 gene is also regulated by SREBP1/2 and intracellular sterol levels. Interestingly, SREBP2 is responsible for the sterol regulation of the SYNE4 gene in HepG2 cells, while SREBP1 is the major regulator in MCF7 cells, demonstrating that diberent cell types use diberent SREBP paralogs to regulate the same promoter/gene. Importantly, we find that nesprin-4 is a positive regulator of SREBP1c expression and function in HepG2 cells and during the diberentiation of human adipose-derived stem cells. In summary, the current report identifies a novel regulatory interaction between lipid metabolism and the LINC complex. Importantly, we demonstrate that this signaling axis is bidirectional, forming a closed loop that has the potential to control SREBP1c activity and thereby fatty acid and triglyceride synthesis/metabolism. Based on our data, we propose that the nesprin-4-dependent regulation of SREBP1c could represent a novel therapeutic target in metabolic disease.

6
Variation at COMT, ADH1B-ADH1C and HTR2A loci is associated with genetic predisposition to substance use disorders in Ukrainians

Bashynska, V.; Zahorodnia, O.; Borysovych, Y.; Zaplatnikov, Y.; Vasilyeva, V.; Arefiev, I.; Darvishov, N.; Osychanska, D.; Karapetov, A.; Melnychuk, O.; Boiko, O.; Zil'berblat, G.; Turos, O.; Prokopenko, I.; Kaakinen, M.

2026-04-24 genetic and genomic medicine 10.64898/2026.04.23.26351594 medRxiv
Top 1%
1.7%
Show abstract

Background: Substance use disorders (SUDs), including alcohol and drug dependence, and smoking, pose a public health threat with their high prevalence and comorbidity with other diseases, and contribution to mortality. SUDs are highly correlated, and their genetic background is shared to some degree. Objectives: We aimed to investigate the genetic associations of previously reported loci for a wide range of SUDs in an unstudied Ukrainian population. Methods: We collected data from 595 individuals (339 women, 253 men), including 321 participants from two rehab centres. Based on clinical review and questionnaire data we defined drug dependence, alcohol dependence, alcohol abuse, binge drinking, smoking, opiate, amphetamine, cannabis, and hallucinogen use, along with several intermediary alcohol use and smoking variables considering the amount of use and the level of dependence. We genotyped COMT-rs4680, ADH1B-ADH1C-rs1789891, and HTR2A-rs6313, and applied logistic and ordered logistic regression assuming an additive inheritance model, controlling for the recruitment group, other substance uses, age, and sex, in the association analyses. Results: We replicate (P<0.05) the associations at COMT-rs4680 with smoking status (OR[95% CI]=1.56[1.01-2.41], P=0.047) and heaviness (1.37[1.04-1.80], P=0.026), and at ADH1B-ADH1C-rs1789891 and HTR2A-rs6313 with alcohol dependence (1.69[1.03-2.76], P=0.038 and 0.66[0.47-0.92, P=0.016], respectively). Furthermore, we provide evidence for an association at HTR2A-rs6313 with hallucinogen use (0.58[0.35-0.98], P=0.040). Conclusion: In this study on multiple SUDs we shed light on the genetic background of SUDs in Ukrainians and provide further evidence that variation at COMT is mainly associated with smoking, at ADH1B-ADH1C with alcohol-related variables, whereas HTR2A is a more general SUD-associated locus.

7
Indirect Genetic Effects on Alcohol Use Disorder and Nicotine Dependence

Luo, M.; Trindade Pons, V.; Zakharin, M.; Pingault, J.-B.; Gillespie, N. A.; van Loo, H. M.

2026-04-19 addiction medicine 10.64898/2026.04.17.26351089 medRxiv
Top 1%
1.7%
Show abstract

Substance use disorders run in families, yet the mechanisms underlying intergenerational transmission remain unclear. We investigated indirect genetic effects, pathways through which parental genotypes influence offspring phenotypes via the family environment, for alcohol use disorder (AUD), nicotine dependence (ND), and related quantitative outcomes, and aimed to identify family environmental factors through which such effects may operate. Using transmitted and non-transmitted polygenic scores (PGS) constructed for problematic alcohol use, tobacco use disorder, and general addiction liability, we analyzed 5972 European-ancestry adult offspring with at least one genotyped parent from the population-based Lifelines cohort (Netherlands). Offspring outcomes included lifetime DSM-5 AUD diagnosis, AUD symptom count, maximum drinks in 24 hours, Fagerstrom Test for Nicotine Dependence score, and cigarettes per day. AUD findings were meta-analyzed with data from the Brisbane Longitudinal Twin Study (N = 1368; Australia). We also examined parent-of-origin effects and mediation by parental substance use and socioeconomic status using structural equation modeling. Transmitted PGS robustly predicted all AUD and ND outcomes ({beta} = 0.07-0.16; OR = 1.20 for AUD diagnosis). Non-transmitted PGS, indexing indirect genetic effects, were negligible for all clinical syndrome outcomes. The only significant indirect genetic effect was on cigarettes per day ({beta} = 0.03, p = 0.01), mediated by parental smoking behavior but not socioeconomic status. These findings indicate that intergenerational transmission of risk for AUD and ND is driven primarily by direct genetic effects, with modest indirect genetic effects on smoking quantity. Larger samples and cross-trait analyses are needed to further elucidate these mechanisms.

8
Daily feeding rhythms may play a role in the genetic variability of feed efficiency in growing pigs

Gilbert, H.; Foury, A.; Agboola, L.; Devailly, G.; Gondret, F.; Moisan, M.-P.

2026-04-21 zoology 10.64898/2026.04.17.719142 medRxiv
Top 1%
1.5%
Show abstract

AO_SCPLOWBSTRACTC_SCPLOWImproving feed efficiency in pigs is essential for reducing production costs and environmental impacts. This study examines the influence of circadian feeding rhythms and genetic polymorphisms on feed efficiency variability using two pig lines divergently selected for Residual Feed Intake (RFI) over ten generations. Feeding behavior was monitored using automatic concentrate dispensers, recording 6,494,097 visits from 3,824 pigs to analyze meal frequency, duration, and diurnal patterns. LRFI pigs ate less frequently, with larger meals and longer durations, they exhibited two distinct feeding peaks: one around 8:00 AM and a higher one at 5:00 PM and they consumed more feed during the diurnal period and less at night. HRFI pigs showed a smoother, less rhythmic feeding behavior with increased nocturnal intake. The differences between the two RFI lines became more pronounced as the number of generations of selection increased, suggesting a genetic basis. Feeding behaviors, including intake during the two main diurnal peaks, were found to be heritable (heritability estimates: 0.30-0.40) and genetic correlations were observed between feed intake and RFI, especially for intake between the two peaks. Then, we investigated the evolution of allele frequencies of single nucleotide polymorphisms (SNPs) in DNA sequences surrounding 10 core clock genes (ARNTL, CLOCK, CRY1, CRY2, NPAS2, NR1D1, PER1, PER2, PER3, RORA) along generations of selection. SNPs with significant frequency changes were mapped to regulatory regions and transposable elements, especially in HRFI line, suggesting potential functional impacts on circadian regulation. These results underscore the role of feeding behavior and genetic variation in feed efficiency, offering insights for breeding programs aimed at improving metabolic efficiency and sustainability in pig production.

9
Genome-wide identification and characterization of the NAC transcription factor family in Cynodon dactylon and their expression during abiotic stresses

Poudel, A.; Wu, Y.

2026-04-20 bioinformatics 10.64898/2026.04.15.718725 medRxiv
Top 1%
1.3%
Show abstract

Common bermudagrass (Cynodon dactylon) is a highly resilient and cosmopolitan grass widely used for turf, forage, and soil stabilization. Although its genome has been sequenced, little study has focused on characterizing genes underlying its resilience, including the NAC transcription factor family, which is well known for its physiological and stress-related functions. This study aimed to systematically characterize NAC TF genes in the bermudagrass genome and assess their potential roles in abiotic stress tolerance. A total of 237 CdNAC genes were identified and phylogenetically classified into 14 groups, including 40 members in the NAM/NAC1 class, which is associated with plant growth and development, and 23 members in the SNAC class, which is associated with stress responses. Tissue-specific RNA-seq analysis indicated that about one-fourth of CdNAC genes were expressed across all tissues, whereas 13 genes showed relatively higher expression in roots and 9 in inflorescence, suggesting both essential and specialized functions. Stress-responsive expression profiling revealed that 35 CdNAC genes were upregulated in response to drought, 43 to heat, 10 to salt, and 42 to submergence stress. Notably, CdNAC122, 149, and 155, the members of SNAC class, were consistently upregulated across all stress conditions, while others exhibited stress-specific expression, such as CdNAC37, 130, 145, and 199 in drought, CdNAC7, 12, 18, and 29 in heat, CdNAC46 and 151 in salt, and CdNAC9 and 31 in submergence. In contrast, 53 genes were downregulated during different stresses, with most belonging to NAM/NAC1, TERN, or OsNAC7 classes, possibly reflecting suppression of photosynthesis and development-related processes under stress. These results provide the first comprehensive characterization of CdNAC genes, reveal their distinct regulatory roles in abiotic stress responses, and establish a foundation for future functional validation and applications in breeding of stress-resilient bermudagrass.

10
Exploring the Relationship Between Non-Suicidal Self-Injury and Problematic Sexual Behaviour

Jiang, S.; Foo, J. C.; Roper, L.; Yang, E.; Green, B.; Arnau, R.; Behavioral Addictions Studies and Insights Consortium, ; Lodhi, R. J.; Isenberg, R.; Wishart, D. S.; Fujiwara, E.; Carnes, P. J.; Aitchison, K. J.

2026-04-25 addiction medicine 10.64898/2026.04.17.26351044 medRxiv
Top 2%
0.9%
Show abstract

Objectives: Non-suicidal self-injury (NSSI) and self-harming sexual behaviours share functional and behavioural overlaps. However, the relationship between NSSI and problematic sexual behaviour (PSB) remains underexplored. This study aimed to investigate the association between NSSI and PSB in two cohorts - a non-clinical university cohort and a clinical PSB patient cohort. Methods: Data were collected from 2,189 university participants and 477 clinical PSB patients. NSSI was assessed via self-report, and PSB was measured with the Sexual Addiction Screening Test-Revised (SAST-R) Core. The four core addictive dimensions of PSB: relationship disturbance, loss of control, preoccupation, and affect disturbance, were also evaluated. Logistic regression analyses were conducted to examine the association between PSB (presence/absence and severity) and NSSI, looking at effects of gender and contributions of addictive dimensions of PSB. Results: Rates of NSSI were similar in the university (7.1%) and patient (5.7%) cohorts; stratified by gender, a higher proportion of women PSB patients had NSSI compared to in the university cohort (29.3% vs 9.3%). In the university group, who had milder PSB than patients, PSB was associated with NSSI (OR=2.11, p<0.001); a significant gender by PSB interaction was found showing that women with PSB were over four times more likely to have NSSI than men without PSB (OR=4.44, p=0.037). In contrast, PSB severity was not associated with NSSI in PSB patients (OR=1.10, p=0.25). Associations of the addictive dimensions of PSB with NSSI were observed only in the subgroup of university women, in the 'preoccupation' dimension (p<0.001). Conclusions: Our findings highlight gender-specific patterns in the association between PSB and NSSI, suggesting the need for further research and possibly targeted prevention and intervention strategies in women.

11
Closely related, yet phenotypically different - Genome assemblies of two sister species of widow spiders: Latrodectus hasselti and L. katipo, Theridiidae

Ivanov, V.; Uludag, K. O.; Schöneberg, Y.; Schneider, J. M.; Kennedy, S.; Hamadou, A. B.; Vink, C. J.; Krehenwinkel, H.

2026-04-21 genomics 10.64898/2026.04.17.719154 medRxiv
Top 2%
0.9%
Show abstract

Widow spiders of the genus Latrodectus are important animals for biomedical, pest and conservation research. Here, we present the assembled genomes of two closely related Latrodectus species: the Australian L. hasselti and the New Zealand endemic L. katipo. The genome of L. katipo consists of 13 scaffolds likely corresponding to chromosomes (90% of the total length) and 1267 short scaffolds (10%). It has a total length of 1.5 Gbp and BUSCO of 94.9%. The genome of L. hasselti consists of 379 scaffolds and has a total length of 1.7 Gbp and a BUSCO score of 95.4%. The repeat content is very similar in both genomes with a total proportion of 37.2% for L. katipo and 39.9% for L. hasselti. Genome annotation predicted 12706 and 15111 genes for L. katipo and L. hasselti respectively. An ortholog analysis shows large overlap between orthogroups suggesting either duplication events in L. hasselti or loss of genes in L. katipo.

12
In Silico study of clinical implication of markers associated with PTHrP regulatory mechanisms and linked to angiogenesis and EMT program of colorectal cancer

Carriere, P. M.; Novoa Diaz, M. B.; Birkenstok, C.; Gentili, C.

2026-04-20 cancer biology 10.64898/2026.04.15.718767 medRxiv
Top 2%
0.9%
Show abstract

Parathyroid hormone-related peptide (PTHrP), encoded by PTHLH, has been implicated in tumor progression through its involvement in epithelial-mesenchymal transition (EMT), angiogenesis, and tumor cell migration. Previous experimental studies suggest that PTHrP may promote these processes in colorectal cancer (CRC), partly through the modulation of factors such as secreted protein acidic and rich in cysteine (SPARC) and vascular endothelial growth factor (VEGFA). These events play a key role in the acquisition of an aggressive phenotype in our experimental models. In this study, we performed an integrative in silico analysis of multiple transcriptomic datasets to investigate the potential role of PTHLH in CRC. Differential expression analysis identified a set of consistently dysregulated genes across independent datasets. Functional enrichment and network analyses revealed that PTHLH expression is associated with biological processes related to extracellular matrix remodeling, EMT, and angiogenesis. Correlation analyses showed a positive association between PTHLH and SPARC expression, while network-based approaches suggested a potential functional connection with VEGFA. To assess the clinical relevance of these findings, survival analysis was performed using publicly available datasets. High expression levels of PTHLH, SPARC, and VEGFA were significantly associated with reduced overall survival in patients. Notably, a combined gene signature based on these three factors demonstrated a stronger prognostic effect than individual genes, indicating enhanced predictive value. These findings suggest that PTHrP is associated with molecular pathways involved in tumor progression and, together with SPARC and VEGF, may contribute to a coordinated regulatory axis with prognostic relevance in CRC, warranting further experimental validation.

13
Diminished sex hormone levels influence the risk of skewed X chromosome inactivation

Roberts, A. L.; Osterdahl, M. F.; Sahoo, A.; Pickles, J.; Franklin-Cheung, C.; Wadge, S.; Mohamoud, N. A.; Morea, A.; Amar, A.; Morris, D. L.; Vyse, T. J.; Steves, C. J.; Small, K. S.

2026-04-22 genetic and genomic medicine 10.64898/2026.04.20.26351303 medRxiv
Top 3%
0.8%
Show abstract

BackgroundX chromosome inactivation (XCI) is the mechanism which randomly silences one X chromosome to equalise gene expression between 46, XX females and 46, XY males. Though XCI is expected to result in a random pattern of mosaicism across tissues, some females display a significantly unbalanced ratio in immune cells, termed XCI-skew, in which [&ge;]75% of cells have the same X inactivated. XCI-skew is associated with adverse health outcomes and its prevalence increases with age - particularly after midlife - yet the specific risk factors have yet to be identified. The menopausal transition, which is driven by profound shifts in sex hormone levels, has significant impact on chronic disease risk yet the molecular and cellular effects are incompletely understood. We hypothesised that the menopausal transition may impact XCI-skew. MethodsUsing XCI data measured in blood-derived DNA from 1,395 females from the TwinsUK population cohort, along with questionnaires, genetic data, and sex hormone measures, we carried out a cross-sectional study to assess the impact of the menopausal transition and sex hormones on XCI-skew. ResultsWe demonstrate that early menopause (<45yrs) is associated with increased risk of XCI-skew. In subset analyses across those who had a surgically induced or natural menopause, we find the association restricted to those who underwent a surgical menopause. We next identify a low polygenic score (PGS) for testosterone levels is significantly associated with XCI-skew, which we replicate in an independent dataset (n=149), while a PGS for age at natural menopause is not associated. Finally, using longitudinal measures across two time points spanning [~]18 years we show XCI-skew is a stable cellular phenotype that typically increases over time. DiscussionThese data represent the first environmental and genetic risk factors of XCI-skew, both of which implicate endogenous sex hormone levels, particularly testosterone. We propose XCI-skew may have clinical relevance in postmenopausal females.

14
Epithelial NCAPD3 expression protects against stress-induced intestinal injury in mice

Johnston, I.; Johnson, E. E.; Khan, A.; Longworth, M. S.; McDonald, C.

2026-04-21 cell biology 10.64898/2026.04.21.719792 medRxiv
Top 3%
0.7%
Show abstract

Intestinal epithelial cells are central players in mucosal barrier integrity and host-microbe interactions. Genetic studies have revealed that epithelial dysfunction is a key contributor to the pathogenesis of inflammatory bowel disease. Non-SMC condensin II complex subunit D3 (NCAPD3) is essential for chromatin organization and stability. NCAPD3 also promotes antimicrobial defense and autophagy responses in vitro. NCAPD3 expression is decreased in intestinal epithelial cells from patients with ulcerative colitis; however, it is not known whether loss of NCAPD3 expression drives intestinal barrier dysfunction or is a result of disease-associated inflammation. To investigate this relationship in vivo, a tissue-specific approach was required, as global constitutive knockout of NCAPD3 is embryonic lethal. Therefore, a transgenic mouse line with doxycycline-inducible expression of a short hairpin RNA targeting NCAPD3 restricted to villin-expressing cells was generated (NCAPD3KD mice) to enable the study of NCAPD3 function in the intestinal epithelium. Treatment of NCAPD3KD mice with 9-tert-butyl doxycycline resulted in [~]75% reduction of NCAPD3 protein in EpCAM intestinal cells. Short-term epithelial NCAPD3 knockdown did not induce spontaneous colitis but was associated with increased serum amyloid A and a trend towards increased intestinal permeability. Upon dextran sodium sulfate or Salmonella enterica serovar Typhimurium {Delta}AroA challenge, NCAPD3KD mice exhibited exacerbated weight loss, higher disease activity, increased histopathological damage, abnormal colonic cytokines and chemokines, and significantly increased intestinal permeability. These results indicate that NCAPD3 expression in the intestinal epithelium is required for optimal barrier maintenance and antimicrobial defense under chemical or microbial stress. These findings support prior in vitro observations and solidify NCAPD3 as a regulator of intestinal epithelial barrier function and mucosal host defense. Author SummaryNCAPD3 is a multifunctional protein with established roles in chromatin organization, genome stability, mitochondrial function, and antimicrobial defense. Dysregulated NCAPD3 is implicated in human diseases, such as inflammatory bowel disease (IBD) and microcephaly; however, due to its essential role in cellular division, determination of whether NCAPD3 loss drives these pathologies in vivo has been lacking. Using a new transgenic mouse model that selectively reduces NCAPD3 expression in intestinal epithelial cells, our study establishes NCAPD3 as an epithelial regulator of the mammalian intestine that enhances epithelial barrier resilience and antimicrobial defense during stress. Although dispensable for short-term basal homeostasis, NCAPD3 function becomes critical during epithelial injury and enteric infection. Reduced NCAPD3 expression may therefore lower the threshold for inflammatory disease by weakening barrier integrity, amplifying inflammatory cascades, and impairing antimicrobial defenses. These findings position NCAPD3 as a potential modulator of IBD susceptibility and highlight chromatin organization as an important, previously underappreciated layer of intestinal epithelial regulation.

15
Proteomic Insights into Lp(a) Cardiovascular Mechanisms: A Mendelian Randomization Study

Tomasi, J.; Xu, H.; Zhang, L.; Carey, C. E.; Schoenberger, M.; Yates, D. P.; Casas, J.

2026-04-22 genetic and genomic medicine 10.64898/2026.04.20.26351299 medRxiv
Top 3%
0.7%
Show abstract

Background: Elevated lipoprotein(a) [Lp(a)] is a known risk factor for several cardiovascular-related diseases established from multiple genetic and observational studies. However, the underlying mechanisms mediating the effects of Lp(a) levels on cardiovascular disease risk and major adverse cardiovascular events (MACE) are unclear. The aim of this study was to identify proteins downstream of Lp(a) using mendelian randomization (MR) - a genetic causal inference approach. Methods: A two-sample MR was performed by initially identifying Lp(a) genetic instruments based on data from genome wide association studies (GWAS) of Lp(a) blood concentrations. These instruments were then tested for association with proteins from proteomic pQTL data (Olink from UK Biobank, 2940 proteins and SomaScan from deCODE, 4907 proteins). Results: A total of 521 proteins associated with Lp(a) were identified. Using pathway enrichment analysis, the following MACE-relevant pathways were identified comprising a total of 91 Lp(a) downstream proteins: oxidized phospholipid-related, chemotaxis of immune cells and endothelial cell activation, pro-inflammatory monocyte activation, neutrophil activity, coagulation, and lipid metabolism. Conclusion: The results suggest that the influence of Lp(a) treatments is primarily through modifying inflammation rather than lipid-lowering, thus providing insight into the mechanistic framework which mediates the effects of elevated Lp(a) on atherosclerotic cardiovascular disease.

16
Reveal Principles of Codon Optimization via Machine Learning

Deng, F.; Li, H.; Sun, D.; Duan, G.; Sun, Z.; Xue, G.

2026-04-21 bioinformatics 10.64898/2026.04.16.718958 medRxiv
Top 3%
0.7%
Show abstract

High level of protein expression is usually welcomed in industry and research, and codon optimization is widely used to achieve high expression. Methods of implementing codon optimization can be divided into two branches, one is classical methods which develop cost functions based on empirical law, another is AI methods which learn the codon choice principles from endogenous genes with neural networks. Here we develop two codon optimization tools based on two branches respectively, namely OptimWiz 2.1 and OptimWiz 3.0. Results of fusion protein fluorescence detection indicate that both OptimWiz 2.1 and OptimWiz 3.0 are superior to all the other commercially available codon optimization tools. Principles of codon optimization are revealed in the process of machine learning on both tools.

17
Echocardiographic characterization and markers of cardiovascular risk in adults with sickle cell disease in a Colombian tertiary referral centre: a cross-sectional study

Arrieta-Mendoza, M. E.; Barbosa-Balaguera, S.; Betancourt, J. R.; Ayala-Zapata, S.; Messu-Llanos, C. D.; Rosales-Melo, J. P.; Andrade-Hoyos, D. F.; Herrera-Escandon, A.; Aguilar-Molina, O. E.

2026-04-20 cardiovascular medicine 10.64898/2026.04.16.26351071 medRxiv
Top 3%
0.7%
Show abstract

Sickle cell disease (SCD) is associated with substantial cardiovascular morbidity, but echocardiographic data from Latin American populations remain scarce. We aimed to characterise the structural, functional, and haemodynamic echocardiographic profile of adults with SCD attending a tertiary referral centre in Cali, Colombia. We conducted an observational, cross-sectional study based on systematic review of medical records and transthoracic echocardiography reports of consecutive adult patients ([&ge;]18 years) with confirmed SCD evaluated between January 2022 and December 2024. Patients with complex congenital heart disease, severe valvular disease of unrelated aetiology, pregnancy, or echocardiograms of insufficient quality were excluded. Of 669 patients screened, 57 met inclusion criteria. Reporting followed STROBE recommendations. The median age was 24 years (interquartile range [IQR] 21-32) and 59.6% were female; the SS genotype was the most frequent (76.4%) and 71.4% were on hydroxyurea. Median haemoglobin was 10.2 g/dL (IQR 9.3-11.4) and median NT-proBNP 491 pg/mL (IQR 98-1290). Most patients had preserved left ventricular dimensions and systolic function (median ejection fraction 63%, IQR 57-66.5; mean global longitudinal strain -18.9% {+/-} 2.9). Right ventricular function was preserved (mean tricuspid annular plane systolic excursion 25.4 {+/-} 4.6 mm). Left ventricular geometry was normal in 42.1%, with concentric remodelling in 24.6%, concentric hypertrophy in 21.1%, and eccentric hypertrophy in 12.3%. Diastolic function was normal in 71.4%. Valvular disease, when present, was predominantly mild. Tricuspid regurgitation velocity exceeded 2.5 m/s in 29.8% of patients and exceeded 3.0 m/s in 10.5%, identifying a substantial subgroup at intermediate-to-high probability of pulmonary hypertension. In this Colombian cohort of relatively young adults with SCD, cardiac structure and biventricular function were largely preserved, but nearly one-third of patients had echocardiographic findings suggestive of pulmonary hypertension. These findings support the routine use of transthoracic echocardiography as an accessible tool for early cardiovascular risk stratification in adults with SCD in low- and middle-income settings.

18
Genetic and Environmental Predictors of Seasonality and Seasonal Affective Disorder in Individuals with Depression

Huider, F.; Crouse, J.; Medland, S.; Hickie, I.; Martin, N.; Thomas, J. T.; Mitchell, B. L.

2026-04-24 genetic and genomic medicine 10.64898/2026.04.22.26351539 medRxiv
Top 3%
0.7%
Show abstract

Background: The etiology and nosological status of seasonal affective disorder (SAD) as a specifier of depressive episodes versus a transdiagnostic disorder are the subject of debate. In this study, we investigated the underlying etiology of SAD and dimensional seasonality by examining their association with latitude and genetic risk for a range of traits, and investigated gene-environment interactions. Methods: This study included 12,460 adults aged 18-90 with a history of depression from the Australian Genetics of Depression Study. Regression models included predictors for latitude (distance from equator) and polygenic scores for eight traits; major depressive disorder, bipolar disorder, anxiety disorders, chronotype, sleep duration, body mass index, vitamin D levels, and educational attainment. Outcomes were SAD status and general seasonality score. Results: SAD was positively associated with latitude (OR[95%CI] = 1.05[1.03-1.06], padjusted<0.001), and there was nominal evidence of additive and multiplicative interactions between chronotype genetic risk and latitude (OR = 0.99[0.99-0.99], padjusted=0.381; OR=0.98[0.97-0.99], padjusted=0.489). General seasonality score was associated with latitude (IRR=1.01[1.01-1.01], padjusted 0.001) and genetic risk for major depressive disorder (IRR =1.02[1.01-1.03], padjusted<0.001), bipolar disorder (IRR=1.02[1.01-1.03], padjusted=0.001), anxiety disorders (IRR=1.03[1.01-1.04], padjusted<0.001), vitamin D levels (OR=0.89[0.80-0.95], padjusted=0.048), and educational attainment (IRR=0.97[0.96-0.99], padjusted<0.001). Conclusions: These findings enhance understanding of SAD etiology, highlighting contributions of psychiatric genetic risk and geographic measures on seasonal behavior, and support examining seasonality as a continuous dimension.

19
Systematic evaluation of 24 extraction and library preparation combinations for metagenomic sequencing of SARS-CoV-2 in saliva

Qian, K.; Abhyankar, V.; Keo, D.; Zarceno, P.; Toy, T.; Eskin, E.; Arboleda, V. A.

2026-04-20 genomics 10.64898/2026.04.16.719115 medRxiv
Top 4%
0.5%
Show abstract

Sequencing the respiratory tract transcriptome has the potential to provide insights into infectious pathogens and the hosts immune response. While DNA-based sequencing is more standard in clinical laboratories due to its stability, RNA assays offer unique advantages. RNA reflects dynamic physiological changes, and for RNA viruses, viral RNA particles directly represent copies of the viral genome, enabling greater diagnostic sensitivity. However, RNAs susceptibility to degradation remains a significant challenge, particularly in RNase-rich specimens like saliva. To address this, we conducted a systematic, combinatorial evaluation of 24 distinct mNGS workflows, crossing eight nucleic acid extraction methods with three RNA-Seq library preparation protocols. Remnant saliva samples (n = 6) were pooled and spiked with MS2 phage as a control. The SARS-CoV-2 virus was spiked into half of the samples, which were extracted using the eight different extraction methods (n = 3) and compared using RNA Integrity Number equivalent (RINe) scores and RNA concentration. The extracted RNA was then processed across the three library construction methods and subjected to short-read sequencing to assess all 24 combinations head-to-head. We compared methods based on viral read recovery and found that RINe and concentration did not correlate with viral detection. The Zymo Quick-RNA Magbead kit and the Tecan Revelo RNA-Seq High-Sensitivity RNA library kit were the extraction and library-preparation kits that yielded the most SARS-CoV-2 reads, respectively. Importantly, our combinatorial analysis revealed that any small variability attributable to different nucleic acid extraction methods was heavily overshadowed by differences in quality attributable to the RNA-Seq library preparation methods. These findings challenge the reliance on conventional RNA quality metrics for clinical metagenomics and underscore the need to redefine extraction quality standards for mNGS applications. IMPORTANCEmNGS is a powerful and unbiased approach towards pathogen detection that has mostly been applied to blood and cerebrospinal fluid samples. However mNGS has recently been applied to more areas including the respiratory pathogen detection space, with potential applications in both in-patient diagnostics and public health surveillance. Saliva samples are an ideal sample type for these use cases since they can be collected non-invasively. However, saliva is also a challenging sample type due to its high RNase activity and often yields low-quality nucleic acid. This study explores the feasibility of using saliva specimens in mNGS with contrived SARS-CoV-2 samples to optimize the combination of two factors: nucleic acid extraction and RNA-seq library preparation. Exploration in this area could enhance the sensitivity of saliva-based mNGS assays, with the goal of future expansion of this specimen type in clinical diagnostics and public health surveillance. Key PointsO_LIThe choice of RNA-Seq library preparation kit has a greater impact on pathogen detection than the nucleic acid extraction method. C_LIO_LIThe combination of Zymo Quick-RNA Magbead extraction kit and TECAN Revelo RNA-Seq High Sensitivity RNA library kit recovered the highest percentage of total SARS-CoV-2 reads. C_LIO_LIRNA quantity and RINe score do not correlate with viral read capture, indicating a need for an alternative metric to assess RNA quality for downstream mNGS clinical diagnostics. C_LI

20
A Seychelles warbler genomic toolkit

Lee, K. G. L.; Bartleet-Cross, C.; Gonzalez-Mollinedo, S.; Dong, S.; Pinto, A.; Lee, C. Z.; Sparks, A.; van de Velde, M.; Manarelli, M.-E.; Holden, T.; Tucker, R.; Maher, K. H.; Hipperson, H.; Slate, J.; Komdeur, J.; Richardson, D.; Dugdale, H.; Burke, T.

2026-04-21 genomics 10.64898/2026.04.16.719046 medRxiv
Top 5%
0.5%
Show abstract

Understanding evolutionary processes is greatly facilitated by high-quality data on genetic variation. We report the development of a genomic toolkit for a recently bottlenecked, long-term studied species, the Seychelles warbler (Ptimerl dezil; Acrocephalus sechellensis). This toolkit comprises a reference genome assembled into 31 chromosomes, together with functional annotations and reference-panel-free imputation of whole-genome sequences from 1,935 individuals. The genomic data have been used to assign the sequenced individuals into a genetic pedigree. Individual genomic data are associated with a suite of phenotypic metadata, amassed from three decades of fieldwork in this closed, long-term monitored population. We compared sex and parentage assigned using the genomic data with the previously recorded sex and parentage metadata to identify and correct 41 sample DNA samples labelled with the wrong identity. This population resource enables a wide range of analyses, that include, but are not limited to phylogenetics, metabarcoding, recombination rates, linkage patterns, adaptation, heritability, demographic history, selection, and inbreeding estimates. We wish to encourage interest from researchers seeking to collaborate on future analyses and data collection. Overall, our methods demonstrate the potential of next generation sequencing and statistical tools to provide dense genomic datasets at large sample sizes for wild populations.