Genes
○ MDPI AG
Preprints posted in the last 7 days, ranked by how well they match Genes's content profile, based on 126 papers previously published here. The average preprint has a 0.09% match score for this journal, so anything above that is already an above-average fit.
Kullyev, A.; Avdeichik, S.; Akimenkova, A.; Kartuesov, A.; Kardymon, O.; Goikhman, Y.
Show abstract
Abstract Purpose: Published clinical outcome data on preconception carrier screening (PCS) in Central Asia are limited. We report the first clinical implementation study from Uzbekistan of a whole-exome sequencing (WES)-based multi-platform PCS program combining exome sequencing with targeted SMA, FMR1, and DMD assays. Methods: We retrospectively analyzed anonymized data from 65 individuals (19 couples, 27 singletons) screened at IMC Genomics, Tashkent, between January 2024 and May 2026. WES covering the protein-coding regions of approximately 20,000 genes was followed by exome-wide bioinformatics filtering and clinical geneticist interpretation. Partly overlapping cohorts underwent SMA carrier screening (n=179), FMR1 CGG-repeat analysis in females (n=155), and DMD deletion/duplication testing in preconception females (n=29). Variants were classified by ACMG/AMP criteria against gnomAD v4.1. Results: Sixty-one of 65 WES-screened individuals (93.8%; 95% CI 85.2 - 97.6%) carried at least one reportable variant (152 instances across 126 genes). Four of 19 couples (21.1%; 95% CI 8.5 - 43.3%) were concordant for pathogenic or likely pathogenic variants in the same autosomal recessive gene; two were referred for preimplantation genetic testing for monogenic disease. SMA screening identified four carriers, including two 2+0 silent carriers; FMR1 analysis identified one intermediate allele; DMD MLPA identified no exonic rearrangements. Conclusion: This first reported WES-based multi-platform PCS program in Uzbekistan was feasible and clinically informative, identifying actionable couple-level reproductive risks and supporting structured implementation of reproductive genetic screening in Central Asia.
Russell, J. B. W.; Smith, M.; Alhassan, Y.; Coker, J. M.; Tejan, E. A.; Bharat, K.; Meena Kumari, M. K.; Mahdi, O. Z.; Lisk, D. R.
Show abstract
Abstract Background: Heart Failure is a complex clinical syndrome of growing public health concern in sub-Saharan Africa, yet the data from Sierra Leone are absent. The aim of the study is to characterise the clinical profile, etiological and temporal trends of hospitalised HF patients at Choithrams Memorial Hospital (CMH), Freetown, Sierra Leone, to confirm specific management strategies. Methods: This single-center, retrospective observational cohort study analysed data on HF patients (>18years) admitted at the CMH between January 2021 to 31 December 2025. The clinical definition of HF was based on the Framingham criteria and the European Society of Cardiology (ESC) guidelines , including standard echocardiographic parameters. All variables, including patients demographics, HF. phenotype, aetiology, medical history and hospital outcomes were extracted from the digital record. Non-parameteric tests, multivariable logistic regression to identify variables associated with etiology, Wilcoxon rank-sum test to compare groups and Kruskal-Wallis test to analyse trends over time were utilised. Result: A total of 765 patients were included in the study, with a median age of 53 years (IQR 42-61) and male predominance of 55.3%. Patients with recurrent HF (60.9%) were more common than those with de novo HF (39.1%), were older (54 years vs 53 years), had a higher comorbidity burden (34% vs 4%, p < 0.001), and presented with a cold-wet hemodynamic profile (18.4% vs 8.4%, p < 0.001). HFrEF (61.3%) was the most predominant phenotype, though HFpEF increased with age. Dilated Cardiomyopathy (37.0%), Hypertensive Heart Disease (31.2%) and Valvular Heart Failure (17.1%) were the leading etiologies, while ischemic heart disease (6.3%) was relatively uncommon. A majority of the patients were referred (77.9%), and 50.8% presented with NYHA IV. The strongest independent predictor for HF was hypertensive heart disease [AOR = 17.81; C.I 95%: (3.13-48.76), p <0.001]. An analysis of the trends in etiologies and demographics over the five-year period demonstrated no significant changes (all p-values > 0.05 for age, sex, aetiology, and most comorbidities). Conclusion: HF affects the younger adult population in Sierra Leone and is mainly caused by DCM and HHD. The late case presentations, the high prevalence of recurrent HF, and the associated high burden of comorbidities emphasize an urgent need to develop and implement improved strategies for the prevention, early detection, and long-term management of HF within Sierra Leone's healthcare system.
Xiang, J.; Zhu, B.; Xu, H.; Chen, Y.; Sun, X.; xiang, r.; Zhao, Y.; Liu, W.; Zhang, L.; He, J.; liu, j.; Chen, Y.; Fan, Z.; Zhang, H.; Tan, J.; Pang, L.; Shi, L.; Kong, Y.; Cai, A.
Show abstract
Background Thalassemia is one of the most common monogenic disorders worldwide, current screening strategies combining hematological testing with molecular assays still carry a risk of missed diagnoses and undesirable efficiency, particularly for complex structural variants and rare mutations. Methods In this prospective double-blind, multicenter cohort study of 3,842 participants (3,362 pregnant women and 480 male partners), we conducted a head-to-head comparison to systematically evaluate the incremental clinical value and detection performance of single-molecule nanopore sequencing in thalassemia (SMITH) against conventional hematological testing and next-generation sequencing (NGS). Findings The overall concordance rate between NGS and SMITH was 98.6% (3789/3842). The discrepant cases (n=53) were directly attributed to the superior detection capabilities of SMITH, which successfully identified complex structural rearrangements-including 45 -globin gene triplications and four HK alleles-that were missed by NGS. Furthermore, SMITH accurately detected four rare variants (c.134_135insT/, c.-22(C>T)/, {beta}N/{beta}c.316-290delinsAGGGCAATAATTT and {beta}3.5 kb deletion/{beta}N ) and resolved ten trans and three cis configurations within the globin gene allele. Clinically, these technical advantages translated to a 9.3% (5/54) increase in the detection rate of high-risk prenatal couples, effectively preventing one birth affected by moderate-to-severe thalassemia. Additionally, SMITH corrected a diagnostic discrepancy in one case (HK vs. -3.7), sparing the couple from an unnecessary invasive procedure. Interpretation Our findings demonstrate that SMITH provides a powerful platform for resolving globin gene rearrangements, detecting rare variants, and enabling direct haplotype phasing. By effectively eliminating diagnostic blind spots, SMITH is expected to become an optimal method for thalassemia prevention programs. Funding This study was supported by Chinese National Natural Science Foundation Projects 81760037 and 82271894.
Verbrugge, J.; Fiallos, K.; Cook, L.; Miller, M.; Head, K. J.
Show abstract
As genetic testing becomes increasingly integrated into Parkinson disease (PD) research, including targeted testing for variants in LRRK2 and GBA1, the return of individual research results is becoming more common. However, limited qualitative data exists regarding how research participants experience genetic results disclosure and post-test genetic counseling in PD research settings. We conducted semi-structured qualitative interviews with participants (n=13) enrolled in the Parkinson Precision Medicine Initiative (formerly Parkinson Progression Markers Initiative; PPMI) who had received PD-related genetic test results and post-test genetic counseling. Interviews were conducted 1 to 3 weeks following result disclosure and analyzed using thematic analysis with a primarily deductive coding approach informed by study aims and inductive identification of emergent themes. Four primary themes were identified: (1) personal connection and motivations for participation, (2) centrality of result disclosure and information preferences, (3) emotional experiences and support needs, and (4) communication quality and alignment with participant needs. Overall, our findings underscore the importance of person-centered genetic counseling within PD research. As return of genetic and biomarker results in research and clinical trial contexts expand, thoughtful integration of relational, informational, and communication-focused practices will be essential to support participant engagement and trust.
O'Donoghue, C.; Kacar, E.; Gomes, T.; Costello, E.; Pender, N.; Peelo, C.; Ryan, M.; Heverin, M.; Byrne, S.; Bede, P.; Hardiman, O.; McLaughlin, R. L.; Byrne, R. P.
Show abstract
Background: Neurological, neuropsychiatric, and neurodevelopmental disorders cluster in ALS families, sharing a common genetic architecture with ALS. Pathogenic variants in genes associated with other neurological, neurodevelopmental, or neuropsychiatric disorders may also co-occur in ALS and modify phenotype. We have sought to determine the prevalence and clinical pattern of likely-pathogenic/pathogenic (LP/P) non-ALS neurological, neurodevelopmental, and neuropsychiatric variants, alone and in combination with ALS-gene variants, in two large ALS cohorts. Methods: Whole-genome sequencing (WGS) of 469 Irish and 774 Answer ALS people with ALS (pwALS) was analysed for ClinVar LP/P variants associated with other neurological (n = 15541), neurodevelopmental (n = 9761), and neuropsychiatric (n = 321) phenotypes. Inheritance patterns for associated genes (autosomal recessive/autosomal dominant) along with the associated phenotype were validated using OMIM. Standardised clinical data included family history, site and age of onset, El Escorial category, survival, motor decline, and cognitive and behavioural assessments. Known ALS-gene variants and C9orf72 repeat expansion status were included for each cohort. Results: Non-ALS neurological variants were identified in 47/469 (10.0%) Irish and 69/774 (8.9%) Answer ALS participants, most frequently in hereditary spastic paraplegia-associated genes (3.2% Irish; 2.8% Answer ALS). Irish neurological variant carriers showed higher frequency of respiratory onset (10.6% vs 1.2%, Fisher's exact p = 0.002, {Phi} = 0.20) and fewer premorbid behavioural symptoms (0.92 +/- 0.56 vs 3.08 +/- 0.97, Cohen's d = -0.40). Neurodevelopmental variants occurred in 12/469 (2.6%) Irish and 20/774 (2.6%) Answer ALS participants. In the Irish cohort, neurodevelopmental variant carriers had significantly shorter survival in Cox proportional hazards model (log-rank p = 0.005), corresponding to a more than two-fold increased hazard of death (HR = 2.25, 95% CI 1.26-4.00), and had significantly increased familial burden of neuropsychiatric disorders among first- and second-degree relatives (negative binomial IRR for carriers = 2.41, 95% CI: 1.12-5.18, p = 0.025). Across combined cohorts, 18 individuals (Irish n = 8; Answer ALS n = 10) carried [≥]2 LP/P variants spanning ALS and non-ALS genes. Conclusion: Rare LP/P variants in genes associated with other neurological and neurodevelopmental disorders occur in up to 12% of pwALS across two independent cohorts. Carriers show distinct phenotypes, shorter survival, and characteristic family history patterns. These findings suggest that extended pleiotropic and oligogenic architectures may contribute to ALS heterogeneity.
Krooss, S. A.; Yang, T.; Yuan, Q.; Drick, N.; Sgodda, M.; Held, J.; Behrendt, P.; Hartleben, B.; Koczulla, R.; Ma, X.; Liu, Y.; Wedemeyer, H.; Janciauskiene, S.; Di Donato, N.; Cantz, T.; Wang, E.; Wu, Y.; Hoeper, M.; Xia, Q.; Ott, M.
Show abstract
Background: Alpha-1 antitrypsin deficiency (AATD) caused by the PI*ZZ mutation (Glu342Lys) results in hepatic accumulation of misfolded AAT-Z protein and reduced circulating AAT levels, leading to progressive liver disease and emphysema. Gene correction therapy represents a potentially curative approach by directly correcting the underlying genetic defect. We report the first case of successful hepatic gene correction with early histological and functional assessment. Methods/Case presentation: We report the case of a 66-year-old male patient with PI*ZZ AATD who underwent gene correction therapy within the YOLT-202 phase I/Ia clinical trial (clinical trial.gov ID NCT07193615). Ten weeks post treatment a liver biopsy was performed to re-evaluate pre-existing F2 liver fibrosis as measured by elastography before entering the study. Serum samples allowed functional assessment of the AAT-mediated elastase inhibition. Results: Liver biopsy did not show signs of hepatic inflammation and demonstrated 54% (Sanger) and 57% (Illumina) gene correction rate of the PI*ZZ variant on the DNA level with no bystander edits or off-target effects. Following a transient elevation of transaminases during the early post-treatment period, liver enzymes normalized. Monthly serum AAT measurements demonstrated biologically active and stable therapeutic levels throughout follow-up. Conclusions: This case demonstrates efficient and precise hepatic gene correction without concerning histological alterations and with substantial improvement of functional parameters, supporting the feasibility and safety of gene editing approaches for AATD.
Preussner, A.; Leinonen, J. T.; FinnGen, ; Pirinen, M.; Tukiainen, T.
Show abstract
Although the Y chromosome represents roughly 2% of the male genome, it is often ignored in genome-wide association studies (GWAS). Subsequently, the potential health impacts of Y-chromosomal genetic variation remain incompletely understood. To fill this gap, we performed a phenome-wide association study (PheWAS) in FinnGen across 1,426 binary and quantitative traits using Y-chromosomal variation (frequency [≥] 1%) in 104,334 genotyped men. As Y chromosome variation is prone to population stratification, we performed carefully adjusted association analyses and further examined these through kin-based validation in 19,275 female and 24,712 male 1st degree relatives. We found 121 suggestive (p < 5.6x10-3) phenotypic associations in the Y chromosome, yet none of these were strong enough to reach phenome-wide significance (p < 3.9x10-6). While only 38 associations were supported in the kin-based validation, intriguingly we found support for a previously suggested link between haplogroup I1 and coronary heart disease (CHD; OR=1.06, 95%CI=1.02-1.11, p=3.7x10-3; male validation OR=1.05; female validation OR=0.97). The I1-CHD association was detected across distinct geographical areas within Finland and was independent from Loss of Y (LOY) and the autosomal risk to CHD, proposing a link between germline Y-chromosomal variation and heart disease risk. Overall, this study presents a comprehensive phenome-wide analysis of Y-chromosomal associations, highlighting the potential relevance of Y-chromosomal variation beyond sex determination. Our findings further emphasize the need for improved capture of Y-chromosomal variants and further analyses in biobank-scale data to allow for deeper exploration of male-specific genetic architecture of complex diseases.
Yerukala Sathipati, S.; Scott, H.
Show abstract
Importance: Hereditary breast and ovarian cancer (HBOC) variant carriers benefit from risk-reducing interventions, but only if identified. The extent to which carriers are clinically recognized, and whether recognition is equitable across diverse populations, is poorly characterized in a single large U.S. cohort. Objective: To estimate P/LP HBOC carrier prevalence across genetic ancestry groups, quantify documented clinical genetic testing among carriers, and evaluate ancestry and socioeconomic disparities in testing. Design, Setting, and Participants: Cross-sectional analysis of the All of Us Research Program Controlled Tier (Curated Data Repository v8/C2024Q3R9), comprising participants with short-read whole genome sequencing and linked electronic health record (EHR) and survey data. Carriers were ascertained from research genomic data independent of clinical testing. Exposures: Genetically inferred ancestry (African [AFR], Admixed American [AMR], East Asian [EAS], European [EUR], Middle Eastern [MID], South Asian [SAS]); self-reported household income and educational attainment. Main Outcomes and Measures: (1) Carrier prevalence with Wilson 95% CIs; (2) documented clinical genetic testing (procedure codes) among carriers; (3) adjusted odds of documented testing among women, by ancestry, before and after socioeconomic adjustment, using multivariable logistic regression. Results: Among 414,830 participants, P/LP HBOC carrier prevalence was 1.42% (95% CI, 1.38-1.45) overall and similar across ancestry groups (AFR 1.24%, AMR 1.32%, EAS 1.19%, EUR 1.52%, MID 1.68%, SAS 1.33%; overlapping CIs). Among 250,071 women in the testing analysis, documented clinical genetic testing was rare: only 74 of 5,878 carriers overall (1.3%) and 59 of 3,572 European-ancestry carriers (1.7%) had a documented test, with counts below reportable thresholds in all other ancestry groups. African-ancestry women had lower adjusted odds of documented testing than European-ancestry women (Model 1 adjusted odds ratio [aOR], 0.32; 95% CI, 0.27-0.39), an association that attenuated but persisted after adjustment for income and education (Model 2 aOR, 0.48; 95% CI, 0.40-0.58; P < 0.001); Admixed American women also had reduced adjusted odds (aOR, 0.71; 95% CI, 0.61-0.84). Lower income and lower education were independently and dose-dependently associated with lower testing odds (income <$25,000 aOR, 0.46; high-school education aOR, 0.54). Conclusions and Relevance: High-risk HBOC variant carriers are present across all ancestry groups at similar frequencies, yet documented clinical genetic testing was disparate in the different ancestry groups. African-ancestry women experience a testing gap that is not fully explained by socioeconomic position, implicating structural barriers in access and referral. Population-level strategies that decouple carrier identification from current referral pathways may be required to close this gap.
Wolfram, T.; Ahangari, M.; Davidson, I.; Wartschinski, L.; Li, J. H.; Eyre, M.; Stern, D.; Schleede, J.; Haghighi, A.; Carmi, S.; Christensen, M.
Show abstract
Consanguinity is a reproductive union between individuals who share a recent common ancestor. These unions are common in many regions of the world and increase the burden of rare recessive disorders by elevating autozygosity in offspring. Current reproductive genetic screening focuses on a limited set of known pathogenic variants, leaving most recessive risk unaddressed. Here we argue that embryo-level autozygosity, quantified as the fraction of the genome in long runs of homozygosity (FROH), is a potentially actionable genomic biomarker that can be integrated into routine preimplantation genetic testing as a homozygosity-informed embryo-prioritization framework (PGT-H) that can be layered onto existing embryo biopsy workflows when couples are already undergoing IVF with PGT-A or PGT-M. Using forward simulations of first-cousin and double-first-cousin couples, we show that siblings conceived by the same couple span a wide range of FROH; selecting the lowest-FROH candidate from a cohort of five embryos reduces FROH by approximately 40% on average. Combining these reductions with empirical effect-size estimates, we estimate that for first-cousin couples this strategy could reduce risk of intellectual disability by roughly 35-45% (corresponding to an absolute risk reduction of about 1.8-2.2%) and potentially reduce excess recessive disease burden, while also modestly reducing risk of common diseases such as type 2 diabetes. We outline how existing PGT-A and PGT-M workflows could potentially be extended to report embryo-level FROH and discuss ethical and counseling considerations. Autozygosity-based embryo prioritization offers a principled way to address a component of recessive risk that current variant-centric approaches miss.
Metselaar, P. I.; Mol, F.; Weiss, R.; van der Hoff, M. J.; Welting, O.; de Jonge, W. J.; Henneman, P.; te Velde, A. A.; Lowenberg, M.; Li Yim, A. Y. F.
Show abstract
Background and Aims: Fatigue is a prevalent and disabling symptom in inflammatory bowel disease (IBD), yet its underlying biological mechanisms remain poorly understood. We aimed to characterize fatigue-associated molecular signatures in IBD patients by integrating DNA methylation and mRNA expression analyses. Methods: Peripheral blood was collected from 40 patients with Crohn's disease (CD), 29 with ulcerative colitis (UC), and 10 healthy controls. Fatigue severity was assessed continuously using the Multidimensional Fatigue Inventory (MFI). Epigenome-wide DNA methylation profiling and mRNA sequencing were performed, identifying differentially methylated regions (DMRs) and differentially expressed genes (DEGs) for active and quiescent CD and UC, adjusting for age, sex, and smoking status. Pathway enrichment analysis was performed on genes with differential methylation and expression. Results: In active CD, more severe fatigue was associated with transcriptional suppression of immune and metabolic pathways (246 DMRs; 1,090 DEGs), versus upregulation of mitochondrial and metabolic processes in quiescent CD (200 DMRs; 1,619 DEGs). In active UC, fatigue was associated with anabolic pathway upregulation and epigenetic silencing of neuroactive pathways (6,927 DMRs; 343 DEGs; 56 concordant genes). Quiescent UC showed transcriptional changes without significant epigenetic pathway enrichment (1,710 DMRs; 3,224 DEGs). Healthy controls exhibited a distinct profile spanning metabolic, immune, and neuronal pathways (8,621 DMRs; 395 DEGs). Fatigue-associated signatures were largely non-overlapping across all five groups. Conclusions: Fatigue-associated molecular profiles differed substantially by disease subtype and activity state, highlighting the biological heterogeneity of IBD-related fatigue and laying the foundation for multi-omics approaches to identify biomarkers and potential therapeutic targets.
Yi, B.
Show abstract
In spite of well-established global immune landscape, SARS-CoV-2 is still able to further spread and continue causing infection waves. The current understanding about the reason behind is limited, and it is still difficult to predict the evolution or spreading tread of SARS-CoV-2. Therefore, it is necessary to investigate whether the establishment of population immunity has changed the virus evolution or spreading pattern. In this investigation, one overall analysis of the SARS-CoV-2 spreading in the past several years have been carried out through one thorough genomic epidemiology study, with Germany being chosen as one representative location in view of the systemic efforts for genomic surveillance. The growth advantage of a few predominant variants in its early spreading period has been evaluated through a logistic regression model. The results have revealed that the major circulating SARS-CoV-2 variants since 2023 are mainly derived from the Omicron BA.2 family. Since middle of 2024, most predominant variants were produced primarily through recombination, indicating that the evolution derived from recombination might be the major driving force for the continuous spread of SARS-CoV-2 despite the existence of population immunity. Furthermore, the lower growth advantage of recently emerged variants might possibly lead to a tread of reduction in the frequency of infection wave. The information revealed from this investigation suggests that although short-term spreading tread can be affected by specific virus feature as well as local immunity landscape, the long-term spreading tread is mainly decided by the genomic diversity of the viruses, and can be predicted through phylogenetic and genomic epidemiology investigation. The results have emphasized the importance of maintaining the efforts for genomic surveillance of SARS-CoV-2, which is essential from both medical and research perspectives.
Karaca, S.; Cabrera Mendoza, B.; He, J.; Qiu, D.; Davtian, D.; Lacobelle, A.; Nunez, Y. Z.; Krystal, J. H.; Pietrzak, R. H.; Gelernter, J.; Polimanti, R.
Show abstract
Background: The biological mechanisms linking generalized anxiety disorder (GAD) and COVID-19 remain poorly understood, despite substantial evidence of their comorbidity. To address this gap, we examined genetic and epigenetic factors underlying their co-occurrence. Methods: In a multi-ancestry sample of 893 participants, we conducted genome-wide and epigenome-wide analyses of GAD and COVID-19 severity. Integrating large-scale genome-wide datasets and information regarding methylation quantitative trait loci, complementary analytic approaches were used to identify regional methylation patterns, assess genetically regulated DNA methylation in blood and brain tissue, and evaluate causal loci shared between GAD and COVID-19. Results: GAD was associated with epigenome-wide significant variation in loci involved in chromatin regulation and synaptic signaling. Conversely, COVID-19-related epigenetic signals were enriched in immune-inflammatory and host-response pathways. Mild COVID-19 was epigenetically related to endothelial-inflammatory signals, while severe COVID-19 was linked to epigenetic changes implicated in myeloid and thrombo-inflammatory pathways. Epigenetic signals shared between GAD and COVID-19 implicated processes related to stress adaptation and tissue homeostasis. Genetically informed analyses identified 60 shared loci, including MAPT, ZFP57, and FBXL18, indicating pleiotropy between GAD and COVID-19 in genetically regulated DNA methylation variation. Brain-specific analyses further highlighted convergence in additional loci (i.e., MICB and HLA-DPB1), suggesting neuroimmune mechanisms underlying GAD-COVID-19 shared methylation patterns. Conclusions: These findings support that GAD and COVID-19 share epigenetic and genetic architecture involving pathways related to vascular integrity, immune function, and cellular adaptation, highlighting a potential neuroimmune basis for their co-occurrence.
Rajeev, M.; Narayan, A.
Show abstract
Background: Unstructured data represent about 80% of total electronic health records (EHR) data. Structuring this free text is essential for advancing clinical research, including cohort selection for trials, retrospective studies, and the development of disease registries. While manual chart review (MCR) remains the gold standard for extracting this clinical data, the process is inherently slow, resource-intensive, and susceptible to errors from human fatigue. We evaluated the extraction accuracy, safety, and efficiency of the HeLIX (Hepatology Logic-Integrated Extraction) framework, a Large Language Model (LLM) protocol using Google Gemini 3 Pro, compared to a gold-standard Manual Chart Review (MCR). Methods: A prospective validation study was conducted using 50 high-complexity, simulated hepatology discharge summaries designed to replicate the real-world heterogeneity of EHRs. The HeLIX framework employed a Zero-Shot, Structured Chain-of-Thought (CoT) prompting strategy enforced by a three-layer architecture: Clinical Reasoning Trace, Schema Enforcement, and Evidence Verification. The model extracted 45 distinct clinical variables. Performance was benchmarked against a consensus MCR. Results: Across 2,250 evaluated data points, the model achieved an overall Extraction Accuracy of 99.24% (95% CI: 98.8%-99.5%), with perfect concordance in 35/45 (77.8%) variables. For binary diagnostic variables, the model demonstrated an overall F1-score of 0.98, Recall of 0.99 and substantial inter-rater reliability (Cohens {kappa} = 0.97). Hallucinations were exceptionally rare (2/2250; 0.08%). Critical errors affecting clinical management occurred in only 2 instances (<0.1% of total data), both involving etiological misattribution in complex multifactorial diagnoses. The AI workflow was 13.4-fold faster and 95.1% more cost-effective than manual extraction. Conclusion: The HeLIX framework demonstrates physician-level accuracy and reliability in extracting complex hepatology data. It offers a scalable, efficient, and economical alternative to manual chart review. Such frameworks could accelerate clinical research, enabling healthcare systems globally to build comprehensive patient registries for a fraction of the traditional cost.
Ricard, J.; Dubeau, A.; Moreau, C.; Boisvert, M.-C.; Maziade, M.; Bureau, A.; Girard, S. L.
Show abstract
In the past two decades, the focus on genome-wide association studies in large samples of unrelated patients has overshadowed family genetic studies. Therefore, little is still known about the levels and effects of the transmission of polygenic risk scores (PRS) among familial cases of schizophrenia (SZ) or bipolar disorder (BD) and their unaffected relatives. Prior research has shown that PRS are elevated in both patients and young individuals at familial risk for BD and SZ. We sought to study the transmission of PRS in affected multigenerational families and non-affected adult relatives (NAARs) with or without other non-mood nonpsychotic DSM-IV diagnoses and unrelated non-affected individuals from the same population. We genotyped 1,117 participants divided in 48 families from the Eastern Quebec Schizophrenia and Bipolar Disorder Kindreds. PRSs for both SZ and BD were computed using Multivariate Lassosum. For both SZ PRS and BD PRS, SZ and BD cases present higher PRS compared to controls, replicating previous findings. Regardless of a diagnosis of other non-psychotic and non-mood conditions, NAARs presented higher PRS than the unrelated cohort. Crucially, a subset of families presented consistently low PRS transmission profiles across generations, falling below expectations from our polygenic inheritance model. When the effect of individual PRs is accounted for, we observed sex-specific associations between familial PRS and patients' symptom dimensions. Our results clearly demonstrate that polygenic inheritance alone does not adequately explain disease transmission in families. Such an approach may also clarify why some families exhibit dense clustering of cases despite minimal polygenic burden.
Hu, L.; Bass, M.; Patridge, E.; Molusky, M.; Antoine, G.; Vuyisich, M.; Banavar, G.
Show abstract
Background: Chronic diseases and symptom syndromes often develop after prolonged biological changes that may precede formal diagnosis. RNA-based metatranscriptomics captures active microbial and human gene expression and may provide a functional layer for disease risk evaluation. To address this translational gap, we developed and validated a Disease Risk Score (DRS) framework that integrates metatranscriptome-derived pathway activity scores from stool, saliva, and blood samples, and evaluated its potential clinical utility as an adjunct risk-evaluation tool. Methods: DRS uses disease-specific sets of pathway activity scores derived from stool and saliva microbial functions, stool and saliva microbial taxa, and blood human gene expression. For each disease, 'not optimal' pathway scores are aggregated into a normalized cumulative odds ratio, or cOR, using score-level odds ratios, statistical significance, and literature-supported biological relevance derived from a Development Cohort of 22,369 individuals. A cOR [≥] 5 is defined as high risk. Performance is evaluated in an independent Validation Cohort of 15,908 individuals using self-reported diseases as the reference. Disease support requires both significant cOR separation between self-reported and not-reported (Cohen's d [≥] 0.2) and risk ratio enrichment of self-reported disease among individuals classified as high risk (95% CI of Risk Ratio > 1). Results: Of 20 initially evaluated diseases, 15 meet the prespecified validation criteria on the independent validation cohort: ADHD, anxiety, chronic fatigue syndrome, depression, GERD, hypertension, inflammatory bowel disease, IBS-C, IBS-D, insomnia, MASLD, obesity, obstructive sleep apnea, Sjogren's syndrome, and type 2 diabetes. Five selected clinical scenarios illustrate how DRS can support clinician-mediated decision making, including IBS subtype reclassification, improved diagnostic acceptance in IBS-D, personalized lifestyle counseling in MASLD and early type 2 diabetes, and diagnostic uncertainty in atypical GERD. Conclusions: DRS is a metatranscriptomics-based risk-stratification framework that aggregates active microbial and human pathway signals into interpretable disease-specific risk estimates across a wide range of disease conditions. Validation against self-reported disease labels in an independent cohort shows significant risk enrichment for each of 15 diseases. DRS is intended as an adjunct to clinical evaluation: a decision support tool in situations where routine care encounters uncertainty, delay, or low patient engagement. Future prospective studies using clinically adjudicated endpoints are needed to assess calibration and clinical outcomes.
Kadivar, M.; Alyamani, M.; Mori, M.; Kadivar, M.; Jonsson, J.; Hertervig, E.; Grip, O.; Svensson, L.; Erjefalt, J. S.; Marsal, J.
Show abstract
Background: Histological examination of mucosal tissue in inflammatory bowel diseases (IBD) is a sensitive tool to measure disease activity, and histological remission is emerging as a potentially important treatment target. There are several existing histopathological indices, but they often encompass caveats such as not primarily having been designed to measure the degree of inflammation, encompassing subjective components with poor intra- and interindividual reproducibility, and requiring expert pathologists who are scarce, thus resulting in extended response times. Aim: To construct a new computerized, automated index to objectively measure histological disease activity in the ileal and colonic mucosa, applicable to both Crohn's disease (CD) and ulcerative colitis (UC). Materials and methods: Ileocolonic biopsies were collected from control subjects and patients with CD or UC. A group of CD patients was sampled before and after 12 weeks of anti-TNF therapy. Another group of CD and UC patients functioned as a small validation cohort. Epithelial cells, neutrophils, macrophages, and T cells were immunohistochemically stained, followed by digitalization of the color signal and computerized delineation of the epithelial and lamina propria compartments. The various immune cell types within the epithelium and the lamina propria, respectively, were enumerated, and the numbers were compared between control subjects and patients with CD or UC. Results: The numbers of neutrophils and macrophages in the epithelium, and neutrophils in the lamina propria, showed the highest sensitivity and specificity for distinguishing control-subject tissues from CD and UC tissues. These three parameters were thus chosen to construct a new index, named QiC3 1.0, that could separate tissues from control subjects and patients with CD or UC with high precision. It performed equally well in a small validation cohort of patients. The QiC3 index correlated well with previously described histopathological indices, fecal calprotectin, and endoscopic scores in UC, but showed worse correlation with endoscopic scores in CD and symptomatic scores. When applying the new index to tissues from CD patients before and after therapy, it showed good responsiveness, demonstrating a distinct amelioration in the microscopic inflammatory status that corresponded well to improvements in histopathological scores. Conclusion: We describe a new quantitative, computerized, automated, non-subjective, and response-sensitive immunohistological index (QiC3) for measuring disease activity in ileal and colonic mucosal biopsies, suitable for both CD and UC.
Beck, S. E.; Deak, J. D.; Levey, D. F.; Ge, T.; Jeffries, P. W.; Lai, D.; Mallard, T. T.; Degenhardt, L.; Lind, P. A.; Tollerup Nielsen, T.; Tubbs, J. D.; Wetherill, L.; Johnson, E. C.; Hatoum, A. S.; The SUD Working Group of the Psychiatric Genomics Consortium, ; COGA Collaborators, ; Yale-Penn Collaboration, ; The VA Million Veteran Program, ; Borglum, A.; Demontis, D.; Medland, S. E.; Martin, N. G.; Nelson, E. C.; Smoller, J. W.; Kranzler, H. R.; Gaziano, J. M.; Stein, M. B.; Agrawal, A.; Edenberg, H. J.; Gelernter, J.
Show abstract
Stimulant use disorder (StimUD) is a significant public health problem, but genetic studies have been limited by small sample sizes. We conducted genome-wide association studies (GWAS) of StimUD in the Million Veteran Program (MVP) and All of Us (AOU), followed by meta-analysis with FinnGen and 10 additional datasets, for a total of 709,369 individuals (Ncases=33,977, Ncontrols=675,392) in four broad ancestry groups: European (EUR) (Ncases=22,564, Ncontrols=624,672), African (AFR) (Ncases=7,574, Ncontrols=34,189), Admixed American (AMR) (Ncases=3,657, Ncontrols=15,698), and East Asian (EAS) (Ncases=182, Ncontrols=833). Population-specific SNP heritability was 6.1% in EUR and 2.4% in AFR. We discovered a total of 19 genome-wide-significant loci, six in EUR, including DRD2*rs5794864, P=7.32E-10, one in AFR, five in a multi-ancestry meta-analysis, including CHRNA5*rs55781567, P=3.27E-9, two in a male-only meta-analysis, including FTO*rs8057044, P=9.50E10-9, and five in a meta-analysis of sex-stratified results. In a hold-out AOU subsample (NEUR=18,841, NAFR=12,263, NAMR=9,739), ancestry-specific polygenic risk scores were significantly associated with StimUD in EUR (OR=3.28, 95% confidence interval (CI)=2.89-3.71) and AMR (OR=2.01, 95% CI=1.71-2.37). Transcriptome-wide association studies, fine-mapping, and colocalization analyses prioritized additional genes (e.g., GPX1, BSN). Genetic correlation, Mendelian randomization, and causal mixture analyses revealed relationships with other substance use and use disorder phenotypes, including cannabis use disorder (rg=0.94, P=5.43E-237) and opioid use disorder (rg=1.01, P=4.40E-107), and other psychiatric traits, including anxiety, depression, neuroticism, and attention-deficit/hyperactivity disorder. This is the first well-powered GWAS of StimUD, and it offers significant insights into disease biology.
YOU, Y.; McAdams, T.; Oginni, O.; Liu, C.; Herle, M.; Zavos, H.
Show abstract
Objective: ADHD has been associated with obesity indicators, including BMI, across the lifespan. A possible mechanism linking ADHD and BMI is binge eating. Previous research has found associations between ADHD, binge eating and BMI. However, the role of genetic and environmental influences on these associations remains unclear. Method: We utilized data from the Twins Early Development Study (TEDS), comprising 3,675 monozygotic and 7,063 dizygotic twin pairs. ADHD symptoms in childhood and adolescence were assessed using parent-reported questionnaires. Adult ADHD symptoms were measured using both self-report and parent-report questionnaires. Phenotypic mediation models examined whether binge eating mediated the association between ADHD and BMI, without controlling for genetic confounding. Subsequently, the etiological architecture underlying the associations among the three traits across childhood, adolescence, and adulthood were investigated by incorporating genetic and environmental influences into the models. Results: Binge eating significantly mediated the association between ADHD symptoms and BMI in both adolescence and adulthood. However, these mediation effects were no longer present once genetic and environmental influences were incorporated into the models. The best-fitting model in childhood, adolescence and adulthood was Cholesky decomposition models, where covariance between traits was explained by shared aetiology. Conclusions: This twin study reveals shared liability across ADHD, binge eating, and BMI. The mediating role of binge eating in the relationship between ADHD symptoms and BMI was largely confounded by shared genetic influences. Intervention strategies could focus more on common underlying behavioural and self-regulatory mechanisms across these traits, as well as placing more emphasis on symptom patterns within families.
Tay, Y. W.; Elsayed, I.; Yeow, D.; James, M.; Kung, P.-J.; Screven, L.; Dilliott, A. A.; Alcalay, R. N.; Fang, Z.-H.; Tan, A. H.; Global Parkinson's Genetics Program (GP2), ; Sue, C. M.; Lange, L. M.; Perinan, M. T.
Show abstract
Introduction: Variants in the polymerase gamma (POLG) gene are associated with a wide range of mitochondrial disorders. Emerging evidence suggests a potential link between POLG variants and Parkinson's disease (PD); yet, results remain inconclusive. Objectives: To investigate the genetic spectrum and prevalence of POLG variants in PD across diverse ancestries. Methods: We leveraged multi-ancestry genetic data from the Global Parkinson's Genetics Program (GP2), including genotyping data from 98,589 and short-read sequencing data from 36,022 individuals. We performed a POLG rare variant screen, case-control association, and gene-level burden analyses. Results: Five PD cases carried potentially biallelic rare pathogenic/likely pathogenic POLG variants. Additionally, 228 individuals (<1%; 161 PD cases, 28 individuals with other neurological disorders, and 39 controls) carried 34 distinct rare pathogenic/likely pathogenic heterozygous variants, with no significant frequency differences between cases and controls, except for the p.Ala467Thr variant in the European population. The co-inherited pathogenic variants p.Thr251Ile and p.Pro587Leu were present in <1% of both cases and controls, with no significant group differences. Burden and variant-level association analyses showed no association between rare POLG variant burden or common POLG variant enrichment and PD. Conclusions: POLG variants are overall rare in PD. The identification of rare pathogenic variants among PD cases suggests that POLG-related mitochondrial dysfunction may contribute to PD in isolated instances, particularly under recessive inheritance. Our findings support a role for POLG variants in select cases and underscore the need for larger-scale sequencing and functional studies.
Bedwell, G. J.; Madden, V. J.; Isaacs, A.; Khorommbi, H.; Moloi, N.; Papaioannou, G.; Solomons, S.; Sudan, S.; Parker, R.
Show abstract
Introduction Dysmenorrhoea is highly prevalent globally and interferes with engagement in education, work, social participation, and quality of life. Although evidence suggests that sociocultural beliefs influence how menstrual pain is understood and managed, relatively little research has explored dysmenorrhoea-related knowledge and beliefs within South Africa. This study aimed to (1) determine the frequency of dysmenorrhoea, (2) assess dysmenorrhoea-related knowledge and compare knowledge between menstruating and non-menstruating individuals, and (3) explore commonly held generational, cultural, and religious beliefs related to dysmenorrhoea in a South African university cohort. Methods We analysed data collected as part of a cross-sectional survey conducted among staff and students at a South African university. Participants completed demographic questions, items assessing dysmenorrhoea-related knowledge, and an adapted Working Ability, Location, Intensity, Days of Pain, Dysmenorrhoea (WaLIDD) questionnaire. Participants were also invited to provide free-text responses describing generational, cultural, and religious beliefs about dysmenorrhoea. Quantitative data were analysed descriptively and compared between menstruating and non-menstruating participants. Free-text responses were analysed using reflexive thematic analysis. Results A total of 863 participants completed the survey, including 578 current or past menstruators. The frequency (95%CI) of dysmenorrhoea was 75.4% (71.7-78.9). Most participants were classified as having moderate (53%) or severe (31%) dysmenorrhoea on the WaLIDD scale. Awareness of dysmenorrhoea was higher among participants who had menstruated than among those who had never menstruated (80.4% vs 55.3%, p<0.001). Most participants (85.1%) reported wanting more education about dysmenorrhoea and its impact. Reflexive thematic analysis of 246 free-text responses identified five themes: (1) menstrual pain is normalised, dismissed, and expected to endure, (2) reproductive meanings attached to menstrual pain, (3) moral, spiritual, and cultural interpretations of menstrual pain, (4) negotiating competing explanations for menstrual pain, and (5) managing and controlling menstrual pain symptoms. Across themes, dysmenorrhoea was interpreted through social, cultural, reproductive, spiritual, and biomedical frameworks that shaped how pain was understood, communicated, and managed. Conclusion Dysmenorrhoea is common in this South African university cohort, and is rarely understood as a purely biological symptom. Instead, menstrual pain is understood and managed through broader social, cultural, reproductive, moral, and biomedical narratives, which shape how pain is recognised, disclosed, legitimised, and treated. These findings highlight the importance of considering sociocultural beliefs alongside clinical factors when developing menstrual health education, support strategies, and healthcare services.