Back

Genomics

Elsevier BV

Preprints posted in the last 7 days, ranked by how well they match Genomics's content profile, based on 60 papers previously published here. The average preprint has a 0.08% match score for this journal, so anything above that is already an above-average fit.

1
Comparative LUSZ Therapeutic Study (LUSZ_AVIST) of Antiviral, Antiretroviral, and Immunosuppressive Treatments in Hospitalized COVID-19 Patients with High-Risk Factors, Biomarkers, and Disease Progression.

Makdissy, N.; Makdessi, E. W.; Fenianos, F.; Nasreddine, N.; Daher, W.; El Hamoui, S.

2026-04-13 respiratory medicine 10.64898/2026.04.10.26350587 medRxiv
Top 0.5%
2.7%
Show abstract

COVID-19 has spread rapidly and caused a global pandemic making it one of the deadliest in history. Early identification of patients with coronavirus disease 2019 who may develop critical illness is of immense importance. Therefore, novel biomarkers were needed to identify patients who will suffer rapid disease progression to severe complications and death. Many treatments were adopted including the antiviral Remdesivir, the antiretroviral Lopinavir /Ritonavir and Tocilizumab. Our study aimed not only to specify high-risk factors and biomarkers of fatal outcome in hospitalized subjects with coronavirus but also to compare the efficacy of the three considered treatments to help clinicians better choose a therapeutic strategy and reduce mortality. We divided the population (n=711) into four main groups based according to the WHO ordinal severity scale. The percentage of mortality, in and out the hospital, the length of stay in the hospital, the pulmonary inflammatory lesion and its distribution, the SARS-CoV-2 IgM and IgG variations at admission, the inflammatory markers, the complete blood count, the coagulation factors and enzymes, proteins and electrolytes profile, glucose and lipid profile, and other relevant markers were measured. The significance of the observed variation was assessed by multivariate and ANOVA analyses. We succeeded to establish a novel predictive scoring model of disease progression based on a cohort of Lebanese hospitalized patients relying on the pulmonary inflammatory lesions, inflammation biomarkers such as LDH, D-Dimer, CRP, IL-6 and the lymphocyte count, the number of comorbidities and the age of the patient which all were significantly correlated with the illness severity showing best outcomes with immunomodulatory and anticoagulant treatments by the results. As top tier, Tocilizumab was more efficient than the two other treatments in non-severe cases but none of the used treatments was insanely effective alone to reduce mortality in severe cases.

2
Identification, evolutionary history and characteristics of orphan genes in root-knot nematodes

Seckin, E.; Colinet, D.; Bailly-Bechet, M.; Seassau, A.; Bottini, S.; Sarti, E.; Danchin, E. G.

2026-04-11 bioinformatics 10.64898/2025.12.19.695360 medRxiv
Top 2%
1.2%
Show abstract

Orphan genes, lacking homologs in other species, are systematically found across genomes. Their presence may result from extensive divergence from pre-existing genes or from de novo gene birth, which occurs when a gene emerges from a previously non-genic region. In this study, we identified orphan genes in the genomes of globally distributed plant-parasitic nematodes of the genus Meloidogyne and investigated their origins, evolution, and characteristics. Using a comparative genomics framework across 85 nematode species, we found that 18% of Meloidogyne genes are genus-specific, transcriptionally supported orphans. By combining ancestral sequence reconstruction and synteny-based approaches, we inferred that 20% of these orphan genes originated through high divergence, while 18% likely emerged de novo. Proteomic and translatomic evidence confirmed the translation of a subset of these genes, and feature analyses revealed distinctive molecular signatures, including shorter length, signal peptide enrichment, and a tendency for extracellular localization. These findings highlight orphan genes as a substantial and previously underexplored component of the Meloidogyne genome, with potential roles in their worldwide parasitism.

3
A safer fluorescent in situ hybridization protocol for cryosections

Chihara, A.; Mizuno, R.; Kagawa, N.; Takayama, A.; Okumura, A.; Suzuki, M.; Shibata, Y.; Mochii, M.; Ohuchi, H.; Sato, K.; Suzuki, K.-i. T.

2026-04-16 molecular biology 10.1101/2025.05.25.655994 medRxiv
Top 4%
0.5%
Show abstract

Fluorescent in situ hybridization (FISH) enables highly sensitive, high-resolution detection of gene transcripts. Moreover, by employing multiple probes, this technique allows for multiplexed, simultaneous detection of distinct gene expression patterns spatiotemporally, making it a valuable spatial transcriptomics approach. Owing to these advantages, FISH techniques are rapidly being adopted across diverse areas of basic biology. However, conventional protocols often rely on volatile, toxic reagents such as formalin or methanol, posing potential health risks to researchers. Here, we present a safer protocol that replaces these chemicals with low-toxicity alternatives, without compromising the high detection sensitivity of FISH. We validated this protocol using both in situ hybridization chain reaction (HCR) and signal amplification by exchange reaction (SABER)-FISH in frozen sections of various model organisms, including mouse (Mus musculus), amphibians (Xenopus laevis and Pleurodeles waltl), and medaka (Oryzias latipes). Our results demonstrate successful multiplexed detection of morphogenetic and cell-type marker genes in these model animals using this safer protocol. The protocol has the additional advantage of requiring no proteolytic enzyme treatment, thus preserving tissue integrity. Furthermore, we show that this protocol is fully compatible with EGFP immunostaining, allowing for the simultaneous detection of mRNAs and reporter proteins in transgenic animals. This protocol retains the benefits of highly sensitive, multiplexed, and multimodal detection afforded by integrating in situ HCR and SABER-FISH with immunohistochemistry, while providing a safer option for researchers, thereby offering a valuable tool for basic biology.

4
Characterization of a pancreatic cancer GWAS signal suggests PDX1 buffers stress in the exocrine pancreas

Hoskins, J. W.; Christensen, T. A.; Eiser, D.; Char, E.; Mobaraki, M.; O'Brien, A.; Collins, I.; Zhong, J.; Patel, M. B.; Prasad, G.; Pancreatic Cancer Cohort Consortium and Pancreatic Cancer Case-Control Consortium (PanScan/PanC4), ; Arda, E.; Connelly, K. E.; Amundadottir, L. T.

2026-04-15 genetic and genomic medicine 10.64898/2026.04.13.26350790 medRxiv
Top 4%
0.5%
Show abstract

Pancreatic ductal adenocarcinoma (PDAC) remains one of the deadliest human cancers. The current largest published PDAC Genome-Wide Association Study (GWAS) identified 23 genetic risk signals, but most lack sufficient characterization. This study aimed to functionally characterize the chr13q12.2 (PLUT/PDX1) PDAC GWAS risk locus. Fine-mapping, luciferase reporter assays, and electrophoretic mobility shift assays implicated rs9581943, a PDX1 promoter SNP, as a functional variant underlying this GWAS signal. GTEx expression QTL analyses identified rs9581943 as a significant PDX1 eQTL in pancreas, and CRISPR/Cas9 editing in PDAC-derived cell lines confirmed a functional relationship. PDX1 is a transcription factor involved in early pancreas development and {beta}-cell homeostasis, but its role in exocrine pancreatic cells is unclear. Single-nucleus RNA-seq analyses of pancreatic acinar and ductal cells from neonatal, adult, and chronic pancreatitis donors suggested PDX1 activity alleviates high secretory load and ER-stress in acinar and biases ducts toward homeostatic phenotypes. Similarly, scRNA-seq analyses of pancreatic tumors suggested PDX1 activity reduces biosynthetic and inflammatory stress and promotes epithelial differentiation. Our study therefore implicates rs9581943 as a causal variant for the chr13q12.2 PDAC GWAS signal wherein the risk allele reduces PDX1 expression, eroding PDX1's capacity to buffer stress and stabilize epithelial cell fate in the exocrine compartment.

5
Pleuroparenchymal fibroelastosis in monogenic DGUOK-associated mitochondriopathy

von Hardenberg, S.; Maier, P.; Christian, L.; Das, A. M.; Neubert, L.; Ruwisch, J.; Peters, K.; Schramm, D.; Griese, M.; Skawran, B.; Eilers, M.; Jonigk, D.; Junge, N.; Haghikia, A.; Hegelmaier, T.; Hofmann, W.; Seeliger, B.; Renz, D. M.; Stalke, A.; Hartmayer, L.; Duscha, A.; Schulze, M.; DiDonato, N.; Prokisch, H.; Auber, B.; Knudsen, L.; Schupp, J. C.; Schwerk, N.

2026-04-11 respiratory medicine 10.64898/2026.04.08.26349275 medRxiv
Top 6%
0.3%
Show abstract

BackgroundPleuroparenchymal fibroelastosis (PPFE) is a rare, fibrotic lung disease with poor prognosis, usually affecting adults which most commonly occurs idiopathically. Biallelic pathogenic variants in DGUOK cause mitochondrial DNA (mtDNA) depletion syndrome, predominantly affecting infants with severe hepatic and neurological symptoms. Detailed description of pulmonary manifestations with late-onset presentation have not been reported. MethodsWe describe nine patients with PPFE and DGUOK-associated mitochondriopathy. Clinical, radiological, histopathological, and genetic data were systematically collected from all patients. Functional studies, single nucleus RNA sequencing (snRNAseq), immunofluorescence staining, transmission electron microscopy and respiratory chain enzyme activity assays were conducted on patient-derived fibroblasts, muscle or lung tissues. mtDNA content quantification was performed on whole genome sequencing (WGS) data. ResultsAll patients (ages 5-36) presented with progressive dyspnea, weight loss and some with spontaneous pneumothoraces. Chest computed tomography and lung biopsies showed features of PPFE. Biallelic pathogenic DGUOK variants were identified in all patients, seven of them carry an unreported intronic variant leading to mtDNA depletion. snRNAseq of lung tissue from four pediatric patients identified Aberrant Basaloid cells and intermediate cells as their precursor localized at the fibrotic edge. Mitochondrial alterations were identified by electron microscopy. ConclusionPPFE in children and young adults is associated with DGUOK-related mitochondriopathy. For the first time, we demonstrate Aberrant Basaloid cells in pediatric fibrotic lung tissue. Since pulmonary involvement may be underrecognized or misinterpreted and the clinical presentation may not always be typical of a mitochondriopathy, we recommend genetic testing in all patients with PPFE of unknown origin.

6
Ventilator triggering control with an LSTM-Based Model

Liu, J.; Fan, J.; Deng, Z.; Tang, X.; Zhang, H.; Sharma, A.; Li, Q.; Liang, C.; Wang, A. Y.; Liu, L.; Luo, K.; Liu, H.; Qiu, H.

2026-04-11 respiratory medicine 10.64898/2026.04.10.26350573 medRxiv
Top 7%
0.2%
Show abstract

Background: Patient-ventilator synchrony, an essential prerequisite for non-invasive mechanical ventilation, requires an accurate matching of every phase of the respiration between patient and the ventilator. Methods: We developed a long short-term memory (LSTM)-based model that can predict the inspiratory and expiratory time of the patient. This model consisted of two hidden layers, each with eight LSTM units, and was trained using a dataset of approximately 27000 of 500-ms-long flow signals that captured both inspiratory and expiratory events. Results: The LSTM model achieved 97% accuracy and F1 score in the test data, and the average trigger error was less than 2.20%. In the first trial, 10 volunteers were enrolled. In "Compliance" mode, 78.6% of the triggering by the LSTM model was compatible with neuronal respiration, which was higher than Auto-Trak model (74.2%). Auto-Trak model performed marginally better in the modes of pressure support = 5 and 10 cmH2O. Considering the success in the first clinical trial, we further tested the models by including five patients with acute respiratory distress syndrome (ARDS). The LSTM model exhibited 60.6% of the triggering in the 33%-box, which is better than 49.0% of Auto-Trak model. And the PVI index of the LSTM model was significantly less than Auto-Trak model (36.5% vs 52.9%). Conclusions: Overall, the LSTM model performed comparable to, or even better than, Auto-Trak model in both latency and PVI index. While other mathematical models have been developed, our model was effectively embedded in the chip to control the triggering of ventilator. Trial registration: Approval Number: 2023ZDSYLL348-P01; Approval Date: 28/09/2023. Clinical Trial Registration Number: ChiCTR2500097446; Registration Date: 19/02/2025.

7
Adherence to International Pharmacogenomic Recommendations in Paediatric Cancer Care: A Cohort Analysis Embedded Within the MARVEL-PIC Randomised Trial

Chawla, A.; Carter, S.; Dyas, R.; Williams, E.; Moore, C.; Conyers, R.

2026-04-16 genetic and genomic medicine 10.64898/2026.04.15.26348678 medRxiv
Top 9%
0.2%
Show abstract

Background: Pharmacogenomic testing (PGx) can optimise drug efficacy and minimise toxicity, but the extent of prescriber adherence to PGx recommendations remains unclear. We aimed to quantify clinician adherence to international genotype-guided prescribing recommendations in a cohort of paediatric oncology patients. Methods: We reviewed files of children enrolled in the MARVEL-PIC (NCT05667766) randomised control trial, who had PGx recommendations available. Patients were included if 12 weeks had passed since their PGx report was released to clinicians. Prescribing events were identified for actionable PGx recommendations, and classified as "explicitly followed", "inadvertently followed", or "not followed". Adherence was assessed by patient, drug, and recommendation. Results: 2,063 PGx recommendations were available for 216 patients. 64 (3.1%) recommendations were actionable for 44 patients and 10 drugs within the 12-week study period. Recommendations were explicitly followed in 57/288 (19.8%) of prescribing events, inadvertently followed in 145 (50.3%), and not followed in 86 (29.9%). Mercaptopurine demonstrated the highest rate of explicit adherence (87.5%). No significant associations were observed between adherence and age group, cancer type, drug type, or strength of recommendation. Conclusion: Adherence to pharmacogenomic recommendations was very low, highlighting the need to understand barriers to PGx implementation, and consideration of clinical decision supports to facilitate adherence.

8
Genetic analysis of female genital tract polyps implicates genome stability, estrogen signalling and shared susceptibility with proliferative gynaecological disorders

Ingold, N.; Frankcombe, S.; Bouttle, K.; Moro, E.; Canson, D.; Zoellner, S.; Patil, S.; Dzigurski, J.; Glubb, D. M.; Laisk, T.; O'Mara, T. A.

2026-04-16 genetic and genomic medicine 10.64898/2026.04.13.26350740 medRxiv
Top 9%
0.2%
Show abstract

Female genital tract (FGT) polyps are common benign growths affecting up to half of all women. However, they carry malignant potential, and their genetic architecture remains poorly defined. We conducted a genome-wide association study (GWAS) meta-analysis across four biobanks (48,400 cases, 477,134 controls), identifying 26 risk loci for FGT polyps, 12 of which were previously unreported. Integrative gene prioritisation highlighted 193 candidate genes, revealing a potential convergent biological mechanism: where germline variation in DNA replication and maintenance (e.g., PRIM1, TERT and HMGA1) compromises genomic stability in the context of hormone-driven proliferation (e.g., ESR1 and GREB1). This susceptibility is further modulated by metabolic drivers of estrogen biosynthesis, underscored by specific adiposity-related loci (e.g. RSPO3 and PLCE1) and the aromatase gene CYP19A1. Mendelian randomisation demonstrated bidirectional causal relationships with endometriosis and fibroids, and endometrial cancer. Leveraging the shared genetic architecture of FGT polyps and other gynaecological disorders via multi-trait analysis revealed an additional 26 loci, validating sub-threshold regions encompassing HMGA1 and GREB1. In total, 52 risk loci were identified (36 novel), 39 of which replicated in an independent cohort. These findings reframe polyps not merely as local gynaecological overgrowths but as manifestations of a systemic proliferative syndrome characterised by dysregulated genome stability and estrogen signalling, which may also impact malignant transformation.

9
Automated Detection of Dental Caries and Bone Loss on Periapical and Bitewing Radiographs using a YOLO Based Deep Learning Model

Alqaderi, H.; Kapadia, U.; Brahmbhatt, Y.; Papathanasiou, A.; Rodgers, D.; Arsenault, P.; Cardarelli, J.; Zavras, A.; Li, H.

2026-04-17 dentistry and oral medicine 10.64898/2026.04.12.26350726 medRxiv
Top 10%
0.1%
Show abstract

BackgroundDental caries and periodontal disease represent the most prevalent global oral health conditions, collectively affecting several billion people. The diagnostic interpretation of dental radiographs, a cornerstone of modern dentistry, is associated with considerable inter-observer variability. In routine clinical practice, clinicians are required to evaluate a high volume of radiographic images daily, a cognitively demanding task in which diagnostic fatigue, time constraints, and the inherent complexity of overlapping anatomical structures can lead to the inadvertent oversight of early-stage pathologies. Artificial intelligence (AI) offers a transformative opportunity to augment clinical decision-making by providing rapid, objective, and consistent radiographic analysis, thereby serving as a tireless adjunct capable of flagging findings that may be missed during routine human inspection. MethodsThis study developed and validated a deep learning system for the automated detection of dental caries and alveolar bone loss using a dataset of 1,063 periapical and bitewing radiographs. Two separate YOLOv8s object detection models were trained and evaluated using a rigorous 5-fold cross-validation methodology. To align with the clinical use-case of a screening tool where high sensitivity is paramount, a custom image-level evaluation criterion was employed: a true positive was recorded if any predicted bounding box had a Jaccard Index (IoU) > 0 with any ground truth annotation. Model performance was systematically evaluated at confidence thresholds of 0.10 and 0.05. ResultsAt a confidence threshold of 0.05, the caries detection model achieved a mean precision of 84.41% ({+/-}0.72%), recall of 85.97% ({+/-}4.72%), and an F1-score of 85.13% ({+/-}2.61%). The alveolar bone loss model demonstrated exceptionally high performance, with a mean precision of 95.47% ({+/-}0.94%), recall of 98.60% ({+/-}0.49%), and an F1-score of 97.00% ({+/-}0.46%). ConclusionThe YOLOv8-based models demonstrated high accuracy and high sensitivity for detecting dental caries and alveolar bone loss on periapical radiographs. The system shows significant potential as a reliable automated assistant for dental practitioners, helping to improve diagnostic consistency, reduce the risk of missed pathology, and ultimately enhance the standard of patient care.

10
Clinico-pathologic characteristics, patterns of treatment and outcome of newly diagnosed Waldenstroms Macroglobulinemia- a single center real world retrospective analysis

Gupta, V.; Podder, D.; Saha, S.; Shah, B.; Ghosh, S.; Kumar, J.; Jacoby, A. P.; Nag, A.; Chattopadhyay, D.; Javed, R.; Rath, A.; Chakraborty, S.; Demde, R.; Vinarkar, S.; Parihar, M.; Zameer, L.; Mishra, D.; Chandy, M.; Nair, R.

2026-04-14 hematology 10.64898/2026.04.10.26350611 medRxiv
Top 11%
0.1%
Show abstract

Waldenstrom macroglobulinemia (WM) is a rare indolent neoplasm characterized by presence of more than 10% lymphoid cells in BM that exhibit plasmacytoid or plasma cell differentiation that secretes an IgM monoclonal protein. This is a retrospective analysis of 89 patients of WM that describes the clinical and laboratory characteristics, treatment patterns and outcome of patients of WM. The median age of the entire cophort was 66 years with male predominance (67.4%). Most common presentations were symptoms pertaining to anemia (77.5%) and constitutional symptoms (33.7%). Median bone marrow lymphoplasmacytic cells were 41%. Positivity for MYD88 and CXCR4 mutations were seen in 81.8% and 2.4% cases. BR was the most common regimen used (52.8%). Overall response rates were seen at 87.8%. Median overall survival, progression free survival and time to next treatment is 8.49 years, 2.15 years and 3.88 years. BR regimen was associated with highest event free survival.

11
Colibactin-associated mutations in the human colon appear to reflect anatomy and early exposure, not oncogenesis

Hiatt, L.; Peterson, E. V.; Happ, H. C.; Major-Mincer, J.; Avvaru, A.; Goclowski, C. L.; Garretson, A.; Sasani, T. A.; Hotaling, J. M.; Neklason, D. W.; Uchida, A. M.; Quinlan, A. R.

2026-04-15 genetic and genomic medicine 10.64898/2026.04.13.26350783 medRxiv
Top 13%
0.1%
Show abstract

Colorectal cancer (CRC) is the second leading cause of cancer death globally and the number one cause of cancer death in people under 50 years old. The reasons for the rise of early-onset CRC are unknown, and while anatomically distinct subtypes of CRC have substantial clinical and molecular associations, the etiology of region-specific disease, such as early-onset CRC's enrichment in the distal colon, remains unclear. Understanding regional mutagenesis may identify risk factors for this public health concern and CRC more broadly. To evaluate mutational dynamics across the premalignant colon, we performed whole-genome sequencing of 125 individual colon crypts taken from six standardized regions biopsied during colonoscopy, collected from 11 donors without polyps and 10 with polyps. We observed mutation spectra and accumulation rates consistent with previous whole-organ studies, with greater subclonal mutation capture enabled by experimental design. T>[A,C,G] mutations, which are associated with colibactin genotoxicity from pks+ Escherichia coli, were significantly enriched in the rectum of donors with and without polyps (adjusted p-values < 0.01). Moreover, when comparing findings to crypts from individuals with CRC and sequenced CRC tumors, we observed consistent enrichment of the colibactin-associated mutational signature "ID18" in the rectum in both normal colon crypts and CRC tumors, without significant difference in colibactin-specific single nucleotide variant or insertion-deletion burden in crypts across the three clinical groups (i.e., no polyp, polyp, and CRC). These findings argue against a causal or prognostic role for colibactin in CRC, instead indicating that the proposed association with early-onset disease reflects anatomic specificity rather than cancer-specific clinical relevance.

12
Deriving LD-adjusted GWAS summary statistics through linkage disequilibrium deconvolution

Nouira, A.; Favre Moiron, M.; Tournaire, M.; Verbanck, M.

2026-04-11 genetic and genomic medicine 10.64898/2026.04.10.26350574 medRxiv
Top 13%
0.1%
Show abstract

Genome-wide association studies (GWAS) have identified numerous genetic variants associated with complex traits. However, linkage disequilibrium (LD) confounds these associations, leading to false positives where non-causal variants appear associated because they are correlated with nearby causal variants. This is particularly the case in highly polygenic traits where the genome can be saturated in causal variants. To address this issue, we propose LDeconv a method based on truncated singular value decomposition (SVD) that adjust GWAS summary statistics without requiring individual-level genotype data. This approach accounts for LD structure, isolates causal variants in high-LD regions, and improve the reliability of effect size estimates. We assess its performance through simulations across various LD scenarios, conduct extensive sensitivity analyses, and apply them to real GWAS data from the UK Biobank. Our results demonstrate that LDeconv effectively reduces false discoveries while preserving true associations, offering a robust framework for post-GWAS analysis.

13
APOE4 Allele Frequencies Show Dramatic Variation Across Indian Populations

Ramdas, S.; Kahali, B.

2026-04-13 genetic and genomic medicine 10.64898/2026.04.09.26350483 medRxiv
Top 15%
0.1%
Show abstract

The APOE {varepsilon}4 allele is the strongest genetic risk factor for Alzheimers Disease. However, its distribution across Indian populations is poorly characterized. We analyze APOE allele frequencies in 9,524 individuals from 83 distinct populations in the GenomeIndia dataset. {varepsilon}4 frequencies show large variation across populations within India, ranging from 2.7% to 36.1%, with a median of 11%. Tribal populations have higher {varepsilon}4 frequencies compared to non-tribal groups, while Tibeto-Burman populations have significantly lower frequencies. One tribal population from the northern coastal highlands has {varepsilon}4 frequency of 0.36, with 59% of individuals being carriers. {varepsilon}4 carrier status correlates significantly with lipid phenotypes including LDL, HDL, total cholesterol, and triglycerides. Collectively, these findings reveal exceptional genetic diversity in Alzheimers Disease risk across India and have important implications for population-specific screening strategies, genetic counseling, and precision medicine approaches to dementia prevention.

14
A Tale of Two Countries: Comparison of Rectal Cancer Characteristics Between Pakistani Americans and Native Pakistanis

Sherwani, M.; Azhar, M. K.; Khan, S.; Ali, D.; Husain, S.; Khan, A.

2026-04-11 surgery 10.64898/2026.04.07.26350364 medRxiv
Top 15%
0.0%
Show abstract

IntroductionComparison of rectal cancer characteristics in Pakistani Americans and native Pakistanis remains poorly investigated, as migrant studies have predominantly concentrated on East and Southeast Asian groups. This research aims to compare clinicopathological characteristics between the two groups. We hypothesize that significant differences will exist between these cohorts, mediated by gene-environment interactions. MethodsThis was a retrospective cohort study utilizing two multi-institutional databases to identify adult patients with rectal cancer: the National Cancer Database in the U.S (2018-2022) and the Rectal Cancer Surgery and Epidemiology Study in Pakistan (2020-2021). Non-Hispanic Whites (NHWs) were included as a reference population for comparative analysis. Clinicopathological characteristics were compared using Wilcoxon rank-sum and chi-square tests. ResultsA total of 523 Pakistani Americans and 608 native Pakistanis were included in the study. The median age at diagnosis was 57 years in Pakistani Americans (IQR 48-68), 42 years (IQR 33-54) in native Pakistanis and 63 years in NHWs (IQR 54-73) (p < 0.001). Native Pakistanis presented with early-stage disease less often than Pakistani Americans and NHWs (5.3%, 25.1%, and 20.5%, respectively; p < 0.001) and had markedly higher rates of signet cell carcinoma (20.1%, 0.6%, and 0.4%, respectively; p < 0.001) and poorly differentiated tumors (29.0%, 10.4%, and 11.4%, respectively; p < 0.001). ConclusionsThis study found that Native Pakistanis with rectal cancer presented at a younger age and with more aggressive tumor characteristics compared to both Pakistani Americans and NHWs. Notably, Pakistani Americans displayed a distinct clinical profile, intermediate between both groups.

15
GPR143, a novel immunohistochemical marker for renal tumors with FLCN/TSC/MTOR-TFE alterations

Li, Q.; Singh, A.; Hu, R.; Huang, W.; Shapiro, D. D.; Abel, E. J.; Zong, Y.

2026-04-13 pathology 10.64898/2026.04.06.26350070 medRxiv
Top 16%
0.0%
Show abstract

Although several ancillary tests are available in limited laboratories, diagnosis of microphthalmia (MiT)/TFE family translocation renal cell carcinoma (tRCC) could be challenging due to diverse and overlapping tumor morphology and the lack of reliable biomarkers. GPNMB has been recently identified as a diagnostic marker for various renal neoplasms with FLCN/TSC/mTOR-TFE alterations. However, the sensitivity and specificity of GPNMB immunostain are suboptimal and the result interpretation in ambiguous cases could be difficult. To search additional biomarkers that could improve the screening sensitivity and predict genetic aberrations in FLCN/TSC/mTOR-TFE pathway in renal tumors, we performed bioinformatic analysis of publicly available cancer databases and found GPR143, a transmembrane protein regulated by MiT transcription factors, was highly expressed in a subset of renal cell carcinomas (RCCs). In two the Cancer Genome Atlas (TCGA) kidney cancer cohorts, RCCs with high levels of GPR143 expression were enriched for renal neoplasms with FLCN/TSC/mTOR-TFE alterations. Similar to GPNMB labeling, GPR143 immunostain was positive in the majority of tRCC cases and renal tumors with FLCN/TSC/mTOR alterations, suggesting that GPR143 could function as another surrogate marker for FLCN/TSC/mTOR-TFE alterations in certain renal tumors. Interestingly, despite the concordant GPR143 and GPNMB immunoreactivity in most renal neoplasms with FLCN/TSC/mTOR-TFE alterations, diffuse GPR143 immunostain was observed in some cases with negative or focal GPNMB labeling. Taken together, our results indicate GPR143 could serve as a useful adjunct marker to improve the sensitivity for screening renal tumors with FLCN/TSC/mTOR-TFE alterations.

16
Inherited genetic risk factors in young-onset lung cancer

Esai Selvan, M.; Gould Rothberg, B. E.; Patel, A. A.; Sang, J.; Horowitz, A.; Christiani, D. C.; Klein, R. J.; Gumus, Z. H.

2026-04-15 genetic and genomic medicine 10.64898/2026.04.14.26350822 medRxiv
Top 17%
0.0%
Show abstract

Introduction Lung cancer is rare before age 45, and its inherited genetic basis remains poorly defined. Methods We performed whole-genome sequencing in 171 predominantly young-onset lung cancer patients and integrated these data with whole-exome sequencing from six major lung cancer consortia, yielding 9,065 patients. After quality control, analyses focused on 6,545 individuals of European ancestry, the largest ancestral group. We compared the prevalence of rare pathogenic and likely pathogenic (P/LP) germline variants between 186 young-onset (age <45 years) and 6,359 older patients at gene and gene-set levels using Fisher's exact test, stratified by histology, sex, and smoking status. Polygenic risk scores (PRS) derived from common variants were also evaluated. Results Young-onset patients carried a higher burden of rare germline P/LP variants in DNA damage response (DDR) genes (including BRIP1, ERCC6, MSH5), and in cilia-related genes, notably GPR161. At the pathway level, DDR genes were significantly enriched (OR=1.66, p=0.007), with the strongest signal in the Fanconi Anemia pathway and among females (OR=1.96, p=0.01). Enrichment was also observed in inborn errors of immunity pathways, with strongest signals in antibody deficiency and the complement system genes. Young-onset patients additionally exhibited higher lung cancer PRS. Conclusion Young-onset lung cancer exhibits a distinct germline genetic architecture, characterized by enrichment of rare P/LP variants in DDR, cilia-related, and immune pathways, and an elevated lung cancer PRS. These findings support a greater role for inherited susceptibility in early-onset disease and have implications for risk stratification, earlier screening, and precision prevention.

17
Multi-task deep learning integrating pretreatment MRI and whole slide images predicts induction chemotherapy response and survival in locally advanced nasopharyngeal carcinoma

Hou, J.; Yi, X.; Li, C.; Li, J.; Cao, H.; Lu, Q.; Yu, X.

2026-04-11 radiology and imaging 10.64898/2026.04.07.26350350 medRxiv
Top 18%
0.0%
Show abstract

Predicting response to induction chemotherapy (IC) and overall survival (OS) is critical for optimizing treatment in patients with locally advanced nasopharyngeal carcinoma (LANPC). This study aimed to develop and validate a multi-task deep learning model integrating pretreatment MRI and whole slide images (WSIs) to predict IC response and OS in LANPC. Pretreatment MRI and WSIs from 404 patients with LANPC were retrospectively collected to construct a multi-task model (MoEMIL) for the simultaneous prediction of early IC response and OS. MoEMIL employed multi-instance learning to process WSIs, PyRadiomics and a convolutional neural network (ResNet50) to extract MRI features, and fused multimodal features through a multi-gate mixture-of-experts architecture. Clustering-constrained attention multiple instance learning and gradient-weighted class activation mapping were applied for visualization and interpretation. MoEMIL effectively stratified patients into good and poor IC response groups, achieving areas under the curve of 0.917, 0.869, and 0.801 in the train, validation, and test sets, respectively, and outperformed the deep learning radiomics model, the pathomics model and TNM staging. The model also stratified patients into high- and low-risk OS groups (P < 0.05). MoEMIL shows promise as a decision-support tool for early IC response prediction and prognostication in LANPC. Author SummaryWe have developed a deep learning model that integrates two types of medical images, including magnetic resonance imaging (MRI) and digital pathological slices, to simultaneously predict response to induction chemotherapy and prognosis in patients with locally advanced nasopharyngeal carcinoma. Current treatment decisions primarily rely on traditional tumor staging (TNM), which often fails to comprehensively reflect the complexity of the disease. Our model, named MoEMIL, was trained and tested on data from 404 patients across two hospitals and consistently outperformed both single-model approaches and TNM staging methods. By identifying patients who exhibit poor response to induction chemotherapy or higher prognostic risk, our tool can assist clinicians in achieving personalized treatment, enabling intensified management for high-risk patients and avoiding unnecessary side effects for low-risk patients. Additionally, we visualize the models reasoning process through heat map generation, which highlights the image regions exerting the greatest influence on prediction outcomes. This work represents a step toward more precise treatment for nasopharyngeal carcinoma; however, larger-scale prospective studies are required before the model can be integrated into routine clinical practice.

18
Algorithm-Based Model for Gastrointestinal and Liver Histopathological Analysis Using VGG16 and Specialized Stains: Statistical Validation of Thresholds in AI-Driven Digital Pathology

Adeluwoye, A. O.; Gbadegesin, M. O.; James, F. M.; Otegbade, P. S.; Alabetutu, A.

2026-04-11 pathology 10.64898/2026.04.08.26350456 medRxiv
Top 18%
0.0%
Show abstract

Digital pathology, coupled with advanced image recognition algorithms, represents a transformative frontier in histopathological diagnosis. This sub-Saharan African laboratorys exploratory study investigates the application of a Convolutional Neural Network (CNN) model, specifically leveraging the VGG16 architecture with transfer learning, for automated analysis and classification of selected gastrointestinal (GIT) and liver tissue samples, incorporating both routine and specialized staining protocols. The study utilized a dataset comprising 114 samples (18 liver, 96 GIT images) derived from archival formalin-fixed paraffin-embedded tissue blocks at University College Hospital, Ibadan, Nigeria. Specialized staining techniques included Alcian Yellow for GIT mucin visualization and Massons Trichrome for liver fibrosis assessment, alongside conventional H&E staining. Model performance was evaluated using statistical methodologies including Wilson Score confidence intervals (CI), Bayesian probability assessment, and effect size analysis. Results reveal a striking dichotomy in model performance. The GIT tissue model achieved perfect classification accuracy (100% test accuracy) with exceptional statistical significance (Z=10.0, p<0.0001), Wilson CI [96.29%, 99.99%], Cohens h=1.571, and Bayesian probability >99.99%. Conversely, the liver tissue model demonstrated diagnostic failure (42.86% test accuracy), with Z=-1.428, p=0.9236, Wilson CI [33.59%, 52.65%], Cohens h=-0.144, and Bayesian probability of 7.64%. This performance divergence correlates with training data availability, as the liver dataset fell far below empirically established thresholds (>100-200 samples) for reliable classification. The liver models failure reveals limitations in transfer learning with insufficient data. These findings underscore critical implications for AI-enhanced digital pathology, demonstrating potential deployment of the GIT model as a promising one that supports tissue-specific model development.

19
Dissecting PON1 Genotype Combinations Modulating Paraoxonase Activity and Risk of Dysglycemia and Liver Fibrosis

Herrera, L.; Meneses, M. J.; Ribeiro, R. T.; Gardete-Correia, L.; Raposo, J. F.; Boavida, J. M.; Penha-Goncalves, C.; Macedo, M. P.

2026-04-13 endocrinology 10.64898/2026.04.09.26350501 medRxiv
Top 18%
0.0%
Show abstract

Background & AimsMetabolic disorders such as dyslipidemia, metabolic dysfunction-associated steatotic liver disease (MASLD), and diabetes are promoted by chronic pro-inflammatory and pro-oxidative states. Paraoxonase 1 (PON1), a liver-derived HDL-associated enzyme, plays an important antioxidant role by hydrolyzing oxidized lipids and protecting against oxidative stress- induced damage. Genetic variation in PON1, particularly in promoter and coding regions, modulates enzyme expression and activity, thereby influencing susceptibility to metabolic and cardiovascular diseases. This study investigated the genetic determinants of serum paraoxonase (PONase) activity and their relationship with dysmetabolic phenotypes. MethodsA genome-wide association study was conducted in 922 Portuguese individuals from the PREVADIAB2 cohort. Genetic variants and haplotypes related to PONase activity were analyzed, and associations with dysglycemia and liver fibrosis were evaluated in individuals aged over 55 years. ResultsWe identified two key PON1 variants as determinants of PONase activity: rs2057681 (in strong linkage disequilibrium with the non-synonymous Q192R variant) and rs854572 (located in the promoter region). Analysis of rs854572-rs2057681 haplotypes revealed that specific combinations differentially modulate PONase activity and confer risk or protection for dysglycemia and liver fibrosis, depending on the rs2057681 genotype context. Notably, although PONase activity was strongly associated with PON1 variants, it did not directly correlate with dysmetabolic phenotypes, suggesting that genetic context and haplotype structure, rather than enzyme activity alone, shape disease susceptibility. ConclusionsThese findings highlight the complex genetic architecture of PON1 and its role in metabolic disease risk, supporting the use of PON1 genetic information to uncover predisposition to dysmetabolic conditions. Our results provide insights into the interplay between PON1 genetics, enzyme function, and dysmetabolism, with implications for risk stratification in metabolic liver disease. Lay SummaryPON1 is a liver-derived gene that encodes an enzyme involved in protection against oxidative stress, a key contributor to metabolic liver disease and diabetes. In this study, we found that specific combinations of PON1 genetic variants are associated with abnormalities in blood glucose regulation and with markers of liver fibrosis. These associations were dependent on genetic configuration rather than enzyme activity alone, suggesting that PON1 genetic information may help identify individuals at higher risk of metabolic liver disease.

20
A multidomain intrinsic capacity score tracks longitudinal health trajectories in the UK Biobank

Zhai, T.; Babu, M.; Fuentealba, M.; Al Dajani, S.; Gladyshev, V. N.; Furman, D.; Snyder, M.

2026-04-13 epidemiology 10.64898/2026.04.10.26350621 medRxiv
Top 20%
0.0%
Show abstract

Quantitative measures for tracking functional health have generally been lacking. Intrinsic capacity (IC) has been proposed as an appropriate measure, but its metrics have been derived in small datasets and sparse longitudinal data. Using harmonized measures of cognition, locomotion, sensory function, vitality, and psychological well-being from 501,615 UK Biobank participants and followed for a median of 15.5 years, we derived domain-specific and composite IC scores. We examined associations with incident disease, cause-specific mortality, multimorbidity, lifestyle and socioeconomic factors, and multi-omic profiles from Olink proteomics, NMR metabolomics, clinical biochemistry, and blood-cell traits. We found that composite IC declined non-linearly with age, and within-person decline was steeper than the cross-sectional age measures. Participants with greater baseline morbidity, those who subsequently developed incident disease, and those who died earlier in follow-up showed lower IC trajectories across adulthood. The IC domains were only modestly correlated with one another, supporting multidimensionality, yet higher overall IC was associated with lower risk of most diseases examined. The dominant IC domain varied by endpoint, with cognition informative for dementia, sensory function for hearing loss, psychological capacity for depression, locomotion for osteoarthritis, and vitality for cardiometabolic outcomes. IC was also associated cross-sectionally with physical activity, insomnia, smoking, medication burden, and socioeconomic disadvantage. More proteins were found predictive for vitality, and enrichment converged on immune/inflammatory and metabolic pathways. Blood-based surrogates recapitulated part of the phenotypic signal, particularly for vitality. Overall, this IC framework captures longitudinal health trajectories and broad disease vulnerability in a large middle- to older-aged cohort and supports IC as a clinically meaningful, multidomain phenotype of aging and identifies blood-based correlates that may facilitate at-scale future monitoring of aging-related function declines.