Back

Metabolites

MDPI AG

Preprints posted in the last 7 days, ranked by how well they match Metabolites's content profile, based on 50 papers previously published here. The average preprint has a 0.05% match score for this journal, so anything above that is already an above-average fit.

1
Protocol for constructing correlation-based molecular networks from large-scale untargeted metabolomics data

Lin, H.; Zhang, L.; Lotfi, A.; Jarmusch, A.; Lee, I.; Kim, A.; Morton, J.; Aksenov, A. A.

2026-04-21 bioinformatics 10.1101/2025.04.26.649581 medRxiv
Top 0.1%
15.0%
Show abstract

This protocol describes a computational approach for constructing correlation-based molecular networks from untargeted metabolomics data using MetVAE, a variational autoencoder-based framework. Complementing spectral similarity networks, it captures functional relationships re-flected in cross-sample correlations. The workflow imports metabolomics features and sample metadata, adjusts for compositionality, missingness, confounding, and high-dimensionality, esti-mates sparse metabolite correlations, and exports GraphML files for network visualization. In a hepatocellular carcinoma mouse model, it links lipid classes in high-fat-diet animals, suggesting an endogenous "auto-brewery" route to lipotoxic metabolites.

2
Metabolomic Profiling of Dried Blood Spots for Breast Cancer Detection: A Multi-Classifier Validation Study in 2,734 Participants

Anctil, N.; Hauguel, P.; Noel, L.-P.

2026-04-27 oncology 10.64898/2026.04.24.26351695 medRxiv
Top 0.1%
6.6%
Show abstract

Background. Breast cancer (BC) remains the most diagnosed malignancy and leading cancer-related cause of mortality in women worldwide. Although blood-based untargeted metabolomics has emerged as a promising modality for detecting early-stage BC, the clinical translation of this approach has been bottlenecked by two unresolved issues: (i) the field has almost exclusively relied on serum or plasma, which require venipuncture and cold-chain logistics, and (ii) machine-learning models reported on such data are frequently validated with protocols that are blind to analytical batch structure, producing optimistically biased performance estimates. Methods. We present a breast cancer detection study based on dried blood spots (DBS), an analytical matrix that enables self-collection and ambient-temperature shipping. A cohort of 2,734 participants (114 biopsy-confirmed BC cases; 2,620 non-cancer controls) was profiled by untargeted LC-MS/MS on a Thermo Scientific Orbitrap IQ-X coupled to a Vanquish UHPLC. A 39-metabolite panel meeting MSI Level 1 identification criteria was pre-specified a priori from the published breast-cancer metabolomics literature, frozen prior to LC-MS acquisition, and applied to the present cohort without any feature selection on the data. Six standard supervised-learning architectures (LASSO, Elastic Net, Linear SVM, PLS-DA, OPLS-DA, XGBoost) were evaluated on this pre-specified panel; OPLS-DA is reported only in the sex-matched subgroup analysis where a single-seed 5-fold stratified protocol permits a directly comparable fit. Per-batch control-median normalization is applied upstream; kNN imputation, log transform, and robust scaling are fit within each training fold. The evaluation battery comprises batch-aware StratifiedGroupKFold CV at single-seed (seed=42) with inter-seed SD quantified across 10 independent seeds, batch-aware nested CV, a 100-seed held-out 20%-batch validation with disjoint-batch isotonic probability calibration (30% calibration partition), PPV/NPV reporting at multiple operating points and three deployment prevalences, subgroup analyses by TNM stage and tumor grade, pathway-ablation sensitivity analysis, and a 1,000-iteration permutation test. Results. Under batch-aware evaluation (StratifiedGroupKFold, single-seed=42), AUC ranged from 0.914 to 0.949 across classifiers, with LASSO achieving 0.928 and XGBoost 0.949; inter-seed SD across 10 seeds was 0.002-0.006. At 95% specificity, LASSO reached 75.4% sensitivity and XGBoost 81.6%. Held-out batch validation (100 seeds) yielded mean AUC 0.912 for Elastic Net and 0.935 for XGBoost, confirming robust generalization. All 39 panel features showed high coefficient stability, and permutation testing on representative classifiers (LASSO, Linear SVM, PLS-DA) yielded p <= 0.001. Subgroup analyses showed weaker detection of stage IIA tumors (AUC 0.87, n=40) compared with stage IIB/IIIA (AUC 0.95), consistent with stronger metabolic signatures in more advanced disease. Bootstrap coefficient consistency of the Elastic Net classifier confirmed that all 39 panel features received a non-zero multivariate weight in >=80% of 100 stratified bootstraps. Conclusions. On this cohort of diagnosed, pre-treatment breast-cancer cases, DBS LC-MS metabolomic profiling delivers classification performance (AUC 0.928 for LASSO and 0.949 for XGBoost under batch-aware GroupKFold CV at single-seed=42; held-out AUC 0.912-0.935) that is robust across classifier families and biological pathways. The DBS matrix is non-radiating, self-collectable by finger-prick, and mailable at ambient temperature. Performance is weaker on stage IIA than on more advanced disease, and prospective validation in an independent asymptomatic screening cohort is required before clinical positioning as a decentralized triage modality.

3
A Translational Lc-Ms/Ms Framework For Lipid Biomarker Identification And Quantification In Human Plasma

David, M.; Adam, K.-P.; Li, D.; Lim, X. Y.; Hurrell, J. G. R.; Preston, S.; Peake, D. A.; Batarseh, A.

2026-04-21 biochemistry 10.64898/2026.04.16.718601 medRxiv
Top 0.1%
5.0%
Show abstract

Lipid metabolism is increasingly recognized as a hallmark of cancer, yet translating lipidomic discoveries into clinically actionable biomarkers remains constrained by analytical variability and limited standardized validation frameworks. This challenge is further compounded by a chicken-or-egg problem, where expensive standards and labelled internal standards are required to identify and quantitate target lipids, but the diagnostic importance of these targets is uncertain until they can be reliably measured. Previous work had indicated the potential of 48 lipid biomarker species for the prediction of breast cancer from plasma samples using high resolution liquid chromatography mass spectrometry. This study aimed to identify each of these 48 species and develop a quantitative method to determine the absolute concentrations of these lipids in plasma to provide the basis for the development of a clinical assay for use in breast cancer detection. In doing so, we present a pragmatic workflow that bridges lipid discovery with lipid identification and robust quantitative analysis. A curated library of 48 lipid species was established using authentic standards to verify plasma lipids through retention-time matching and high-resolution spectral comparison. In plasma, 41 lipids were confidently identified based on co-elution with standards and diagnostic fragment ions. Method qualification, including assessment of accuracy, precision, recovery, and linearity, was performed across all 48 lipids in parallel with identification, and 46 lipids ultimately met all predefined qualification criteria. Notably, practical constraints, including time, cost, and availability of authentic standards, necessitated performing identification and targeted method development in parallel, highlighting challenges inherent to translating lipidomics into commercial or clinical assays. This workflow provides a reproducible framework for harmonizing lipid identification and quantification, enabling the reliable integration of lipidomic data into biomarker discovery and clinical applications.

4
CGM glycemic persistence reflects OGTT dysglycemia

Zhang, R.

2026-04-23 endocrinology 10.64898/2026.04.22.26351476 medRxiv
Top 0.2%
4.1%
Show abstract

Aims The oral glucose tolerance test (OGTT) is effective for detecting post-load dysglycemia, but it is burdensome and therefore not routinely used. Continuous glucose monitoring (CGM) offers a convenient way to capture real-world glucose patterns, yet it remains unclear whether CGM-derived metrics reflect OGTT-defined dysglycemia. We therefore aimed to evaluate CGM-derived and clinical metrics for predicting OGTT 2-hour glucose, classifying OGTT-defined dysglycemia, and assessing day-to-day repeatability. Methods We analyzed a cohort with paired free-living CGM and OGTT. Multiple CGM-derived metrics and clinical measures were compared for prediction of OGTT 2-hour glucose, classification of OGTT-defined dysglycemia, and day-to-day stability. Predictive performance was assessed primarily by leave-one-out (LOO) R^2, and day-to-day repeatability by intraclass correlation coefficients (ICC). Results The glycemic persistence index (GPI), a metric integrating the magnitude and duration of glycemic elevation, was the strongest single predictor of OGTT 2-hour glucose (LOO R^2 = 0.439). GPI also showed strong day-to-day repeatability (ICC = 0.665) and ranked first on a combined prediction-stability score. For classification of OGTT-defined dysglycemia, HbA1c had a slightly higher AUC than GPI, but GPI plus HbA1c performed best overall, indicating complementary information. Conclusions GPI was a strong predictor of OGTT 2-hour glucose and showed a favorable balance between predictive performance and day-to-day stability, supporting its potential utility as a CGM-derived marker of dysglycemia.

5
GC-MS Profiling of Compounds produced by endophytic fungi ex-situ and from their host plants, Azadirachta indica and Melia azedarach collected in Kenya, Africa

Dill, R.; Amakhobe, T.; Oballa, G.; Ojenge, G.; Adibe, F.; Peng, J.; Okoth, S.; Osano, A.

2026-04-21 plant biology 10.64898/2026.04.16.719096 medRxiv
Top 0.2%
3.7%
Show abstract

Endophytic fungi residing within medicinal plants are emerging as prolific sources of structurally diverse bioactive secondary metabolites with applications in drug discovery. Azadirachta indica (Neem) and Melia azedarach (Melia), members of the Meliaceae family, are renowned for their rich phytochemical composition; however, the contribution of their endophytic fungi communities to this chemical diversity remains largely unexplored. Herein, endophytic fungi were isolated from leaves and bark of Neem and Melia collected in Kenya and cultured under distinct physical conditions, solid (plates) and liquid (broth) media to assess how culture environment influences compound production. Compounds were extracted and analyzed using gas chromatography-mass spectrometry (GCMS) to profile the chemical diversity associated with each endophytic fungi, physical culturing state and host plant. GCMS analysis revealed that while the host plant identity influences the presence of specific compounds, the dominant determinant of chemical diversity was intrinsic biosynthetic capacity of the endophytic fungi themselves. Several compounds were unique to endophytic fungi cultures, highlighting their role as independent sources of bioactive compounds. Culture conditions moderately influence metabolite profiles, demonstrating the importance of optimizing growth environments in experimental design and natural product bioprospecting. From the Neem samples, we found 53 compounds uniquely present in the broth samples (consisting of Neem powder and endophytic fungi), 22 found exclusively with the endophytic fungi from the Neem, and 31 compounds shared between the broth and the endophytic fungi samples. In Melia samples, 109 compounds were uniquely present in broth samples from Melia plant (consisting of Melia powder and endophytic fungi), 22 compounds were found exclusively with the endophytic fungi from the Melia, and 55 were shared between the broth and the endophytic fungi samples. Our comparative analysis assessed the Neem and Melia endophytic fungi exclusive samples and reported 12 shared compounds. 10 compounds were unique to Neem and 10 unique to Melia; however, their identities varied between the two categories. While GCMS enabled the identification of volatile and semi-volatile metabolites, future studies employing complementary metabolomic approaches, such as liquid chromatography-mass spectrometry (LCMS), ultra-high-performance liquid chromatography MS/MS (UHPLC MS/MS), or nuclear magnetic resonance (NMR) spectroscopy, would expand coverage to non-volatile, polar, and high molecular weight compounds, providing a more comprehensive understanding of endophyte-derived chemical diversity. These findings provide insights into the interplay between medicinal plants and their endophytes and establish a foundation for leveraging endophytic fungi from Neem and Melia as scalable sources of structurally complex natural products for pharmaceutical and biotechnological applications while minimizing ecological impact.

6
Systematic mass-spectrometry-guided metabolic fingerprinting elucidates diversity of specialized metabolites across the Brassicaceae

Wolters, F. C.; Woldu Semere, T.; Schranz, M. E.; Medema, M. H.; Bouwmeester, K.; van der Hooft, J. J. J.

2026-04-21 plant biology 10.64898/2026.04.17.719190 medRxiv
Top 0.2%
3.6%
Show abstract

O_LIPlants produce diverse bouquets of specialized metabolites (SMs), yet only a fraction of the vast phytochemical space has been explored to date. Comparative analysis of SM profiles can reveal hotspots of biochemical novelty, while systematic profiling across taxonomic levels does presently not cover large plant families. C_LIO_LITo study core and accessory SM profiles in the Brassicaceae plant family, we fingerprinted 14 species by Liquid-Chromatography Mass-Spectrometry (LCMS/MS). We develop standardized experimental and computational workflows integrating in silico annotation tools to study consensus compound class and substructure distributions of SMs. Furthermore, we investigate the congruence of chemotaxonomy and species phylogeny across an extended panel of 17 species. C_LIO_LIUnique metabolite profiles were outstanding in Camelina sativa, Capsella rubella, and B. vulgaris, with the largest unique terpenoid profile annotated in C. sativa, accounting for 33.5% and 55.6% in positive and negative ionization mode, respectively. Substructure motifs were found to overlap with compound class predictions, highlighted for triterpenoids in Camelinodae. Furthermore, dual-tissue chemotaxonomic clustering resembled relationships of Brassica subgenomes across tissues. C_LIO_LIWe anticipate that our systematic approach can serve as a blueprint for investigating biochemical diversity in other plant lineages and can boost the characterization of plant natural product pathways. C_LI

7
Role of Alanine Transaminase in Retinal Metabolic Homeostasis: Potential therapeutic target in retinal diseases

Chen, Q.; Zhang, T.; Zeng, J.; Yam, M.; Lee, S.; Zhou, F.; Zhu, M.; Zhang, M.; Lu, F.; Du, J.; Gillies, M.; Zhu, L.

2026-04-22 neuroscience 10.64898/2026.04.19.719493 medRxiv
Top 0.3%
2.8%
Show abstract

PurposeAlanine transaminases (ALT), encoded by the GPT gene, catalyzes the reversible conversion of pyruvate and glutamate to alanine and alpha-ketoglutarate, thereby correlating carbohydrate and amino acid metabolism. However, its role in the human neural retina remains unclear. This study aimed to explore the expression, localization, and metabolic function of ALT in the human neural retina and its potential involvement in retinal diseases. MethodsALT1 and ALT2 expression and localization were examined in the retinas of healthy and diabetic retinopathy (DR) donors via immunoblotting and immunofluorescence. ALT function was assessed in ex vivo human retinal explants using pharmacological inhibition with beta-chloro-L-alanine (BCLA), followed by the analyses of enzyme activity, tissue injury, and transcriptomic responses. Stable-isotope tracing with 13C-and 15N-labelled substrates combined with GC-MS was used to define ALT-dependent carbon and nitrogen fluxes in macular and peripheral retinas. Redox level (NADPH/NADP+) was also evaluated under tert-butyl hydroperoxide-induced oxidative stress. ResultsALT1 and ALT2 were both expressed in the human neural retina, with prominent localization in Muller glia and photoreceptor inner segments. ALT1 displayed a diffuse cytoplasmic distribution, whereas ALT2 demonstrated a punctate pattern consistent with mitochondrial localization. In DR retinas, ALT1 expression was spatially disorganized and heterogeneous, while ALT2 remained comparatively preserved. Inhibition of ALT with BCLA markedly reduced ALT activity without causing overt cytotoxicity or major transcriptional changes. Isotope tracing demonstrated that retinal ALT predominantly channels pyruvate-derived carbon into alanine, whereas alanine was minimally contributed to pyruvate production under basal conditions. ALT inhibition suppressed alanine synthesis and release, redirected nitrogen flux towards glutamate, glutamine, and aspartate, and uncovered distinct metabolic adaptations in macular but not peripheral retinas. Under oxidative stress, ALT inhibition induced the decrease of NADP+/NADPH ratio and LDH release, indicating improved redox balance and reduced tissue injury. ConclusionsALT is previously unrecognized as a regulator of carbon and nitrogen partitioner in the human neural retina, contributing to redox homeostasis under stress. The altered distribution of ALT1 in DR retina and the protective metabolic effects of ALT inhibition suggest ALT as a potential contributor to retinal metabolic vulnerability and a candidate therapeutic target in retinal diseases.

8
Metabolic fingerprinting of 17 Brassicaceae species across three tissues

Wolters, F. C.; Woldu Semere, T.; Schranz, M. E.; Medema, M. H.; Bouwmeester, K.; van der Hooft, J. J. J.

2026-04-21 plant biology 10.64898/2026.04.17.719198 medRxiv
Top 0.3%
2.5%
Show abstract

Plants produce the most diverse blends of specialized metabolites on earth. Natural products derived from plants are valuable resources for drug development, food chemistry, and crop resistance breeding. Phenotypes of specialized metabolite profiles can be captured by untargeted mass-spectrometry across species phylogeny, tissues, and genotypes. Here, we collected metabolic fingerprints of 17 Brassicaceae species across three tissues (paired leaf and root; flower) using liquid chromatography-tandem mass spectrometry (LC-MS/MS) in positive and negative ionization mode. Corresponding metadata has been refined for reuse according to ReDU guidelines, and for integration with public genomic and transcriptomic data. Standardization of in vitro growth conditions, and data processing workflows enables integration of acquired raw and processed data across platforms for single- and multi-omics analysis. Further, the inclusion of tissue-specific metabolic profiles across ploidy levels, as well as across crop species and wild relatives, makes this dataset a valuable resource for natural product discovery.

9
Non-invasive glucose monitoring vs iCGM: a systematic review and meta-analysis of accuracy and methodological challenges

Zhang, H.; Dromard, E.; Tsang, K. C. H.; Guemes, A.; Guo, Z.; Baldeweg, S. E.; Li, K.

2026-04-27 endocrinology 10.64898/2026.04.24.26351680 medRxiv
Top 0.5%
1.7%
Show abstract

Non-invasive glucose monitoring (NIGM) has been pursued for decades, yet no device has achieved regulatory approval despite numerous studies reporting high accuracy. This systematic review and meta-analysis of 32 studies (38 cohorts: 20 NIGM, 18 iCGM; N = 1,693) investigated methodological factors underlying this accuracy-regulatory gap. The pooled Mean Absolute Relative Difference (MARD) for NIGM (10.21%; 95% CI: 8.73-11.69%) showed no significant difference from iCGM (11.82%; 95% CI: 10.36-13.29%; p = 0.13), with extreme heterogeneity (I^2 = 95.2%). Meta-regression revealed that study duration was the strongest predictor of NIGM accuracy ({beta} = 3.94, p < 0.001), with MARD degrading from 8.7% in short-term to 15.2% in long-term studies, while iCGM accuracy remained stable. Only 15% of NIGM cohorts validated in the hypoglycemia range, compared to 89% of iCGM studies (p < 0.001). These findings suggest that reported NIGM accuracy is substantially influenced by methodological asymmetries.

10
Estimating protein isoform abundances with PAQu

Testa, L.; Klei, L.; Rengle, A.; Yocum, A.; Lewis, D. A.; Devlin, B.; Roeder, K.; MacDonald, M. L.

2026-04-22 genomics 10.64898/2026.04.20.719668 medRxiv
Top 0.5%
1.7%
Show abstract

A single gene can encode multiple versions of a protein, dubbed isoforms, with varying functionality. Cellular control of isoform abundances is critical for multiple aspects of biology and is only partially regulated by transcript levels. While long-read sequencing facilitates transcript quantification, quantifying the resulting protein isoforms on a large scale is a major challenge, complicating biological interpretation of transcript alterations. Standard "bottom up" mass spectrometry can assess only short portions of isoforms called peptides, and these peptides often map onto more than one isoform. We introduce PAQu, a novel Bayesian method that leverages multiomic information from the peptidome and transcriptome to provide accurate estimates of isoform abundance even when peptide mapping is ambiguous. PAQu offers several advantages over existing methods in a unified framework. It provides uncertainty quantification, integrates multiomic information for improved accuracy, and provides a rigorous framework for hypothesis testing. Extensive simulations show that PAQu consistently outperforms competing methods in detecting differentially expressed protein isoforms and estimating their abundances. We use PAQu to investigate differences in isoform abundance levels between people with schizophrenia and control subjects, confirming a long held hypothesis that levels of the C4A isoform of Complement Component 4 are increased in schizophrenia while C4B is not. These results demonstrate that PAQu can identify significant variations in isoform abundance levels not previously possible.

11
A Cross-Cohort Validated Plasma Lipid Biomarker Assay for Early Breast Cancer Detection Using Machine Learning

Huang, T.; Koch, F. C.; Peake, D. A.; Adam, K.-P.; David, M.; Li, D.; Heffernan, K.; Lim, A.; Hurrell, J. G.; Preston, S.; Baterseh, A.; Vafaee, F.

2026-04-23 oncology 10.64898/2026.04.23.26351564 medRxiv
Top 0.5%
1.5%
Show abstract

Early detection of breast cancer remains essential for improving clinical outcomes, and complementary non-invasive approaches are needed to support existing screening methods, particularly for women with dense breast tissue. We have previously reported plasma lipid biomarker discovery using untargeted high-resolution liquid chromatography tandem mass spectrometry (LC-MS/MS). In this study, we performed biomarker confirmation and developed machine-learning models applied to targeted plasma lipid measurements for the non-invasive detection of early-stage breast cancer across international cohorts with independent external validation. Targeted LC-MS/MS was used to quantify candidate lipid panels in plasma samples from European discovery cohorts (n = 554) and an independent Australian cohort (n = 266) used for external validation. Data-driven feature selection identified a 15-lipid panel with strong performance in European cohorts (AUC >= 0.94). External validation prior to confidence stratification yielded 76% sensitivity, 64% specificity, and an AUC of 0.81 in the Australian validation cohort. Clinical assay development requires iterative panel and model testing to support translational feasibility and performance in the intended-use population. An analytically viable panel, excluding lipids requiring complex and costly synthesis, achieved comparable accuracy with improved assay robustness. Confidence-based analysis showed enhanced performance for predictions made with moderate to high confidence, with sensitivity up to 89% and AUC up to 0.85, suggesting that ongoing research should focus on strategies to enhance diagnostic model confidence. Importantly, model predictions were independent of breast density, tumour size, grade, subtype, and morphology, indicating biological specificity of the lipid signature. These results demonstrate that calibrated machine-learning models applied to plasma lipid biomarkers can support non-invasive breast cancer detection. Expanding training datasets to include greater diversity will further improve performance in the ongoing development of this lipid-based detection approach.

12
Methodological and Clinical Validation of TholdStormDX v0.0.1: An Advanced Stochastic Engine for the Optimization of Thresholds and Multimarker Panels Applied to Oncology

Reinosa, R.

2026-04-27 oncology 10.64898/2026.04.24.26351692 medRxiv
Top 0.9%
0.9%
Show abstract

Introduction: The translation of biomarkers into binary clinical decisions requires the determination of precise cut-off points. This study validates the TholdStormDX v0.0.1 tool, a mathematical engine that employs Dual Annealing, 2- and 4-parameter logistic fitting, and vectorized Monte Carlo simulations for panel optimization under Boolean OR logic. Methods: The tool was evaluated using datasets from four diagnostic domains (Pulmonary Nodules, Hepatocellular Carcinoma [HCC], Cervical Cancer, and Breast Cancer), along with a prognosis-oriented analytical context (Breast Cancer). Validation followed a strict workflow: characterization and selection of the best individual and combined thresholds in the Training (Train) and Validation (Val) sets, using the Test set in a completely independent manner, solely to assess the model s performance and generalizability. Results: The tool enabled precise derivation of cut-off points for both individual biomarkers and multivariable combinations. Evaluation on the Test set objectively demonstrated in which scenarios a single biomarker outperforms a complex panel, promoting clinical parsimony. For example, in Breast Cancer diagnosis, an individual predictor outperformed the optimized panel (Sensitivity: 0.953 / Specificity: 0.952 in Test); conversely, in Hepatocellular Carcinoma, the multivariable combination showed superior performance compared to the single marker (Sens: 0.707 / Spe: 0.718 in Test). Additionally, the self-auditing system effectively flagged metric degradation when noisy variables were included, preventing potential issues. Conclusion: TholdStormDX v0.0.1 proves to be a robust and transparent bioinformatics platform for deriving clinical thresholds. Its main contribution lies in mitigating local minima and promoting clinical parsimony, enabling researchers to objectively identify when a single biomarker is sufficient and when a panel provides real added value. Furthermore, it transforms the problem of biological noise into a safety feature: by systematically warning about algorithmic instability, it prevents overfitting and ensures the clinical viability of medical decisions. Availability: The software is free and distributed under the GNU GPLv3 license. TholdStormDX v0.0.1 is written in Python, and its source code is available at the following GitHub address: https://github.com/roberto117343/TholdStormDX.

13
A systematic review and meta-analysis of the efficacy and safety of pharmacological treatments for obesity in adults: 2026 Update

Ciudin Mihai, A.; Baker, J. L.; Belancic, A.; Busetto, L.; Dicker, D.; Fabryova, L.; Fruhbeck, G.; Goossens, G. H.; Gordon, J.; Monami, M.; Sbraccia, P.; Martinez Tellez, B.; Yumuk, V.; McGowan, B.

2026-04-24 endocrinology 10.64898/2026.04.19.26351196 medRxiv
Top 0.9%
0.9%
Show abstract

This updated systematic review and network meta-analysis evaluated the efficacy and safety of obesity management medications (OMMs) in terms of reducing body weight and obesity related complications. Medline and Embase were searched up to 21 November 2025 for randomized controlled trials comparing OMMs versus placebo or active comparators in adults. The primary endpoint was percentage total body weight loss (TBWL%) at the end of the study. Secondary endpoints were TBWL% at 1, 2 and 3 years, anthropometric, metabolic, mental health and quality of life outcomes, cardiovascular morbidity and mortality, remission of obesity related complications, serious adverse events and all cause mortality. Sixty six RCTs (66 comparisons) were identified: orlistat (22), semaglutide (18), liraglutide (11), tirzepatide (8), naltrexone/bupropion (5) and phentermine/topiramate (2), enrolling 63,909 patients (34,861 and 29,048 with active compound and placebo, respectively). All OMMs showed significantly greater TBWL% versus placebo; tirzepatide and semaglutide exceeded 10% TBWL and showed the most favourable glycaemic effects. Semaglutide reduced major adverse cardiovascular events and all cause mortality. In dedicated complication specific trials, semaglutide and tirzepatide showed benefit on heart failure related outcomes; tirzepatide was associated with improved obstructive sleep apnoea syndrome and semaglutide with knee osteoarthritis pain remission. Tirzepatide and semaglutide were associated with improvements in metabolic dysfunction-associated steatohepatitis remission, and semaglutide with improvement in liver fibrosis. No OMMs were associated with an increased risk of serious adverse events. These updated results reinforce the need to individualize OMMs selection according to weight loss efficacy, complication profile and safety.

14
Onca: An Open 9B Language Model for Pancreatic Cancer Clinical Tasks

Shim, K. B.

2026-04-24 oncology 10.64898/2026.04.16.26351055 medRxiv
Top 1%
0.7%
Show abstract

Pancreatic ductal adenocarcinoma (PDAC) remains one of the deadliest solid tumors and continues to face low treatment-trial participation, fragmented evidence workflows, and labor-intensive ab- straction of unstructured clinical text. Existing oncology-focused language models show promise, but many depend on private institutional corpora, limiting reproducibility and practical reuse across centers. We present Onca, an open 9B dense model designed for four PDAC-relevant tasks: trial eligibility screening, case-specific clinical reasoning, structured pathology report extraction, and molecular variant evidence reasoning. Onca is fine-tuned from Qwopus3.5-9B-v3 with a single Un- sloth BF16 LoRA adapter on 37,364 training rows drawn from openly available sources. The evalu- ation spans 11 panels and compares Onca against Woollie-7B, CancerLLM-7B, OpenBioLLM-8B, and the unmodified Qwopus base. Onca achieves the strongest overall results on Trial Screening (81.6 F1), Clinical Reasoning (14.1 composite), Pathology Extraction (30.5 field exact-match), Pub- MedQA Cancer (68.3 macro-F1), and PubMedQA (66.5 macro-F1). The strongest gains appear in tasks closest to routine oncology workflow, especially trial review and pathology structuring. These findings suggest that clinically targeted pancreatic-cancer language models can be built from open data with competitive performance while remaining practical to train on a single workstation-scale GPU setup.

15
SIMO - Single Section Integrative Multi-Omics - spatial mapping of metabolites and lipids combined with region-specific proteomics in a single tissue slice

Hau, K.; Fecke, A.; Hormann, F.-L.; Groba, A.-C.; Melo, L. M. N.; Cansiz, F.; Allies, G.; Hentschel, A.; Chen, J.; Heiles, S.; Tasdogan, A.; Sickmann, A.; Smith, K. W.

2026-04-21 biochemistry 10.64898/2026.04.17.719206 medRxiv
Top 1%
0.7%
Show abstract

Technological advances in biomedical sciences have accelerated multi-omics research, enabling high-resolution spatial mapping of diverse molecular compound classes. However, integrating spatial omics often requires serial tissue sections, limiting the alignment correlation across modalities. We present a single-section integrative multi-omics (SIMO) workflow that combines metabolite and lipid imaging with histopathology and region-specific proteomics. Using MALDI-MSI, tissue staining, and laser microdissection (LMD), SIMO delivers comprehensive metabolic, lipidomic, and proteomic insight from the same sample. Using mouse cardiac tissue we develop, control, and validate the methodology resulting in [~]60 imaged lipids and [~]60 imaged metabolites at 20 {micro}m pixel size and subsequently spatial proteomics by LMD, detecting over 5,000 proteins from the same tissue. To demonstrate the capabilities of the workflow in preclinical context, we apply SIMO to a metastasizing melanoma PDX model, identifying over 100 spatially localized lipids and metabolites, and over 5,000 proteins across metastases and non-tumor tissues in liver. SIMO enables precise ROI selection, statistical comparison of protein regulation, and alignment of metabolic and lipidomics pathways across spatial omics and region-specific proteomics, demonstrating its value as a spatial multi-omics platform.

16
Systematic Benchmarking of Kinase Bioactivity Models Across Splitting Strategies and Protein Representations

Abbott, J. M.

2026-04-22 bioinformatics 10.64898/2026.04.20.719590 medRxiv
Top 1%
0.6%
Show abstract

Machine learning models for protein-ligand bioactivity prediction are increasingly used in computational drug discovery. However, reported benchmark performance is often sensitive to evaluation design. To further understand evaluation design strategies, we present a systematic evaluation of seven machine learning architectures for kinase inhibitor bioactivity prediction, spanning classical baselines (Random Forest, XGBoost, ElasticNet, multi-layer perceptron) and advanced neural approaches (Graph Isomorphism Network, ESM-2 protein embedding MLP, and a GNN-ESM fusion model). Using a curated ChEMBL-derived kinase activity dataset of 352,874 records across 507 human protein kinase targets, we evaluated all models under three splitting strategies of increasing stringency: random, scaffold-based (Bemis-Murcko), and target-held-out. We observed that Random Forest with Morgan fingerprints achieves near-equivalent or superior performance to all neural architectures under scaffold and target-based evaluation. On target-held-out splits frozen ESM-2 embeddings showed worse generalization, with ESM-FP MLP exhibiting the largest performance degradation. Learned graph representations (GIN) do not outperform fixed 2048-bit ECFP4 fingerprints at this data scale, and tree-based uncertainty methods outperform MC-Dropout implementations tested here on calibration and selective prediction metrics. A JAK kinase subfamily case study shows that protein-aware models achieved 79% top-1 selectivity accuracy versus 52% for pooled fingerprint models. However, stronger baselines using explicit target identity achieved 83-84%, indicating that ESM-2 embeddings in this study functioned primarily as an implicit target identifier. These results indicate that evaluation methodology and statistical rigor are major determinants of reported performance in bioactivity prediction. Benchmark design overview O_FIG O_LINKSMALLFIG WIDTH=177 HEIGHT=200 SRC="FIGDIR/small/719590v1_ufig1.gif" ALT="Figure 1"> View larger version (50K): org.highwire.dtl.DTLVardef@ccbae4org.highwire.dtl.DTLVardef@1020583org.highwire.dtl.DTLVardef@1b7ef76org.highwire.dtl.DTLVardef@ca685a_HPS_FORMAT_FIGEXP M_FIG C_FIG A curated ChEMBL kinase bioactivity dataset (352,874 records, 507 targets) was evaluated under three splitting strategies of increasing stringency. Seven model architectures spanning baselines, protein-aware, and graph neural approaches were each trained under 5-seed replication (105 total runs), with results analyzed across three complementary branches: the main 507-target benchmark, ESM-2 embedding ablation studies on a clean 92-target subset, and a JAK-family selectivity case study with stronger target-conditioned baselines

17
Daily feeding rhythms may play a role in the genetic variability of feed efficiency in growing pigs

Gilbert, H.; Foury, A.; Agboola, L.; Devailly, G.; Gondret, F.; Moisan, M.-P.

2026-04-21 zoology 10.64898/2026.04.17.719142 medRxiv
Top 2%
0.5%
Show abstract

AO_SCPLOWBSTRACTC_SCPLOWImproving feed efficiency in pigs is essential for reducing production costs and environmental impacts. This study examines the influence of circadian feeding rhythms and genetic polymorphisms on feed efficiency variability using two pig lines divergently selected for Residual Feed Intake (RFI) over ten generations. Feeding behavior was monitored using automatic concentrate dispensers, recording 6,494,097 visits from 3,824 pigs to analyze meal frequency, duration, and diurnal patterns. LRFI pigs ate less frequently, with larger meals and longer durations, they exhibited two distinct feeding peaks: one around 8:00 AM and a higher one at 5:00 PM and they consumed more feed during the diurnal period and less at night. HRFI pigs showed a smoother, less rhythmic feeding behavior with increased nocturnal intake. The differences between the two RFI lines became more pronounced as the number of generations of selection increased, suggesting a genetic basis. Feeding behaviors, including intake during the two main diurnal peaks, were found to be heritable (heritability estimates: 0.30-0.40) and genetic correlations were observed between feed intake and RFI, especially for intake between the two peaks. Then, we investigated the evolution of allele frequencies of single nucleotide polymorphisms (SNPs) in DNA sequences surrounding 10 core clock genes (ARNTL, CLOCK, CRY1, CRY2, NPAS2, NR1D1, PER1, PER2, PER3, RORA) along generations of selection. SNPs with significant frequency changes were mapped to regulatory regions and transposable elements, especially in HRFI line, suggesting potential functional impacts on circadian regulation. These results underscore the role of feeding behavior and genetic variation in feed efficiency, offering insights for breeding programs aimed at improving metabolic efficiency and sustainability in pig production.

18
A bidirectional interaction between the SREBP pathway and the LINC complex component nesprin-4 controls lipid metabolism

Al-Sammak, B. F.; Mahmood, H. M.; Bengoechea-Alonso, M. T.; Horn, H. F.; Ericsson, J.

2026-04-21 cell biology 10.64898/2026.04.18.719359 medRxiv
Top 2%
0.5%
Show abstract

This report identifies a bidirectional signaling axis connecting lipid metabolism to nuclear mechanotransduction, with the potential to control fatty acid/triglyceride metabolism. The sterol regulatory element-binding (SREBP) family of transcription factors control fatty acid, triglyceride and cholesterol synthesis and metabolism. The family consists of three members: SREBP1a, SREBP1c, and SREBP2, that are regulated by intracellular cholesterol levels and insulin signaling. The SREBP2-dependent control of the LDL receptor gene is a well-established target for cholesterol-lowering therapeutics and the activity of SREBP1c is an attractive target in metabolic disease. In the current report, we identify SYNE4 (nesprin-4), a component of the Linker of Nucleoskeleton and Cytoskeleton (LINC) complex, as a direct target of the SREBP family of transcription factors, and show that nesprin-4 in turn supports SREBP1c function. We identify functional SREBP binding sites in the human SYNE4 promoter and demonstrate that these are required for the sterol- and SREBP-dependent regulation of the promoter. Furthermore, we show that the endogenous SYNE4 gene is also regulated by SREBP1/2 and intracellular sterol levels. Interestingly, SREBP2 is responsible for the sterol regulation of the SYNE4 gene in HepG2 cells, while SREBP1 is the major regulator in MCF7 cells, demonstrating that diberent cell types use diberent SREBP paralogs to regulate the same promoter/gene. Importantly, we find that nesprin-4 is a positive regulator of SREBP1c expression and function in HepG2 cells and during the diberentiation of human adipose-derived stem cells. In summary, the current report identifies a novel regulatory interaction between lipid metabolism and the LINC complex. Importantly, we demonstrate that this signaling axis is bidirectional, forming a closed loop that has the potential to control SREBP1c activity and thereby fatty acid and triglyceride synthesis/metabolism. Based on our data, we propose that the nesprin-4-dependent regulation of SREBP1c could represent a novel therapeutic target in metabolic disease.

19
Simultaneous Inhibition of ACLY and OGDH Has a Synergistic Effect on Hepatocellular Carcinoma Cell Lines

Dehghan Manshadi, M.; Panchal, N. K.; Sun, L.-Z.; Setoodeh, P.; Zare, H.

2026-04-22 cancer biology 10.64898/2026.04.19.716936 medRxiv
Top 2%
0.4%
Show abstract

Hepatocellular carcinoma (HCC) remains a leading cause of cancer-related mortality worldwide. Current treatments offer limited efficacy and no definitive cure, underscoring the urgent need for more selective and effective therapeutic strategies. This study investigated the synthetic lethality caused by co-targeting two metabolic genes, ATP citrate lyase (ACLY) and oxoglutarate dehydrogenase (OGDH), in HCC cells. Using valproic acid (VPA) and bempedoic acid (BA) as pharmacological inhibitors of OGDH and ACLY, respectively, we observed a strong synergistic effect in inhibiting the proliferation of HCC cell lines (Hep3B and Huh7), compared to using these drugs individually. Importantly, this combination treatment exhibited little increased cytotoxicity in the non-cancerous liver cell line THLE-2, indicating a degree of selectivity. Our findings are consistent with previous reports implicating USP13 as a metabolic regulator of ACLY and OGDH in various cancers, suggesting that the inhibition of USP13 may prevent HCC cell proliferation primarily through its downstream effects on ACLY and OGDH. By directly co-targeting ACLY and OGDH, our approach may offer a more precise and safer alternative to USP13 inhibition. Additionally, while both VPA and BA have been individually associated with beneficial effects in liver disease, their combined application in the context of HCC has not been previously investigated. Limitations include the reliance on cell line models, highlighting the need for validation in more physiologically relevant systems such as human organoids and animal models. Overall, this study provides a compelling rationale for further investigation into ACLY and OGDH as a synthetic lethal pair and the therapeutic potential of the VPA-BA combination treatment in HCC.

20
A Context-Aware Target Engagement and Pharmacodynamic Biomarker Resource to Accelerate Drug Discovery and Development

Yang, Y.; Zhao, L.; Orouji, S.; Zhu, Y.; Johnson, R. L.; Maxwell, D. S.; Mica, I.; Russell, K. P.; Al-lazikani, B.

2026-04-22 bioinformatics 10.64898/2026.04.19.719411 medRxiv
Top 2%
0.4%
Show abstract

Confirming target engagement in tumor experimental models remains a major challenge in oncology drug development. Pharmacodynamic biomarkers can help address this, but few systematic resources link drug targets to candidate biomarkers. We developed TargetTrace, a comprehensive resource to identify and prioritize pharmacodynamic biomarkers across nine key target classes, including transcription factors/cofactors, kinases, phosphatases, ubiquitin ligases, deubiquitinases, acetyltransferases, deacetylases, methyltransferases, and demethylases. Biomarker candidates were gathered from curated molecular interaction resources and refined using external annotations to improve accuracy. For enzyme targets with measurable substrate changes, we applied a two-agent large language model workflow, followed by manual review, to harmonize antibody information from the antibody resources and ensure that the selected biomarkers are measurable with existing laboratory tests. From more than 92,000 input interactions and over 2,300 targets, we compiled 71,323 target-biomarker relationships involving 2,270 potential drug targets, encompassing both transcription factor/cofactor-target gene and enzyme-substrate interactions. Commercial antibodies were available for over 1,400 biomarkers, supporting laboratory validation. This resource provides a structured and reusable resource for systematic identification and prioritization of pharmacodynamic biomarkers in oncology.