eBioMedicine
○ Elsevier BV
Preprints posted in the last 30 days, ranked by how well they match eBioMedicine's content profile, based on 130 papers previously published here. The average preprint has a 0.13% match score for this journal, so anything above that is already an above-average fit.
Oppong, A. E.; Louden, K.; HOLLOWAY, A.; ROSSI, L.; McDonnell, T. C. R.; Robinson, G. A.; ARULKUMARAN, N.; Manson, J. J.; Jury, E. C.
Show abstract
Haemophagocytic lymphohistiocytosis (HLH) is a rare, life-threatening hyperinflammatory syndrome characterised by uncontrolled immune activation. Reduced high- and low-density lipoprotein cholesterol and hypertriglyceridaemia are reported in HLH, suggesting lipid metabolism disturbances although in-depth serum metabolomic analysis is lacking in HLH. Here a lipid-focused NMR spectroscopy platform was used to define the serum metabolomic landscape of adults hospitalised with HLH compared to adults with sepsis (HLH-mimic) and rheumatic disease (potential HLH drivers/triggers), following surgical resection of solid organ cancer (non-infectious acute inflammation controls) and healthy controls (HCs). Serum metabolites distinguished HLH from HCs with high accuracy (>91.36%) using multiple machine learning models. The top classifying features included elevated apolipoprotein-B (ApoB)-containing low, intermediate, and very low-density lipoprotein particles; and lipoprotein remodelling towards triglyceride enrichment and cholesterol depletion. Differentially abundant metabolites in HLH compared to all control groups were enriched in pathways related to lipid metabolism including: 'Lipid particles composition', 'Plasma lipoprotein clearance', 'Plasma lipoprotein remodelling', 'Glucose homeostasis' and 'Amino acid metabolism'. Metabolomic results were validated using matched whole blood RNA-sequencing which identified differentially expressed genes enriched in metabolic modules associated with lipid, amino acid, and glucose metabolism, supporting a coordinated metabolic dysregulation in HLH from a transcriptomic to metabolomic level. Finally, twenty-seven metabolites including ApoB-containing, triglyceride-rich lipoproteins and saturated fatty acids distinguished HLH from all disease controls (AUC>0.70) either alone or combined as a metabolomic signature. Elevated ApoB and ApoB:ApoA1 ratio in HLH vs sepsis and HCs were validated by ELISA, supporting their utility as biomarkers to distinguish HLH from other hyperinflammatory syndromes.
Hauguel, P.; Anctil, N.; Noel, L.-P.
Show abstract
Background. Plasma and serum metabolomic studies of myalgic encephalomyelitis / chronic fatigue syndrome (ME/CFS) have repeatedly implicated hypometabolic, lipid, mitochondrial, redox and tryptophan-kynurenine pathways, but prior cohorts have been modest in size and have used heterogeneous case definitions. Whether similar pathway-level signals are detectable at scale in dried blood spots (DBS), across questionnaire-derived fatigue constructs and across orthogonal LC gradients in the same individuals remains unresolved. Methods. We profiled DBS extracts from 1,784 community-cohort adults by reverse-phase LC-MS using paired 5 min and 15 min gradients. Six questionnaire-derived endpoints captured a pragmatic self-reported PEM-like phenotype, a DSQ-derived PEM-like construct, high or review clinical status, temporal fatigue state, comorbid fatigue and self-reported chronic fatigue. The locked primary endpoint for Phase 1 was pragmatic_fatigue_pem with 226 cases and 914 controls after excluding major metabolic comorbidity. We tested a biology-first panel comprising 22 literature-curated metabolites represented by four participant-level descriptors each, and evaluated three discovery extensions: a targeted m/z search of additional literature candidates, a hypothesis-free univariate screen across 4,553 5 min and 5,625 15 min consensus features, and pairwise z-difference ratios. Endpoint-specific Ridge classifiers were evaluated by five-fold out-of-fold AUC with bootstrap stability filtering. Cross-gradient agreement was assessed by per-metabolite AUC concordance between paired 5 min and 15 min profiles. Severity was modelled as an ordinal grade derived from the number of fatigue criteria met and chronic-fatigue-form status. Results. The biology-first DBS panel achieved out-of-fold AUC 0.81 for the pragmatic self-reported PEM-like endpoint (226 cases / 914 controls). The DSQ-derived PEM-like construct reached AUC 0.60 (57 cases / 201 controls) on the un-filtered set and AUC 0.778 (SD 0.013, twenty seeds) in a post-hoc signature-decomposition follow-up restricted to participants without a self-declared major-metabolic-history tag (29 cases / 230 controls); both are treated as construct-validity anchors rather than as provoked or clinically adjudicated PEM. An optimised operationalisation of the same construct (panel-self normalisation, restriction to non-comorbid participants and demographic covariates) reached AUC 0.71 (95 % CI 0.55 to 0.76), and an exploratory age-stratified signature decomposition suggested age-dependent pathway composition that requires confirmation given small per-stratum case counts. Stable contributors mapped to carnitine-shuttle, TCA-cycle, redox-thiol and tryptophan-kynurenine pathways. Cross-gradient analysis of 22 matched metabolites yielded Pearson r = 0.62 for signed univariate effects (p = 0.002; 68 % directional agreement). The metabolomic score increased with severity grade (Spearman rho = 0.45, p = 4 x 10^-91; median scores 0.24, 0.51 and 0.75 across grades 0, 1 and 2). Sensitivity analyses on the covariate-complete subset (n = 565; 138 cases / 427 controls) showed that the DBS signal was robust to adjustment for age, sex, BMI and medication burden (DBS-only AUC 0.76, DBS plus covariates 0.78, covariates only 0.64), and produced a metabolomic-specific lift of approximately 0.13 AUC over the strongest anti-leak declarative cross-form questionnaire baseline (AUC 0.63). DBS-only AUC was stable across sex, age and BMI subgroups, and a 1:4 nearest-neighbour matched analysis on age, sex and BMI yielded AUC 0.72 (95 % CI 0.67 to 0.77). The observed pattern supported pathway-level convergence with prior ME/CFS metabolomics literature, including carnitine shuttle, fatty-acid beta-oxidation, TCA cycle, redox-thiol, urea cycle, glycerophospholipid and tryptophan-kynurenine axes. In contrast, the hypothesis-free 15 min screen produced high-AUC features that mapped predominantly to environmental or technical signals, including pesticide, industrial-amine and mobile-phase artifact annotations; only one of eight top leads, a truncated oxidised phospholipid, was biologically plausible, and none had tandem-MS support. Conclusions. In this large community cohort, a literature-curated DBS metabolomic panel captured pathway-level biology associated with a questionnaire-derived PEM-like fatigue phenotype, showed directional concordance across LC gradients, scaled with symptom severity and remained robust to key demographic, anthropometric and anti-leak questionnaire baselines. The findings converge with several metabolic axes previously reported in ME/CFS plasma and serum studies, including carnitine-shuttle, TCA-cycle, redox-thiol, urea-cycle, glycerophospholipid and tryptophan-kynurenine pathways. They should not be interpreted as clinical validation of a diagnostic test, screening tool or objective provoked-PEM biomarker. Rather, they support at-home-compatible DBS metabolomics as a biologically grounded platform for future clinically adjudicated validation, decision-support development and longitudinal monitoring in fatigue and PEM-like syndromes. Because DBS contains cellular and plasma-derived components, matrix effects must be considered when comparing individual metabolites with venous plasma or serum studies, and hypothesis-free screening at this scale can preferentially surface exposome or technical variance unless molecular identification is enforced before biological interpretation.
Zhong, H.; Gao, M.; Ma, S.; Zhang, W.; Chen, N.; Jiao, K.; Zhu, B.; Song, J.; Yan, C.; Yue, D.; Xi, J.; Zhu, W.; Zhao, C.; Luo, S.
Show abstract
Histopathological evaluation of skeletal muscle biopsies relies on subjective, semi-quantitative assessment with no standardized grading system. We developed a four-tissue deep learning segmentation pipeline using Cellpose-SAM for myofiber instance segmentation, a pixel classifier for fat infiltration, and watershed detection for nuclei. We applied this pipeline to 478 H&E whole-slide images from two independent cohorts: HuashanMuscle (n = 79; China; myotonic dystrophy type 1 [DM1], n = 28; limb-girdle muscular dystrophy type R1 [LGMDR1, calpainopathy], n = 12; type R2 [LGMDR2, dysferlinopathy], n = 22; controls, n = 17) and GTEx (n = 399; United States; three-level myopathy spectrum). Thirty-seven unique morphometric features were extracted per sample. Nuclear centralization index (NCI) and fiber size variability coefficient (fiber CV) discriminated myopathy from controls (p = 1.3E-05, rank-biserial r = 0.69; and p = 2.9E-04, r = 0.58, respectively). DM1 showed the highest NCI (median 0.121), consistent with its centronuclear pathology, and NCI correlated with CTG repeat count (Spearman rho = 0.46, p = 0.042, n = 20). In the GTEx cohort, both biomarkers exhibited significant dose-response trends across the myopathy spectrum (Jonckheere-Terpstra p < E-04). The MyoPath Score, a logistic regression composite of seven pathology indicators trained on GTEx, achieved AUC = 0.788 (LOO-CV 0.735) and transferred to the independent HuashanMuscle cohort with AUC = 0.873 without retraining. Segmentation achieved Dice coefficients of 0.92 (myofiber), 0.95 (fat), 0.87 (nucleus), and 0.88 (connective tissue), with intraclass correlation coefficients exceeding 0.88. NCI and fiber CV provide objective, reproducible quantitative biomarkers for skeletal muscle pathology severity assessment with potential as standardized grading criteria and clinical trial endpoints.
TANG, W.; ZHANG, Z.
Show abstract
BackgroundThe discontinuation of Fasiglifam (TAK-875), a GPR40/FFAR1 full agonist, during Phase 3 clinical trials due to hepatotoxicity led to widespread abandonment of GPR40 as a viable therapeutic target for type 2 diabetes mellitus (T2DM). However, mechanistic evidence suggests that Fasiglifams hepatotoxicity arises from mitochondrial liability driven by high lipophilicity (aLogP = 5.31), rather than from on-target GPR40 signaling. We hypothesized that target-level failure was incorrectly inferred from compound-level safety concerns, and that superior candidates exist within publicly available databases. MethodsWe queried ChEMBL Release 36 (28 GB SQLite, 74 tables) for all compounds with documented GPR40/FFAR1 activity (UniProt: O14842). Compounds were filtered by EC50 [≤] 10 nM in nM units with standard relation "=". Drug-likeness was assessed using Lipinskis Rule of Five (Ro5), aLogP, molecular weight (MW), hydrogen bond donors/acceptors (HBD/HBA), and polar surface area (PSA). A parallel analysis of Therapeutic Target Database (TTD v10.1.01, 4,298 targets) provided clinical context. A real-world evidence (RWE) patient stratification framework was constructed using EMR data from tens of millions of patients with >10 years of longitudinal follow-up. ResultsOf 2,637 GPR40-active compounds in ChEMBL 36, 526 (19.9%) demonstrated EC50 < 100 nM and 102 (3.9%) demonstrated EC50 < 10 nM. Eight compounds met stringent drug-likeness criteria (Ro5 violations = 0, aLogP < 5.0, EC50 [≤] 1 nM). The lead compound (CHEMBL4859651) exhibited EC50 = 0.04 nM (8.75-fold more potent than Fasiglifam), MW = 297 Da (43% lower), and aLogP = 4.30 (19% lower), with zero Ro5 violations. Mean MW of the eight candidates was 317 {+/-} 28 Da versus 524 Da for Fasiglifam. A parallel GCK analysis identified a protein-protein interaction target (CHEMBL3885579, GCK-GKRP interface) harboring 40 exclusive compounds as an orthogonal strategy for partial GCK activation. ConclusionsSystematic cheminformatic analysis reveals that compounds with substantially superior activity and drug-likeness profiles relative to Fasiglifam exist within ChEMBL 36. Fasiglifams hepatotoxicity is attributable to compound-specific physicochemical properties, not GPR40-mediated toxicity. RWE patient stratification may further mitigate hepatotoxicity risk for next-generation GPR40 agonists. These findings argue for systematic reappraisal of GPR40 as a viable therapeutic target for T2DM.
Fan, J.; Rouilly, V.; Musvosvi, M.; Robert, M.; Albert-Vega, C.; Bondet, V.; Jasper, A.; Yu, X.; Malherbe, S.; Borie, R.; Peiffer-Smadja, N.; Sacre, K.; TERRIER, B.; Walzl, G.; Barry, C. E.; Tameris, M.; Scriba, T.; Duffy, D.
Show abstract
Tuberculosis (TB) continues to pose a significant global public health challenge with substantial patient morbidity and mortality. Current TB patient biomarkers lack sufficient resolution to inform treatment response and patient stratification. This necessitates the development of sensitive and reliable host biomarkers. We previously demonstrated the efficacy of TruCulture whole blood stimulation for differentiating asymptomatic TB from active pulmonary TB disease patients in endemic regions. Our systems immunology study expands upon this previous work by evaluating the potential of TruCulture to monitor longitudinal responses to TB treatment in patients from the Predict-TB trial before, during, and after 6 months of antibiotic therapy. We stimulated whole blood from TB patients (n=40) using TruCulture under four conditions (Null, Mycobacterium tuberculosis-antigen, LPS, and IL-1{beta}) at baseline (week 0), during treatment (weeks 16 and 24), and one-year follow-up post- treatment (week 72). 20/25 measured cytokines exhibited significant changes throughout treatment, with several continuing to evolve during post-therapy follow-up. Machine learning based analysis identified Mtb-Ag-induced IL-1RA (AUC = 0.90, 0.92, 0.95 at weeks 16, 24, 72) and LPS-induced NLRP3 (AUC = 0.94 at week 16) as the best protein and transcriptional biomarkers for distinguishing treated from untreated patients, strongly implicating the inflammasome response. Combining these results with the extent of lung disease assessed by FDG PET/CT scans, we showed direct disease relevance for these blood-based biomarkers. The identified biomarker profiles hold promise for improving TB patient care through early prediction of treatment responses, real-time therapy monitoring, and informed development of host-directed therapeutic strategies for clinical decision-making. Graphical abstract O_FIG O_LINKSMALLFIG WIDTH=146 HEIGHT=200 SRC="FIGDIR/small/723467v1_ufig1.gif" ALT="Figure 1"> View larger version (45K): org.highwire.dtl.DTLVardef@14a32eforg.highwire.dtl.DTLVardef@55f3d4org.highwire.dtl.DTLVardef@fb0137org.highwire.dtl.DTLVardef@10cf39e_HPS_FORMAT_FIGEXP M_FIG O_FLOATNOGraphical abstractC_FLOATNO Predict-TB clinical study overview and summary of TB-specific biomarkers identified from TruCulture whole blood stimulation system. C_FIG
Zhang, R.
Show abstract
Disposition index (DI) is an informative measure of {beta}-cell function adjusted for insulin resistance, but its assessment is procedurally demanding, requiring dynamic testing with timed sampling and insulin or C-peptide-based estimation of insulin sensitivity and secretion. A simple glucose-only metric derived from the oral glucose tolerance test (OGTT) could provide a practical approach to estimating DI. We developed the Recovery-Burden Index (RBI), a glucose-only geometric metric that quantifies post-peak glucose recovery relative to total glucose excursion during OGTT. Using densely sampled venous OGTT profiles with measured DI, RBI was evaluated for prediction of continuous DI by leave-one-out (LOO) cross-validated R2 and for discrimination of DI-defined {beta}-cell dysfunction by AUROC. Performance was compared with conventional glycemic metrics. RBI predicted continuous DI more accurately than conventional glycemic metrics, with LOO R2 of 0.43, Pearson r = 0.70, and Spearman{rho} = 0.75. RBI30-180 performed similarly, with cross-validated R2 of 0.42, Pearson r = 0.72, and Spearman{rho} = 0.75. RBI also discriminated DI-defined {beta}-cell dysfunction, with AUROC values of 0.90 for RBI and 0.91 for RBI30-180. Reduced sampling schedules preserved much of the RBI signal, whereas truncation at 120 min attenuated continuous DI prediction, supporting the contribution of late recovery-phase information. RBI extracts {beta}-cell-relevant information from the OGTT glucose profile using a single transparent glucose-only index. These findings highlight post-peak recovery as a key feature for estimating DI-associated {beta}-cell compensation and support further validation of RBI in extended or CGM-augmented OGTT settings.
Fieggen, J.; Simond, G.; Segal, B. M.; Noori, A.; Thakurta, A.; Butler, C. C.; Clifton, D. A.; Clifton, L.
Show abstract
Background. Blood-based biomarkers are increasingly proposed for identifying high-risk individuals before clinical disease and for making prevention-oriented trials more efficient. Prognostic enrichment can increase event rates, but trial efficiency also depends on whether the intervention effect is preserved in the enriched population. Methods. Using the UK Biobank Pharma Proteomics Project, we trained disease-specific proteomic risk scores (ProRS) from 2,916 plasma proteins with elastic-net Cox models. We compared ProRS, polygenic risk scores (PRS), and combined PRS--ProRS scores across ten incident diseases. We estimated cumulative incidence and theoretical two-arm time-to-event trial sample sizes across risk strata. To evaluate effect preservation, we examined six intervention-analogue exposure--outcome pairs spanning genetic (PCSK9/coronary artery disease, APOE/Alzheimer's disease, PPARG/type 2 diabetes, IL23R/Crohn's disease), behavioural (physical activity/all-cause mortality), and pharmacological (RAAS inhibitors versus calcium channel blockers/coronary artery disease) examples. Results. ProRS outperformed PRS for 9 of 10 diseases (median C-index 0.75 versus 0.61). ProRS and PRS were weakly correlated (median Pearson |r| = 0.04), and joint PRS--ProRS stratification identified groups with higher observed incidence than either score alone for several endpoints. In the top risk quartile, combined-score enrichment reduced theoretical required sample sizes by 32--74\% under a fixed 20\% relative hazard reduction. These gains were not always preserved when stratum-specific intervention-analogue effects were used. Effects were broadly preserved for APOE/Alzheimer's disease and physical activity/mortality. The PPARG/type 2 diabetes effect attenuated toward the null under all three score types, showing that event-rate enrichment does not guarantee effect preservation. For IL23R/Crohn's disease and the antihypertensive comparison, point estimates differed across score types -- preserved under polygenic but attenuated under proteomic enrichment -- but confidence intervals were wide and overlapping. Conclusions. Proteomic risk scores can identify high-event-rate populations for prevention-oriented trials, but event-rate enrichment alone is insufficient for trial design. Biomarker-guided enrichment should evaluate mechanism-specific effect preservation and may be preferable as a stratification or adaptive-design variable rather than as a restrictive eligibility criterion.
Ghosh, N.; Sinha, K.
Show abstract
BackgroundParkinsons disease (PD) gut metagenomic studies have repeatedly reported disease-associated shifts in microbial taxa, genes, and pathways. However, the field still lacks transparent trait-level indices that summarize biologically coherent microbial exposures. Curli fibres are extracellular bacterial amyloids produced by several Enterobacteriaceae and related taxa, and they provide a plausible microbiological bridge between gut microbial ecology, epithelial/immune interfaces, and alpha-synuclein-centered gut-brain-axis hypotheses. We introduced Curli Carrier Burden (CCB), a mathematically explicit, taxon-informed index that estimates the aggregate abundance of curated curli-carrier bacterial taxa in processed metagenomic profiles. MethodsA curated curli-carrier candidate panel was converted into an evidence-weighted taxon set. For sample s, CCB was defined as [Formula], where ais is the processed relative abundance of matched curli-carrier taxon i and wi is an evidence weight reflecting curli-carrier confidence. We evaluated CCB in five main PD gut metagenomic evidence streams: Wallen 2022, Integrated-US, Mao Central China, Romano non-Wallen, and DuruIC 2024. Results were interpreted cohort-wise rather than as a formal meta-analysis. ResultsThe CCB framework generated a reproducible sample-level microbial trait variable and enabled cohort-wise comparison of amyloidogenic bacterial burden. Wallen showed discovery-stage PD-associated elevation (724 samples; 31 matched curli taxa; Mann-Whitney p = 0.0020). Integrated-US provided supportive independent evidence (244 samples; 18 matched taxa; p = 0.0079). Mao Central China and DuruIC 2024 showed the same PD-greater-than-control direction by mean and median CCB, although their individual comparisons were not nominally significant. Romano non-Wallen provided a large multi-study analysis (600 samples; 29 matched curli-associated mOTUs taxa), with higher PD mean and median CCB in pooled analysis (p = 0.0036, Cliffs{delta} = 0.137) and cohort-sensitive behavior under study-stratified permutation (p = 0.1974). Additional processed-cohort checks indicated that CCB interpretability depends on taxonomic representation and matched curli-candidate coverage, reinforcing the value of explicit compatibility reporting. ConclusionsCCB is a novel, extensible, microbiology-informed index for quantifying amyloidogenic curli-carrier bacterial burden in processed gut metagenomic profiles. The current results support CCB as a useful exploratory trait-level variable for PD microbiome research and provide a principled route toward future raw-read, csg-operon, strain-resolved, and phenotype-aware studies of the curli-vagal PD axis.
Liang, M.; Song, Y.; Yang, L.; Li, H.-t.; Liu, G.; Guo, Z.; Liu, S.; Lei, Z.; Yang, S.; Wang, J.
Show abstract
Background Platinum refractory paediatric germ cell tumours (GCTs) carry a poor prognosis, with five year survival below 30% and no validated molecular stratification tool. The biological mechanisms underlying platinum resistance in this population remain poorly defined, limiting the development of targeted therapeutic strategies and early warning biomarkers. Methods We performed integrated plasma multi-omics profiling in 105 pediatric GCT patients (54 refractory and 51 treatment naive) using data-independent acquisition proteomics, untargeted metabolomics, and exploratory lipidomics. Candidate biomarkers were validated using ELISA and spatial multiplex immunofluorescence. Predictive models were constructed using logistic regression and evaluated by ROC analysis, calibration, and decision-curve analysis. Results Multiomics integration has revealed the coordinated dysregulation of sphingolipid metabolism, extracellular matrix remodeling, and immune checkpoint signaling in refractory diseases. Lipidomic analysis demonstrated a significant depletion of sphingolipid associated species, including lysophosphatidylserine, lysophosphatidylethanolamine, and phosphatidylserine. Proteomic profiling identified the upregulation of LAG3 and HTRA1, which was validated by ELISA. Multiplex immunofluorescence demonstrated the spatial enrichment of exhausted CD8 + LAG3 T cells adjacent to CK-PAN tumor cells in refractory tumors. A plasma biomarker panel integrating LAG3, HTRA1, and AFP showed improved discrimination of refractory disease (AUC = 0.821) compared with AFP alone. Conclusions Our study identified a sphingolipid HTRA1 LAG3 immune evasion axis as a defining molecular feature of refractory pediatric germ cell tumors and proposed a clinically applicable plasma biomarker panel for early risk stratification.
Goodman, M. O.; Alex, R. M.; Sands, S. A.; Azarbarzin, A.; Batool-anwar, S.; Pavlova, M. K.; Epstein, L. J.; Redline, S.; Cade, B. E.
Show abstract
Obstructive sleep apnea (OSA) is associated with a wide range of comorbidities, but the extent to which these follow predictable, age-dependent patterns is not well understood. Identifying such patterns could provide insight into OSA heterogeneity and its links to physiological measures of OSA. We trained age-dependent topic models (ATM) on longitudinal electronic health records from 36,426 patients with OSA in the Mass General Brigham Biobank. ATM organizes incident diagnoses into distinct comorbidity "topics," whose age-specific disease loadings represent predictive patterns linking related diagnoses across the life course. We applied the trained model to compute individual-level topic scores in independent data: a cohort of 11,689 OSA cases and 22,695 matched controls, and a cohort of 6,220 patients with polysomnography (PSG)-derived physiological measures. We identified 19 distinct age-dependent comorbidity profiles, all significantly associated with OSA case status (FDR-adjusted p<0.05). Topics reflected recognizable clusters including metabolic, neuropsychiatric, and immune-mediated conditions, and several were distinguished by age-of-onset of key comorbidities, such as early- vs late-onset asthma. Seventeen of the 19 topics were significantly associated with at least one of 13 PSG-derived physiological measures, including associations between cardiometabolic topics and the apnea-hypopnea index, sleep apnea specific hypoxic burden, and respiratory event-specific heart rate burden. These findings indicate that age-dependent comorbidity patterns distinguish meaningful OSA subtypes with differing prognoses and endophenotype associations. ATM offers insight into complex OSA comorbidity and suggests that age-informed, topic-based stratification may improve individualized risk assessment, interpretation of PSG findings, and targeting of clinical interventions.
Nagori, A.; Singh, P.; Firdos, S.; Devadiga, A.; Vats, V.; Gupta, A.; Bandhey, H.; Ailavadi, P.; Awasthi, R.; Narotam, N.; Mishra, A.; Lodha, R.; Sethi, T.
Show abstract
High-frequency physiological monitoring in ICUs can identify impending deterioration hours before clinical recognition yet extracting reliable early-warning signals from noisy vital-sign streams remains challenging. We present SIgnose, an interpretable prediction framework for early detection of abnormal shock index (SI), built from routinely monitored vital signs using physiologic variability and nonlinear time-series features. SIgnose was developed on the eICU Collaborative Research Database and externally validated on the MIMIC-III adult database and a pediatric SafeICU cohort (AIIMS New Delhi), with additional prospective validation in the pediatric ICU. We benchmarked three representation strategies: (i) engineered physiologic variability and nonlinear time-series features, (ii) deep learning, and (iii) Llama-3.1-8B embeddings with low-rank adaptation. Physiologic variability features consistently demonstrated superior cross-cohort generalization. The final model used 3,970 features from five vital signs to predict abnormal SI up to 8 hours ahead, achieving AUROC 0.861 (95% CI 0.859-0.863) and AUPRC 0.927 (95% CI 0.925-0.929) on eICU. External validation yielded AUROC 0.870 (95% CI 0.863-0.876) and AUPRC 0.935 (95% CI 0.930-0.940) on MIMIC-III, and AUROC 0.875 (95% CI 0.863-0.888) and AUPRC 0.915 (95% CI 0.898-0.930) on SafeICU; prospective pediatric validation (n = 88) achieved AUROC 0.885 (95% CI 0.868-0.902) and AUPRC 0.911 (95% CI 0.882-0.936). SHAP interpretability analysis identified heart rate variability, respiratory trend dynamics, and multi-scale blood pressure variability as key early-warning signatures. These findings establish SIgnose as a reproducible, low-compute, early-warning framework and demonstrate that physiologic variability features provide robust, generalizable representations for early deterioration detection across adult and pediatric critical care.
De Los Reyes, F. V. A.; Hayashi, S.; Saito, Y.; Ogawa, M.; Oya, Y.; Noguchi, S.; Nishino, I.
Show abstract
Caveolinopathies caused by CAV3 mutations present with heterogeneous clinical phenotypes ranging from asymptomatic hyperCKemia to limb-girdle-type muscular dystrophy. Although prior imaging studies have described commonly affected muscles, structured modeling of muscle involvement patterns in caveolinopathy has not been established. We analyzed whole-body skeletal muscle computed tomography imaging in eight patients with pathogenic or likely pathogenic CAV3 variants, comprising 14 imaging study samples. Fat infiltration across 43 muscles was graded using modified Mercuri scores. Computational multivariate analysis,including principal component analysis, clustering, and pseudotime modeling,was applied to characterize severity staging and distribution patterns. A statistically supported, stage-dependent continuum of muscle involvement was identified. Most samples demonstrated a distributed limb-girdle-predominant pattern with coordinated progression across muscle clusters. In contrast, one patient (three samples in longitudinal series) exhibited a compartment-restricted thigh-dominant pattern characterized by early posterior and medial thigh involvement. Rectus femoris showed consistent stage-dependent progression, while greater medial gastrocnemius involvement was associated with advanced severity. None of the patients exhibited clinical evidence of rippling muscle disease. These findings suggest that integrating semi-quantitative imaging with computational modeling may provide an objective framework for characterizing muscle involvement patterns in CAV3-related myopathy.
Adegboyega, B. B.; Ekanem, P. C.; Awolaja, O. O.; Osarietin, E.; Okorie, B.
Show abstract
ObjectiveDiabetic complications collectively represent one of the most urgent unresolved problems in medicine, yet the field continues to study them in near-complete isolation from one another. No unified framework has systematically characterised the shared and divergent molecular signatures of ten clinically critical metabolic transporters across all five major complications, cardiomyopathy (DCM), nephropathy (DN), retinopathy (DR), peripheral neuropathy (DPN), and atherosclerosis and vasculopathy (DAD), through an integrated, multi-method computational pipeline. This study was designed to address that gap directly. MethodsEleven GEO microarray datasets comprising 118 diabetic and 76 control samples were analysed through twelve sequential phases: differential expression analysis, pan-complication overlap, weighted gene co-expression network analysis (WGCNA), GO/KEGG functional enrichment with gene set enrichment analysis (GSEA), STRING protein-protein interaction (PPI) network construction, competing endogenous RNA (ceRNA) network mapping, transcription factor activity inference using a VIPER-style algorithm, immune cell infiltration estimation by single-sample GSEA, diagnostic biomarker modelling using LASSO logistic regression and Random Forest classification, CMap-style drug repurposing by connectivity scoring, and two-sample Mendelian randomisation (MR) employing four independent estimators (inverse-variance weighted [IVW], MR-Egger, weighted median, and weighted mode). ResultsCD36 was the only transporter to achieve significant dysregulation across three independently sourced tissue types (DN, DR, DPN; logFC range 0.88 to 2.18), whilst TLR4 exhibited the highest fold-change in the study (logFC = 3.88, DPN) and the greatest WGCNA module membership (kME = 0.976, DPN). SERCA2 was significantly downregulated in three complications (DCM, DN, and DR) at formal significance thresholds and trended negatively in the remaining two (DPN and DAD), constituting the most consistently suppressed transporter in the study. Its universal downregulation was explicable through four convergent mechanisms spanning transcriptional, oxidative, ceRNA-mediated, and transcription factor-level regulation, and was confirmed as causally relevant to diabetic cardiomyopathy by eQTL Mendelian randomisation (beta = -0.085, p = 0.005). miR-21-5p was identified as the dominant ceRNA regulatory bridge (betweenness centrality = 0.428; 6.7-fold above the second-ranked miRNA), with MALAT1 as the sole lncRNA hub active in all five complications. PPARgamma and TP53 repression emerged as the leading transcription factor-level explanations for the simultaneous metabolic and inflammatory dysregulation characteristic of the diabetic transcriptome. Immune deconvolution revealed DCM as immunologically quiescent, DN as comprehensively infiltrated (ten enriched cell types), and DPN as mast-cell-dominated, identifying a cellular mechanism for TLR4-driven neuroinflammation that has not previously been systematically characterised. GLUT4 achieved perfect diagnostic discrimination for DPN (AUC = 1.000, p < 0.001; LASSO coefficient = -2.143), whilst SGLT2 was the leading DAD diagnostic marker (AUC = 1.000, p = 0.002). Epalrestat was the sole pan-complication drug repurposing candidate (significant connectivity reversal in four of five complications). Mendelian randomisation confirmed causal effects of T2DM genetic liability on all five complications (all p < 0.0001, all four estimators concordant), and eQTL-MR identified TLR4 (beta = +0.073, p = 0.006) and CD36 (beta = +0.070, p = 0.008) as causal risk factors for DN, SERCA2 reduced expression as a causal driver of DCM (beta = -0.085, p = 0.005), and SGLT2 expression as a causal protector against DN (beta = -0.070, p = 0.013). ConclusionsThis twelve-phase investigation identifies a pan-complication CD36/TLR4 inflammatory dyad and a SERCA2 calcium-mitochondrial effector axis, both confirmed at seven independent analytical levels, including causal genomic inference. GLUT4 downregulation defines DPN at the diagnostic level with perfect accuracy and is explicable through a five-layer mechanistic chain from MODY transcription factor inactivation to ceRNA competitive pressure. Epalrestat warrants prospective evaluation beyond its established DPN indication. These findings collectively constitute the most comprehensive computational characterisation of metabolic transporter biology in diabetic complications to date. RESEARCH IN CONTEXTO_ST_ABSWhat is already known about this subject?C_ST_ABSThe five major diabetic complications (cardiomyopathy, nephropathy, retinopathy, peripheral neuropathy, and atherosclerosisare) individually well-characterised, and several key metabolic transporters, including SGLT2, CD36, TLR4, SERCA2, and GLUT4, have established roles in one or more of these conditions. Mendelian randomisation has confirmed that T2DM genetic liability causally increases the risk of each complication independently. However, no study has examined all ten major metabolic transporters across all five complications simultaneously, and the shared versus complication-specific regulatory architectures of these transporters remain entirely uncharacterised. What is the key question?Which metabolic transporters are consistently dysregulated across all five diabetic complications, which are complication-specific, and can their shared regulatory mechanisms, from RNA regulation through to causal genetic evidence be used to identify diagnostic biomarkers and actionable therapeutic targets that transcend individual complication boundaries? What are the key findings and their implications for the field?CD36 and TLR4 constitute a pan-complication inflammatory dyad confirmed at seven independent analytical levels, including Mendelian randomisation causal evidence (both p < 0.01 for diabetic nephropathy). SERCA2 is universally suppressed across all five complications and is a causal driver of diabetic cardiomyopathy by eQTL-MR (p = 0.005). GLUT4 is a perfect single-gene diagnostic for diabetic peripheral neuropathy (AUC = 1.000) and a causal renal protector. Mast cells are identified as the innate cellular effectors of TLR4-driven diabetic neuropathy. Epalrestat demonstrates pan-complication therapeutic potential beyond its licensed DPN indication. These findings provide a unified mechanistic framework and a translational roadmap grounded in causal genomic evidence, with implications for both complication-targeted and pan-complication therapeutic strategies.
Panchumarthi, L. Y.; Kataria, S.; Wu, Y.; Hu, X.; Fedorov, A.; Kwak, H. G.
Show abstract
Background. Fairness-aware machine learning increasingly targets demographic performance disparities in clinical prediction, yet whether standard bias mitigation strategies genuinely improve equity in physiological signal analysis remains unclear. Age-based disparities in photoplethysmography (PPG)-based heart rate prediction present a particular challenge, as age-related performance differences may reflect context-dependent physiological structure rather than correctable artifacts. Methods. We evaluated three fairness interventions, inverse-frequency weighting (IF), Group Distributionally Robust Optimization (GroupDRO), and adversarial debiasing (ADV), applied via fine-tuning of a PPG foundation model across three clinical datasets spanning intensive care unit, laboratory, and consumer wearable contexts. Outcomes were assessed using a 2x2 framework classifying each intervention-dataset combination by the joint direction of change in mean absolute error (MAE) and fairness gap (FG) across age groups, yielding four outcome types: genuine improvement (G), leveling down (L), selective benefit (S), and both worse (W). Results. Across nine intra-domain conditions, no intervention simultaneously improved both MAE and FG (0/9 genuine improvement). The dominant pattern was leveling down (5/9): FG decreased but was accompanied by MAE degradation, indicating that apparent fairness gains were achieved at the cost of overall predictive performance. Age-group difficulty ordering varied across clinical contexts at baseline and was not preserved under intervention. In 18 cross-domain transfer conditions, genuine improvement was rare (4/18) and observed exclusively in non-MIMIC source configurations; models fine-tuned on MIMIC-sourced data yielded no genuine improvements (0/6). Embedding-level representation changes following fine-tuning did not reliably predict fairness outcomes. Conclusions. Age-based fairness interventions in PPG heart rate prediction indicate a leveling-down pattern rather than genuine equity improvement, suggesting that age-related performance gaps reflect context-dependent physiological structure not fully addressable through standard bias mitigation. Cross-domain transfer further amplifies this instability. These findings suggest that fairness evaluation frameworks for age-stratified physiological prediction should account for context-dependent performance structure rather than treating observed gaps as correctable bias.
Chen, J.; Wang, J.; Du, S.; Chen, Y.; Li, K.; Song, J.; Liu, D.
Show abstract
Clinical pharmacokinetic (PK) modelling is constrained by sparse sampling, limited general-isability of single-drug models, and labour-intensive workflows, making it difficult to infer complete drug exposure from limited concentration observations. We present the Pharmacokinetic Foundation Model (PKFM), a grey-box Transformer framework pre-trained across 32 drugs that reconstructs concentration-time profiles from sparse concentration observations, dosing events, molecular descriptors, and physiological covariates while preserving output interpretability. In representative oral PK curves, three sparse input points recovered the principal absorption-elimination trajectory, achieving coefficient of determination (R2) = 0.992 for Midazolam oral and R2 = 0.990 for Verapamil oral. Using reconstructed curves in NONMEM (nonlinear mixed-effects modelling) improved covariance stability and individual prediction accuracy. Contrastive-learning embeddings supported Top-10 physiologically based pharmacokinetic (PBPK) candidate retrieval, with 75.6% of observations within the 2-fold range. A pharmacometrics-informed AI Agent (PM Agent) outperformed general-purpose programming tools in stability and pairwise win rate on a standardised modelling benchmark, with each run requiring human pharmaco-metrician confirmation before downstream use. These results support cross-drug pre-trained PK models as an information-completion layer for sparse PK evidence and a structured scaffold for the modelling workflow; clinical or regulatory use requires prospective validation, broader external benchmarking, and independent expert assessment.
Zhu, Z.; Shan, S.
Show abstract
BackgroundSeveral lipid ratios have been linked to obstructive sleep apnea (OSA) risk in NHANES, yet two questions central to clinical translation remain unanswered: how much of the association is carried by central adiposity, and whether the dose-response curve contains an actionable threshold. We addressed both for the remnant cholesterol-to-HDL-C ratio (RC/HDL-C). MethodsWe analysed 3,635 adults aged [≥]20 years from NHANES 2015-2018. OSA risk was ascertained from the Sleep Disorders Questionnaire. Multivariable logistic regression estimated odds ratios across three nested models. Restricted cubic splines and segmented regression characterised the dose-response and located the inflection point. Mediation by body roundness index (BRI) was quantified by nonparametric percentile bootstrap (1,000 resamples). Discrimination was compared by ROC analysis, with stratified and trimmed-sample sensitivity analyses. ResultsOSA risk was identified in 1,361 participants (37.4%). Each one-unit rise in RC/HDL-C carried 23% higher adjusted odds of OSA (OR 1.23, 95% CI 1.03-1.47); the highest quartile carried 49% higher odds than the lowest (P-trend < 0.001). The dose-response was nonlinear, with an inflection at RC/HDL-C = 0.232: below this point each 0.1-unit increase raised odds by 54% (OR 1.54, 95% CI 1.16-2.05); above it the curve plateaued. BRI mediated 82.7% of the total effect (ACME 0.039, P < 0.001), with the indirect pathway 2.8 times stronger in women. AUCs were 0.599 (BRI) and 0.564 (RC/HDL-C). ConclusionsRC/HDL-C showed a modest, threshold-shaped association with OSA risk in U.S. adults, with central adiposity (BRI) as the predominant mediating factor. These exploratory findings, based on questionnaire-defined OSA, warrant prospective validation in cohorts with polysomnography.
Jovanova, M.; Bruegger, V.; Svirhrova, R.; Fuchs, M.; Jin, Q.; Wortmann, F.; Mitter, M.; Bechny, M.
Show abstract
One in four adults has insulin resistance (IR), a modifiable driver of type-2 diabetes that can precede diagnosis by a decade. However, IR assessment remains clinic- and laboratory-based, limiting repeated population screening. We tested whether free-living wearable data can detect IR in adults with normoglycemia or prediabetes. Machine-learning models using continuous glucose monitor (CGM)-based glucose dynamics and smartwatch-based heart rate/heart rate variability were developed in Study 1 (N = 97) and externally validated without retraining in Study 2 (N = 61, 31% IR prevalence). The best-performing CGM-based model achieved AU-ROC = 0.873 [0.756-0.967] and AU-PRC = 0.816 [0.640-0.934], outperforming an anthropometrics-only baseline (AU-ROC = 0.749, AU-PRC = 0.593). Findings are the first to detect IR from wearables without blood tests or structured glucose challenges, with state-of-the-art comparable performance. By enabling continuous at-home screening, this approach can identify undetected at-risk individuals and trigger confirmatory blood tests to close detection gaps.
Ockenden, E. S.; Anguajibi, V.; Mpooya, S.; Ntegeka, B.; Mugume, T.; Nabatte, B.; Kabatereine, N. B.; Noble, A.; Chami, G. F.
Show abstract
Schistosomiasis causes a complex, difficult to diagnose form of liver fibrosis with high rates of life-threatening morbidity in resource-poor settings where there are often no trained sonographers. Protocols for diagnosis of schistosomiasis-related liver fibrosis have focused on difficult-to-acquire and subjective ultrasound images dependent on extensive expertise. Here we present SchistoTrackVideoNet, the first deep learning-based video model trained on easy-to-acquire standardised ultrasound video sweeps for classification of schistosomiasis-related liver fibrosis. This video-based classification model was trained and evaluated on video sweeps from 2140 participants aged 5--87 years from three districts in rural Uganda. We tested the model at a clinically-relevant sensitivity threshold ($\geq$90\%) and achieved positive predictive values of 0.0968--0.5556 for diverse presentations of liver fibrosis. Our findings show potential for the use of easy-to-acquire video sweeps for diagnosis of schistosomiasis-related liver fibrosis and our model provides a proof-of-concept for deep learning applied to liver ultrasound video for diagnosis of schistosomiasis-related liver morbidity.
Mulley, J. F.
Show abstract
Aims CGM devices report glucose only within fixed limits (typically 40-400 mg/dL; 2.2-22.2 mmol/L), truncating extreme values to a boundary ("capping"). We characterised prevalence, duration, and consequences of capping in type 1 diabetes trial data. Materials and Methods We analysed 46,990,617 CGM readings from 948 participants across four publicly available clinical trial datasets (Dexcom G4 Platinum or G6 sensors). Capping prevalence, run duration, and associations with age, HbA1c and sex were characterised across all datasets. In the 77 participants of the Replace-BG trial CGM-plus-blood glucose monitor (BGM) arm, CGM-derived metrics were compared with contemporaneous BGM measurements across 1,162 non-overlapping 14-day windows. Results Between 93.5% and 100% of participants had at least one capped reading, and capped values comprised 0.47-0.98% of all readings. In the three datasets for which duration could be calculated, over 70% of upper-cap runs exceeded 15 minutes and over one third exceeded 60 minutes. Upper-limit capping was inversely associated with age (Spearman {rho} -0.20 to -0.47, p[≤]0.002) in three of the datasets, and positively associated with baseline HbA1c ({rho} 0.39-0.62, p<0.001) in all four datasets. A within-participant analysis showed that capping burden did not predict CGM-BGM divergence in any summary metric (all p>0.2), and a systematic CGM-BGM offset in mean glucose and time in range (TIR) reflected the physiological lag between blood and interstitial fluid rather than capping artefact. Conclusions Sensor limit capping is near-universal in type 1 diabetes, produces sustained periods of right-censored glucose data disproportionately affecting younger patients, and does not substantially distort standard summary metrics at the population level. Clinicians and trialists should be aware that CGM data can confirm extreme glucose events but cannot quantify their severity.
Berna, A. Z.; Panganiban, J.; Liu, Y.; Logan, J.; Russo, P.; Aryal, A.; Hafertepe, K.; Abu-Alreesh, S.; DeBosch, B.; Stoll, J.; John, A. R. O.
Show abstract
Background & Aims: Metabolic Dysfunction Associated Steatotic Liver Disease (MASLD) is the leading cause of chronic liver disease in children. However, accurate, noninvasive diagnostic tools remain limited. Current screening methods are invasive or lack sensitivity. Breath-based volatile organic compound (VOC) analysis offers a simple approach with potential for point of care screening. This study aimed to identify and validate breath VOC signatures of pediatric MASLD. Approach & Results: We conducted a prospective IRB approved cohort study at the Childrens Hospital of Philadelphia (CHOP). Children aged between 7 and 20 years with MASLD (n=22), as defined by hepatic steatosis either by liver biopsy or imaging and 1 cardiometabolic risk factor, and a control group without MASLD (n=20) were enrolled. Breath samples were collected using a standardized protocol and analyzed by untargeted comprehensive two-dimensional gas chromatography-mass spectrometry (GCGCMS). Machine learning and unsupervised clustering were applied to identify discriminatory VOCs and assess heterogeneity. Untargeted GCGCMS analysis identified a distinct breath VOC signature in children with MASLD compared with non MASLD controls. A Random Forest model achieved a sensitivity of 73% and specificity of 65%, with AUC of 0.84. The VOC 2,4-dimethyl-1-heptene demonstrated strong diagnostic performance in the discovery cohort with a sensitivity of 85%, specificity of 77% and an AUC of 0.81. Unsupervised clustering revealed four MASLD subgroups with distinct volatile phenotypes associated with differences in liver enzymes and metabolic parameters. External validation in a second pediatric cohort confirmed reproducible reductions in o/p-xylene in subjects with MASLD. Conclusions: Pediatric MASLD is associated with a reproducible breath VOC signature identified by untargeted GCGCMS. These findings support breath analysis as a scalable, noninvasive screening and stratification tool for pediatric MASLD and warrant validation in larger, longitudinal studies.