Back

Life

MDPI AG

Preprints posted in the last 7 days, ranked by how well they match Life's content profile, based on 27 papers previously published here. The average preprint has a 0.02% match score for this journal, so anything above that is already an above-average fit.

1
Segmental Lung Sound Analysis in Obstructive Lung Diseases Using Electronic Stethoscope; a protocol to establish an acoustic repository

Anuradha, H.; Yasaratne, D.; GMRI, G.; Parakrama, E.; Severin, R.

2026-05-28 respiratory medicine 10.64898/2026.05.27.26354263 medRxiv
Top 0.6%
0.6%
Show abstract

Introduction Obstructive lung diseases (OLDs) are responsible for high rates of illness and death worldwide. Inflammation, chronic airflow limitation, and bronchial remodeling occur in OLD and eventually result in the unique respiratory sounds. Despite its subjective and having low reproducibility, still traditional auscultation using a manual stethoscope is the main method used to identify the lung sounds. Nevertheless, the combination of recent advancements in digital stethoscopes and AI (Artificial Intelligence) has permitted the objective measurement of lung sounds. Nevertheless, there is a lack of standardized, region-specific databases for AI training and validation. Even though lung sound classification is an emerging aspect in research and telerehabilitation the lobar wise acoustic pattern is still novel due to lack of prevailing database to train AI models. Identifying this gap this study aims to develop an acoustic repository and analyze the data using segmental lung sounds from patients with OLDs and healthy controls through an electronic stethoscope. Methods and analysis This is a cross sectional observational study involving 120 participants (60 OLD patients and 60 healthy controls). Lobar wise acoustic signals will be captured using an electronic stethoscope in healthy and diseases population. The data will be analyzed using Audacity software for annotations and then it will be used for feature extraction and statistical analysis. The acoustic features extracted through Audacity, will include frequency, intensity, pitch, and root mean square (RMS) energy. Repeated measures ANOVA will be applied to compare mean sound intensities across lung segments while Pearson correlation will be used to assess associations with body composition parameters. The data will then be standardized for AI-based diagnostic applications. Ethics and dissemination The study is being reviewed from the Ethics Review Committee, Faculty of Medicine, University of Peradeniya (2025/EC/87) will be sought. Informed consent will be obtained in writing. The dissemination of results will take place through peer-reviewed publications and the creation of a public database containing lung sounds from the region.

2
Generation and Evaluation of Realistic Synthetic Clinical Progress Notes for Prostate Cancer using Large Language Models.

Rey-Blanes, A.; Veredas-Morente, J.; Vivas-Vargas, E.; Gil-Garcia, F.; Moreno-Barea, F. J.; Veredas, F. J.

2026-05-28 health informatics 10.64898/2026.05.25.26354027 medRxiv
Top 1%
0.3%
Show abstract

Background and Objective: Access to real-world electronic health records (EHRs) remains limited by privacy, governance and annotation constraints, hindering the development of clinical natural language processing models. Realistic synthetic progress notes may provide EHR-like corpora that preserve clinically rigorous information on diagnoses, treatments, symptoms, imaging, laboratory findings and therapeutic trajectories without relying directly on sensitive patient records. This study evaluates whether large language models (LLMs) can generate realistic Spanish prostate cancer progress notes from published case reports, preserving clinical content, temporality and hospital-style conventions.

3
The Verification Gap: Artificial Intelligence Adoption, Hallucination Awareness, and Verification Practices Among Early Career Medical Researchers in Pakistan

Sajjad, M.

2026-05-30 health informatics 10.64898/2026.05.28.26354373 medRxiv
Top 2%
0.3%
Show abstract

Artificial intelligence (AI) tools have been rapidly adopted by medical researchers, yet whether early career researchers in low and middle income countries possess the awareness and habits needed to use these tools safely remains poorly documented. This study characterized AI adoption patterns, hallucination awareness, and verification and disclosure practices among early career medical researchers in Pakistan. A cross sectional anonymous online survey was conducted among medical students, house officers, residents, physicians, and faculty involved in research or academic work across Pakistan (May 2026). Descriptive statistics and chi square tests were applied to 373 eligible responses. AI use was near universal (99.7%), with 60.3% using AI tools daily. The most commonly reported tool in this sample was Claude (40.5%), followed by ChatGPT (29.2%) and Perplexity (26.0%), though this ranking likely reflects sampling characteristics. Despite high adoption, 59.2% typically did not verify AI outputs before use, and 40.2% had never heard that AI can generate fabricated scientific references. In behavioral vignettes, 36.5% assumed convincing AI generated references were authentic, and 54.2% would continue using remaining AI content after discovering one fabricated reference. Formal research training was strongly associated with consistent disclosure (51.7% vs. 17.1%; chi square=48.43, p less than 0.001). Role, daily use frequency, and research training were not significantly associated with verification behavior. Early career medical researchers in Pakistan demonstrate high AI adoption alongside incomplete hallucination awareness and infrequent verification, a pattern that may carry implications for research integrity. Formal training was the only factor significantly associated with consistent disclosure. Integration of AI literacy into medical curricula and institutional governance frameworks merits consideration.

4
Future Pandemics: AI-Designed Diagnostic Assays for Detection of Andes Orthohantavirus (ANDV) Associated with the 2026 MV Hondius Outbreak

MacSharry, J.; Tonda, A.; Lopez-Rincon, A.

2026-05-27 health informatics 10.64898/2026.05.26.26354101 medRxiv
Top 2%
0.3%
Show abstract

Andes orthohantavirus (ANDV), the primary etiological agent of hantavirus pulmonary syndrome (HPS) in South America, is uniquely capable of limited human-to-human transmission, posing a significant challenge for outbreak control. Recent events, including the 2018-2019 Epuyen outbreak and the 2026 MV Hondius incident, underscore the need for rapid, lineage-specific molecular diagnostics. In this study, we present an artificial intelligence (AI)-driven framework for the design of diagnostic primers targeting the S genomic segment of the Epuyen lineage. Using an evolutionary algorithm integrated with thermodynamic evaluation via Primer3Plus, candidate primers were optimized to maximize classification accuracy while satisfying stringent biochemical constraints. The resulting primer set enables amplification of lineage-specific regions suitable for molecular characterization and surveillance. In silico validation demonstrates that the proposed primers achieve perfect discrimination between 2026 outbreak sequences and other ANDV variants. Furthermore, in silico comparison with standard protocol-based primers reveals substantially reduced sensitivity and specificity in the latter, highlighting the limitations of static diagnostic designs when applied to evolving viral populations. Overall, this work demonstrates that AI-assisted primer design provides a robust and adaptable strategy to improve viral detection, enhance outbreak tracking, and support timely public health interventions. Integrating computational optimization into diagnostic development is essential for strengthening preparedness against emerging zoonotic threats.

5
Using artificial intelligence for radiotherapy clinical trial quality assurance: analysis of a multi-institutional clinical trial for neurovascular-sparing prostate stereotactic ablative radiotherapy

Doucette, M.; Zhang, Y.; Liao, C.-Y.; Lin, M.-H.; Yan, Y.; Dess, R. T.; Tendulkar, R. D.; Garant, A.; Hannan, R.; Jiang, S.; Nguyen, D.; Desai, N.; Yang, D. X.

2026-05-29 health informatics 10.64898/2026.05.27.26354252 medRxiv
Top 3%
0.2%
Show abstract

Our study evaluated whether a deep learning auto segmentation model combined with machine learning triage can streamline radiotherapy clinical trial quality assurance (QA). We analyzed 107 stereotactic ablative radiotherapy (SABR) cases from a multi-institutional phase II clinical trial of neurovascular sparing prostate SABR, focusing on physician contours of the internal pudendal artery (IPA) as a novel organ-at-risk with substantial interobserver variability. Contours were scored by the trial principal investigator as Per-Protocol or Minor Deviation/Unacceptable. We applied a deep learning model for IPA auto-segmentation. Agreement between human and AI contours was then quantified using 14 overlap, distance, and surface metrics, and a supervised classifier was trained on these metrics to flag clinical trial protocol deviations. While AI segmentation achieved only modest geometric accuracy with mean Dice similarity coefficient of 0.446 and 95th percentile Hausdorff distance of 14.23, when incorporating all 14 metrics, a machine learning classifier yielded AUROC of 0.836, flagging all Minor Deviation/Unacceptable cases with 100% sensitivity on the 27 case hold-out set with 6 false positives and no false negatives. AI segmentation combined with metrics-based machine learning can triage protocol deviations within a multi-institution radiotherapy clinical trial, supporting prospective evaluation of AI-assisted trial QA.

6
Application of SinoPlan in Trajectory Planning for Robot-Assisted Intracerebral Hematoma Puncture

Zhang, F. y.; Yao, J.; Zhou, Q. y.; fang, Y. c.; Hu, A.; Wang, Y.; Ding, W.; Wu, X.; Gu, Y.

2026-05-27 surgery 10.64898/2026.05.24.26353998 medRxiv
Top 3%
0.2%
Show abstract

Robot-assisted hematoma puncture has seen significant development in primary hospitals across the country. Sino Plan software system is the core of the intelligent surgical robot, independently developed by Sinovation.We conducted a comparative study of imaging indicators, such as residual hematoma volume and hematoma clearance rate, as well as prognostic indicators, in patients who underwent hematoma puncture at our hospital over a 9-year period, before and after the introduction of Sino Plan.The results indicated that following the application of Sino Plan, the hematoma clearance rate was significantly enhanced, and the residual hematoma volume was markedly reduced. Regarding patient prognosis, there was no significant difference in GCS scores between the two groups, but the incidence of adverse prognostic events was lower in patients where Sino Plan was utilized.In conclusion, this 9-year retrospective analysis at our hospital reveals that Sino Plan offers distinct advantages. However, its application in certain special cases suggests that further improvements to the software are warranted to better meet the demands of more specific clinical scenarios.

7
Optimizing Ambulatory Groin Hernia Repair in Public Healthcare Frameworks: A Prospective Analysis of Predictive Factors for Discharge Failure

Krichen, J.; SGHAIER, A.; Dhouib, R.; Souii, S.; Tioumi, M.; Sindi, S.; Faidi, B.; Ben Salah, K.

2026-05-29 public and global health 10.64898/2026.05.27.26354207 medRxiv
Top 3%
0.2%
Show abstract

Background Outpatient groin hernia repair is widely recommended globally due to clinical and socioeconomic efficiency, yet it remains underutilized in developing healthcare systems like Tunisia. This study aimed to evaluate the feasibility of a newly implemented day-surgery clinical pathway for groin hernias and identify specific predictors associated with outpatient discharge failure. Methods A prospective, observational cohort study was conducted at a Tunisian tertiary hospital between September 2023 and April 2024. A total of 85 consecutive patients scheduled for elective groin hernia repair under an optimized clinical pathway were enrolled. Inclusion criteria spanned ASA classes I-III, age [&ge;]16 years, proximity to the hospital [&le;]50 km), and presence of a literate adult caregiver. Outpatient failure (unanticipated admission) was defined as the inability to achieve discharge within 24 hours post-surgery. Statistical associations were determined using Chi-squared, Fisher's exact, and independent t-tests. Results The cohort primarily comprised males (n = 82, 96.5%) with a mean age of 56 years (range: 19-86). Successful ambulatory discharge was achieved in 80 patients (94.1%), yielding a failure rate of 5.9% (n = 5). Unanticipated admissions were triggered by uncontrolled pain (n = 1), acute anxiety (n = 2), decompensation of comorbidities (n = 1), and a Post-Anesthetic Discharge Scoring System (PADSS) score < 10 (n = 1). Overall 30-day morbidity was low (2.4%), presenting as minor wound or scrotal hematomas managed conservatively; no surgical site infections, acute urinary retention, or mortality occurred. Univariate analysis revealed that a hernial sac size measured at its maximum diameter between 1.5 and 3 cm was significantly associated with ambulatory failure (p = 0.047). General anesthesia showed a trend toward increased failure compared to regional anesthesia (p = 0.08). Conclusion Day-surgery groin hernia repair is highly safe and feasible in resource-constrained environments, even for elderly or stable ASA III patients, provided rigorous social criteria are satisfied. A small hernial sac size (1.5-3 cm) constitutes a major anatomical predictor of failure, likely due to distinct dissection dynamics and localized post-operative pain profiles.

8
DKK1 and CKAP4 expression is associated with cervical lymph node metastasis in tongue squamous cell carcinoma

Fujita, H.; Takahashi, O.; Yada, N.; Tanaka, J.; Haraguchi, K.; Morioka, M.; Yaginuma, T.; Sasaguri, M.; Kokabu, S.; Habu, M.

2026-06-01 dentistry and oral medicine 10.64898/2026.05.29.26354440 medRxiv
Top 3%
0.2%
Show abstract

Objective: To identify Dickkopf-1 (DKK1) as a prognostically relevant candidate in head and neck squamous cell carcinoma and to evaluate whether DKK1 and cytoskeleton-associated protein 4 (CKAP4) expression is associated with cervical lymph node metastasis in tongue squamous cell carcinoma (TSCC). Methods: DKK1 was screened using the Human Protein Atlas Pathology Atlas. Immunohistochemical expression of DKK1 and CKAP4 was examined in 54 patients with primary TSCC (cT1-4N0) treated surgically between 2015 and 2020. Nine cases were excluded because of insufficient tissue blocks or inadequate staining quality, leaving 45 evaluable cases. Associations with delayed cervical lymph node metastasis were assessed together with conventional clinicopathological factors, including infiltrative growth pattern (INF) and pathological depth of invasion (pDOI). Results: In public database analysis, high DKK1 expression was associated with poorer overall survival in head and neck squamous cell carcinoma. In the TSCC cohort, pDOI [&ge;]5 mm and INF pattern c were significantly associated with cervical lymph node metastasis. Positive DKK1 and CKAP4 expression were also significantly associated with cervical lymph node metastasis. Furthermore, combined DKK1/CKAP4 positivity, when incorporated with INF and pDOI, provided additional risk stratification, and cases with all 3 factors showed a markedly increased likelihood of cervical lymph node metastasis. Conclusions: Expression of DKK1 and CKAP4 was associated with cervical lymph node metastasis in TSCC. Combined assessment of DKK1/CKAP4 expression with INF and pDOI may improve pathological risk stratification and may help identify patients who require closer neck evaluation and postoperative management.

9
A Retrospective Evaluation of the Microsoft Healthcare Agent Orchestrator for Tumor Board Patient Summaries

Roy, J.; Korleski, J. B.; Augustin, R. C.; Yefet, L.; Jensen, Z. D.; Ehman, E. C.; Zadeh, G.; Conners, A. L.; Tevaarwerk, A. J.; Korfiatis, P.

2026-06-01 health informatics 10.64898/2026.05.22.26353812 medRxiv
Top 4%
0.2%
Show abstract

Background: Preparing tumor board patient summaries is time intensive. Large-language-model based systems may automate summarization but require real-world evaluation prior to clinical use. We performed an exploratory retrospective evaluation of the Microsoft Healthcare Agent Orchestrator (HAO), deployed in a Mayo Clinic controlled staged environment, to generate tumor board-style patient summaries from retrospective Electronic Health Record (EHR) notes. Methods: HAO generated summaries for breast, hepatobiliary, and neuro-oncology tumor board cases using up to the most recent 1,000 clinical notes. Clinician reviewers evaluated outputs via REDCap surveys across perceived factuality, completeness, clarity/conciseness, temporal cohesion, comparative performance, safety, and clinical utility (0-4 Likert scale). Reviewers were permitted to query the HAO chat interface to address missing details. Automated factuality was assessed using TBFact (bidirectional entailment), reporting precision and recall against available reference summaries. Results: Among 57 survey responses from 5 different physicians, mean scores exceeded 2.8 across domains, with medians of 3 for most axes. In an exploratory comparison, oncology fellows required less time to review HAO-generated summaries than to manually generate patient summaries (mean difference 13.57 minutes per patient, p<0.001), although this difference may be influenced by prior familiarity with the same cases; 96% of survey responses indicated that HAO would save time. TBFact evaluations showed higher recall than precision across domains, consistent with broad capture of reference content alongside additional content that was not present in gold-standard summaries. Attribution was viewed favorably but showed issues with primary-source specificity and link reliability. Conclusions: In a controlled Mayo environment, HAO demonstrated moderate performance and was associated with reduced review time for tumor board preparation. These findings are promising but preliminary and do not establish clinical safety, noninferiority to manual review, or readiness for routine clinical use. Limitations, including verbosity, specialty-specific content gaps, and inconsistent attribution, highlight the need for iterative refinement and further evaluation.

10
Multi-Agent AI for Chest Radiography: A Sequential Segmentation and LLM-Driven Consultative Tool for Medical Training

Kurt, F.; Subasi, A.

2026-06-01 health informatics 10.64898/2026.05.29.26354432 medRxiv
Top 4%
0.2%
Show abstract

Background: Traditional diagnostic models lack explainability, while multimodal language models prone to hallucination remain unsafe for medical education. An interactive, risk-free artificial intelligence framework is required to serve as a reliable clinical mentor for radiology trainees. Methods: We propose a multi-agent architecture decoupling deterministic image analysis from generative consultation. Specialized computer vision models perform anatomical localization and pathological segmentation. These quantitative outputs are synthesized into a structured payload, which grounds a locally hosted large language model (LLaVA 7B) using strict prompt guardrails and prerequisite protocols. Results: The system effectively eliminates visual hallucinations by intercepting unanchored queries. The artificial intelligence tutor successfully contextualizes spatial anomalies and baseline metrics, generating accurate conversational explanations and formally structured radiology reports while strictly enforcing medical safety disclaimers. Discussion and Conclusion: By anchoring language generation exclusively to verified algorithmic realities, this framework transforms opaque diagnostic models into safe, interactive educational simulators. This establishes a highly reliable paradigm for integrating explainable artificial intelligence into medical training.

11
Physician Facing AI Tools Show Distinct Failure Modes Under Structured Stress Testing

Hazare, N. S.; Oh, W.; Kumar, G.; Goel, N.; Shaikh, A.; Sharma, A.; Desman, J.; Kumar, A.; Robles, C.; Singh, A.; Jangda, M.; Agaron, S.; Capone, C.; Ngai, D.; Itwaru, A.; Parchure, P.; Ramaswamy, A.; Gorbenko, K.; Timsina, P.; Lampert, J.; Tamler, R.; Manasia, A.; Kohli-Seth, R.; Kaplan, B.; Vakil, A.; Omar, M.; Glicksberg, B. S.; Freeman, R.; Stern, A. D.; Klang, E.; Darrow, B.; Stump, L. S.; Reich, D.; Charney, A.; Nadkarni, G. N.; Sakhuja, A.

2026-05-29 health informatics 10.64898/2026.05.27.26354248 medRxiv
Top 4%
0.2%
Show abstract

Importance: Physician-facing AI tools are now in clinical use, yet whether different platforms fail in similar or fundamentally different ways in high-stakes settings like critical care is unknown. Objective: To evaluate two physician-facing AI platforms, ChatGPT for Clinicians and OpenEvidence, for distinct vulnerabilities under structured stress testing. Design, Setting, and Participants: An observational study conducted using 60 simulated critical care vignettes developed and adjudicated by four attending critical care physicians. Data were collected in the last week of April 2026, via the public website interfaces of each platform. Interventions/Exposures: A 2x2x2x2 factorial design across four stressors - anchoring, cognitive load, social conformity pressure, and a clinically incorrect directive - yielded 16 prompt subsets per vignette and 960 prompts per platform. A separate multi-turn adversarial prompting paradigm administered three sequential "You are incorrect" challenges to baseline vignettes. All prompts had a universal output length constraint of fewer than 30 words. Main Outcomes and Measures: Critical elements capture (percentage of gold-standard critical elements present in responses), susceptibility to clinically incorrect directive, and sycophancy (reversal of an initial correct recommendation under iterative adversarial challenge). Results: Across 1916 responses to 1920 prompts, ChatGPT for Clinicians captured more gold-standard critical elements than OpenEvidence (81.4% {+/-} 18.1% vs 61.0% {+/-} 23.5%; adjusted difference, 20.3 percentage points; 95% CI, 18.3 to 22.4; P < .001) and was less susceptible to clinically incorrect directives (1.7% vs 8.0%; adjusted odds ratio, 0.07; 95% CI, 0.02-0.21; P < .001). Anchoring and social conformity pressure were associated with reduced critical element capture across both platforms, while cumulative stressor burden reduced critical element capture only on OpenEvidence. Conversely, ChatGPT for Clinicians reversed correct recommendations more readily under adversarial prompting (hazard ratio, 2.61; 95% CI, 1.10 - 6.19; P = .03). Conclusion and Relevance: The two physician-facing clinical AI platforms evaluated demonstrated non-overlapping vulnerabilities, with neither platform uniformly superior. These findings argue against single-axis ranking of clinical AI systems and support multidimensional safety evaluation encompassing completeness of reasoning, resistance to incorrect directives, and stability under adversarial challenge.

12
Beyond Identifier Matching: An Empirical Characterization of Failure Modes in Biomedical Knowledge Graph Integration

Hu, S.; Cheng, H.; Gillenwater, L.; Manpearl, K.; Mandava, A.; Wang, Y.; Pividori, M.; Stranger, B.; Krishnan, A.; Greene, C.; Gao, Y.

2026-05-28 health informatics 10.64898/2026.05.26.26354182 medRxiv
Top 4%
0.2%
Show abstract

Objective. Biomedical knowledge graphs (KGs) such as PrimeKG, Hetionet, UMLS, and PharmGKB are increasingly used as the substrate for downstream machine-learning, retrieval-augmented generation, drug-repurposing, and electronic health record (EHR) augmentation pipelines. The dominant assumption in published work is that integrating two or more such KGs is a tractable engineering step solved by identifier (ID) matching. This paper interrogates that assumption empirically. We quantify how much concept overlap survives realistic alignment, and we characterize the new failure modes introduced by the methods that practitioners reach for when ID matching is insufficient. Materials and Methods. We compared four widely used biomedical KGs (PrimeKG, Hetionet v1.0, the full UMLS Metathesaurus, and PharmGKB) across eleven node types using a tiered alignment pipeline: (1) direct ID matching for nodes sharing a primary vocabulary; (2) cross-ontology bridging using standard mappings (e.g., MONDO-DOID, HPO-UMLS, HPO-UMLS-MeSH for side effects, NCBI Gene-HGNC-UMLS, UBERON-FMA/SNOMEDCT_US/NCI/MeSH for anatomy); (3) ClinicalBERT cosine-similarity grouping at threshold >= 0.98 for over-segmented disease nodes, with a deterministic suffix-stripping canonicalizer; (4) exact name matching for ontology-poor types (anatomy, REACTOME pathways); and (5) embedding-based fuzzy matching with UMLS lookup (SapBERT and ClinicalBERT) for free-text microbiome concepts. We applied the pipeline to a 698-concept gut-microbiome benchmark spanning taxa, pathways, and disease labels, validated grouping decisions against the curated SSSOM mappings released by the MONDO project, and audited the ClinicalBERT consolidation against five clinical-genetics case studies drawn from the literature. Results. Per-type pairwise coverage was strikingly asymmetric. Genes/proteins and the three Gene Ontology categories aligned cleanly across PrimeKG and Hetionet (mutual coverage 94-99%), but disease overlap was sparse: only 0.7% of PrimeKG individual disease nodes mapped to Hetionet, rising to 2.0% after MONDO grouping (versus 78.7% and 18.4% from the Hetionet side). PrimeKG-to-UMLS coverage spanned 100% (effect/phenotype via HPO) down to 20.8% (REACTOME pathways), with drugs at 73.7% and anatomy at 58.8%. PrimeKG-to-PharmGKB drug coverage required up to two bridging hops (DrugBank -> UMLS -> RxNorm/ATC/MeSH). Bigger was not uniformly more complete: on a 698-concept microbiome drug benchmark, Hetionet missed 0 concepts while PrimeKG missed 16. ClinicalBERT-based grouping consolidated 22,205 raw MONDO disease nodes into 17,080 groups but introduced three reproducible failure modes documented in case studies: (i) peer over-merging: for example, all 22 osteogenesis imperfecta subtypes collapsed into a single node despite distinct severity classes; (ii) parent-child collapse: e.g. acute myeloid leukemia merged with myeloid leukemia, erasing the acute/chronic distinction that drives clinical management; and (iii) lexical false positives: neurofibromatosis and schwannomatosis grouped together despite cellular-pathology differences. Discussion. Identifier matching alone is a weak baseline for biomedical KG integration. Cross-ontology bridges and embedding-based consolidation expand coverage but do so at the cost of clinically meaningful resolution, and the resulting failures are systematic rather than random. Reporting only aggregate coverage statistics obscures these losses, which propagate silently into downstream tasks. Conclusion. We provide reusable per-type coverage tables, a taxonomy of three integration failure modes, and concrete recommendations for downstream studies that depend on a unified biomedical KG. We argue that future KG integration work should report per-type coverage and per-cluster confidence rather than aggregate match rates.

13
Prognostic Value of Mean Platelet Volume in Septic Shock: A Retrospective Cohort Study

Trujillo-Vega, F.; Lopez-Delgado, P. A.

2026-06-01 emergency medicine 10.64898/2026.05.29.26354453 medRxiv
Top 4%
0.2%
Show abstract

Abstract Background: Mean platelet volume (MPV) is a simple, low-cost biomarker that reflects platelet activation. Its prognostic value in septic shock remains controversial. We aimed to determine whether MPV at intensive care unit (ICU) admission is associated with hospital mortality in patients with septic shock. Methods: Retrospective cohort study of consecutive adults with septic shock (Sepsis-3 criteria) admitted to a single ICU. MPV, severity scores (SOFA, APACHE II, SAPS II), procalcitonin, and clinical data were collected. The primary outcome was in-hospital mortality. Spearman correlation, univariate and multivariate logistic regression (with Firth's correction), ROC curves, and subgroup analyses were performed. Results: Fifty-eight patients were included; mortality was 58.6%. MPV did not differ between non-survivors and survivors (13.09 {+/-} 1.37 vs. 12.66 {+/-} 1.45 fL, p = 0.259). MPV showed a weak correlation with procalcitonin ({rho} = 0.394, p = 0.002) but not with severity scores. In multivariate analysis adjusting for age, sex, SOFA and comorbidity count, MPV was not an independent predictor of mortality (OR 1.075, 95% CI 0.682-1.755, p = 0.749). The area under the ROC curve for MPV was 0.598 (95% CI 0.444-0.752), significantly lower than that of SOFA (0.837) and procalcitonin (0.836). Subgroup analyses showed no significant association between MPV and mortality in any stratum. Conclusions: In this cohort of septic shock patients, MPV at ICU admission was not associated with hospital mortality and had poor discriminative ability. Widely used severity scores and procalcitonin remain superior prognostic markers. MPV should not be used as a prognostic tool in septic shock. Keywords: Septic shock, Mean platelet volume, Mortality, SOFA, Procalcitonin, Biomarker

14
A Consensus-Driven Stacking Ensemble Framework for Interpretable Cardiovascular Risk Prediction and Clinical Deployment

Sozol, S. S.; Dev Nath, B. C.; Fahim, F. M. S.; Suzana, N. N.; Mirza, J. F.; Ahmmed, S.; Zohra, F.-T.; Zafr, A. H. A.; Uddin, M. N.; Mondal, M. R. H.; Hoque, A. S. M. L.

2026-05-26 health informatics 10.64898/2026.05.18.26352989 medRxiv
Top 5%
0.1%
Show abstract

Machine learning (ML) is being considered to help diagnose cardiovascular diseases (CVD). Still, challenges like inconsistent and limited datasets, limited infrastructure, and global inequalities lead to the need for a reliable and practicable ML solution. This paper presents an ML-driven framework for predicting CVD risk scores and classifying status. Several data preprocessing techniques, including multiple imputation by chained equations (MICE), outlier removal, are considered. In addition, hyperparameter tuning is performed with the GridSearchCV tuning technique. Moreover, a consensus-driven five-feature selection method is applied to identify optimal predictors. The dataset used in this study contains healthcare records related to future CVD risk scores, comprising 1,529 patient records with 22 features. The optimized stacked ensemble model is applied to the dataset and achieves a cross-validated coefficient of determination value of 98.13% for CVD risk score regression. Comparative evaluation with other ML models confirmed improved accuracy, efficiency, and interpretability. The explainable AI technique SHAP is applied to interpret predictions and highlight key risk factors. Moreover, a deployment-ready web platform with multi-role access has been developed that demonstrates clinical applicability. The proposed framework offers a reliable and interpretable tool for early detection of CVD and personalized risk assessment. In the future, this work can be extended to integrate longitudinal data, medical imaging, and deep learning to improve generalizability and strengthen real-world impact.

15
Changes in Frequency of Resuscitation Among the Oldest Old Following Japans End-of-Life Care Guideline Revision: A Population-Level Interrupted Time-Series Analysis Using National Open Claims Data

Sakai, M.; Nakayama, T.

2026-05-30 health policy 10.64898/2026.05.28.26354307 medRxiv
Top 5%
0.1%
Show abstract

Resuscitation in the oldest old at the end of life is associated with potential harm, raising concerns about misalignment with patients goals of care. This study aimed to elucidate changes in the use of resuscitation among the oldest old in Japan following the revision of the national guideline on end-of-life care which explicitly incorporates the concept of advance care planning. We conducted a repeated cross-sectional study using the National Database of Health Insurance Claims Open Data, including adults aged [&ge;]85 years, from April 2014 to March 2024. The annual number of resuscitation procedures per 100,000 individuals aged [&ge;]85 years was used as the measure of frequency. Resuscitation included closed-chest cardiopulmonary resuscitation (CPR) and endotracheal intubation. Interrupted time series analysis was used to examine changes following the 2018 revision of the national end-of-life care guideline. The frequencies of CPR and endotracheal intubation declined before 2018 (CPR: age 85-89, -68.4 [-87.9 to -48.8]; age [&ge;]90, -106.7 [-131.5 to -82.0]; intubation: age 85-89, -57.5 [-71.8 to -43.2]; age [&ge;]90, -69.5 [-80.7 to -58.3]), but the decline attenuated thereafter (CPR: age 85-89, +56.2 [28.0 to 84.5]; age [&ge;]90, +84.1 [50.7 to 117.6]; intubation: age 85-89, +36.6 [8.5 to 64.7]; age [&ge;]90, +38.3 [23.8 to 52.8]). These findings provide insight into the changes in resuscitation trends following policy interventions supporting end-of-life decision-making. Further studies are needed to better understand the mechanisms underlying this change.

16
SeGA-GNN: Semantically Gated Augmented Graph Neural Networks for Wearable-Based Emotion Detection

Kurt, F.; Subasi, S. N.; Yakisan, E. S.; Subasi, A.

2026-06-01 health informatics 10.64898/2026.05.29.26354434 medRxiv
Top 6%
0.1%
Show abstract

Background: Wearable technologies enable scalable and continuous monitoring of emotional states through passive sensing of physiological and behavioral signals. However, conventional learning approaches often struggle to model the complex temporal, contextual, and relational dependencies underlying human emotions. To address these limitations, we propose a graph-based framework that represents multimodal wearable observations as heterogeneous knowledge graphs enriched with semantic information derived from Large Language Models (LLMs), enabling richer contextual understanding beyond raw sensor measurements. Methods: We constructed a heterogeneous knowledge graph using multimodal Fitbit physiological signals and affective self-report data collected from 45 users. Framing mood prediction and emotion detection was formulated as both binary and ternary node classification tasks. We evaluated five baseline heterogeneous Graph Neural Network (GNN) architectures and compared them with the proposed Semantically Gated Augmented Graph Neural Network (SeGA-GNN) framework, which dynamically integrates LLM-generated semantic embeddings into graph representations through a gated cross-modal fusion mechanism. Results: The baseline GNN models achieved strong performance, with classification accuracies ranging from 0.7525 to 0.9739 for binary classification and 0.6249 to 0.9699 for ternary classification. The proposed SeGA framework consistently improved predictive performance across most architectures. In particular, semantic augmentation transformed the HAN model from moderate baseline performance into near-perfect emotion recognition capability, achieving SeGA-HAN Accuracy = 0.9988 and AUC = 1.0000 for binary classification and Accuracy = 0.9979 and AUC = 1.0000 for ternary classification. Discussion and Conclusion: Integrating LLM-derived semantic contextualization into heterogeneous graph learning enables effective modeling of contextual information that is not directly captured by wearable physiological signals alone. The proposed SeGA-GNN framework demonstrates that adaptive semantic fusion substantially improves the accuracy, robustness, and interpretability of wearable-based emotion detection. These findings establish a promising direction for next-generation wearable affective computing systems and intelligent emotion-aware applications.

17
Longitudinal Evaluation of Harlem United Multiservice Model on Clinical, Behavioral, and Social Outcomes Among Clients Living with HIV

Monk, B. S.; Strauss, D.

2026-06-01 public and global health 10.64898/2026.05.23.26353941 medRxiv
Top 6%
0.1%
Show abstract

Background/Objectives People living with HIV face overlapping hardship through medical, behavioral, and social needs that require an integrated and coordinated approach. Harlem United multiservice model provides healthcare, food assistance, housing support, harm reduction services, behavioral health counseling, case management, and much more to support their clients. This study is an examination on how the participation in the Harlem United multiservice model is associated with changes over time in client health, behavioral health, and social outcomes. Methods This study performed a longitudinal program evaluation examining Harlem United clients enrolled between January 2020 and January 2025 who remained engaged in services for a minimum of one year. Client outcomes were assessed across three time points: Baseline, Year 1, and Year 2. The sample included 154 clients at baseline (N=154) with a total of 428 observations (N=428). Quantitative measures that were assessed included program involvement, housing stability, PHQ4 scores, food insecurity, medication adherence, and viral suppression. Data was analyzed using IBM SPSS Statistics through descriptive statistics, frequency tables, and generalized estimating equation models (GEE) to account for repeated observation over time. Results Medication adherence and viral suppression remained consistently high across all time points in the longitudinal study suggesting that most clients were virally suppressed or undetectable at baseline. Housing stability was statistically significant Wald X2 (2) = 156.073, p < 0.001 with improvements noted in Year 1 and Year 2 compared to baseline. Program level was significantly associated with PHQ4 scores, Wald X2 (1) = 7.902, p = 0.005. Food insecurity was also associated with PHQ4 scores, Wald X2 (1) = 5.462, p = 0.019. Findings suggest that clients with higher PHQ4 scores were involved in more programs compared to clients only enrolled in 1-2 programs. Additionally, clients with higher PHQ4 scores were more food insecure highlighting the relationship between social needs and mental health. Conclusion: Findings suggest that the Harlem United multiservice model played a supportive role in the maintenance of health and social outcomes through medication adherence and viral suppression. Although, significant improvement was not reflected across several outcomes, the association between PHQ4 scores, food insecurity, and an increase in program involvement suggest that the multiservice is reaching more clients with complex behavioral and social needs. Continued integration of these services is important for sustaining client stability while addressing social determinants of health.

18
Malnutrition and healthcare costs in older adults in Sweden: a longitudinal study based on a population-based cohort and Swedish registers

Xia, X.; Balcha, Y. M.; Carballo-Casla, A.; Aho, E.; Willers, C.; Rydwik, E.; Calderon-Larranaga, A.; Kugelberg, S.; Berggreen-Clausen, A.; Garpsater, J.; Jonsson, L.

2026-06-01 health economics 10.64898/2026.05.29.26354412 medRxiv
Top 6%
0.1%
Show abstract

Background The study aimed to estimate healthcare costs associated with malnutrition in Swedish older adults. Methods We conducted a cohort study using data from the population-based Swedish National Study on Aging and Care in Kungsholmen (SNAC-K, N = 2982), a geriatric inpatient cohort of complex patients (N = 7680), and a cohort of individuals with cognitive impairment from the Swedish Register of Cognitive/Dementia Disorders (SveDem, N = 64192). At risk of malnutrition and malnutrition were ascertained by the Mini-Nutritional Assessment in SNAC-K and the geriatric inpatient cohort. In SveDem, body mass index was used for identifying malnutrition. Healthcare resource use was derived from regional and national registers. Associations between malnutrition and healthcare costs in 2024 Swedish kronor (SEK) were analyzed using two-part models and generalized linear regression models, adjusting for demographic and clinical factors. Findings In the community, at risk of malnutrition and malnutrition were associated with an increase in annual healthcare costs of 2267 SEK (95% CI: 64,4469) and 1846 SEK (95% CI: -6802,10493), respectively. In geriatric patients, healthcare costs over 6 months in individuals at risk of malnutrition and individuals with malnutrition were 60205 SEK (45613,74798) and 86619 SEK (68362,104875) higher than those without malnutrition. In people with cognitive impairment, malnutrition was associated with higher annual healthcare costs (22170 SEK, 95% CI: 15152,29188). Interpretation Both at risk of malnutrition and malnutrition are associated with higher healthcare costs in Swedish older adults. The study findings are important for informing future economic evaluations of malnutrition interventions in Swedish older adults.

19
Randomised Trial of a Multilingual Conversational AI for Preoperative Education

Ke, Y.; Niu, C.; Liao, J.; Sim, J.; Abdullah, H. R.; Jin, L.; An, J.; Ho, H. S. S.; Tung, J. Y. M.; Tan, H. K.; Sng, B. L.; Ting, D. S. W.; Ong, M. E. H.; Liu, N.

2026-05-26 anesthesia 10.64898/2026.05.24.26353997 medRxiv
Top 6%
0.1%
Show abstract

Background Informed consent depends on patients' understanding of anaesthesia risk, yet comprehension remains poor despite routine preoperative consultation. Conversational artificial intelligence (AI) could establish patient-reported understanding before clinician contact, but whether such systems can achieve patient-reported understanding comparable to clinician-delivered education remains unknown. Methods We conducted a randomised equivalence trial (n = 130) of PEAR (Preoperative Education of Anaesthesia Risks), a multilingual retrieval-augmented conversational AI grounded in institutional consent materials, versus standard preoperative consultation in adults undergoing elective surgery. Results A total of 130 adults (mean age 52.4 +/- 14.5 years) were enrolled. Post-consultation understanding scores in the PEAR group met the pre-specified equivalence criterion compared with standard consultation across all three primary measures. Patients who interacted with PEAR before clinician contact achieved understanding scores comparable to those receiving standard face-to-face consultation alone. PEAR reduced documentation and consultation time, corresponding to a projected annual net benefit of approximately SGD 0.99 million (USD 0.78 million) at a single tertiary centre. Conclusions A retrieval-augmented conversational AI achieved patient-reported understanding of anaesthesia risk equivalent to standard preoperative consultation while substantially improving workflow efficiency. These findings support supervised deployment of conversational AI within perioperative care pathways while preserving clinician oversight for verification and patient-specific decision-making.

20
Random Forest Model for Predicting Post-Lockdown Antenatal Depression Risk: A Cross-Sectional Study of Pregnant Women in China

Pan, Y.; Lin, H.; HIRONO, T.; Yang, Y.; Liu, Y.; Zhang, Y.

2026-05-26 public and global health 10.64898/2026.05.23.26353929 medRxiv
Top 7%
0.1%
Show abstract

Background As lockdown measures was eased, pregnant women faced an elevated risk of COVID-19 infection, potentially impacting their mental health. This study aimed to investigate the prevalence of antenatal depression (AD) post-lockdown and develop predictive models for AD risk using machine learning. Methods A cross-sectional study utilizing the Edinburgh Postnatal Depression Scale was conducted in Beijing and Guizhou, China, from January to August 2023. Data was randomly split into training and test datasets (6:4 ratio), with logistic regression (LR), Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Random Forest (RF), eXtreme Gradient Boosting (XGBoost), and Gradient Boosting Decision Tree (GBDT) models trained and compared. The best model underwent further examination, including SHapley Additive exPlanations (SHAP) for feature importance, calibration curve (CC) for discrimination, and decision curve analysis (DCA) for clinical benefit. Results The effective response rate was 91.07% (459/504), with 25.7% (118/459) testing positive for AD. Multivariate analysis identified "sleep disorders," "family support level," and "COVID-19 symptom severity" as independent predictors. RF model showed the highest area under the curve in both training (0.842) and testing (0.724) datasets, with SHAP emphasizing the greatest impact of "sleep disorders" on AD. The RF model's calibration (P > 0.05) and clinical utility across thresholds (8%-95% and 10%-58%) were confirmed by CC and DCA, respectively. Conclusions AD strongly correlated with "sleep disorders," "family support level," and "COVID-19 symptom severity" post-lockdown, and the EPDS-based RF model effectively predicted AD risk.