Patterns
Top medRxiv preprints most likely to be published in this journal, ranked by match strength.
Show abstract
Data scarcity and stylistic heterogeneity pose major challenges for emotion intensity classification. This paper presents a cross-dataset augmentation framework that leverages prompt-conditioned generative models alongside deterministic and heuristic transformations to synthesize target-style examples for improved transfer learning. We introduce a unified taxonomy of augmentation strategies--Heuristic Lexical Perturbation (HLA), Prompt-Conditioned Generative Augmentation (CGA), Sequential Hybrid...
Show abstract
As of early 2026, over 115 million US adults (more than 1 in 3) have prediabetes, a condition with an annual conversion rate of 5%-10% to type 2 diabetes. Total diabetes (diagnosed and undiagnosed) affects approximately 40.1 million Americans, or 12% of the population, with roughly 1.5 million new cases diagnosed annually. Continuous Glucose Monitoring (CGM) provides real-time, 24/7 insights into glycemic variability, detecting dangerous highs, lows, and trends that HbA1c (a 3-month average) mis...
Show abstract
BackgroundTuberculosis (TB) remains a major public health challenge in Nepal, with incidence rates substantially higher than global estimates. Accurate forecasting of TB incidence is essential for early warning systems, resource allocation, and targeted interventions. This study aimed to develop and validate a hybrid Seasonal Autoregressive Integrated Moving Average (SARIMA) and Convolutional Neural Network Auto-Regressive (CNNAR) model for TB incidence forecasting in Nepal. MethodsMonthly TB i...
Show abstract
Ambient AI documentation tools generate draft notes that clinicians can review and edit before signing off in electronic health records. Scalable computational approaches to characterize how clinicians modify drafts remain limited, yet are essential for evaluating and improving AI effectiveness. We examined the feasibility of a few-shot prompted large language model (LLM) for categorizing sentence-level edits between AI drafts and final documentation. We developed five label-specific binary mode...
Show abstract
Over 54 million Americans are aged 65+, with depression affecting 25-49% and anxiety exceeding 30% of assisted living residents. AI systems employing agentic orchestration exhibit 0.5-2% failure rates--unacceptable where a single missed crisis can be fatal. We designed and bench-evaluated Lilo Engine, a 5-layer deterministic therapeutic pipeline replacing a prior multi-agent orchestrator. Safety is enforced through structural invariants: a Guardian layer with 4-gate OR crisis detection runs unco...
Show abstract
Acquiring insights from electronic health records (EHRs) is slowed by manual analytical workflows that limit scalability and reproducibility. We present LATCH (LLM-Assisted Testing of Clinical Hypotheses), an agentic framework that converts natural language clinical hypotheses into fully auditable analyses on structured EHR data. LATCH integrates LLM-assisted semantic layers with deterministic execution pipelines to automate cohort construction, statistical analysis, and result reporting, while ...
Show abstract
Cross-jurisdictional pharmaceutical compliance requires comparative analysis of regulatory requirements across jurisdictions such as the US FDA and Chinas NMPA. Although large language models (LLMs) are increasingly explored for healthcare-related applications, their performance in cross-jurisdictional regulatory comparison has not been systematically characterized using dedicated benchmarks. This study introduces Sino-US-DrugQA, a bilingual benchmark dataset designed to evaluate LLM performance...
Show abstract
Long-term adherence to prescribed therapies remains a persistent challenge in chronic and ultra-rare conditions where clinical outcomes depend on continuous medication use. Even brief gaps in therapy can compromise disease control, yet patients frequently encounter structural barriers including high out-of-pocket costs, prior-authorization (PA) delays, annual re-verification cycles, and refill logistics that disrupt persistence. This study evaluates a patient-centric Markov-chain framework for a...
Show abstract
AO_SCPLOWBSTRACTC_SCPLOWLongitudinal healthcare surveys frequently contain inconsistencies in self-reported onset ages, where participants report different ages for the same condition between enrollment and follow-up surveys. We propose two methods to handle this challenge. First, we introduce a procedure that aggregates inconsistency patterns to construct participant-level reliability scores, enabling researchers to stratify participants and prioritize analysis on high-reliability cohorts. Seco...
Show abstract
Health behaviors such as physical activity and sleep affect mental health, but the effect of each health behavior varies substantially across individuals, limiting the usefulness of generic behavioral recommendations. We collected one year of continuous wearable and ecological momentary assessment data from 3,139 participants in the Intern Health Study (2018-2023), and examined individual-level associations between wearable-derived features and mood across the internship year. The behaviors asso...
Show abstract
Sodium (23Na) magnetic resonance imaging (MRI) provides valuable metabolic information, but it is limited by a low signal-to-noise ratio (SNR) and long acquisition times. To overcome these challenges, we present a Deep Image Prior (DIP)-based framework that combines anatomically guided proton (1H) MRI and metabolically guided 23Na MRI denoising via a fused proton-sodium prior within a directional total variation (dTV) regularization scheme. The DIP-Fusion approach minimizes a variational loss fu...
Show abstract
ObjectiveThis study investigates whether incorporating physiological coupling concepts into neural network design can support stable and interpretable feature learning for histopathological image classification under limited data conditions. MethodsA physiologically inspired architecture, termed CardioPulmoNet, is introduced to model interacting feature streams analogous to pulmonary ventilation and cardiac perfusion. Local and global tissue features are integrated through bidirectional multi-h...
Show abstract
Cox proportional hazard regressions are frequently employed to develop prognostic models for time-to-event data, considering both patient-specific and disease-specific characteristics. In high-dimensional clinical modeling, these biological features can exhibit high collinearity due to inter-feature relationships, potentially causing instability and numerical issues during estimation without regularization. For rare diseases such as acute myeloid leukemia (AML), the sparsity and scarcity of data...
Show abstract
BackgroundAs sleep data from wearable devices are increasingly available in health research, there are new opportunities to understand sleep regulation behaviors as modifiable risk factors for disease. At such a large scale (tens of thousands of people over millions of day-level observations), prioritizing and interpreting sleep behaviors is challenging while maintaining biological relevance and modifiability. In this work, we aim to address this challenge by proposing a framework to interpret F...
Show abstract
Thyroid carcinoma is one of the most prevalent endocrine malignancies worldwide, and accurate preoperative differentiation between benign and malignant thyroid nodules remains clinically challenging. Diagnostic methods that medical practitioners use at present depend on their personal judgment to evaluate both imaging results and separate clinical tests, which creates inconsistency that leads to incorrect medical evaluations. The combination of radiological imaging with clinical information syst...
Show abstract
Medicine historically separates abstract clinical reasoning from physical intervention. We bridge this divide with MedOS, a general-purpose embodied world model. Mimicking human cognition via a dual-system architecture, MedOS demonstrates superior reasoning on biomedical benchmarks and autonomously executes complex clinical research. To extend this intelligence physically, the system simulates medical procedures as a physics-aware model to foresee adverse events. Generating and validating on the...
Show abstract
BackgroundLarge language models (LLMs) are increasingly deployed in medical contexts as patient-facing assistants, providing medication information, symptom triage, and health guidance. Understanding their robustness to adversarial inputs is critical for patient safety, as even a single safety failure can lead to adverse outcomes including severe harm or death. ObjectiveTo systematically evaluate the safety guardrails of state-of-the-art LLMs through adversarial red-teaming specifically designe...
Show abstract
Wearable devices present transformative opportunities for personalized healthcare through continuous monitoring of digital biomarkers; however, individual variations in device wear time could mask or otherwise impact signal identification. Despite the widespread adoption of wearable devices in research, no comprehensive framework exists for understanding how wear time varies across populations or for addressing wear time-related biases in analysis. Using Fitbit data from 11,901 participants in t...
Show abstract
ObjectiveAdverse events (AEs) resulting from medical interventions are significant contributors to patient morbidity, mortality, and healthcare costs. Prediction of these events using electronic health records (EHRs) can facilitate timely clinical interventions. However, effective prediction remains challenging due to severe class imbalance, missing labels, and the complexity of EHR records. Classical machine learning approaches frequently underperform due to insufficient representation of minor...
Show abstract
Non-alcoholic fatty liver disease (NAFLD) is a globally prevalent hepatic condition caused by the buildup of fat in the liver. It is frequently associated with metabolic comorbidities such as hypertension, cardiovascular disease (CVD), and prediabetes. However, early detection remains challenging due to the asymptomatic progression, and existing primary diagnostic methods, such as imaging or liver biopsy, are often expensive and inaccessible in rural areas. This study proposes a two-stage, inter...