Med
Top medRxiv preprints most likely to be published in this journal, ranked by match strength.
Show abstract
Background & AimsLiver cancer primarily develops in patients with chronic liver disease (CLD), yet most cases are diagnosed at an advanced stage with poor prognosis. While clinical surveillance of patients with CLD generates extensive longitudinal data, its unstructured free-text nature hinders large-scale research. To unlock this real-world evidence, we developed a scalable framework using open-source Large Language Models (LLMs) to transform unstructured clinical text into structured data. Me...
Show abstract
The Updated Sydney System (USS) provides a standardized framework for grading gastritis and stratifying gastric cancer risk. However, subjective observer variability and labor-intensive workflows impede its routine clinical use. To address these challenges, we developed SydneyMTL, a multi-task deep learning framework that uses Multiple Instance Learning (MIL) with task-specific attention pooling to predict severity grades across all five USS attributes simultaneously. Trained on an unprecedented...
Show abstract
IntroductionKidney biopsy reports contain rich information that is clinically actionable and useful for research. However, the narrative format hinders scalable reuse. We here investigated whether open-source large language models (LLMs) can extract relevant, standardized readouts from native kidney biopsy pathology reports. MethodsGerman free-text native kidney biopsy reports were parsed with three open-source LLMs (Llama3 70B, Llama3 8B, MedGemma) to generate structured JSON outputs covering ...
Show abstract
BackgroundDistinguishing benign proliferative nodules (PNs) from melanoma arising within congenital melanocytic nevi remains a major diagnostic challenge. Copy number alteration (CNA) analysis is widely used to support classification, but current criteria were developed using array comparative genomic hybridization (aCGH). The performance of alternative platforms such as shallow whole-genome sequencing (sWGS) and methylation arrays in this setting is poorly defined. ObjectivesThe objective of t...
Show abstract
Pathology faces persistent challenges including a global shortage of specialists, uneven access to expertise, increasing diagnostic complexity, and a growing need for second-opinion consultations. While digital and telepathology platforms address parts of this problem, existing solutions often trade accessibility for structured, workflow-aware clinical integration. At the same time, multimodal medical AI shows promise for diagnostic support but raises concerns regarding transparency, automation ...
Show abstract
BackgroundDengue virus (DENV) appears to manipulate several cellular metabolic pathways to permit its replication and immune evasion in the host. Here, we employed high-resolution mass spectrometry (HR-MS) to investigate the serum metabolomic landscape of clinical DENV infection. MethodsSerum specimens from primary dengue (n=11), secondary dengue (n=9) samples, and healthy controls (n=10) were used for untargeted and targeted metabolomic quantification on a Waters Xevo G2-XS QTof Mass Spectrome...
Show abstract
Genomic surveillance of influenza viruses informs vaccine strain selection and evolutionary forecasting. Sequencing efforts vary widely across U.S. states, which raises concerns about spatial sampling bias. We evaluated how well 10,958 influenza virus genomes sampled by our group in Michigan captured the genetic diversity in 34,743 genomes circulating nationally from the 2021/22 through 2024/25 seasons. We defined seasonal hemagglutinin haplotypes and tracked their detection across states. A sma...
Show abstract
T-cell lymphomas are often histologically indistinguishable from benign T-cell infiltrates. Clonality testing is frequently required for diagnosis. It lacks the spatial context and is slow and expensive, relying on complex, multiplexed PCR reactions, interpreted by experienced scientists or pathologists. We previously published details of a pair of highly specific monoclonal antibodies against the two alternatively used, but very similar, T-cell receptor {beta} constant regions, TCR{beta}1 and T...
Show abstract
BackgroundRetrieval-augmented generation (RAG) frameworks such as RAPID [1] have demonstrated that staged planning and retrieval grounding improve long-form text generation. However, most implementations remain similarity-driven and open-domain, lacking the epistemic safeguards required for biomedical synthesis, where mechanistic completeness, temporal governance, traceability, and explicit gap classification are essential. ObjectiveTo develop and evaluate a topology-aware, graph-augmented retr...
Show abstract
BackgroundHospital-acquired bacterial pneumonia (HABP) and ventilator-associated bacterial pneumonia (VABP), particularly those caused by multi-drug resistant organisms (MDROs), often require newer antibiotic treatment. The efficacy and safety of newer antibiotics compared to generic antibiotics in randomized controlled trials (RCTs) have not been evaluated before. MethodsIn this systematic review, we searched RCTs in the United States National Library of Medicine (PubMed), Cochrane Central Reg...
Show abstract
BackgroundMetformin is the cornerstone therapy for type 2 diabetes, but gastrointestinal intolerance commonly limits dose escalation and long-term adherence. In the ProGasMet trial, multi-strain probiotic supplementation improved metformin tolerability. However, the underlying microbiome-metabolome mechanisms remain unclear. Methods and analysisWe performed an exploratory multi-omics analysis using Period 1 of a randomized, double-blind, placebo-controlled trial. Participants with metformin int...
Show abstract
MotivationFanconi anemia (FA) is a rare disease mainly caused by biallelic pathogenic variants, including structural variants such as large deletions and insertions in FA genes. Currently, variant detection is based on short-read sequencing and probe-based approaches. However, determining the exact genomic breakpoint or achieving allelic discrimination remains challenging. Nanopore-based long-read sequencing enables a comprehensive detection of FA variants, but a unified bioinformatic analysis p...
Show abstract
BK polyomavirus (BKPyV) is a major complication in kidney transplant recipients (KTR), for whom no specific antiviral therapy is available. Modulation of immunosuppressive therapy results in virus clearance in most KTR with BKPyV DNAemia (controllers), but a significant minority fail to clear the virus (non-controllers). Here, we adapt LIBRA-seq, which links antibody sequence data to antigen specificity, to intact viral capsids of the four BKPyV genotypes to study and compare BKPyV-specific B-ce...
Show abstract
Achieving timely diagnosis for rare diseases remains challenging due to, among others, phenotypic heterogeneity and incomplete clinical data. While the Solve-RD project developed a phenotype-based gene prioritisation method, this approach did not account for the clinical consistency among related diseases in Orphanets hierarchical classifications. We present a phenotype-based computational pipeline that ranks candidate ORPHAcodes based on patient phenotypes. The pipeline computes patient-diseas...
Show abstract
BackgroundPreoperative biliary stenting alters biliary colonization and may reduce the effectiveness of perioperative antibiotic prophylaxis in pancreatoduodenectomy. Although broader-spectrum regimens have been associated with improved infectious outcomes, their microbiological adequacy in routine clinical practice remains poorly defined. We therefore evaluated the real-world adequacy of a prolonged ampicillin-sulbactam protocol, its association with infectious outcomes and survival, and the po...
Show abstract
BackgroundThe role of the gut microbiome and specific enteric bacteria in influencing the development of colorectal cancer (CRC) remains incompletely understood. Recently, it was shown that human CRC-derived strains of Clostridioides difficile were capable of inducing colonic tumorigenesis in a susceptible mouse model. We hypothesized that C. difficile contributes to the pathogenesis of human CRC and would be enriched in CRC tumors compared to paired normal tissues from the same individual. Met...
Show abstract
Rare diseases affect over 300 million people worldwide, yet patients often endure years-long diagnostic delays that limit timely intervention and trial opportunities. Computational rare disease recognition (RDR) remains constrained by knowledge resources that are often incomplete, heterogeneous, and dependent on extensive multi-disciplinary expert curation that cannot scale. Large language models (LLMs) applied directly for end-to-end diagnosis or disease discrimination face similar knowledge bo...
Show abstract
BackgroundThe 2024 blood culture bottle shortage brought diagnostic resource allocation to the forefront, reflecting persistent, foundational challenges with low-value testing and empiric treatment approaches under clinical uncertainty. ObjectiveTo determine whether a machine learning approach using electronic medical record data can predict bacteremia more effectively than existing systems and practices to guide diagnostic testing and empiric treatment strategies. MethodsIn a retrospective co...
Show abstract
An H3N2 variant, named subclade K, continues to circulate widely during the 2025-2026 influenza season. This virus possesses a hemagglutinin (HA) protein that has eleven substitutions relative to the HA of the Northern Hemisphere 2025-2026 H3N2 vaccine strain. Many of these substitutions are in epitopes in well-characterized HA antigenic sites. Despite this, interim vaccine effectiveness studies indicate that the 2025-2026 influenza vaccine provides moderate protection against H3N2 subclade K in...
Show abstract
RationaleObstructive sleep apnea (OSA) is linked to cardiovascular, metabolic, and cognitive morbidity. Although COVID-19 has been associated with long-term respiratory and neurological sequelae, its role in precipitating new-onset OSA remains unclear. ObjectivesTo evaluate whether SARS-CoV-2 infection increases risk of developing OSA up to 4.5 years post-infection and how risk varies by hospitalization status, demographics, comorbidities, and vaccination status. MethodsThis retrospective coho...