Graph-Augmented Retrieval for Digital Evidence-Based Medical Synthesis: A Proof-of-Concept Study on Topology-Aware Mechanistic Narrative Generation
Buscemi, P.; Buscemi, F.
Show abstract
BackgroundRetrieval-augmented generation (RAG) frameworks such as RAPID [1] have demonstrated that staged planning and retrieval grounding improve long-form text generation. However, most implementations remain similarity-driven and open-domain, lacking the epistemic safeguards required for biomedical synthesis, where mechanistic completeness, temporal governance, traceability, and explicit gap classification are essential. ObjectiveTo develop and evaluate a topology-aware, graph-augmented retrieval framework for structured biomedical narrative synthesis, and to position it as a domain-constrained evolution of staged RAG aligned with structural principles of digital evidence-based medicine (dEBM). MethodsWe implemented a two-layer architecture operating on a closed, version-controlled corpus of 11,861 peer-reviewed text chunks on iron deficiency. A metadata-constrained vector retriever (RAG01) was extended with a Graph-RAG (RAG02) overlay (RAG02) constructed from chunk-level entity extraction and weighted co-occurrence networks (30 nodes; 118 directed edges). Topic planning was organized through predefined mechanistic axes functioning as structured hypothesis probes. Retrieval was performed under identical deterministic constraints (top-k = 5; cosine threshold = 0.50; publication year [≥] 2023), and graph diagnostics--including local connectivity, induced subgraph density, modular overlap, and multi-hop stability--were used to distinguish retrieval insufficiency from corpus-level evidentiary scarcity. ResultsIn a case study of obesity-associated iron deficiency, the entity network exhibited a centralized regulatory topology with hepcidin as a high-connectivity hub. Axis-based retrieval combined with graph auditing consistently reinforced an inflammation-mediated hepcidin pathway linking obesity to iron deficiency, while alternative mechanisms lacked stable multi-hop embedding. Compared with vector-only retrieval, graph augmentation preserved semantic alignment and increased mean cosine similarity from 0.673 to 0.694 while reducing similarity dispersion (SD 0.056 to 0.035) under identical constraints. Graph activity ratio was 1.00 in the temporally filtered corpus. ConclusionsBy integrating mechanistic axis decomposition, topology-aware auditing, causal scaffolding, and expert-driven iterative refinement, the proposed framework implements selected structural constraints inspired by evidence-based medicine within a controlled digital synthesis environment. The approach advances retrieval-augmented generation beyond similarity-based summarization toward a reproducible model of topology-aware biomedical evidence interrogation with implications for AI-assisted systematic reviews.
Matching journals
The top 9 journals account for 50% of the predicted probability mass.