Back

AI-Driven Science Communication: Leveraging LLMs and Knowledge Graphs for Seamless Knowledge Exchange

Schor, J.; Scheibe, P.

2025-07-07 pharmacology and toxicology
10.1101/2025.07.04.663152 bioRxiv
Show abstract

PurposeScientific knowledge is increasingly captured in structured formats, such as knowledge graphs, yet it remains largely inaccessible to non-technical users. We present EcoToxFred, a prototype conversational AI agent that enables intuitive, natural language access to curated environmental toxicology data. Designed to support users without programming expertise, EcoToxFred facilitates the exploration of complex datasets, such as chemical exposures and species-specific hazard information in European surface waters. MethodsEcoToxFred integrates a large language model (LLM) with a Neo4j graph database via a retrieval-augmented generation (RAG) architecture. The system employs a decision-making agent to interpret user queries, invoke appropriate tools, and translate natural language input into formal graph queries. Outputs are validated and returned in multiple formats, like text, tables, and interactive maps, and are grounded in structured, curated monitoring and hazard data. ResultsThe agent bridges the gap between human intent and formal data retrieval, enabling researchers, policy advisors, and stakeholders to pose complex, multi-step queries without prior training in query languages. By grounding LLM outputs in structured data, we demonstrate the systems ability to respond to diverse question types and deliver transparent, accurate, and context-aware results. EcoToxFred successfully answers broad and highly specific queries, bridging natural language input with formal data retrieval. ConclusionEcoToxFred represents a scalable and transferable framework for human-AI interaction in domain-specific contexts, combining natural language interfaces with structured data. By lowering access barriers to scientific knowledge, the system supports evidence-based decision-making and fosters responsible, human-centered AI use in environmental science and beyond.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
PLOS ONE
4510 papers in training set
Top 5%
23.3%
2
Environmental Health Perspectives
17 papers in training set
Top 0.1%
8.7%
3
Bioinformatics
1061 papers in training set
Top 3%
8.7%
4
Nature Communications
4913 papers in training set
Top 31%
5.0%
5
Scientific Reports
3102 papers in training set
Top 26%
4.5%
50% of probability mass above
6
Computational and Structural Biotechnology Journal
216 papers in training set
Top 1%
3.7%
7
Environmental Science & Technology
64 papers in training set
Top 0.8%
3.7%
8
Data in Brief
13 papers in training set
Top 0.1%
3.7%
9
PLOS Computational Biology
1633 papers in training set
Top 11%
3.2%
10
Nucleic Acids Research
1128 papers in training set
Top 7%
2.7%
11
Environment International
42 papers in training set
Top 0.6%
1.8%
12
Bioinformatics Advances
184 papers in training set
Top 3%
1.8%
13
GigaScience
172 papers in training set
Top 1%
1.7%
14
Methods in Ecology and Evolution
160 papers in training set
Top 1%
1.5%
15
Patterns
70 papers in training set
Top 2%
1.0%
16
BMC Bioinformatics
383 papers in training set
Top 6%
0.9%
17
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 42%
0.8%
18
ACS Synthetic Biology
256 papers in training set
Top 3%
0.8%
19
Nature Human Behaviour
85 papers in training set
Top 4%
0.8%
20
PLOS Biology
408 papers in training set
Top 19%
0.8%
21
npj Digital Medicine
97 papers in training set
Top 3%
0.8%
22
Scientific Data
174 papers in training set
Top 2%
0.8%
23
GeoHealth
10 papers in training set
Top 0.7%
0.7%
24
Limnology and Oceanography: Methods
11 papers in training set
Top 0.4%
0.7%
25
International Journal of Medical Informatics
25 papers in training set
Top 2%
0.7%
26
Frontiers in Artificial Intelligence
18 papers in training set
Top 0.9%
0.7%
27
Viruses
318 papers in training set
Top 6%
0.7%
28
Metabolic Engineering
68 papers in training set
Top 0.8%
0.7%
29
iScience
1063 papers in training set
Top 36%
0.7%
30
Archives of Toxicology
14 papers in training set
Top 0.4%
0.7%