Back

MicrowellMicrofluidicsMiner (M3): Leverage Large Language Model Agents for Knowledge Mining of Microwell Microfluidics

Nguyen, D.-N.; Shakil, S.; Tong, R. K. Y.; Dinh, N.-D.

2026-02-17 bioengineering
10.64898/2026.02.14.705953 bioRxiv
Show abstract

Microwell microfluidics has emerged as powerful platforms for high-precision biological and chemical investigations, bridging microscale fluid handling with compartmentalized reaction environments. Achieving robust and reproducible performance in such studies requires substantial effort to optimize microwell array design. This burden could be markedly alleviated by the availability of a curated database of microwell array parameters. Such a resource would enable the application of machine-learning models for performance prediction and automated design, leveraging knowledge accumulated from prior microfluidics research. However, constructing such a database entails a considerable investment of time and extensive manual curation, as microwell performance is governed by numerous critical design parameters that are reported inconsistently across a broad and largely unstructured body of literature. In this study, we introduce MicrowellMicrofluidicsMiner (M3), a framework that employs large language model (LLM) agents for autonomous knowledge extraction in microwell microfluidics. To evaluate its performance, we curate a ground-truth database and establish an LLM-driven assessment approach. Our results demonstrate that M3 achieves a peak accuracy of approximately 78%, representing more than a twofold improvement over the lowest observed accuracy (32%) obtained using a standalone LLM model (LLAMA 3.1). This study provides a foundational reference for researchers seeking to apply LLM agents to data-driven microfluidics research. The insights presented have the potential to substantially improve how scientists across microfluidics-related disciplines access, interpret, and leverage scientific information, thereby accelerating the development of innovative microfluidic devices and associated discoveries.

Matching journals

The top 2 journals account for 50% of the predicted probability mass.

1
Lab on a Chip
88 papers in training set
Top 0.1%
45.4%
2
Analytical Chemistry
205 papers in training set
Top 0.4%
7.4%
50% of probability mass above
3
Advanced Science
249 papers in training set
Top 4%
4.7%
4
PLOS ONE
4510 papers in training set
Top 47%
2.3%
5
Scientific Reports
3102 papers in training set
Top 51%
2.1%
6
ACS Omega
90 papers in training set
Top 1%
1.9%
7
Biosensors and Bioelectronics
52 papers in training set
Top 0.7%
1.8%
8
APL Bioengineering
18 papers in training set
Top 0.1%
1.8%
9
Biofabrication
32 papers in training set
Top 0.5%
1.5%
10
Advanced Materials Technologies
27 papers in training set
Top 0.4%
1.3%
11
The Analyst
15 papers in training set
Top 0.3%
1.3%
12
Nature Communications
4913 papers in training set
Top 57%
1.2%
13
Biotechnology and Bioengineering
49 papers in training set
Top 0.6%
1.0%
14
Computational and Structural Biotechnology Journal
216 papers in training set
Top 7%
1.0%
15
iScience
1063 papers in training set
Top 24%
1.0%
16
Cell Systems
167 papers in training set
Top 11%
0.9%
17
Science Advances
1098 papers in training set
Top 27%
0.8%
18
ACS Synthetic Biology
256 papers in training set
Top 3%
0.7%
19
ACS Biomaterials Science & Engineering
37 papers in training set
Top 1%
0.7%
20
Bioengineering & Translational Medicine
21 papers in training set
Top 1.0%
0.7%
21
Frontiers in Digital Health
20 papers in training set
Top 2%
0.5%
22
Communications Biology
886 papers in training set
Top 31%
0.5%
23
Small Methods
26 papers in training set
Top 2%
0.5%