Back

Systematic evaluation of 24 extraction and library preparation combinations for metagenomic sequencing of SARS-CoV-2 in saliva

Qian, K.; Abhyankar, V.; Keo, D.; Zarceno, P.; Toy, T.; Eskin, E.; Arboleda, V. A.

2026-04-20 genomics
10.64898/2026.04.16.719115 bioRxiv
Show abstract

Sequencing the respiratory tract transcriptome has the potential to provide insights into infectious pathogens and the hosts immune response. While DNA-based sequencing is more standard in clinical laboratories due to its stability, RNA assays offer unique advantages. RNA reflects dynamic physiological changes, and for RNA viruses, viral RNA particles directly represent copies of the viral genome, enabling greater diagnostic sensitivity. However, RNAs susceptibility to degradation remains a significant challenge, particularly in RNase-rich specimens like saliva. To address this, we conducted a systematic, combinatorial evaluation of 24 distinct mNGS workflows, crossing eight nucleic acid extraction methods with three RNA-Seq library preparation protocols. Remnant saliva samples (n = 6) were pooled and spiked with MS2 phage as a control. The SARS-CoV-2 virus was spiked into half of the samples, which were extracted using the eight different extraction methods (n = 3) and compared using RNA Integrity Number equivalent (RINe) scores and RNA concentration. The extracted RNA was then processed across the three library construction methods and subjected to short-read sequencing to assess all 24 combinations head-to-head. We compared methods based on viral read recovery and found that RINe and concentration did not correlate with viral detection. The Zymo Quick-RNA Magbead kit and the Tecan Revelo RNA-Seq High-Sensitivity RNA library kit were the extraction and library-preparation kits that yielded the most SARS-CoV-2 reads, respectively. Importantly, our combinatorial analysis revealed that any small variability attributable to different nucleic acid extraction methods was heavily overshadowed by differences in quality attributable to the RNA-Seq library preparation methods. These findings challenge the reliance on conventional RNA quality metrics for clinical metagenomics and underscore the need to redefine extraction quality standards for mNGS applications. IMPORTANCEmNGS is a powerful and unbiased approach towards pathogen detection that has mostly been applied to blood and cerebrospinal fluid samples. However mNGS has recently been applied to more areas including the respiratory pathogen detection space, with potential applications in both in-patient diagnostics and public health surveillance. Saliva samples are an ideal sample type for these use cases since they can be collected non-invasively. However, saliva is also a challenging sample type due to its high RNase activity and often yields low-quality nucleic acid. This study explores the feasibility of using saliva specimens in mNGS with contrived SARS-CoV-2 samples to optimize the combination of two factors: nucleic acid extraction and RNA-seq library preparation. Exploration in this area could enhance the sensitivity of saliva-based mNGS assays, with the goal of future expansion of this specimen type in clinical diagnostics and public health surveillance. Key PointsO_LIThe choice of RNA-Seq library preparation kit has a greater impact on pathogen detection than the nucleic acid extraction method. C_LIO_LIThe combination of Zymo Quick-RNA Magbead extraction kit and TECAN Revelo RNA-Seq High Sensitivity RNA library kit recovered the highest percentage of total SARS-CoV-2 reads. C_LIO_LIRNA quantity and RINe score do not correlate with viral read capture, indicating a need for an alternative metric to assess RNA quality for downstream mNGS clinical diagnostics. C_LI

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
Microbiology Spectrum
435 papers in training set
Top 0.1%
14.5%
2
PLOS ONE
4510 papers in training set
Top 22%
8.5%
3
BMC Genomics
328 papers in training set
Top 0.2%
6.9%
4
Scientific Reports
3102 papers in training set
Top 18%
6.4%
5
mSystems
361 papers in training set
Top 2%
4.9%
6
Microbial Genomics
204 papers in training set
Top 0.5%
4.4%
7
Clinical Chemistry
22 papers in training set
Top 0.1%
3.6%
8
Frontiers in Cellular and Infection Microbiology
98 papers in training set
Top 2%
1.9%
50% of probability mass above
9
Journal of Clinical Microbiology
120 papers in training set
Top 0.9%
1.8%
10
PeerJ
261 papers in training set
Top 6%
1.8%
11
Viruses
318 papers in training set
Top 3%
1.7%
12
Frontiers in Microbiology
375 papers in training set
Top 5%
1.7%
13
Diagnostic Microbiology and Infectious Disease
21 papers in training set
Top 0.1%
1.7%
14
Computational and Structural Biotechnology Journal
216 papers in training set
Top 5%
1.5%
15
Emerging Infectious Diseases
103 papers in training set
Top 2%
1.2%
16
BioTechniques
24 papers in training set
Top 0.2%
1.2%
17
Analytical and Bioanalytical Chemistry
17 papers in training set
Top 0.3%
1.0%
18
Journal of Clinical Virology
62 papers in training set
Top 0.6%
1.0%
19
Cell Reports Methods
141 papers in training set
Top 4%
0.9%
20
International Journal of Infectious Diseases
126 papers in training set
Top 3%
0.9%
21
Molecular Ecology
304 papers in training set
Top 4%
0.9%
22
mSphere
281 papers in training set
Top 5%
0.9%
23
iScience
1063 papers in training set
Top 26%
0.9%
24
Environmental Science & Technology
64 papers in training set
Top 2%
0.9%
25
GigaScience
172 papers in training set
Top 3%
0.8%
26
Applied and Environmental Microbiology
301 papers in training set
Top 3%
0.8%
27
Cancer Research Communications
46 papers in training set
Top 1%
0.8%
28
Heliyon
146 papers in training set
Top 6%
0.8%
29
Frontiers in Public Health
140 papers in training set
Top 9%
0.7%
30
Pathogens
53 papers in training set
Top 2%
0.7%