Back

Easy-to-use whole-genome sequencing workflows and standardized practices to uncover hidden genetic variation in Synechocystis PCC 6803 wild-type and knock-out strains

Theune, M.; Fritsche, R.; Kueppers, N.; Boehm, M.; Kolkhof, P.; Paul, F.; Popa, O.; Oldenburg, E.; Wiegard, A.; Axmann, I. M.; Gutekunst, K.

2026-04-08 microbiology
10.64898/2026.04.08.717167 bioRxiv
Show abstract

Knock-out mutants are often used to study gene function by disrupting a specific gene and comparing the mutant to a wild-type strain. Reliable interpretation, however, requires that the two strains differ only by the intended mutation and that the observed phenotype is caused specifically by the deleted gene. In the highly polyploid cyanobacterium Synechocystis sp. PCC 6803, this is particularly challenging because incomplete segregation can mask genetic heterogeneity or secondary suppressor mutations. The genetic variation among laboratory wild-type lines can further confound phenotypic analyses. We show that these challenges can be addressed by routine strain validation via whole-genome sequencing (WGS). To this end, we developed and tested user friendly workflows for short-read (Illumina), long-read (Oxford Nanopore Technologies; ONT), and hybrid data, providing standardized quality control, variant calling, and structural variant detection. We benchmarked their performance in detecting single-nucleotide polymorphisms (SNPs), small indels, and structural variants using simulated datasets across different coverages and mixed populations. Applying the workflows to three Synechocystis sp. PCC 6803 wild-type lines revealed multiple sequence and structural differences relative to the reference genome, including previously undescribed genetic variants, underscoring the importance of documenting the strain background and the value of long-read sequencing. Characterization of two independent 6-phosphogluconate dehydrogenase (gnd) knock-out mutants and their complemented strains highlighted how a failed rescue can reveal a phenotype unrelated to the intended knock-out. An automated literature analysis revealed that only a minority of the investigated Synechocystis studies that used knock-out mutants included complementation as a control (39%), whereas this practice is more common in studies involving Escherichia coli (63%) and Saccharomyces cerevisiae (55%). Based on these results, we propose a practical guide for standardizing knock-out phenotyping in Synechocystis PCC 6803. Combined with accessible workflows for routine whole-genome validation, this framework aims to support more robust and reproducible knock-out studies in the future.

Matching journals

The top 16 journals account for 50% of the predicted probability mass.

1
Scientific Reports
3102 papers in training set
Top 16%
6.5%
2
Nucleic Acids Research
1128 papers in training set
Top 4%
5.0%
3
Nature Communications
4913 papers in training set
Top 34%
4.4%
4
Algal Research
20 papers in training set
Top 0.1%
4.1%
5
ACS Synthetic Biology
256 papers in training set
Top 0.9%
3.8%
6
PLOS ONE
4510 papers in training set
Top 38%
3.7%
7
GigaScience
172 papers in training set
Top 0.7%
2.8%
8
Frontiers in Plant Science
240 papers in training set
Top 3%
2.7%
9
Plant Biotechnology Journal
56 papers in training set
Top 0.4%
2.7%
10
mSystems
361 papers in training set
Top 4%
2.7%
11
Frontiers in Microbiology
375 papers in training set
Top 4%
2.4%
12
Photosynthesis Research
15 papers in training set
Top 0.1%
2.1%
13
Genome Biology
555 papers in training set
Top 3%
2.1%
14
NAR Genomics and Bioinformatics
214 papers in training set
Top 1%
2.1%
15
Communications Biology
886 papers in training set
Top 6%
1.9%
16
Metabolic Engineering
68 papers in training set
Top 0.3%
1.9%
50% of probability mass above
17
BMC Genomics
328 papers in training set
Top 2%
1.8%
18
New Phytologist
309 papers in training set
Top 3%
1.7%
19
eLife
5422 papers in training set
Top 41%
1.7%
20
Scientific Data
174 papers in training set
Top 1%
1.7%
21
Microbiology Resource Announcements
22 papers in training set
Top 0.4%
1.5%
22
The Plant Journal
197 papers in training set
Top 3%
1.3%
23
Microbial Genomics
204 papers in training set
Top 1%
1.3%
24
Life Science Alliance
263 papers in training set
Top 0.6%
1.3%
25
The Plant Cell
141 papers in training set
Top 2%
1.1%
26
PLOS Computational Biology
1633 papers in training set
Top 21%
1.0%
27
ISME Communications
103 papers in training set
Top 2%
1.0%
28
Plant Direct
81 papers in training set
Top 2%
1.0%
29
Microbiology Spectrum
435 papers in training set
Top 4%
0.9%
30
Plant Physiology
217 papers in training set
Top 2%
0.9%