Back

OPENPichia: building a free-to-operate Komagataella phaffii protein expression toolkit

Van Herpe, D.; Vanluchene, R.; Vandewalle, K.; Vanmarcke, S.; Wyseure, E.; Van Moer, B.; Eeckhaut, H.; Fijalkowska, D.; Grootaert, H.; Lonigro, C.; Meuris, L.; Michielsen, G.; Naessens, J.; Roels, C.; van Schie, L.; De Rycke, R.; De Bruyne, M.; Borghgraef, P.; Claes, K.; Callewaert, N.

2022-12-13 molecular biology
10.1101/2022.12.13.519130 bioRxiv
Show abstract

In the standard toolkit for recombinant protein expression, the yeast known in biotechnology as Pichia pastoris (formally: Komagataella phaffii) takes up the position between E. coli and HEK293 or CHO mammalian cells, and is used by thousands of laboratories both in academia and industry. The organism is eukaryotic yet microbial, and grows to extremely high cell densities while secreting proteins into its fully defined growth medium, using very well established strong inducible or constitutive promoters. Many products made in Pichia are in the clinic and in industrial markets. Pichia is also a favoured host for the rapidly emerging area of precision fermentation for the manufacturing of food proteins. However, the earliest steps in the development of the industrial strain (NRRL Y-11430/CBS 7435) that is used throughout the world were performed prior to 1985 in industry (Phillips Petroleum Company) and are not in the public domain. Moreover, despite the long expiry of associated patents, the patent deposit NRRL Y-11430/CBS 7435 that is the parent to all commonly used industrial strains, is not or no longer made freely available through the resp. culture collections. This situation is far from ideal for what is a major chassis for synthetic biology, as it generates concern that novel applications of the system are still encumbered by licensing requirements of the very basic strains. In the spirit of open science and freedom to operate for what is a key component of biotechnology, we set out to resolve this by using genome sequencing of type strains, reverse engineering where necessary, and comparative protein expression and strain characterisation studies. We find that the industrial strains derive from the K. phaffii type strain lineage deposited as 54-11.239 in the UC Davis Phaff Yeast Strain collection by Herman Phaff in 1954. This type strain has valid equivalent deposits that are replicated/derived from it in other yeast strain collections, incl. in ARS-NRRL NRRL YB-4290 (deposit also made by Herman Phaff) and NRRL Y-7556, CBS 2612 and NCYC 2543. We furthermore discovered that NRRL Y-11430 and its derivatives carry an ORF-truncating mutation in the HOC1 cell wall synthesis gene, and that reverse engineering of a similar mutation in the NCYC 2543 type strain imparts the high transformability that is characteristic of the industrial strains. Uniquely, the NCYC 2543 type strain, which we propose to call OPENPichia henceforth, is freely available from the NCYC culture collection, incl. resale and commercial production licenses at nominal annual licensing fees1. Furthermore, our not-for-profit research institute VIB has also acquired a resale/distribution license from NCYC, which we presently use to openly provide to end-users our genome-sequenced OPENPichia subclone strain and its derivatives, i.e., currently the highly transformable hoc1tr and the his4 auxotrophic mutants. To complement the OPENPichia platform, a fully synthetic modular gene expression vector building toolkit was developed, which is also openly distributed, for any purpose. We invite other researchers to contribute to our open science resource-building effort to establish a new unencumbered standard chassis for Pichia synthetic biology.

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
Yeast
15 papers in training set
Top 0.1%
12.5%
2
ACS Synthetic Biology
256 papers in training set
Top 0.4%
10.5%
3
PLOS ONE
4510 papers in training set
Top 22%
8.4%
4
Metabolic Engineering Communications
20 papers in training set
Top 0.1%
6.8%
5
Genetics
225 papers in training set
Top 1%
3.6%
6
Current Biology
596 papers in training set
Top 5%
3.6%
7
G3 Genes|Genomes|Genetics
351 papers in training set
Top 0.7%
3.1%
8
eLife
5422 papers in training set
Top 29%
3.1%
50% of probability mass above
9
Frontiers in Bioengineering and Biotechnology
88 papers in training set
Top 0.7%
2.9%
10
Protein Science
221 papers in training set
Top 0.5%
2.9%
11
Open Biology
95 papers in training set
Top 0.3%
2.6%
12
Journal of Molecular Biology
217 papers in training set
Top 1%
1.9%
13
Nucleic Acids Research
1128 papers in training set
Top 11%
1.7%
14
Nature Communications
4913 papers in training set
Top 52%
1.7%
15
The Plant Journal
197 papers in training set
Top 2%
1.5%
16
Genome Biology and Evolution
280 papers in training set
Top 1%
1.3%
17
Scientific Reports
3102 papers in training set
Top 66%
1.2%
18
Cell Genomics
162 papers in training set
Top 5%
1.1%
19
PLOS Genetics
756 papers in training set
Top 12%
1.0%
20
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 41%
0.9%
21
PLOS Biology
408 papers in training set
Top 16%
0.9%
22
FEBS Open Bio
29 papers in training set
Top 0.5%
0.8%
23
mBio
750 papers in training set
Top 11%
0.8%
24
Molecular Biology of the Cell
272 papers in training set
Top 2%
0.8%
25
G3: Genes, Genomes, Genetics
222 papers in training set
Top 1.0%
0.7%
26
Life Science Alliance
263 papers in training set
Top 2%
0.6%