Back

An improved reference of the grapevine genome supports reasserting the origin of the PN40024 highly-homozygous genotype

Velt, A.; Frommer, B.; Blanc, S.; Holtgräwe, D.; Duchene, E.; Dumas, V.; Grimplet, J.; Hugueney, P.; Lahaye, M.; Kim, C.; Matus, J. T.; Navarro-Paya, D.; Orduna, L.; Tello-Ruiz, M. K.; Vitulo, N.; Ware, D.; Rustenholz, C.

2022-12-22 genomics
10.1101/2022.12.21.521434 bioRxiv
Show abstract

The genome sequence assembly of the diploid and highly homozygous V. vinifera genotype PN40024 serves as the reference for many grapevine studies. Despite several improvements of the PN40024 genome assembly, its current version PN12X.v2 is quite fragmented and only represents the haploid state of the genome with mixed haplotypes. In fact, despite the PN40024 genome is nearly homozygous, it still contains various heterozygous regions. Taking the opportunity of the improvements that long-read sequencing technologies offer to fully discriminate haplotype sequences and considering that several Vitis sp. genomes have recently been assembled with these approaches, an improved version of the reference, called PN40024.v4, was generated. Through incorporating long genomic sequencing reads to the assembly, the continuity of the 12X.v2 scaffolds was highly increased. The number of scaffolds decreased from 2,059 to 640 and the number of N bases was reduced by 88%. Additionally, the full alternative haplotype sequence was built for the first time, the chromosome anchoring was improved and the amount of unplaced scaffolds were reduced by half. To obtain a high-quality gene annotation that outperforms previous versions, a liftover approach was complemented with an optimized annotation workflow for Vitis. Integration of the gene reference catalogue and its manual curation have also assisted in improving the annotation, while defining the most reliable estimation to date of 35,230 genes. Finally, we demonstrate that PN40024 resulted from selfings of cv. Helfensteiner (cross of cv. Pinot noir and Schiava grossa) instead of a single Pinot noir. These advances will help maintaining the PN40024 genome as a gold-standard reference also contributing in the eventual elaboration of the grapevine pangenome.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
DNA Research
23 papers in training set
Top 0.1%
14.7%
2
Frontiers in Plant Science
240 papers in training set
Top 0.5%
14.3%
3
Scientific Data
174 papers in training set
Top 0.1%
10.1%
4
Horticulture Research
43 papers in training set
Top 0.3%
6.4%
5
BMC Genomics
328 papers in training set
Top 0.4%
4.8%
50% of probability mass above
6
Scientific Reports
3102 papers in training set
Top 31%
4.0%
7
PLOS ONE
4510 papers in training set
Top 40%
3.6%
8
Gigabyte
60 papers in training set
Top 0.3%
3.2%
9
The Plant Journal
197 papers in training set
Top 2%
2.7%
10
Peer Community Journal
254 papers in training set
Top 1%
2.1%
11
Frontiers in Genetics
197 papers in training set
Top 4%
1.9%
12
International Journal of Molecular Sciences
453 papers in training set
Top 6%
1.9%
13
GigaScience
172 papers in training set
Top 1%
1.7%
14
International Journal of Biological Macromolecules
65 papers in training set
Top 2%
1.7%
15
Genomics
60 papers in training set
Top 1%
1.7%
16
The Plant Genome
53 papers in training set
Top 0.4%
1.5%
17
Plant Biotechnology Journal
56 papers in training set
Top 0.7%
1.5%
18
Genes
126 papers in training set
Top 2%
1.2%
19
Computational and Structural Biotechnology Journal
216 papers in training set
Top 6%
1.2%
20
Database
51 papers in training set
Top 0.7%
0.9%
21
BMC Plant Biology
47 papers in training set
Top 0.7%
0.9%
22
Plant Methods
39 papers in training set
Top 0.7%
0.8%
23
Frontiers in Microbiology
375 papers in training set
Top 9%
0.7%
24
Agronomy
18 papers in training set
Top 0.8%
0.7%