Homology-based perspective on pangenome graphs
Lisiecka, A.; Kowalewska, A.; Dojer, N.
Show abstract
Pangenome graphs conveniently represent genetic variation within a population. Several types of such graphs have been proposed, with varying properties and potential applications. Among them, variation graphs (VGs) seem best suited to replace reference genomes in sequencing data processing, while whole genome alignments (WGAs) are particularly practical for comparative genomics applications. For both models, no widely accepted optimization criteria for a graph representing a given set of genomes have been proposed. In the current paper we introduce the concept of homology relation induced by a pangenome graph on the characters of represented genomic sequences and define such relations for both VG and WGA model. Then, we use this concept to propose homology-based metrics for comparing different graphs representing the same genome collection, and to formulate the desired properties of transformations between VG and WGA models. Moreover, we propose several such transformations and examine their properties on pangenome graph data. Finally, we provide implementations of these transformations in a package WGAtools, available at https://github.com/anialisiecka/WGAtools.
Matching journals
The top 2 journals account for 50% of the predicted probability mass.