POTTR: Identifying Recurrent Trajectories in Evolutionary and Developmental Processes using Posets
Käufler, S. C.; Schmidt, H.; Jürgens, M.; Klau, G. W.; Sashittal, P.; Raphael, B.
Show abstract
Multiple biological processes, including cancer evolution and organismal development, are described as a sequence of events with a temporal ordering. While cancer evolves independently in each patient, DNA sequencing studies have shown that in some cancers different patients share specific orders of mutations and these correlate with distinct morphology, drug response, and treatment outcomes. Several methods have been developed to identify such recurrent trajectories of genetic events from phylogenetic trees, but this is complicated by high intra- and inter-tumor heterogeneity as well as uncertainty in the inferred tumor phylogenies including the ambiguous orders between some mutations. We formalize the problem of finding recurrent mutation trajectories using a novel framework of incomplete partially ordered sets (posets), which generalize representations used in previous works and explicitly account for the uncertainty in tumor phylogenies. We define the problem of identifying the largest recurrent trajectories shared in at least k input phylogenies as the maximum k-common induced incomplete subposet (MkCIIS) problem, which we show is NP-hard. We present a combinatorial algorithm, POsets for Temporal Trajectory Resolution (POTTR), to solve the MkCIIS problem using a conflict graph that models recurrent trajectories as independent sets. Thereby we identify maximum recurrent trajectories while resolving multiple sources of uncertainty, like mutation clusters, in the phylogenetic data. We apply POTTR to TRACERx non-small cell lung cancer bulk sequencing and acute myeloid leukemia single-cell sequencing data and through resolution of mutation clusters discover previously unreported trajectories of high statistical significance. On lineage tracing data of an in vitro embryoid model, POTTR identifies conserved differentiation routes across biological replicates and how these routes change in response to chemical perturbations.
Matching journals
The top 4 journals account for 50% of the predicted probability mass.