Back

Exploring the Space of Tumor Phylogenies Consistent with Single-Cell Whole-Genome Sequencing Data

Khan, S. R.; Sashittal, P.

2026-01-23 bioinformatics
10.64898/2026.01.21.700922 bioRxiv
Show abstract

Tumors comprise subpopulations of cells that harbor distinct collections of somatic mutations, ranging from single-nucleotide variants (SNVs) to large-scale copy-number aberrations (CNAs). Single-cell whole-genome sequencing (scWGS) enables direct measurement of these mutations; however, inferring tumor phylogenies from scWGS data remains challenging due to ultra-low coverage ([~]0.05 x). There may be multiple ways of imputing missing information in the data leading to distinct tumor phylogenies that are equally well supported by the data. Existing methods produce a single phylogeny and overlook this uncertainty in reconstructing evolutionary histories from sparse scWGS data. We present SCOPE, a novel algorithmic framework that characterizes the space of tumor phylogenies consistent with scWGS data under a copy-number constrained version of the perfect phylogeny model. Our approach relies on estimating the cell fraction of each mutation, i.e. the proportion of cells within each copy-number cluster that carry the mutation. We derive the necessary and sufficient conditions these fractions must satisfy to admit a copy-number constrained perfect phylogeny. This yields a complete combinatorial description of all tumor phylogenies that are supported by the data under our model. We prove that identifying the largest subset of mutations with cell fractions satisfy model constraints using noisy measurements of cell fractions is NP-hard. On simulated data, SCOPE outperforms existing methods in accuracy with faster runtime in particular on the larger simulations. On scWGS data from a patient-derived ovarian cancer cell line, SCOPE infers a more resolved phylogeny with stronger statistical support compared to existing methods. Using SCOPE to analyze a larger dataset of 4 triple negative breast cancer (TNBC) and 8 high-grade serous ovarian cancer (HGSOC) samples, we show that several samples admit multiple phylogenies. We further find that number of admissible phylogenies increases with lower sequencing coverage and is negatively correlated with the number of copy-number clusters and number of distinct loss of heterozygosity (LOH) events in the clusters, highlighting how data quality and evolutionary constraints jointly shape uncertainty in tumor phylogeny reconstruction. By providing a principled framework for exploring and quantifying phylogenetic uncertainty, SCOPE establishes a new foundation for robust inference of tumor evolution from scWGS data. Code availabilitySoftware is available at https://github.com/sashittal-group/SCOPE

Matching journals

The top 3 journals account for 50% of the predicted probability mass.