Submit a preprint


Cancer phylogenetic tree inference at scale from 1000s of single cell genomesuse asterix (*) to get italics
Sohrab Salehi, Fatemeh Dorri, Kevin Chern, Farhia Kabeer, Nicole Rusk, Tyler Funnell, Marc J Williams, Daniel Lai, Mirela Andronescu, Kieran R. Campbell, Andrew McPherson, Samuel Aparicio, Andrew Roth, Sohrab Shah, and Alexandre Bouchard-CôtéPlease use the format "First name initials family name" as in "Marie S. Curie, Niels H. D. Bohr, Albert Einstein, John R. R. Tolkien, Donna T. Strickland"
<p style="text-align: justify;">A new generation of scalable single cell whole genome sequencing (scWGS) methods allows unprecedented high resolution measurement of the evolutionary dynamics of cancer cell populations. Phylogenetic reconstruction is central to identifying sub-populations and distinguishing the mutational processes that gave rise to them.<br>Existing phylogenetic tree building models do not scale to the tens of thousands of high resolution genomes achievable with current scWGS methods.<br>We constructed a phylogenetic model and associated Bayesian inference procedure, Sitka, specifically for scWGS data.<br>The method is based on a novel phylogenetic encoding of copy number (CN) data, the Sitka transformation, that simplifies the site dependencies induced by rearrangements while still forming a sound foundation to phylogenetic inference. The Sitka transformation allows us to design novel scalable Markov chain Monte Carlo (MCMC) algorithms. Moreover, we introduce a novel point mutation calling method that incorporates the CN data and the underlying phylogenetic tree to overcome the low per-cell coverage of scWGS. We demonstrate our method on three single cell datasets, including a novel PDX series, and analyse the topological properties of the inferred trees. Sitka is freely available at \href{}{}.</p>
You should fill this box only if you chose 'All or part of the results presented in this preprint are based on data'. URL must start with http:// or https://
You should fill this box only if you chose 'Scripts were used to obtain or analyze the results'. URL must start with http:// or https:// should fill this box only if you chose 'Codes have been used in this study'. URL must start with http:// or https://
Phylogenetics, Evolutionary dynamics, Single cell genomics, Cancer evolution, scalable Bayesian inference
Evolutionary Biology, Genetics and population Genetics, Genomics and Transcriptomics, Machine learning, Probability and statistics
No need for them to be recommenders of PCI Math Comp Biol. Please do not suggest reviewers for whom there might be a conflict of interest. Reviewers are not allowed to review preprints written by close colleagues (with whom they have published in the last four years, with whom they have received joint funding in the last four years, or with whom they are currently writing a manuscript, or submitting a grant proposal), or by family members, friends, or anyone for whom bias might affect the nature of the review - see the code of conduct
e.g. John Doe []
2021-12-10 17:08:04
Amaury Lambert