Assessing the performance of methods for copy number aberration detection from single-cell DNA sequencing data

Xian F. Mallory; Mohammadamin Edrisi; Nicholas Navin; Luay Nakhleh

doi:10.1371/journal.pcbi.1008012

Assessing the performance of methods for copy number aberration detection from single-cell DNA sequencing data

Xian F. Mallory, Mohammadamin Edrisi, Nicholas Navin, Luay Nakhleh

Genetics

Research output: Contribution to journal › Article › peer-review

21 Scopus citations

Abstract

Single-cell DNA sequencing technologies are enabling the study of mutations and their evolutionary trajectories in cancer. Somatic copy number aberrations (CNAs) have been implicated in the development and progression of various types of cancer. A wide array of methods for CNA detection has been either developed specifically for or adapted to singlecell DNA sequencing data. Understanding the strengths and limitations that are unique to each of these methods is very important for obtaining accurate copy number profiles from single-cell DNA sequencing data. We benchmarked three widely used methods-Ginkgo, HMMcopy, and CopyNumber-on simulated as well as real datasets. To facilitate this, we developed a novel simulator of single-cell genome evolution in the presence of CNAs. Furthermore, to assess performance on empirical data where the ground truth is unknown, we introduce a phylogeny-based measure for identifying potentially erroneous inferences. While single-cell DNA sequencing is very promising for elucidating and understanding CNAs, our findings show that even the best existing method does not exceed 80% accuracy. New methods that significantly improve upon the accuracy of these three methods are needed. Furthermore, with the large datasets being generated, the methods must be computationally efficient.

Original language	English (US)
Article number	e1008012
Journal	PLoS computational biology
Volume	16
Issue number	7
DOIs	https://doi.org/10.1371/journal.pcbi.1008012
State	Published - Jul 2020

ASJC Scopus subject areas

Ecology, Evolution, Behavior and Systematics
Modeling and Simulation
Ecology
Molecular Biology
Genetics
Cellular and Molecular Neuroscience
Computational Theory and Mathematics

Access to Document

10.1371/journal.pcbi.1008012

Cite this

@article{fb6298f71a6941f0a020f63975df7654,

title = "Assessing the performance of methods for copy number aberration detection from single-cell DNA sequencing data",

abstract = "Single-cell DNA sequencing technologies are enabling the study of mutations and their evolutionary trajectories in cancer. Somatic copy number aberrations (CNAs) have been implicated in the development and progression of various types of cancer. A wide array of methods for CNA detection has been either developed specifically for or adapted to singlecell DNA sequencing data. Understanding the strengths and limitations that are unique to each of these methods is very important for obtaining accurate copy number profiles from single-cell DNA sequencing data. We benchmarked three widely used methods-Ginkgo, HMMcopy, and CopyNumber-on simulated as well as real datasets. To facilitate this, we developed a novel simulator of single-cell genome evolution in the presence of CNAs. Furthermore, to assess performance on empirical data where the ground truth is unknown, we introduce a phylogeny-based measure for identifying potentially erroneous inferences. While single-cell DNA sequencing is very promising for elucidating and understanding CNAs, our findings show that even the best existing method does not exceed 80% accuracy. New methods that significantly improve upon the accuracy of these three methods are needed. Furthermore, with the large datasets being generated, the methods must be computationally efficient.",

author = "Mallory, {Xian F.} and Mohammadamin Edrisi and Nicholas Navin and Luay Nakhleh",

note = "Funding Information: The study was supported by the National Science Foundation grant IIS-1812822 (L.N.). X.F. M. was supported in part by the Computational Cancer Biology Training Program (CPRIT Grant No. RP170593). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Publisher Copyright: {\textcopyright} 2020 Mallory et al.",

year = "2020",

month = jul,

doi = "10.1371/journal.pcbi.1008012",

language = "English (US)",

volume = "16",

journal = "PLoS computational biology",

issn = "1553-734X",

publisher = "Public Library of Science",

number = "7",

}

TY - JOUR

T1 - Assessing the performance of methods for copy number aberration detection from single-cell DNA sequencing data

AU - Mallory, Xian F.

AU - Edrisi, Mohammadamin

AU - Navin, Nicholas

AU - Nakhleh, Luay

N1 - Funding Information: The study was supported by the National Science Foundation grant IIS-1812822 (L.N.). X.F. M. was supported in part by the Computational Cancer Biology Training Program (CPRIT Grant No. RP170593). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Publisher Copyright: © 2020 Mallory et al.

PY - 2020/7

Y1 - 2020/7

N2 - Single-cell DNA sequencing technologies are enabling the study of mutations and their evolutionary trajectories in cancer. Somatic copy number aberrations (CNAs) have been implicated in the development and progression of various types of cancer. A wide array of methods for CNA detection has been either developed specifically for or adapted to singlecell DNA sequencing data. Understanding the strengths and limitations that are unique to each of these methods is very important for obtaining accurate copy number profiles from single-cell DNA sequencing data. We benchmarked three widely used methods-Ginkgo, HMMcopy, and CopyNumber-on simulated as well as real datasets. To facilitate this, we developed a novel simulator of single-cell genome evolution in the presence of CNAs. Furthermore, to assess performance on empirical data where the ground truth is unknown, we introduce a phylogeny-based measure for identifying potentially erroneous inferences. While single-cell DNA sequencing is very promising for elucidating and understanding CNAs, our findings show that even the best existing method does not exceed 80% accuracy. New methods that significantly improve upon the accuracy of these three methods are needed. Furthermore, with the large datasets being generated, the methods must be computationally efficient.

AB - Single-cell DNA sequencing technologies are enabling the study of mutations and their evolutionary trajectories in cancer. Somatic copy number aberrations (CNAs) have been implicated in the development and progression of various types of cancer. A wide array of methods for CNA detection has been either developed specifically for or adapted to singlecell DNA sequencing data. Understanding the strengths and limitations that are unique to each of these methods is very important for obtaining accurate copy number profiles from single-cell DNA sequencing data. We benchmarked three widely used methods-Ginkgo, HMMcopy, and CopyNumber-on simulated as well as real datasets. To facilitate this, we developed a novel simulator of single-cell genome evolution in the presence of CNAs. Furthermore, to assess performance on empirical data where the ground truth is unknown, we introduce a phylogeny-based measure for identifying potentially erroneous inferences. While single-cell DNA sequencing is very promising for elucidating and understanding CNAs, our findings show that even the best existing method does not exceed 80% accuracy. New methods that significantly improve upon the accuracy of these three methods are needed. Furthermore, with the large datasets being generated, the methods must be computationally efficient.

UR - http://www.scopus.com/inward/record.url?scp=85088609739&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85088609739&partnerID=8YFLogxK

U2 - 10.1371/journal.pcbi.1008012

DO - 10.1371/journal.pcbi.1008012

M3 - Article

C2 - 32658894

AN - SCOPUS:85088609739

SN - 1553-734X

VL - 16

JO - PLoS computational biology

JF - PLoS computational biology

IS - 7

M1 - e1008012

ER -

Assessing the performance of methods for copy number aberration detection from single-cell DNA sequencing data

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this