TY - JOUR
T1 - The VAAST Variant Prioritizer (VVP)
T2 - Ultrafast, easy to use whole genome variant prioritization tool
AU - Flygare, Steven
AU - Hernandez, Edgar Javier
AU - Phan, Lon
AU - Moore, Barry
AU - Li, Man
AU - Fejes, Anthony
AU - Hu, Hao
AU - Eilbeck, Karen
AU - Huff, Chad
AU - Jorde, Lynn
AU - G. Reese, Martin
AU - Yandell, Mark
N1 - Publisher Copyright:
© 2018 The Author(s).
PY - 2018/2/20
Y1 - 2018/2/20
N2 - Background: Prioritization of sequence variants for diagnosis and discovery of Mendelian diseases is challenging, especially in large collections of whole genome sequences (WGS). Fast, scalable solutions are needed for discovery research, for clinical applications, and for curation of massive public variant repositories such as dbSNP and gnomAD. In response, we have developed VVP, the VAAST Variant Prioritizer. VVP is ultrafast, scales to even the largest variant repositories and genome collections, and its outputs are designed to simplify clinical interpretation of variants of uncertain significance. Results: We show that scoring the entire contents of dbSNP (>155 million variants) requires only 95min using a machine with 4 cpus and 16GB of RAM, and that a60X WGS can be processed in less than 5min. We also demonstrate that VVP can score variants anywhere in the genome, regardless of type, effect, or location. It does so by integrating sequence conservation, the type of sequence change, allele frequencies, variant burden, and zygosity. Finally, we also show that VVP scores are consistently accurate, and easily interpreted, traits not shared by many commonly used tools such as SIFT and CADD. Conclusions: VVP provides rapid and scalable means to prioritize any sequence variant, anywhere in the genome, and its scores are designed to facilitate variant interpretation using ACMG and NHS guidelines. These traits make it well suited for operation on very large collections of WGS sequences.
AB - Background: Prioritization of sequence variants for diagnosis and discovery of Mendelian diseases is challenging, especially in large collections of whole genome sequences (WGS). Fast, scalable solutions are needed for discovery research, for clinical applications, and for curation of massive public variant repositories such as dbSNP and gnomAD. In response, we have developed VVP, the VAAST Variant Prioritizer. VVP is ultrafast, scales to even the largest variant repositories and genome collections, and its outputs are designed to simplify clinical interpretation of variants of uncertain significance. Results: We show that scoring the entire contents of dbSNP (>155 million variants) requires only 95min using a machine with 4 cpus and 16GB of RAM, and that a60X WGS can be processed in less than 5min. We also demonstrate that VVP can score variants anywhere in the genome, regardless of type, effect, or location. It does so by integrating sequence conservation, the type of sequence change, allele frequencies, variant burden, and zygosity. Finally, we also show that VVP scores are consistently accurate, and easily interpreted, traits not shared by many commonly used tools such as SIFT and CADD. Conclusions: VVP provides rapid and scalable means to prioritize any sequence variant, anywhere in the genome, and its scores are designed to facilitate variant interpretation using ACMG and NHS guidelines. These traits make it well suited for operation on very large collections of WGS sequences.
KW - Genomics
KW - Human genome
KW - Variant prioritization
KW - Variants of uncertain significance
UR - http://www.scopus.com/inward/record.url?scp=85042543566&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85042543566&partnerID=8YFLogxK
U2 - 10.1186/s12859-018-2056-y
DO - 10.1186/s12859-018-2056-y
M3 - Article
C2 - 29463208
AN - SCOPUS:85042543566
SN - 1471-2105
VL - 19
JO - BMC bioinformatics
JF - BMC bioinformatics
IS - 1
M1 - 57
ER -