TY - JOUR
T1 - Ultra-precise detection of mutations by droplet-based amplification of circularized DNA
AU - Wang, Kaile
AU - Ma, Qin
AU - Jiang, Lan
AU - Lai, Shujuan
AU - Lu, Xuemei
AU - Hou, Yali
AU - Wu, Chung I.
AU - Ruan, Jue
N1 - Publisher Copyright:
© 2016 Wang et al.
PY - 2016/3/9
Y1 - 2016/3/9
N2 - Background: NGS (next generation sequencing) has been widely used in studies of biological processes, ranging from microbial evolution to cancer genomics. However, the error rate of NGS (0.1 % ~ 1 %) is still remaining a great challenge for comprehensively investigating the low frequency variations, and the current solution methods have suffered severe amplification bias or low efficiency. Results: We creatively developed Droplet-CirSeq for relatively efficient, low-bias and ultra-sensitive identification of variations by combining millions of picoliter uniform-sized droplets with Cir-seq. Droplet-CirSeq is entitled with an incredibly low error rate of 3 ~ 5 X 10-6. To systematically evaluate the performances of amplification uniformity and capability of mutation identification for Droplet-CirSeq, we took the mixtures of two E. coli strains as specific instances to simulate the circumstances of mutations with different frequencies. Compared with Cir-seq, the coefficient of variance of read depth for Droplet-CirSeq was 10 times less (p = 2.6 X 10-3), and the identified allele frequency presented more concentrated to the authentic frequency of mixtures (p = 4.8 X 10-3), illustrating a significant improvement of amplification bias and accuracy in allele frequency determination. Additionally, Droplet-CirSeq detected 2.5 times genuine SNPs (p < 0.001), achieved a 2.8 times lower false positive rate (p < 0.05) and a 1.5 times lower false negative rate (p < 0.001), in the case of a 3 pg DNA input. Intriguingly, the false positive sites predominantly represented in two types of base substitutions (G- > A, C- > T). Our findings indicated that 30 pg DNA input accommodated in 5 ~ 10 million droplets resulted in maximal detection of authentic mutations compared to 3 pg (p = 1.2 X 10-8) and 300 pg input (p = 2.2 X 10-3). Conclusions: We developed a method namely Droplet-CirSeq to significantly improve the amplification bias, which presents obvious superiority over the currently prevalent methods in exploitation of ultra-low frequency mutations. Droplet-CirSeq would be promisingly used in the identification of low frequency mutations initiated from extremely low input DNA, such as DNA of uncultured microorganisms, captured DNA of target region, circulation DNA of plasma et al, and its creative conception of rolling circle amplification in droplets would also be used in other low input DNA amplification fields.
AB - Background: NGS (next generation sequencing) has been widely used in studies of biological processes, ranging from microbial evolution to cancer genomics. However, the error rate of NGS (0.1 % ~ 1 %) is still remaining a great challenge for comprehensively investigating the low frequency variations, and the current solution methods have suffered severe amplification bias or low efficiency. Results: We creatively developed Droplet-CirSeq for relatively efficient, low-bias and ultra-sensitive identification of variations by combining millions of picoliter uniform-sized droplets with Cir-seq. Droplet-CirSeq is entitled with an incredibly low error rate of 3 ~ 5 X 10-6. To systematically evaluate the performances of amplification uniformity and capability of mutation identification for Droplet-CirSeq, we took the mixtures of two E. coli strains as specific instances to simulate the circumstances of mutations with different frequencies. Compared with Cir-seq, the coefficient of variance of read depth for Droplet-CirSeq was 10 times less (p = 2.6 X 10-3), and the identified allele frequency presented more concentrated to the authentic frequency of mixtures (p = 4.8 X 10-3), illustrating a significant improvement of amplification bias and accuracy in allele frequency determination. Additionally, Droplet-CirSeq detected 2.5 times genuine SNPs (p < 0.001), achieved a 2.8 times lower false positive rate (p < 0.05) and a 1.5 times lower false negative rate (p < 0.001), in the case of a 3 pg DNA input. Intriguingly, the false positive sites predominantly represented in two types of base substitutions (G- > A, C- > T). Our findings indicated that 30 pg DNA input accommodated in 5 ~ 10 million droplets resulted in maximal detection of authentic mutations compared to 3 pg (p = 1.2 X 10-8) and 300 pg input (p = 2.2 X 10-3). Conclusions: We developed a method namely Droplet-CirSeq to significantly improve the amplification bias, which presents obvious superiority over the currently prevalent methods in exploitation of ultra-low frequency mutations. Droplet-CirSeq would be promisingly used in the identification of low frequency mutations initiated from extremely low input DNA, such as DNA of uncultured microorganisms, captured DNA of target region, circulation DNA of plasma et al, and its creative conception of rolling circle amplification in droplets would also be used in other low input DNA amplification fields.
KW - Accurate sequencing
KW - Droplet (water in oil) based amplification
KW - Low bias amplification
KW - Low frequency mutation detection
KW - Low input NGS library
UR - http://www.scopus.com/inward/record.url?scp=84960077662&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84960077662&partnerID=8YFLogxK
U2 - 10.1186/s12864-016-2480-1
DO - 10.1186/s12864-016-2480-1
M3 - Article
C2 - 26960407
AN - SCOPUS:84960077662
SN - 1471-2164
VL - 17
JO - BMC genomics
JF - BMC genomics
IS - 1
M1 - 214
ER -