TY - JOUR
T1 - DsPIG
T2 - A tool to predict imprinted genes from the deep sequencing of whole transcriptomes
AU - Li, Hua
AU - Su, Xiao
AU - Gallegos, Juan
AU - Lu, Yue
AU - Ji, Yuan
AU - Molldrem, Jeffrey J.
AU - Liang, Shoudan
N1 - Funding Information:
We thank Kevin Coombes, Li Zhang, Pan Tong and Zhifeng Shao for thoughtful discussions and consistent help. This research was funded by a training fellowship from the Keck Center for Quantitative Biomedical Sciences of the Gulf Coast Consortia, on the Computational Cancer Biology Training Program from the Cancer Prevention & Research Institute of Texas (CPRIT No. RP101489).
PY - 2012/10/19
Y1 - 2012/10/19
N2 - Background: Dysregulation of imprinted genes, which are expressed in a parent-of-origin-specific manner, plays an important role in various human diseases, such as cancer and behavioral disorder. To date, however, fewer than 100 imprinted genes have been identified in the human genome. The recent availability of high-throughput technology makes it possible to have large-scale prediction of imprinted genes. Here we propose a Bayesian model (dsPIG) to predict imprinted genes on the basis of allelic expression observed in mRNA-Seq data of independent human tissues.Results: Our model (dsPIG) was capable of identifying imprinted genes with high sensitivity and specificity and a low false discovery rate when the number of sequenced tissue samples was fairly large, according to simulations. By applying dsPIG to the mRNA-Seq data, we predicted 94 imprinted genes in 20 cerebellum samples and 57 imprinted genes in 9 diverse tissue samples with expected low false discovery rates. We also assessed dsPIG using previously validated imprinted and non-imprinted genes. With simulations, we further analyzed how imbalanced allelic expression of non-imprinted genes or different minor allele frequencies affected the predictions of dsPIG. Interestingly, we found that, among biallelically expressed genes, at least 18 genes expressed significantly more transcripts from one allele than the other among different individuals and tissues.Conclusion: With the prevalence of the mRNA-Seq technology, dsPIG has become a useful tool for analysis of allelic expression and large-scale prediction of imprinted genes. For ease of use, we have set up a web service and also provided an R package for dsPIG at http://www.shoudanliang.com/dsPIG/.
AB - Background: Dysregulation of imprinted genes, which are expressed in a parent-of-origin-specific manner, plays an important role in various human diseases, such as cancer and behavioral disorder. To date, however, fewer than 100 imprinted genes have been identified in the human genome. The recent availability of high-throughput technology makes it possible to have large-scale prediction of imprinted genes. Here we propose a Bayesian model (dsPIG) to predict imprinted genes on the basis of allelic expression observed in mRNA-Seq data of independent human tissues.Results: Our model (dsPIG) was capable of identifying imprinted genes with high sensitivity and specificity and a low false discovery rate when the number of sequenced tissue samples was fairly large, according to simulations. By applying dsPIG to the mRNA-Seq data, we predicted 94 imprinted genes in 20 cerebellum samples and 57 imprinted genes in 9 diverse tissue samples with expected low false discovery rates. We also assessed dsPIG using previously validated imprinted and non-imprinted genes. With simulations, we further analyzed how imbalanced allelic expression of non-imprinted genes or different minor allele frequencies affected the predictions of dsPIG. Interestingly, we found that, among biallelically expressed genes, at least 18 genes expressed significantly more transcripts from one allele than the other among different individuals and tissues.Conclusion: With the prevalence of the mRNA-Seq technology, dsPIG has become a useful tool for analysis of allelic expression and large-scale prediction of imprinted genes. For ease of use, we have set up a web service and also provided an R package for dsPIG at http://www.shoudanliang.com/dsPIG/.
KW - Analysis of allelic expression
KW - Bayesian model
KW - Prediction of imprinted genes
KW - Transcriptome deep sequencing
KW - mRNA-Seq
UR - http://www.scopus.com/inward/record.url?scp=84867604230&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84867604230&partnerID=8YFLogxK
U2 - 10.1186/1471-2105-13-271
DO - 10.1186/1471-2105-13-271
M3 - Article
C2 - 23083219
AN - SCOPUS:84867604230
SN - 1471-2105
VL - 13
JO - BMC bioinformatics
JF - BMC bioinformatics
IS - 1
M1 - 271
ER -