TY - GEN
T1 - Impact of bias correction methods and increasing the biological samples in transcriptomic analysis
AU - González-Peña, Dianelys
AU - Nixon, Scott E.
AU - Southey, Bruce R.
AU - Lawson, Marcus A.
AU - McCusker, Robert H.
AU - Dantzer, Robert
AU - Kelley, Keith W.
AU - Rodriguez-Zas, Sandra L.
PY - 2014
Y1 - 2014
N2 - RNA-sequencing (RNA-Seq) technologies enable the quantification of gene expression levels, identification of splice junctions, and uncovering of novel transcripts. Comparing the number of sequences (reads) that map to transcripts under different conditions (e.g. genotypes) can detect differentially expressed genes and transcripts. Typically, RNA-Seq experiments have small sample sizes, making accurate detection of differentially abundant transcripts (DATs) challenging under these conditions. Normalization, bias adjustment and modeling of the RNASeq count data can further influence the detection and ranking of DATs. The objectives of this study were to assess the impact of sample size, normalization, and bias correction on the detection, ranking and functional characterization of DATs. RNA-Seq data from the brain macrophages of wild-type and indoleamine 2,3- dioxygenase 1-knockout mice was used. Data subsets (4 and 6 samples per group) and a range of bias adjustments were tested using TopHat and Cufflink routines. Prolactin variant 1 and growth hormone DATs were detected in both 4 and 6 group sample sizes. Average DAT number across sample sizes was 144.5 and 79, respectively. Bias corrections affected the estimate precision, resulting in reranking of the DATs. Despite the additional DATs identified using bias adjustments, functional clustering remained stable. Identification of robust DAT sets requires the evaluation of complementary bias-correction approaches.
AB - RNA-sequencing (RNA-Seq) technologies enable the quantification of gene expression levels, identification of splice junctions, and uncovering of novel transcripts. Comparing the number of sequences (reads) that map to transcripts under different conditions (e.g. genotypes) can detect differentially expressed genes and transcripts. Typically, RNA-Seq experiments have small sample sizes, making accurate detection of differentially abundant transcripts (DATs) challenging under these conditions. Normalization, bias adjustment and modeling of the RNASeq count data can further influence the detection and ranking of DATs. The objectives of this study were to assess the impact of sample size, normalization, and bias correction on the detection, ranking and functional characterization of DATs. RNA-Seq data from the brain macrophages of wild-type and indoleamine 2,3- dioxygenase 1-knockout mice was used. Data subsets (4 and 6 samples per group) and a range of bias adjustments were tested using TopHat and Cufflink routines. Prolactin variant 1 and growth hormone DATs were detected in both 4 and 6 group sample sizes. Average DAT number across sample sizes was 144.5 and 79, respectively. Bias corrections affected the estimate precision, resulting in reranking of the DATs. Despite the additional DATs identified using bias adjustments, functional clustering remained stable. Identification of robust DAT sets requires the evaluation of complementary bias-correction approaches.
KW - Bias correction
KW - RNA-Seq
KW - Sample size
KW - Transcriptome
UR - http://www.scopus.com/inward/record.url?scp=84905819573&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84905819573&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84905819573
SN - 9781632665140
T3 - Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014
SP - 115
EP - 118
BT - Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014
PB - International Society for Computers and Their Applications
T2 - 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014
Y2 - 24 March 2014 through 26 March 2014
ER -