Impact of bias correction methods and increasing the biological samples in transcriptomic analysis

Dianelys González-Peña; Scott E. Nixon; Bruce R. Southey; Marcus A. Lawson; Robert H. McCusker; Robert Dantzer; Keith W. Kelley; Sandra L. Rodriguez-Zas

Impact of bias correction methods and increasing the biological samples in transcriptomic analysis

Dianelys González-Peña, Scott E. Nixon, Bruce R. Southey, Marcus A. Lawson, Robert H. McCusker, Robert Dantzer, Keith W. Kelley, Sandra L. Rodriguez-Zas

Symptom Research CAO

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

RNA-sequencing (RNA-Seq) technologies enable the quantification of gene expression levels, identification of splice junctions, and uncovering of novel transcripts. Comparing the number of sequences (reads) that map to transcripts under different conditions (e.g. genotypes) can detect differentially expressed genes and transcripts. Typically, RNA-Seq experiments have small sample sizes, making accurate detection of differentially abundant transcripts (DATs) challenging under these conditions. Normalization, bias adjustment and modeling of the RNASeq count data can further influence the detection and ranking of DATs. The objectives of this study were to assess the impact of sample size, normalization, and bias correction on the detection, ranking and functional characterization of DATs. RNA-Seq data from the brain macrophages of wild-type and indoleamine 2,3- dioxygenase 1-knockout mice was used. Data subsets (4 and 6 samples per group) and a range of bias adjustments were tested using TopHat and Cufflink routines. Prolactin variant 1 and growth hormone DATs were detected in both 4 and 6 group sample sizes. Average DAT number across sample sizes was 144.5 and 79, respectively. Bias corrections affected the estimate precision, resulting in reranking of the DATs. Despite the additional DATs identified using bias adjustments, functional clustering remained stable. Identification of robust DAT sets requires the evaluation of complementary bias-correction approaches.

Original language	English (US)
Title of host publication	Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014
Publisher	International Society for Computers and Their Applications
Pages	115-118
Number of pages	4
ISBN (Print)	9781632665140
State	Published - 2014
Event	6th International Conference on Bioinformatics and Computational Biology, BICOB 2014 - Las Vegas, NV, United States Duration: Mar 24 2014 → Mar 26 2014

Publication series

Name	Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014

Other

Other	6th International Conference on Bioinformatics and Computational Biology, BICOB 2014
Country/Territory	United States
City	Las Vegas, NV
Period	3/24/14 → 3/26/14

Keywords

Bias correction
RNA-Seq
Sample size
Transcriptome

ASJC Scopus subject areas

Information Systems
Health Informatics

Cite this

González-Peña, D., Nixon, S. E., Southey, B. R., Lawson, M. A., McCusker, R. H., Dantzer, R., Kelley, K. W., & Rodriguez-Zas, S. L. (2014). Impact of bias correction methods and increasing the biological samples in transcriptomic analysis. In Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014 (pp. 115-118). (Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014). International Society for Computers and Their Applications.

Impact of bias correction methods and increasing the biological samples in transcriptomic analysis. / González-Peña, Dianelys; Nixon, Scott E.; Southey, Bruce R. et al.
Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014. International Society for Computers and Their Applications, 2014. p. 115-118 (Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

González-Peña, D, Nixon, SE, Southey, BR, Lawson, MA, McCusker, RH, Dantzer, R, Kelley, KW & Rodriguez-Zas, SL 2014, Impact of bias correction methods and increasing the biological samples in transcriptomic analysis. in Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014. Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014, International Society for Computers and Their Applications, pp. 115-118, 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014, Las Vegas, NV, United States, 3/24/14.

González-Peña D, Nixon SE, Southey BR, Lawson MA, McCusker RH, Dantzer R et al. Impact of bias correction methods and increasing the biological samples in transcriptomic analysis. In Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014. International Society for Computers and Their Applications. 2014. p. 115-118. (Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014).

González-Peña, Dianelys ; Nixon, Scott E. ; Southey, Bruce R. et al. / Impact of bias correction methods and increasing the biological samples in transcriptomic analysis. Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014. International Society for Computers and Their Applications, 2014. pp. 115-118 (Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014).

@inproceedings{337e9d0904c64e159bf64d7fa16c948e,

title = "Impact of bias correction methods and increasing the biological samples in transcriptomic analysis",

abstract = "RNA-sequencing (RNA-Seq) technologies enable the quantification of gene expression levels, identification of splice junctions, and uncovering of novel transcripts. Comparing the number of sequences (reads) that map to transcripts under different conditions (e.g. genotypes) can detect differentially expressed genes and transcripts. Typically, RNA-Seq experiments have small sample sizes, making accurate detection of differentially abundant transcripts (DATs) challenging under these conditions. Normalization, bias adjustment and modeling of the RNASeq count data can further influence the detection and ranking of DATs. The objectives of this study were to assess the impact of sample size, normalization, and bias correction on the detection, ranking and functional characterization of DATs. RNA-Seq data from the brain macrophages of wild-type and indoleamine 2,3- dioxygenase 1-knockout mice was used. Data subsets (4 and 6 samples per group) and a range of bias adjustments were tested using TopHat and Cufflink routines. Prolactin variant 1 and growth hormone DATs were detected in both 4 and 6 group sample sizes. Average DAT number across sample sizes was 144.5 and 79, respectively. Bias corrections affected the estimate precision, resulting in reranking of the DATs. Despite the additional DATs identified using bias adjustments, functional clustering remained stable. Identification of robust DAT sets requires the evaluation of complementary bias-correction approaches.",

keywords = "Bias correction, RNA-Seq, Sample size, Transcriptome",

author = "Dianelys Gonz{\'a}lez-Pe{\~n}a and Nixon, {Scott E.} and Southey, {Bruce R.} and Lawson, {Marcus A.} and McCusker, {Robert H.} and Robert Dantzer and Kelley, {Keith W.} and Rodriguez-Zas, {Sandra L.}",

year = "2014",

language = "English (US)",

isbn = "9781632665140",

series = "Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014",

publisher = "International Society for Computers and Their Applications",

pages = "115--118",

booktitle = "Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014",

note = "6th International Conference on Bioinformatics and Computational Biology, BICOB 2014 ; Conference date: 24-03-2014 Through 26-03-2014",

}

TY - GEN

T1 - Impact of bias correction methods and increasing the biological samples in transcriptomic analysis

AU - González-Peña, Dianelys

AU - Nixon, Scott E.

AU - Southey, Bruce R.

AU - Lawson, Marcus A.

AU - McCusker, Robert H.

AU - Dantzer, Robert

AU - Kelley, Keith W.

AU - Rodriguez-Zas, Sandra L.

PY - 2014

Y1 - 2014

N2 - RNA-sequencing (RNA-Seq) technologies enable the quantification of gene expression levels, identification of splice junctions, and uncovering of novel transcripts. Comparing the number of sequences (reads) that map to transcripts under different conditions (e.g. genotypes) can detect differentially expressed genes and transcripts. Typically, RNA-Seq experiments have small sample sizes, making accurate detection of differentially abundant transcripts (DATs) challenging under these conditions. Normalization, bias adjustment and modeling of the RNASeq count data can further influence the detection and ranking of DATs. The objectives of this study were to assess the impact of sample size, normalization, and bias correction on the detection, ranking and functional characterization of DATs. RNA-Seq data from the brain macrophages of wild-type and indoleamine 2,3- dioxygenase 1-knockout mice was used. Data subsets (4 and 6 samples per group) and a range of bias adjustments were tested using TopHat and Cufflink routines. Prolactin variant 1 and growth hormone DATs were detected in both 4 and 6 group sample sizes. Average DAT number across sample sizes was 144.5 and 79, respectively. Bias corrections affected the estimate precision, resulting in reranking of the DATs. Despite the additional DATs identified using bias adjustments, functional clustering remained stable. Identification of robust DAT sets requires the evaluation of complementary bias-correction approaches.

AB - RNA-sequencing (RNA-Seq) technologies enable the quantification of gene expression levels, identification of splice junctions, and uncovering of novel transcripts. Comparing the number of sequences (reads) that map to transcripts under different conditions (e.g. genotypes) can detect differentially expressed genes and transcripts. Typically, RNA-Seq experiments have small sample sizes, making accurate detection of differentially abundant transcripts (DATs) challenging under these conditions. Normalization, bias adjustment and modeling of the RNASeq count data can further influence the detection and ranking of DATs. The objectives of this study were to assess the impact of sample size, normalization, and bias correction on the detection, ranking and functional characterization of DATs. RNA-Seq data from the brain macrophages of wild-type and indoleamine 2,3- dioxygenase 1-knockout mice was used. Data subsets (4 and 6 samples per group) and a range of bias adjustments were tested using TopHat and Cufflink routines. Prolactin variant 1 and growth hormone DATs were detected in both 4 and 6 group sample sizes. Average DAT number across sample sizes was 144.5 and 79, respectively. Bias corrections affected the estimate precision, resulting in reranking of the DATs. Despite the additional DATs identified using bias adjustments, functional clustering remained stable. Identification of robust DAT sets requires the evaluation of complementary bias-correction approaches.

KW - Bias correction

KW - RNA-Seq

KW - Sample size

KW - Transcriptome

UR - http://www.scopus.com/inward/record.url?scp=84905819573&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84905819573&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84905819573

SN - 9781632665140

T3 - Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014

SP - 115

EP - 118

BT - Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014

PB - International Society for Computers and Their Applications

T2 - 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014

Y2 - 24 March 2014 through 26 March 2014

ER -

Impact of bias correction methods and increasing the biological samples in transcriptomic analysis

Abstract

Publication series

Other

Keywords

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this