Differential expression in SAGE: Accounting for normal between-library variation

Keith A. Baggerly, Li Deng, Jeffrey S. Morris, C. Marcelo Aldaz

Research output: Contribution to journalArticlepeer-review

257 Scopus citations

Abstract

Motivation: In contrasting levels of gene expression between groups of SAGE libraries, the libraries within each group are often combined and the counts for the tag of interest summed, and inference is made on the basis of these larger 'pseudolibraries'. While this captures the sampling variability inherent in the procedure, it fails to allow for normal variation in levels of the gene between individuals within the same group, and can consequently overstate the significance of the results. The effect is not slight: between-library variation can be hundreds of times the within-library variation. Results: We introduce a beta-binomial sampling model that correctly incorporates both sources of variation. We show how to fit the parameters of this model, and introduce a test statistic for differential expression similar to a two-sample t-test.

Original languageEnglish (US)
Pages (from-to)1477-1483
Number of pages7
JournalBioinformatics
Volume19
Issue number12
DOIs
StatePublished - Aug 12 2003

ASJC Scopus subject areas

  • Statistics and Probability
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Fingerprint

Dive into the research topics of 'Differential expression in SAGE: Accounting for normal between-library variation'. Together they form a unique fingerprint.

Cite this