Stratified Test Accurately Identifies Differentially Expressed Genes under Batch Effects in Single-Cell Data

Shaoheng Liang, Qingnan Liang, Rui Chen, Ken Chen

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

Analyzing single-cell sequencing data from large cohorts is challenging. Discrepancies across experiments and differences among participants often lead to omissions and false discoveries in differentially expressed genes. We find that the Van Elteren test, a stratified version of the widely used Wilcoxon rank-sum test, elegantly mitigates the problem. We also modified the common language effect size to supplement this test, further improving its utility. On both simulated and real patient data we show the ability of Van Elteren test to control for false positives and false negatives. A comprehensive assessment using receiver operating characteristic (ROC) curve shows that Van Elteren test achieves higher sensitivity and specificity on simulated datasets, compared with nine state-of-the-art differential expression analysis methods. The effect size also estimates the differences between cell types more accurately.

Original languageEnglish (US)
Pages (from-to)2072-2079
Number of pages8
JournalIEEE/ACM Transactions on Computational Biology and Bioinformatics
Volume18
Issue number6
DOIs
StatePublished - 2021

Keywords

  • Van Elteren test
  • Wilcoxon rank-sum test
  • batch effect
  • differential expression analysis
  • scRNA-seq analysis

ASJC Scopus subject areas

  • Biotechnology
  • Genetics
  • Applied Mathematics

MD Anderson CCSG core facilities

  • Bioinformatics Shared Resource

Fingerprint

Dive into the research topics of 'Stratified Test Accurately Identifies Differentially Expressed Genes under Batch Effects in Single-Cell Data'. Together they form a unique fingerprint.

Cite this