Bayesian variable selection for linear regression in high dimensional microarray data

Wellington Cabrera, Carlos Ordonez, David Sergio Matusevich, Veerabhadran Baladandayuthapani

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

Variable selection is a fundamental problem in Bayesian statistics whose solution requires exploring a combinatorial search space. We study the solution of variable selection with a well-known MCMC method, which requires thousands of iterations. We present several algorithmic optimizations to accelerate the MCMC method to make it work efficiently inside a database system. Our optimizations include sufficient statistics, variable preselection, hash tables and calling a linear algebra library. We present experiments with very high dimensional microarray data sets to predict cancer survival time. We discuss encouraging findings, identifying specific genes likely to predict the survival time for brain cancer patients. We also show our DBMS-based algorithm is orders of magnitude faster than the R statistical package. Our work shows a DBMS is a promising platform to analyze microarray data.

Original languageEnglish (US)
Title of host publicationDTMBIO 2013 - Proceedings of the 7th International Workshop on Data and Text Mining in Biomedical Informatics, Co-located with CIKM 2013
Pages17-18
Number of pages2
DOIs
StatePublished - 2013
Event7th ACM International Workshop on Data and Text Mining in Biomedical Informatics, DTMBIO 2013, in Conjunction with the 22nd ACM International Conference on Information and Knowledge Management, CIKM 2013 - San Francisco, CA, United States
Duration: Nov 1 2013Nov 1 2013

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings

Other

Other7th ACM International Workshop on Data and Text Mining in Biomedical Informatics, DTMBIO 2013, in Conjunction with the 22nd ACM International Conference on Information and Knowledge Management, CIKM 2013
Country/TerritoryUnited States
CitySan Francisco, CA
Period11/1/1311/1/13

Keywords

  • Algorithms
  • DBMS
  • MCMC
  • Microarray
  • Variable selection

ASJC Scopus subject areas

  • General Decision Sciences
  • General Business, Management and Accounting

Fingerprint

Dive into the research topics of 'Bayesian variable selection for linear regression in high dimensional microarray data'. Together they form a unique fingerprint.

Cite this