Spectral feature selection and its application in high dimensional gene expression studies

Zixing Wang, Peng Qiu, Wenlong Xu, Yin Liu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

Many variable selection techniques have been proposed for clustering analysis of gene expression data. Motivated by spectral learning, we propose a new filtering method that uses the correlation between features and the eigenspace of sample similarity matrix as the variable selection criteria. Spectral algorithm states that a sample similarity matrix with q strongly connected components tends to have q piecewise almost constant eigenvectors representing a specific partition of the sample space. Using distance correlation metric, our proposed method, spectral correlation (Scorrelation) measures features' correlation with the top q eigenvectors of sample similarity matrix and then infers their ability in differentiating the underlying clusters of samples. Our method has been applied to large-scale gene expression datasets. Compared to other filtering methods, our method is more effective and provides better clustering results in terms of clustering error rate and the reliability of the selected features. Our framework can be easily extended to other types of datasets for addressing clustering and classification problems.

Original languageEnglish (US)
Title of host publicationACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics
PublisherAssociation for Computing Machinery
Pages314-320
Number of pages7
ISBN (Electronic)9781450328944
DOIs
StatePublished - Sep 20 2014
Event5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, ACM BCB 2014 - Newport Beach, United States
Duration: Sep 20 2014Sep 23 2014

Publication series

NameACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics

Other

Other5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, ACM BCB 2014
Country/TerritoryUnited States
CityNewport Beach
Period9/20/149/23/14

Keywords

  • Clustering
  • Distance correlation
  • Spectral feature selection
  • Unsupervised

ASJC Scopus subject areas

  • Health Informatics
  • Computer Science Applications
  • Software
  • Biomedical Engineering

Fingerprint

Dive into the research topics of 'Spectral feature selection and its application in high dimensional gene expression studies'. Together they form a unique fingerprint.

Cite this