Spectral feature selection and its application in high dimensional gene expression studies

Zixing Wang; Peng Qiu; Wenlong Xu; Yin Liu

doi:10.1145/2649387.2649396

Spectral feature selection and its application in high dimensional gene expression studies

Zixing Wang, Peng Qiu, Wenlong Xu, Yin Liu

Bioinformatics & Computational Biology

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

2 Scopus citations

Abstract

Many variable selection techniques have been proposed for clustering analysis of gene expression data. Motivated by spectral learning, we propose a new filtering method that uses the correlation between features and the eigenspace of sample similarity matrix as the variable selection criteria. Spectral algorithm states that a sample similarity matrix with q strongly connected components tends to have q piecewise almost constant eigenvectors representing a specific partition of the sample space. Using distance correlation metric, our proposed method, spectral correlation (Scorrelation) measures features' correlation with the top q eigenvectors of sample similarity matrix and then infers their ability in differentiating the underlying clusters of samples. Our method has been applied to large-scale gene expression datasets. Compared to other filtering methods, our method is more effective and provides better clustering results in terms of clustering error rate and the reliability of the selected features. Our framework can be easily extended to other types of datasets for addressing clustering and classification problems.

Original language	English (US)
Title of host publication	ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics
Publisher	Association for Computing Machinery
Pages	314-320
Number of pages	7
ISBN (Electronic)	9781450328944
DOIs	https://doi.org/10.1145/2649387.2649396
State	Published - Sep 20 2014
Event	5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, ACM BCB 2014 - Newport Beach, United States Duration: Sep 20 2014 → Sep 23 2014

Publication series

Name	ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics

Other

Other	5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, ACM BCB 2014
Country/Territory	United States
City	Newport Beach
Period	9/20/14 → 9/23/14

Keywords

Clustering
Distance correlation
Spectral feature selection
Unsupervised

ASJC Scopus subject areas

Health Informatics
Computer Science Applications
Software
Biomedical Engineering

Access to Document

10.1145/2649387.2649396

Cite this

Wang, Z., Qiu, P., Xu, W., & Liu, Y. (2014). Spectral feature selection and its application in high dimensional gene expression studies. In ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (pp. 314-320). (ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics). Association for Computing Machinery. https://doi.org/10.1145/2649387.2649396

Spectral feature selection and its application in high dimensional gene expression studies. / Wang, Zixing; Qiu, Peng; Xu, Wenlong et al.
ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics. Association for Computing Machinery, 2014. p. 314-320 (ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Wang, Z, Qiu, P, Xu, W & Liu, Y 2014, Spectral feature selection and its application in high dimensional gene expression studies. in ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics. ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, Association for Computing Machinery, pp. 314-320, 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, ACM BCB 2014, Newport Beach, United States, 9/20/14. https://doi.org/10.1145/2649387.2649396

Wang Z, Qiu P, Xu W, Liu Y. Spectral feature selection and its application in high dimensional gene expression studies. In ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics. Association for Computing Machinery. 2014. p. 314-320. (ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics). doi: 10.1145/2649387.2649396

Wang, Zixing ; Qiu, Peng ; Xu, Wenlong et al. / Spectral feature selection and its application in high dimensional gene expression studies. ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics. Association for Computing Machinery, 2014. pp. 314-320 (ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics).

@inproceedings{b479a1cb37c64f308bf44a11dd33e6ec,

title = "Spectral feature selection and its application in high dimensional gene expression studies",

abstract = "Many variable selection techniques have been proposed for clustering analysis of gene expression data. Motivated by spectral learning, we propose a new filtering method that uses the correlation between features and the eigenspace of sample similarity matrix as the variable selection criteria. Spectral algorithm states that a sample similarity matrix with q strongly connected components tends to have q piecewise almost constant eigenvectors representing a specific partition of the sample space. Using distance correlation metric, our proposed method, spectral correlation (Scorrelation) measures features' correlation with the top q eigenvectors of sample similarity matrix and then infers their ability in differentiating the underlying clusters of samples. Our method has been applied to large-scale gene expression datasets. Compared to other filtering methods, our method is more effective and provides better clustering results in terms of clustering error rate and the reliability of the selected features. Our framework can be easily extended to other types of datasets for addressing clustering and classification problems.",

keywords = "Clustering, Distance correlation, Spectral feature selection, Unsupervised",

author = "Zixing Wang and Peng Qiu and Wenlong Xu and Yin Liu",

note = "Publisher Copyright: Copyright {\textcopyright} 2014 ACM.; 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, ACM BCB 2014 ; Conference date: 20-09-2014 Through 23-09-2014",

year = "2014",

month = sep,

day = "20",

doi = "10.1145/2649387.2649396",

language = "English (US)",

series = "ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics",

publisher = "Association for Computing Machinery",

pages = "314--320",

booktitle = "ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics",

}

TY - GEN

T1 - Spectral feature selection and its application in high dimensional gene expression studies

AU - Wang, Zixing

AU - Qiu, Peng

AU - Xu, Wenlong

AU - Liu, Yin

PY - 2014/9/20

Y1 - 2014/9/20

N2 - Many variable selection techniques have been proposed for clustering analysis of gene expression data. Motivated by spectral learning, we propose a new filtering method that uses the correlation between features and the eigenspace of sample similarity matrix as the variable selection criteria. Spectral algorithm states that a sample similarity matrix with q strongly connected components tends to have q piecewise almost constant eigenvectors representing a specific partition of the sample space. Using distance correlation metric, our proposed method, spectral correlation (Scorrelation) measures features' correlation with the top q eigenvectors of sample similarity matrix and then infers their ability in differentiating the underlying clusters of samples. Our method has been applied to large-scale gene expression datasets. Compared to other filtering methods, our method is more effective and provides better clustering results in terms of clustering error rate and the reliability of the selected features. Our framework can be easily extended to other types of datasets for addressing clustering and classification problems.

AB - Many variable selection techniques have been proposed for clustering analysis of gene expression data. Motivated by spectral learning, we propose a new filtering method that uses the correlation between features and the eigenspace of sample similarity matrix as the variable selection criteria. Spectral algorithm states that a sample similarity matrix with q strongly connected components tends to have q piecewise almost constant eigenvectors representing a specific partition of the sample space. Using distance correlation metric, our proposed method, spectral correlation (Scorrelation) measures features' correlation with the top q eigenvectors of sample similarity matrix and then infers their ability in differentiating the underlying clusters of samples. Our method has been applied to large-scale gene expression datasets. Compared to other filtering methods, our method is more effective and provides better clustering results in terms of clustering error rate and the reliability of the selected features. Our framework can be easily extended to other types of datasets for addressing clustering and classification problems.

KW - Clustering

KW - Distance correlation

KW - Spectral feature selection

KW - Unsupervised

UR - http://www.scopus.com/inward/record.url?scp=84920729285&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84920729285&partnerID=8YFLogxK

U2 - 10.1145/2649387.2649396

DO - 10.1145/2649387.2649396

M3 - Conference contribution

AN - SCOPUS:84920729285

T3 - ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics

SP - 314

EP - 320

BT - ACM BCB 2014 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics

PB - Association for Computing Machinery

T2 - 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, ACM BCB 2014

Y2 - 20 September 2014 through 23 September 2014

ER -

Spectral feature selection and its application in high dimensional gene expression studies

Abstract

Publication series

Other

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this