Spectral clustering via sparse graph structure learning with application to proteomic signaling networks in cancer

Sayantan Banerjee, Rehan Akbani, Veerabhadran Baladandayuthapani

Research output: Contribution to journalArticlepeer-review

7 Scopus citations

Abstract

Clustering methods for multivariate data exploiting the underlying geometry of the graphical structure between variables are presented. As opposed to standard approaches for graph clustering that assume known graph structures, the edge structure of the unknown graph is first estimated using sparse regression based approaches for sparse graph structure learning. Subsequently, graph clustering on the lower dimensional projections of the graph is performed based on Laplacian embeddings using a penalized k-means approach, motivated by Dirichlet process mixture models in Bayesian nonparametrics. In contrast to standard algorithmic approaches for known graphs, the proposed method allows estimation and inference for both graph structure learning and clustering. More importantly, the arguments for Laplacian embeddings as suitable projections for graph clustering are formalized by providing theoretical support for the consistency of the eigenspace of the estimated graph Laplacians. Fast computational algorithms are proposed to scale the method to large number of nodes. Extensive simulations are presented to compare the clustering performance with standard methods. The methods are applied to a novel pan-cancer proteomic data set, and protein networks and clusters are evaluated across multiple different cancer types.

Original languageEnglish (US)
Pages (from-to)46-69
Number of pages24
JournalComputational Statistics and Data Analysis
Volume132
DOIs
StatePublished - Apr 2019

Keywords

  • Graph clustering
  • Graph structure learning
  • Proteomic data
  • Spectral clustering

ASJC Scopus subject areas

  • Statistics and Probability
  • Computational Mathematics
  • Computational Theory and Mathematics
  • Applied Mathematics

MD Anderson CCSG core facilities

  • Bioinformatics Shared Resource
  • Functional Proteomics Reverse Phase Protein Array Core

Fingerprint

Dive into the research topics of 'Spectral clustering via sparse graph structure learning with application to proteomic signaling networks in cancer'. Together they form a unique fingerprint.

Cite this