Explaining Gene Expression Using Twenty-One MicroRNAs

Amir Asiaee, Zachary B. Abrams, Samantha Nakayiza, Deepa Sampath, Kevin R Coombes

Research output: Contribution to journalArticlepeer-review

Abstract

The transcriptome of a tumor contains detailed information about the disease. Although advances in sequencing technologies have generated larger data sets, there are still many questions about exactly how the transcriptome is regulated. One class of regulatory elements consists of microRNAs (or miRs), many of which are known to be associated with cancer. To better understand the relationships between miRs and cancers, we analyzed ∼9000 samples from 32 cancer types studied in The Cancer Genome Atlas. Our feature reduction algorithm found evidence for 21 biologically interpretable clusters of miRs, many of which were statistically associated with a specific type of cancer. Moreover, the clusters contain sufficient information to distinguish between most types of cancer. We then used linear models to measure, genome-wide, how much variation in gene expression could be explained by the 21 average expression values ("scores") of the clusters. Based on the ∼20,000 per-gene R2 values, we found that (1) mean differences between tissues of origin explain about 36% of variation; (2) the 21 miR cluster scores explain about 30% of the variation; and (3) combining tissue type with the miR scores explained about 56% of the total genome-wide variation in gene expression. Our analysis of poorly explained genes shows that they are enriched for olfactory receptor processes, sensory perception, and nervous system processing, which are necessary to receive and interpret signals from outside the organism. Therefore, it is reasonable for those genes to be always active and not get downregulated by miRs. In contrast, highly explained genes are characterized by genes translating to proteins necessary for transport, plasma membrane, or metabolic processes that are heavily regulated processes inside the cell. Other genetic regulatory elements such as transcription factors and methylation might help explain some of the remaining variation in gene expression.

Original languageEnglish (US)
Pages (from-to)1157-1170
Number of pages14
JournalJournal of Computational Biology
Volume27
Issue number7
DOIs
StatePublished - Jul 2020

Keywords

  • feature extraction
  • gene expression prediction
  • gene regulation
  • microRNA
  • mRNA

ASJC Scopus subject areas

  • Modeling and Simulation
  • Molecular Biology
  • Genetics
  • Computational Mathematics
  • Computational Theory and Mathematics

Fingerprint Dive into the research topics of 'Explaining Gene Expression Using Twenty-One MicroRNAs'. Together they form a unique fingerprint.

Cite this