Co-fuse: a new class discovery analysis tool to identify and prioritize recurrent fusion genes from RNA-sequencing data

Sakrapee Paisitkriangkrai, Kelly Quek, Eva Nievergall, Anissa Jabbour, Andrew Zannettino, Chung Hoow Kok

Research output: Contribution to journalArticlepeer-review

Abstract

Recurrent oncogenic fusion genes play a critical role in the development of various cancers and diseases and provide, in some cases, excellent therapeutic targets. To date, analysis tools that can identify and compare recurrent fusion genes across multiple samples have not been available to researchers. To address this deficiency, we developed Co-occurrence Fusion (Co-fuse), a new and easy to use software tool that enables biologists to merge RNA-seq information, allowing them to identify recurrent fusion genes, without the need for exhaustive data processing. Notably, Co-fuse is based on pattern mining and statistical analysis which enables the identification of hidden patterns of recurrent fusion genes. In this report, we show that Co-fuse can be used to identify 2 distinct groups within a set of 49 leukemic cell lines based on their recurrent fusion genes: a multiple myeloma (MM) samples-enriched cluster and an acute myeloid leukemia (AML) samples-enriched cluster. Our experimental results further demonstrate that Co-fuse can identify known driver fusion genes (e.g., IGH-MYC, IGH-WHSC1) in MM, when compared to AML samples, indicating the potential of Co-fuse to aid the discovery of yet unknown driver fusion genes through cohort comparisons. Additionally, using a 272 primary glioma sample RNA-seq dataset, Co-fuse was able to validate recurrent fusion genes, further demonstrating the power of this analysis tool to identify recurrent fusion genes. Taken together, Co-fuse is a powerful new analysis tool that can be readily applied to large RNA-seq datasets, and may lead to the discovery of new disease subgroups and potentially new driver genes, for which, targeted therapies could be developed. The Co-fuse R source code is publicly available at https://github.com/sakrapee/co-fuse.

Original languageEnglish (US)
Pages (from-to)1217-1229
Number of pages13
JournalMolecular Genetics and Genomics
Volume293
Issue number5
DOIs
StatePublished - Oct 1 2018

Keywords

  • Cancer
  • Leukemia
  • Oncology
  • RNA-sequencing
  • Recurrent fusion genes

ASJC Scopus subject areas

  • Molecular Biology
  • Genetics

Fingerprint

Dive into the research topics of 'Co-fuse: a new class discovery analysis tool to identify and prioritize recurrent fusion genes from RNA-sequencing data'. Together they form a unique fingerprint.

Cite this