Abstract
Biological sequence motifs are short nucleotide or amino acid sequences that are biologically significant and are attractive to scientists because they are usually highly conserved and result in structural and regulatory implications. In this chapter, the authors show practical applications of these data, followed by a review of the algorithms, techniques, and tools. They address the nature of motifs and elucidate on several methods for de novo motif discovery, covering the algorithms based on Gibbs sampling, expectation maximization, Bayesian inference, covariance models, and discriminative learning. The authors present the tools and their requirements to weigh their individual benefits and challenges. Since interpretation of a large set of results can pose significant challenges, they discuss several methods for handling data that span from visualization to integration into pipelines and curated databases. Additionally, the authors show practical applications of these data with examples.
Original language | English (US) |
---|---|
Title of host publication | Big Data Analytics in Bioinformatics and Healthcare |
Publisher | IGI Global |
Pages | 86-116 |
Number of pages | 31 |
ISBN (Electronic) | 9781466666122 |
ISBN (Print) | 1466666110, 9781466666115 |
DOIs | |
State | Published - Oct 31 2014 |
Externally published | Yes |
ASJC Scopus subject areas
- General Computer Science
- General Medicine
- General Health Professions