System for Quality-Assured Data Analysis: Flexible, reproducible scientific workflows

Research output: Contribution to journalArticlepeer-review

6 Scopus citations

Abstract

The reproducibility of scientific processes is one of the paramount problems of bioinformatics, an engineering problem that must be addressed to perform good research. The System for Quality-Assured Data Analysis (SyQADA), described here, seeks to address reproducibility by managing many of the details of procedural bookkeeping in bioinformatics in as simple and transparent a manner as possible. SyQADA has been used by persons with backgrounds ranging from expert programmer to Unix novice, to perform and repeat dozens of diverse bioinformatics workflows on tens of thousands of samples, consuming over 80 CPU-months of computing on over 300,000 individual tasks of scores of projects on laptops, computer servers, and computing clusters. SyQADA is especially well-suited for paired-sample analyses found in cancer tumor-normal studies. SyQADA executable source code, documentation, tutorial examples, and workflows used in our lab is available from http://scheet.org/software.html.

Original languageEnglish (US)
Pages (from-to)227-237
Number of pages11
JournalGenetic epidemiology
Volume43
Issue number2
DOIs
StatePublished - Mar 2019

Keywords

  • bioinformatics
  • cancer genomics
  • computer software
  • reproducibility
  • workflow

ASJC Scopus subject areas

  • Epidemiology
  • Genetics(clinical)

Fingerprint

Dive into the research topics of 'System for Quality-Assured Data Analysis: Flexible, reproducible scientific workflows'. Together they form a unique fingerprint.

Cite this