Bioinformatics workflow management system

A bioinformatics workflow management system is a specialized form of workflow management system designed specifically to compose and execute a series of computational or data manipulation steps, or a workflow, that relate to bioinformatics.

There are currently many different workflow systems. Some have been developed more generally as scientific workflow systems for use by scientists from many different disciplines like astronomy and earth science. All such systems are based on an abstract representation of how a computation proceeds in the form of a directed graph, where each node represents a task to be executed and edges represent either data flow or execution dependencies between different tasks. Each system typically provides a visual front-end, allowing the user to build and modify complex applications with little or no programming expertise.

Examples
In alphabetical order, some examples of bioinformatics workflow management systems include:
 * Anduril bioinformatics and image analysis
 * BioBIKE: a Web-based, programmable, integrated biological knowledge base
 * CLC bio, a bioinformatics analysis and workflow management platform from QIAGEN Digital Insights.
 * Clone Manager from Sci-Ed.
 * Cuneiform: A functional workflow language for large-scale data analysis
 * Discovery Net: one of the earliest examples of a scientific workflow system, later commercialized as InforSense which was then acquired by IDBS.
 * Galaxy: initially targeted at genomics
 * GenePattern: A powerful scientific workflow system that provides access to hundreds of genomic analysis tools.
 * KNIME the Konstanz Information Miner
 * OnlineHPC Online workflow designer based on Taverna
 * Playbook Workflow Builder Flexible workflow builder for bioinformatics applications based on API services. Initially developed for the NIH CFDE Common Fund program
 * UGENE provides a workflow management system that is installed on a local computer
 * VisTrails

Comparisons between workflow systems
With a large number of bioinformatics workflow systems to choose from, it becomes difficult to understand and compare the features of the different workflow systems. There has been little work conducted in evaluating and comparing the systems from a bioinformatician's perspective, especially when it comes to comparing the data types they can deal with, the in-built functionalities that are provided to the user or even their performance or usability. Examples of existing comparisons include:


 * The paper "Scientific workflow systems-can one size fit all?", which provides a high-level framework for comparing workflow systems based on their control flow and data flow properties. The systems compared include Discovery Net, Taverna, Triana, Kepler as well as Yawl and BPEL.
 * The paper "Meta-workflows: pattern-based interoperability between Galaxy and Taverna" which provides a more user-oriented comparison between Taverna and Galaxy in the context of enabling interoperability between both systems.


 * The infrastructure paper "Delivering ICT Infrastructure for Biomedical Research" compares two workflow systems, Anduril and Chipster, in terms of infrastructure requirements in a cloud-delivery model.


 * The paper "A review of bioinformatic pipeline frameworks" attempts to classify workflow management systems based on three dimensions: "using an implicit or explicit syntax, using a configuration, convention or class-based design paradigm and offering a command line or workbench interface".