Tool:NextflowWorkbench: An integrated development environment for workflows
1
3
Entering edit mode
8.6 years ago
fac2003 ▴ 170

We are developing the NextflowWorkbench, an integrated development environment for building workflows that you can run with Nextflow. Here's a snapshot of a very simple workflow built with this workbench:

Workflow illustration

The workflow refers to two Processes called splitSequence and reverse, which are defined as follows:

ProcessSplitSequenceProcessReverse

In a first approximation, you could think of the NextflowWorkbench as an integrated development environment (IDE) for Nextflow. However, this workbench exposes a language that is a bit different from Nextflow. We aimed to simplify the language and make it more consistent and easier to learn (see the documentation: PDF, Tablet).

In other instances, we added features that we felt were important, but not so easy to do with Nextflow. For instance, NextflowWorkbench makes it possible to reuse Process definitions in several workflows, without having to copy and paste and rename channels. Another extension is explicit data types, which we think help develop and maintain sound pipelines. Despite these simplifications or extensions, the Workbench will produce plain Nextflow scripts.

NextflowWorkbench supports closures and makes it possible to run workflows directly from within the IDE (see this post about new features in version 1.1). This tool is built with Language Workbench Technology which makes it easy to extend the languages supported in the workbench for different applications (see [1-2] and references therein).

1. Simi M, Campagne F. Composable languages for bioinformatics: the NYoSh experiment. PeerJ. 2014;2:e241. Available from: https://peerj.com/articles/241/

2. Benson VM, Campagne F. Language workbench user interfaces for data analysis. PeerJ. 2015 Jan;3:e800. Available from: https://peerj.com/articles/800/

Workbench Workflow Nextflow LWT • 3.0k views
ADD COMMENT
0
Entering edit mode

In response to Istvan's comment, I am adding a more meaningful example. We use this example for teaching NextflowWorkbench. We have now added full support for docker and are therefore able to take advantage of many tools for which a docker image exists. In this example, we use an image with the SRA tools, one with fastqc, and one with Kallisto and an index of the human transcriptome. This pipeline runs directly on a user laptop, as long as they have enough memory to run Kallisto (more than 4GB recommended).

The tool is fully integrated with git and subversion. The bars on the left indicates changes with the last version of the pipeline committed to source control.

KallistoAnalysisPipeline

ADD REPLY
0
Entering edit mode

Preprint now available: NextflowWorkbench: Reproducible and Reusable Workflows for Beginners and Experts

Jason P Kurs, Manuele Simi, Fabien Campagne doi: http://dx.doi.org/10.1101/041236

http://biorxiv.org/content/early/2016/02/24/041236

ADD REPLY
0
Entering edit mode
8.6 years ago

The problem with your example workflow is that is not useful at all - that makes it very difficult to connect to what you are trying to do. Why would people spend time to figure out your tool when your example is not useful to them. This is actually a very common problem with scientific software of this kind, the creators assume that just by reading a trivial example the readers themselves will come up with all kinds of super helpful applications of it.

It is hard to figure out what is your example supposed to be doing. It lacks the most important aspect of all tools, documenting what the purpose of that step is and describing what the input and output should be. And in your second example, why is the name finaleres.txt written twice, hence repeated. Does not seem right, the typo in it does not help either, reminds us all of that time when we accidentally wrote finalres.txt in one place and finaleres.txt in the other and, as a result the world as we know it almost ended.

I don't want to sound too negative, and sorry if this sounds like so. It is just way too common to see "hey look people, here is this cool tool" then the example that is provided is something exceedingly simple that, on its own provides no proof that the tool is actually useful.

ADD COMMENT
0
Entering edit mode

I appreciate your point of view. We are in the process of building real analysis pipelines, but these need a lot of tools to be installed and are not easy to distribute at this time. The next release will solve this problem by making it possible to get resources from a docker container, and have them automatically installed and built (e.g., genome aligner indices). In general, this tool will be useful if you need to develop pipelines as those we have described for GobyWeb [1].

As for the typo, it is shown exactly the same in both places because the tool keeps only one reference to the name. Changing the name at the top to fix the typo will fix all places where that name is used in the script.

  1. Dorff KC, Chambwe N, Zeno Z, Simi M, Shaknovich R, Campagne F. GobyWeb: Simplified Management and Analysis of Gene Expression and DNA Methylation Sequencing Data. Provart NJ, editor. PLoS One [Internet]. Public Library of Science; 2013 Jan [cited 2013 Nov 19];8(7):e69666. Available from: http://dx.plos.org/10.1371/journal.pone.0069666
ADD REPLY

Login before adding your answer.

Traffic: 3156 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6