Chapter 2 Import and Configure Workflows

This chapter describes how to import publicly available WDL workflows and their associated configuration files. In addition to covering the growing Dockstore community — which implements and helps drive forward GA4GH standards — we cover the Broad Methods Repository, which provides several convenient features if this is your first time developing WDL Workflows in the Cloud (see Chapter 3 for more details).

Learning Objectives

  • Understand how to import workflows from Dockstore and the Broad Methods Repository
  • Understand the role that JSON files play in configuring imported workflows, and how to select a configuration for your workflow

2.1 Importing workflows into AnVIL-powered-by-Terra

Chapter 1’s review exercise covers how to run a workflow that was included in a cloned AnVIL-powered-by-Terra workspace. But what if you want to add a new workflow into your workspace, without writing it yourself?

There are three ways to import workflows into an AnVIL-powered-by-Terra workspace:

  • Dockstore
  • Broad Methods Repository
  • Other online collections

2.1.1 Dockstore

Dockstore is an app store for bioinformatics tools, including workflows. It provides an easy interface for finding and running workflows on AnVIL-powered-by-Terra.

Complete the following steps to practice importing a workflow from Dockstore:

Use the instructions in Step 1 of “How to import a workflow and its parameter file from Dockstore into Terra” to find the Optimus workflow (search for WARP/Optimus).

Then, navigate to the Versions tab and click on the version you want to export to AnVIL-powered-by-Terra:

Finally, follow the instructions in Step 2 to export the workflow to your cloned version of the WDL-puzzles workspace.

2.1.2 Broad Methods Repository

The Broad Methods Repository contains a collection of pre-written workflows that can be run on AnVIL-powered-by-Terra. Notably, the Repository provides a web-based WDL editor, making it easy to create and modify workflows in the Cloud. Follow these instructions to open the Broad Methods Repository, find an interesting workflow, and export it to your workspace.

2.1.3 Other online collections

WDL is a widely-used tool, and WDL workflows are publicly available through other sources beyond Dockstore and the Broad Methods Repository. These include WARP, ENCODE, the Genome Analysis Toolkit, and GitHub. Note that exporting workflows from these sources is a two-step process: first, copy the workflow’s script to a new method in the Broad Methods Repository (see Chapter 3) or a Dockstore repository. Then, export the workflow to an AnVIL-powered-by-Terra workspace.

2.2 Configuring workflows with JSON files

To run a workflow in AnVIL-powered-by-Terra, you need to specify its inputs and outputs. You can do this by hand, filling in the workflow’s configuration each time you run it. But if your workflow’s inputs are reusable across workflow submissions – for example, if you always expect to run the workflow on a column named “bam” in your data table – it’s faster and less error-prone to pre-configure your workflow.

To pre-configure a workflow, you can upload a file that specifies the default values for some set of your workflow’s inputs. JSON is the most common format for this configuration file. AnVIL-powered-by-Terra will automatically fill these values into the workflow configuration form. If necessary. you can always change these values when running the workflow.

2.2.1 Optional practice: pre-configure a workflow

To pre-configure a workflow that you have imported to your workspace, upload a JSON file.

An easy way to do this is to download a JSON file from an existing workflow, edit it, and upload the edited version to the workflow that you’re trying to configure.

To practice this, open the easy-puzzle-solved workflow in your clone of the WDL-puzzles workspace and download the JSON file:

Open the JSON file in a text editor. It should look something like this, if the last time you ran it your input was “Marie Curie”:

{"HelloInput.name":"Marie Curie"}

Edit the file to set the input name to your name, instead of Marie Curie. Save the file, go back to your workflow and upload the edited JSON file by clicking the “Drag or click to upload json” link:

The input for the “name” variable should now be set to your name. Be sure to click Save to ensure that this change persists the next time you open the workflow.

2.2.2 A few notes on syntax

If you’re configuring multiple inputs, specify them like this:

{"taskName.variableName1":"attribute1", "taskName.variableName2":"attribute2"}

If your inputs will come from a data table, specify them like this:

{"taskName.variableName":"${this.columnName}"}

2.2.3 Importing a pre-configured workflow

In some cases, the workflow’s authors have provided a configuration file, and you can export this file to AnVIL-powered-by-Terra along with the workflow, rather than uploading the JSON file yourself. See How to import a workflow and its parameter file from Dockstore into Terra and Create, edit, and share a new workflow (steps 1.6-1.7) for instructions on how to do this for a workflow in Dockstore or the Broad Methods Repository.

2.2.4 More information

To learn more about how to configure a workflow’s inputs through the UI, see How to configure workflow inputs. For more detail on how to use JSON files to pre-configure a workflow, read Getting workflows up and running faster with JSONs.

2.3 Further Reading

If you’re interested in a deeper dive into this chapter’s topics, check out these optional articles: