HTTP Inputs

Overview

For shared filesystem and Google Pipelines API (PAPI) version 2 backends Cromwell can support workflow inputs specified by http and https URLs. Please note this is not true "filesystem" support for HTTP URLs; if inputs to a workflow are specified by HTTP URLs the outputs of steps will nevertheless appear at local or GCS paths and not HTTP URLs.

Configuration

Cromwell's default configuration defines an instance of the HTTP filesystem named http. There is no additional configuration required for the HTTP filesystem itself so adding HTTP filesystem support to a backend is a simple as adding a reference to this filesystem within the backend's filesystems stanza. e.g. Cromwell's default Local shared filesystem backend is configured like this (a PAPI version 2 backend would be configured in a similar way):

backend {
  default = "Local"
  providers {
    Local {
      ...
      config {
        filesystems {
          local {
            ...
          }
          http { }
        }
      }
      ...
    }
    ...
  }
}

If there is a need to turn off this http filesystem in the default Local backend the following Java property allows for this: -Dbackend.providers.Local.config.filesystems.http.enabled=false.

Caveats

Using HTTP inputs in Cromwell can produce some unexpected behavior: - Files specified by HTTP URIs will be renamed locally, so programs that rely on file extensions or other filenaming conventions may not function properly. - Files located in the same remote HTTP-defined directory will not be colocated locally. This can cause problems if a program is expecting an index file (e.g. .fai) to appear in the same directory as the associated data file (e.g. .fa) without specifying the index location.