Customize tasks
Runtime attributes can be specified in one of three ways:
- Within a task you can specify runtime attributes to customize the environment for the call.
- Default runtime attributes for all tasks can be specified in Workflow Options.
- In WDL 1.1 and later, runtime attributes can be overridden in the workflow input JSON file.
Task Example
task my_task {
command {
echo "Hello World!"
}
runtime {
docker: "ubuntu:latest"
memory: "4G"
cpu: "3"
zones: "us-central1-c us-central1-b"
disks: "/mnt/mnt1 3 SSD, /mnt/mnt2 500 HDD"
}
}
workflow my_wf {
call my_task
}
Workflow Input Override Example
{
"my_wf.my_task.runtime.memory": "12GB"
}
Recognized Runtime attributes and Backends
Cromwell recognizes certain runtime attributes and has the ability to format these for some Backends. See the table below for common attributes that apply to most backends.
| Runtime Attribute | Local | Google Cloud | TES | AWS Batch | HPC |
|---|---|---|---|---|---|
cpu |
✅ | ✅ | cpu |
||
memory |
✅ | ✅ | memory_mb / memory_gb |
||
disks |
✅ | ⚠️ Note 1 | ⚠️ Note 2 | ℹ️ Note 3 | |
disk |
✅ | ||||
docker |
✅ | ✅ | ✅ | docker ℹ️ Note 3 |
|
maxRetries |
✅ | ✅ | ✅ | ℹ️ Note 3 | |
continueOnReturnCode |
✅ | ✅ | ✅ | ℹ️ Note 3 | |
failOnStderr |
✅ | ✅ | ✅ | ℹ️ Note 3 | |
gpu |
✅ | ✅ | ✅ | ✅ | ℹ️ Note 4 |
Note 1
Partial support. See TES documentation for details.
Note 2
Partial support. See
disksfor details.Note 3
The HPC Shared Filesystem backend (SFS) is fully configurable and any number of attributes can be exposed. Cromwell recognizes some of these attributes (
cpu,memoryanddocker) and parses them into the attribute listed in the table which can be used within the HPC backend configuration.** Note 4**
Supported starting in WDL 1.1
Google Cloud Specific Attributes
There are a number of additional runtime attributes that apply to the Google Cloud Platform:
AWS Specific Attributes
Expression support
Runtime attribute values are interpreted as expressions. This means that it has the ability to express the value of a runtime attribute as a function of one of the task's inputs.
For example:
task runtime_test {
String ubuntu_tag
Int memory_gb
command {
./my_binary
}
runtime {
docker: "ubuntu:" + ubuntu_tag
memory: memory_gb + "GB"
}
}
HPC backends may define other configurable runtime attributes beyond the five listed, to find out more visit the SunGridEngine tutorial.
Default Values
Default values for runtime attributes can be specified via Workflow Options.
For example, consider this WDL file:
task first {
command { ... }
}
task second {
command {...}
runtime {
docker: "my_docker_image"
}
}
workflow w {
call first
call second
}
And this set of workflow options:
{
"default_runtime_attributes": {
"docker": "ubuntu:latest",
"zones": "us-central1-c us-central1-b"
}
}
Then, these values for docker and zones will be used for any task that does not explicitly override them in the WDL file. In return, the effective runtime for task first is:
{
"docker": "ubuntu:latest",
"zones": "us-central1-c us-central1-b"
}
And the effective runtime for task second is:
{
"docker": "my_docker_image",
"zones": "us-central1-c us-central1-b"
}
Note how for task second the WDL value for docker is used instead of the default provided in the workflow options.
Runtime Attribute Descriptions
cpu
Default: 1
The cpu runtime attribute represents the number of cores that a job requires, however each backend may interpret this differently:
- In Google Cloud: this is interpreted as "the minimum number of cores to use."
- In HPCs (SFS): this is configurable, but usually a reservation and/or limit of number of cores.
Example
runtime {
cpu: 2
}
memory
Default: "2G"
Memory is the amount of RAM that should be allocated to a task, however each backend may interpret this differently:
- Google Cloud: The minimum amount of RAM to use.
- SFS: Configurable, but usually a reservation and/or limit of memory.
The memory size is specified as an amount and units of memory, for example "4G":
runtime {
memory: "4G"
}
Within the SFS backend, you can additionally specify memory_mb or memory_gb as runtime attributes within the configuration. More information can be found here.
disks
This attribute specifies volumes that will be mounted to the VM for your job. These volumes are where you can read and write files that will be used by the commands within your workflow.
They are specified as a comma separated list of disks. Each disk is further separated as a space separated triplet (e.g. local-disk 10 SSD) consisting of:
- Mount point (absolute path), or
local-diskto reference the mount point where Google Cloud will localize files and the task's current working directory will be - Disk size in GB (rounded to the next 375 GB for LOCAL)
- Disk type. One of: "LOCAL", "SSD", or "HDD" (documentation)
All tasks launched on Google Cloud must have a local-disk. If one is not specified in the runtime section of the task, then a default of local-disk 10 SSD will be used. The local-disk will be mounted to /cromwell_root.
For the AWS Batch backend, the disk volume is managed by AWS EBS with autoscaling capabilities. As such, the Disk size and disk type will be ignored. If provided, the mount point will be verified at runtime.
The Disk type must be one of "LOCAL", "SSD", or "HDD". When set to "LOCAL", the size of the drive is constrained to 375 GB intervals so intermediate values will be rounded up to the next 375 GB. All disks are set to auto-delete after the job completes.
Example 1: Changing the Localization Disk
runtime {
disks: "local-disk 100 SSD"
}
Example 2: Mounting an Additional Two Disks
runtime {
disks: "/mnt/my_mnt 3 SSD, /mnt/my_mnt2 500 HDD"
}
disk
Specific to the TES backend, sets the disk_gb resource.
runtime {
disk: "25 GB"
}
docker
When specified, Cromwell will run your task within the specified Docker image.
runtime {
docker: "ubuntu:latest"
}
- Local: Cromwell will automatically run the docker container.
- SFS: When a docker container exists within a task, the
submit-dockermethod is called. See the Getting started with containers guide for more information. - GCP: This attribute is mandatory when submitting tasks to Google Cloud.
- AWS Batch: This attribute is mandatory when submitting tasks to AWS Batch.
maxRetries
Default: 0
This retry option is introduced to provide a method for tackling transient job failures. For example, if a task fails due to a timeout from accessing an external service, then this option helps re-run the failed the task without having to re-run the entire workflow. It takes an Int as a value that indicates the maximum number of times Cromwell should retry a failed task. This retry is applied towards jobs that fail while executing the task command. This method only applies to transient job failures and is a feeble attempt to retry a job, that is it cannot be used to increase memory in out-of-memory situations.
If using the Google backend, it's important to note that The maxRetries count is independent from the preemptible count. For example, the task below can be retried up to 6 times if it's preempted 3 times AND the command execution fails 3 times.
runtime {
preemptible: 3
maxRetries: 3
}
continueOnReturnCode
Default: 0
When each task finishes it returns a code. Normally, a non-zero return code indicates a failure. However you can override this behavior by specifying the continueOnReturnCode attribute.
When set to false, any non-zero return code will be considered a failure. When set to true, all return codes will be considered successful.
runtime {
continueOnReturnCode: true
}
When set to an integer, or an array of integers, only those integers will be considered as successful return codes.
runtime {
continueOnReturnCode: 1
}
runtime {
continueOnReturnCode: [0, 1]
}
failOnStderr
Default: false
Some programs write to the standard error stream when there is an error, but still return a zero exit code. Set failOnStderr to true for these tasks, and it will be considered a failure if anything is written to the standard error stream.
runtime {
failOnStderr: true
}
gpu
Default: "false"
If true, Cromwell will attempt to ensure that the task can run in an environment with GPU support. The task will be
failed if we can't confirm a GPU is available. This attribute is NOT required to be true to run a task with GPUs, it
merely adds a way to fast-fail tasks that are expected to run with GPUs but are not properly configured to do so.
- Google Cloud: Cromwell will attempt to examine other runtime attributes such as
gpuCount,gpuType,predefinedMachineTypeto determine whether the task is configured to use a GPU, and fail the task if it is not. - AWS Batch: Cromwell will attempt to examine other runtime attributes such as
gpuCountto determine whether the task is configured to use a GPU, and fail the task if it is not. - SFS: Cromwell is unable to confirm GPU availability, so tasks with
gpu: truewill always fail. - TES: Cromwell is unable to confirm GPU availability, so tasks with
gpu: truewill always fail.
runtime {
gpu: true
}
zones
The ordered list of zone preference (see Region and Zones documentation for specifics).
The zones are specified as a space separated list, with no commas:
runtime {
zones: "us-central1-c us-central1-b"
}
Defaults to the configuration setting genomics.default-zones in the Google Cloud configuration block, which in turn defaults to using us-central1-b.
predefinedMachineType (alpha)
Default: none
This attribute is in experimental status. Please see limitations for details.
Select a specific GCP machine type, such as n2-standard-2 or a2-highgpu-1g.
Setting predefinedMachineType overrides cpu, memory, gpuCount, and gpuType.
predefinedMachineType is compatible with cpuPlatform so long as the platform is a valid option for the specified type.
runtime {
predefinedMachineType: "n2-standard-2"
}
Possible benefits:
- Access GPU machine types such as Ampere, Lovelace, and other newer models
- Avoid 5% surcharge on custom machine types (Cromwell default)
- Reduce preemption by using predefined types with better availability
- Run basic tasks at the lowest possible cost with shared-core machines like
e2-medium
Limitations:
- Cost estimation not yet supported
- GPU availability may be limited due to resource or quota exhaustion
- GCP types are non-portable and proprietary to Google Cloud Platform
- GCP Batch job details display incorrect "Cores", "Memory" values (cosmetic)
preemptible
Default: 0
Passed to Google Cloud: "If applicable, preemptible machines may be used for the run."
Take an Int as a value that indicates the maximum number of times Cromwell should request a preemptible machine for this task before defaulting back to a non-preemptible one.
eg. With a value of 1, Cromwell will request a preemptible VM, if the VM is preempted, the task will be retried with a non-preemptible VM.
runtime {
preemptible: 1
}
In GCP Batch, preempted jobs can be identified in job metadata (gcloud batch jobs describe) by a statusEvent with a description that looks like:
Job state is set from RUNNING to FAILED for job projects/abc/locations/us-central1/jobs/job-abc.Job
failed due to task failure. Specifically, task with index 0 failed due to the
following task event: "Task state is updated from RUNNING to FAILED on zones/us-central1-b/instances/8675309
due to Spot VM preemption with exit code 50001."
bootDiskSizeGb
In addition to working disks, Google Cloud allows specification of a boot disk size. This is the disk where the docker image itself is booted (not the working directory of your task on the VM). Its primary purpose is to ensure that larger docker images can fit on the boot disk.
runtime {
# Yikes, we have a big OS in this docker image! Allow 50GB to hold it:
bootDiskSizeGb: 50
}
Since no local-disk entry is specified, Cromwell will automatically add local-disk 10 SSD to this list.
noAddress
This runtime attribute adds support to disable assigning external IP addresses to VMs provisioned by the Google backend. If set to true, the VM will NOT be provided with a public IP address, and only contain an internal IP. If this option is enabled, the associated job can only load docker images from Google Container Registry, and the job executable cannot use external services other than Google APIs.
Note well! You must enable "Private Google Access" for this feature to work. See "How To Setup" below.
For example, the task below will succeed:
command {
echo "hello!"
}
runtime {
docker: "gcr.io/gcp-runtimes/ubuntu_16_0_4:latest"
noAddress: true
}
The task below will fail for two reasons: 1. The command is accessing an external service, in this case GitHub. 2. The docker image is available in DockerHub and not the Google Container Registry.
command {
git clone https://github.com/broadinstitute/cromwell.git
}
runtime {
docker: "docker.io/alpine/git:latest"
noAddress: true
}
How to Setup
Configure your Google network to use "Private Google Access". This will allow your VMs to access Google Services including Google Container Registry, as well as Dockerhub images.
- Using
gcloud compute networks subnets list, identify the subnet and region you will be using with Cromwell. If multiple, run the next step for each region and subnet you wish to use. gcloud compute networks subnets update [SUBNET-NAME] --region [REGION] --enable-private-ip-google-access
That's it! You can now run with noAddress runtime attribute and it will work as expected.
gpuCount and gpuType
Attach GPUs to the GCP Batch instance. Make sure to choose a zone in which the type of GPU you want is available.
The types of compute GPU supported are:
nvidia-tesla-v100nvidia-tesla-p100nvidia-tesla-p4nvidia-tesla-t4
runtime {
gpuType: "nvidia-tesla-t4"
gpuCount: 2
zones: ["us-central1-c"]
}
nvidiaDriverVersion is deprecated and ignored; GCP Batch selects the correct driver version automatically.
cpuPlatform
This option is specific to the Google Cloud backend, specifically this feature when a certain minimum CPU platform is desired.
A usage example:
runtime {
cpu: 2
cpuPlatform: "Intel Cascade Lake"
}
Note that when this options is specified, make sure the requested CPU platform is available in the zones you selected.
The following CPU platforms are currently supported by the Google Cloud backend:
- Intel Ice Lake
- Intel Cascade Lake
- Intel Skylake
- Intel Broadwell
- Intel Haswell
- Intel Ivy Bridge
- Intel Sandy Bridge
- AMD Rome
awsBatchRetryAttempts
Default: 0
This runtime attribute adds support to AWS Batch Automated Job Retries which makes it possible to tackle transient job failures. For example, if a task fails due to a timeout from accessing an external service, then this option helps re-run the failed the task without having to re-run the entire workflow. This option is also very useful when using SPOT instances.
It takes an Int, between 1 and 10, as a value that indicates the maximum number of times AWS Batch should retry a failed task. If the value 0 is passed, the Retry Strategy will not be added to the job definiton and the task will run just once.
runtime {
awsBatchRetryAttempts: integer
}
ulimits
Default: empty
This attribute is only supported for AWS. A list of ulimits values to set in the container. This parameter maps to Ulimits in the Create a container section of the Docker Remote API and the --ulimit option to docker run.
"ulimits": [
{
"name": string,
"softLimit": integer,
"hardLimit": integer
}
...
]
Parameter description:
name- The
typeof theulimit. - Type: String
-
Required: Yes, when
ulimitsis used. -
softLimit - The soft limit for the
ulimittype. - Type: Integer
-
Required: Yes, when
ulimitsis used. -
hardLimit - The hard limit for the
ulimittype. - Type: Integer
- Required: Yes, when
ulimitsis used.