Skip to content

CWL workflow complied to include a subworkflow with an outputs applet fails to run successfully #455

@ThomasHickman

Description

@ThomasHickman

If you have a workflow as follows:

cwlVersion: v1.2
class: Workflow
inputs: []
requirements:
  SubworkflowFeatureRequirement: {}
steps:
    - id: one
      in: []
      out: [output]
      run: one.cwl
    - id: two
      in: []
      out: [output]
      run: two_wrapper.cwl
outputs:
    output1:
      type: File
      outputSource: one/output
    output2:
      type: File
      outputSource: two/output

one.cwl:

cwlVersion: v1.2
class: CommandLineTool
inputs: []
baseCommand:
    - bash
    - -c
    - "echo something > output"
outputs:
    - id: output
      type: File
      outputBinding:
        glob: output

two_wrapper.cwl:

cwlVersion: v1.2
class: Workflow
inputs: []
steps:
    - id: two
      in: []
      out: [output]
      run: two.cwl
outputs:
    output:
      type: File
      outputSource: two/output

two.cwl:

cwlVersion: v1.2
class: CommandLineTool
inputs: []
baseCommand:
    - bash
    - -c
    - "echo something > output"
outputs:
    - id: output
      type: File?
      outputBinding:
        glob: output
`cwlpack` workflow
    "class": "Workflow",
    "cwlVersion": "v1.2",
    "id": "workflow.cwl",
    "inputs": [],
    "outputs": [
        {
            "id": "output1",
            "outputSource": "one/output",
            "type": "File"
        },
        {
            "id": "output2",
            "outputSource": "two/output",
            "type": "File"
        }
    ],
    "requirements": [
        {
            "class": "SubworkflowFeatureRequirement"
        }
    ],
    "steps": [
        {
            "id": "one",
            "in": [],
            "out": [
                "output"
            ],
            "run": {
                "baseCommand": [
                    "bash",
                    "-c",
                    "echo something > output"
                ],
                "class": "CommandLineTool",
                "cwlVersion": "v1.2",
                "id": "workflow.cwl@step_one@one.cwl",
                "inputs": [],
                "outputs": [
                    {
                        "id": "output",
                        "outputBinding": {
                            "glob": "output"
                        },
                        "type": "File"
                    }
                ],
                "requirements": []
            }
        },
        {
            "id": "two",
            "in": [],
            "out": [
                "output"
            ],
            "run": {
                "class": "Workflow",
                "cwlVersion": "v1.2",
                "id": "workflow.cwl@step_two@two_wrapper.cwl",
                "inputs": [],
                "outputs": [
                    {
                        "id": "output",
                        "outputSource": "two/output",
                        "type": "File"
                    }
                ],
                "requirements": [],
                "steps": [
                    {
                        "id": "two",
                        "in": [],
                        "out": [
                            "output"
                        ],
                        "run": {
                            "baseCommand": [
                                "bash",
                                "-c",
                                "echo something > output"
                            ],
                            "class": "CommandLineTool",
                            "cwlVersion": "v1.2",
                            "id": "two_wrapper.cwl@step_two@two.cwl",
                            "inputs": [],
                            "outputs": [
                                {
                                    "id": "output",
                                    "outputBinding": {
                                        "glob": "output"
                                    },
                                    "type": [
                                        "null",
                                        "File"
                                    ]
                                }
                            ],
                            "requirements": []
                        }
                    }
                ]
            }
        }
    ]
}

This generates the following applets:
image

Note: there are two outputs applets, the inner workflow_cwl_step_two_two_wrapper_cwl_outputs applet created by two_wrapper.cwl narrowing the output type from File? in two.cwl to File in two_wrapper.cwl.

When running this, two_wrapper's outputs step fails as seen below:
image
with the error message:

Environment: Map(two/output -> (TOptional(TFile),VFile(dx://file-GYqV19QJKB2b8pXF24xJ1xBb::/output,Some(output),None,Some(sha1$50a4e988380c09d290acdab4bd53d24ee7b497df),Some(10),Vector(),None,None)))
Evaluating workflow outputs
Evaluating output parameters:
  (output1,WorkflowOutputParameter(Some(Identifier(Some(file:/null),workflow.cwl/output1)),None,None,CwlFile,Vector(),None,false,Vector(Identifier(Some(file:/null),one/output)),None,None),CwlFile)
  (output2,WorkflowOutputParameter(Some(Identifier(Some(file:/null),workflow.cwl/output2)),None,None,CwlFile,Vector(),None,false,Vector(Identifier(Some(file:/null),two/output)),None,None),CwlFile)
[error] failure executing Workflow action 'Outputs'
java.lang.Exception: cannot coerce VNull to TFile
	at dx.core.ir.Value$.coerceTo(Value.scala:311)
	at dx.executor.cwl.CwlWorkflowExecutor.$anonfun$evaluateOutputs$3(CwlWorkflowExecutor.scala:329)
	at scala.collection.immutable.Vector1.map(Vector.scala:1872)
	at scala.collection.immutable.Vector1.map(Vector.scala:375)
	at dx.executor.cwl.CwlWorkflowExecutor.evaluateOutputs(CwlWorkflowExecutor.scala:283)
	at dx.executor.WorkflowExecutor.evaluateOutputs(WorkflowExecutor.scala:156)
	at dx.executor.WorkflowExecutor.apply(WorkflowExecutor.scala:897)
	at dx.executor.BaseCli.dispatchCommand(BaseCli.scala:103)
	at dx.executor.BaseCli.main(BaseCli.scala:137)
	at dxExecutorCwl.MainApp$.delayedEndpoint$dxExecutorCwl$MainApp$1(Main.scala:27)
	at dxExecutorCwl.MainApp$delayedInit$body.apply(Main.scala:26)
	at scala.Function0.apply$mcV$sp(Function0.scala:39)
	at scala.Function0.apply$mcV$sp$(Function0.scala:39)
	at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)
	at scala.App.$anonfun$main$1(App.scala:76)
	at scala.App.$anonfun$main$1$adapted(App.scala:76)
	at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:563)
	at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:561)
	at scala.collection.AbstractIterable.foreach(Iterable.scala:926)
	at scala.App.main(App.scala:76)
	at scala.App.main$(App.scala:74)
	at dxExecutorCwl.MainApp$.main(Main.scala:26)
	at dxExecutorCwl.MainApp.main(Main.scala)

implying that it's trying to collect outputs for the entire workflow, not the two workflow.

I've had a bit of a delve into the dxCompiler source code to see what might be going on. My theory is that:

  1. Prior to CwlWorkflowExecutor.scala:329 (mentioned in the exception), we find that the output parameters are determined from the workflow property in the CwlWorkflowExecutor class here:
    https://github.com/dnanexus/dxCompiler/blob/develop/executorCwl/src/main/scala/dx/executor/cwl/CwlWorkflowExecutor.scala#L271
  2. The workflow that it fetches here seems to be from the CwlWorkflowExecutor.create function. In this function, the following bit of code executes this to find the appropriate workflow:
    https://github.com/dnanexus/dxCompiler/blob/develop/executorCwl/src/main/scala/dx/executor/cwl/CwlWorkflowExecutor.scala#L78-L82
  3. In this bit of ciode OriginalName of the outputs applet is "${wfName}_outputs", as can be seen here:
    https://github.com/dnanexus/dxCompiler/blob/develop/compiler/src/main/scala/dx/translator/cwl/ProcessTranslator.scala#L748
  4. CwlWorkflowExecutor.create therefore follows:
    https://github.com/dnanexus/dxCompiler/blob/develop/executorCwl/src/main/scala/dx/executor/cwl/CwlWorkflowExecutor.scala#L98-L103
    and considers the wrong workflow

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions