task message: support file path patterns

**Crude outline of an idea that's been brewing for a while, written up as a placeholder. Needs much more thought!**

Writing workflows in terms of data dependencies is clunky in Cylc, however, two simple changes could make this easier:

1. Allow task outputs to accept task message patterns (rather than requiring an exact match).
2. Provide triggering task outputs to tasks.

For context, see also https://github.com/cylc/cylc-flow/issues/2764

### Description

Cylc allows us to write fine-grained dependencies, e.g:

```
foo:file1 & bar:file2 => baz
```

The upstream task satisfies these outputs by running a `cylc message` command.

This example shows how a Cylc workflow can be written to follow an inputs/outputs paradigm rather than an abstract control flow paradigm:

```cylc
#!Jinja2

{% set FILE1 = '$CYLC_WORKFLOW_SHARE_DIR/$CYLC_TASK_CYCLE_POINT/foo/file.dat' %}
{% set FILE2 = '$CYLC_WORKFLOW_SHARE_DIR/$CYLC_TASK_CYCLE_POINT/bar-output.csv' %}

[scheduling]
  [[graph]]
    R1 = foo:file1 & bar:file2 => baz

[runtime]
  [[foo]]
    script = echo 'some data' > "$(eval "$FILE1")"; cylc message -- file1
    [[[environment]]]
      FILE1 = {{ FILE1 }}
    [[[outputs]]]
      file1 = file1

  [[bar]]
    script = echo 'some,data' > "$(eval "$FILE2")"; cylc message -- file2
    [[[environment]]]
      FILE2 = {{ FILE2 }}
    [[[outputs]]]
      file2  = file2

  [[baz]]
    script = do-something-with "$FILE1" "$FILE2"
    [[[environment]]]
      FILE1 = {{ FILE1 }}
      FILE2 = {{ FILE2 }}
```

> [!NOTE]
>
> Another alternative to achieving the above is using `cylc broadcast` which can be used to configure paths in downstream tasks.

However, it is a bit clunky because the `file1`, `file2` outputs are really abstract dependencies in disguise, they serve a function as event-driven triggers, but they can't carry data.


However, if we supported patterns in task messages, this could be made much neater:

```cylc
[scheduling]
  [[graph]]
    R1 = foo:file1 & bar:file2 => baz

[runtime]
  [[foo]]
    script = echo 'some data' > "somewhere"; cylc message -- "file1:somewhere"
    [[[outputs]]]
      file1 = file1:(.*)

  [[bar]]
    script = echo 'some,data' > "somewhere"; cylc message -- "file2:somewhere"
    [[[outputs]]]
      file2 = file2:(.*)

  [[baz]]
    script = do-something-with "$CYLC_TASK_INPUT_file1" "$CYLC_TASK_INPUT_file2"
```

We could potentially go further than this by adding an explicit task `[inputs]` section for fully flexible mapping and many other things besides.


### Blue Sky Ideas (Speculative)

* Tasks declare both inputs and outputs.
* Cylc provides an easier interface for declaring an output as satisfied than using `cylc message` manually (e.g. `cylc output <output-name> <file-path/data>` - note message template handled by Cylc internally).
* Cylc provides the ability to automatically sync outputs between install targets (when the output is a file path) see the `--rsync` option in the [cylc clean proposal](https://github.com/cylc/cylc-admin/blob/master/docs/proposal-cylc-clean.md).
* Cylc GUI links into "artefacts" (i.e. output files) via Jupyter Lab [via this cylc-ui feature](https://github.com/cylc/cylc-ui/pull/2092).

### Proposal (Imminent)

1. Allow task outputs to accept patterns (to carry data):
   - Note, we need to reject potentially duplicate outputs - https://github.com/cylc/cylc-flow/issues/6056
   - In theory we could support multiple named REGEX patterns within a message to carry multiple pieces of data, good idea?
2. Provide triggering outputs to the task (to access the data):
   - Note, we once had a `CYLC_TASK_DEPENDENCIES` variable, but there was too much data to hold in the environment so it was scrapped - https://github.com/cylc/cylc-flow/issues/5764
   - We could investigate file-based solutions, e.g. JSON, CSV, YAML, etc.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

task message: support file path patterns #6811

Description

Blue Sky Ideas (Speculative)

Proposal (Imminent)

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

task message: support file path patterns #6811

Description

Description

Blue Sky Ideas (Speculative)

Proposal (Imminent)

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions