Skip to content

[Feature] Allow handling of empty job configs #175

@pdiu

Description

@pdiu

Describe the feature
The ability to handle empty job config files. This in particular is useful when job configs are split into multiple config files such as a default_jobs.yml and a custom_jobs.yml where default_jobs contains environment specific default jobs such as a CI job, a PR schema clean-up job etc which do not contain jobs in certain environments so evaluate to an empty job config and cause errors when perhaps it should be allowed such that if there are no jobs under the job: key then it just does nothing instead of raising an error and exiting.

Describe alternatives you've considered
Tried adding a job for each environment into the job configs but this isn't clean.

Who will this benefit?
Teams that split their jobs up into different config files. For example, the splitting into default_jobs and custom_jobs.

default_jobs may look like this:

anchors:
  &default_settings
  project_id: "{{ project_id }}"
  account_id: "{{ account_id }}"
  dbt_version:
  environment_id: "{{ environment_id }}"
  name: ""  
  schedule:
    cron: "0 0 1 * *"
  settings:
    target_name: default
    threads: 4
  triggers:
    git_provider_webhook: false
    github_webhook: false
    schedule: false
  execution:
    timeout_seconds: 2400
  state: 1
  run_generate_sources: false
  generate_docs: true
  job_type: "other"

jobs:
  # INT environment specific jobs
  {% if environment_name == "Integration" %}
  "{{ environment_name }}-CI":
    <<: *default_settings
    deferring_environment_id: "{{ deferring_environment_id }}"
    execute_steps:
    - "dbt build --select state:modified+ --fail-fast"
    - "dbt build --select state:modified,config.materialized:incremental --fail-fast"
    execution:
      timeout_seconds: 7200
    job_type: ci
    run_lint: true
    errors_on_lint_failure: true
    generate_docs: false
    settings:
      target_name: ci
      threads: 6

  "{{ environment_name }}-drop_pr_schemas":
    <<: *default_settings
    execute_steps:
    - "dbt run-operation drop_pr_schemas --args '{dry_run: False}'"
    generate_docs: false
    schedule:
      cron: "0 15 * * *"
    triggers:
      schedule: true
  {% endif %}

In this case when running a dbt-jobs-as-code command against this file with a vars file that has environment_name as "SIT" or anything other than "Integration" then it just completely fails and exits which when run in a CI/CD pipeline causes that to exit as well when instead it would be better if it just did nothing and continued.

Additional context
For my context, we are calling dbt-jobs-as-code commands as part of the CI/CD process and also using globs passing in the entire jobs/ directory which contains the 2 job config files. The full command looks like:

dbt-jobs-as-code plan /jobs --vars-yml jobs/vars/sit.yml --limit-projects-envs-to-yml

This is the resulting error

config = _load_yaml_with_template(config_files, var_files)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    combined_config["jobs"].update(config["jobs"])
TypeError: 'NoneType' object is not iterable

Are you interested in contributing this feature?
Yes.
I'm not certain, but it looks like if _load_yaml_with_template() is adjusted such that instead of doing

if config:

it does

if config.get("jobs", {}):

Then it handles it without erroring out which is actually more similar to the _load_yaml_no_template function.

Metadata

Metadata

Assignees

Labels

triageIssue needs to be triaged by the maintainer team.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions