Skip to content

Proposal: First-class job dependency support #745

@klesgidis

Description

@klesgidis

Proposal: First-class job dependency support

Summary

Add the ability for a job to declare dependencies on other jobs, so it is not eligible for processing until all its parent jobs have completed. This enables workflow patterns (e.g. fan-in, sequential pipelines) directly in pg-boss without requiring application-level orchestration.

Motivation

A common pattern in job processing is "Job B should not start until Job A finishes." Today, the only way to achieve this in pg-boss is application-level chaining — completing Job A's handler then calling send() for Job B. This works for simple cases but has drawbacks:

  • Atomicity: If the process crashes after completing A but before sending B, the dependent job is lost.
  • Visibility: There's no way to inspect the dependency graph — you can't see that B is waiting on A.
  • Fan-in: When B depends on multiple parents (A1, A2, A3), the application must track partial completion state itself, which is error-prone.
  • Separation of concerns: The "what depends on what" knowledge leaks into every handler rather than being declared at job creation time.

Other job systems (Celery chords, Airflow DAGs, Bull's FlowProducer) offer first-class dependency primitives. This proposal adds an equivalent to pg-boss while staying true to its Postgres-native design.

Proposed API

Sending a job with dependencies

const jobA = await boss.send('process-data', { file: '1.csv' })
const jobB = await boss.send('process-data', { file: '2.csv' })

// jobC won't start until both jobA and jobB have completed
const jobC = await boss.send('aggregate-results', { output: 'report.csv' }, {
  dependsOn: [
    { name: 'process-data', id: jobA },
    { name: 'process-data', id: jobB }
  ]
})

Design

New blocked column on the job table

Rather than adding a new enum value to job_state (which would break the carefully ordered state < 'active' comparisons used throughout the codebase), a boolean blocked column is added:

  • Jobs with unmet dependencies are inserted with state = 'created' and blocked = true.
  • The fetch index (job_i5) is updated to include AND NOT blocked, so blocked jobs are invisible to fetchNextJob at the index level — zero cost for queues that don't use dependencies.
  • All existing state comparisons (state < 'active', state < 'completed', etc.) remain unchanged.

Unblocking on completion

The completeJobs SQL is extended with a CTE that runs after the existing completion UPDATE:

-- After completing parent jobs, check if any of their children are now fully unblocked
WITH completed AS (
  -- existing completion UPDATE ... RETURNING name, id
),
children_to_check AS (
  SELECT DISTINCT d.child_name, d.child_id
  FROM pgboss.job_dependency d
  JOIN completed c ON c.name = d.parent_name AND c.id = d.parent_id
),
unblocked AS (
  UPDATE pgboss.job j
  SET blocked = false
  FROM children_to_check ct
  WHERE j.name = ct.child_name
    AND j.id = ct.child_id
    AND j.blocked = true
    AND NOT EXISTS (
      SELECT 1
      FROM pgboss.job_dependency d2
      JOIN pgboss.job p ON p.name = d2.parent_name AND p.id = d2.parent_id
      WHERE d2.child_name = ct.child_name
        AND d2.child_id = ct.child_id
        AND p.state <> 'completed'
    )
)
SELECT count(*) FROM completed

This is targeted — it only examines children of the just-completed job, then verifies all other parents are also completed. For queues that don't use dependencies, the children_to_check CTE returns zero rows and the unblocking UPDATE is a no-op.

Interaction with startAfter

When a blocked job is unblocked, its original start_after timestamp is preserved. If it was set to a future time, the job becomes eligible only after both conditions are met: all dependencies completed AND start_after < now(). This allows expressing "start Job B 5 minutes after all its parents finish" by combining dependsOn with startAfter.

Edge cases

Scenario Behavior
Parent fails permanently Child stays blocked. User can explicitly cancel or fail the child. A future enhancement could add a cascadeFail option.
Parent is cancelled Child stays blocked (same as above).
Parent is retried then completes Works naturally — unblocking triggers on state reaching completed.
Parent is deleted Child stays blocked. Orphaned job_dependency rows are cleaned up by extending the existing deletion supervisor in boss.ts.
Circular dependencies Could be detected at insert time with a recursive CTE check. Alternatively, left as a user responsibility (similar to how most DAG systems handle it).
Dependent job has startAfter Both conditions must be met: unblocked AND start_after < now().
Bulk completion of parents The unblocking CTE handles multiple completed IDs in a single pass.

Scope

This proposal covers the core dependency primitive. Future enhancements could build on it:

  • cascadeFail / cascadeCancel options to propagate failure to dependents
  • Cycle detection at insert time

@timgit what do you think about that? If you are ok with this approach I wouldn't mind implementing it

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions