Skip to content

feat: durable background execution + HITL suspend/resume schema#13570

Closed
Cristhianzl wants to merge 13 commits into
release-1.11.0from
cz/human-in-the-loop
Closed

feat: durable background execution + HITL suspend/resume schema#13570
Cristhianzl wants to merge 13 commits into
release-1.11.0from
cz/human-in-the-loop

Conversation

@Cristhianzl

@Cristhianzl Cristhianzl commented Jun 9, 2026

Copy link
Copy Markdown
Member

Objective

Open the human-in-the-loop groundwork (LE-1437) on release-1.11.0: a durable
background-execution substrate whose jobs can later suspend for human input and
resume without losing state.

ogabrielluiz and others added 10 commits June 1, 2026 15:39
Rebased onto release-1.10.0. The base independently rebuilt the v2
workflows backend (RBAC, body globals, share-aware fetch); keep our
forward design and conform its auth to that work:

1. Auth: keep get_current_user_for_workflow (session-or-API-key authN
   that does not hold a DB connection during the inline run, avoiding
   the SQLite lock contention api_key_security would cause) and enforce
   the base's RBAC on top: ensure_flow_permission(EXECUTE) before run,
   (READ) before status reconstruct, with widen_for_shares fetch.
2. Port the base's request-body globals onto the v2 WorkflowRunRequest.
   The X-LANGFLOW-GLOBAL-VAR-* headers stay supported (the Responses API
   passes globals that way); body globals win on conflict. Converters
   echo the effective globals via effective_globals.
3. Public endpoint keeps the v1 build_public_tmp posture
   (access_type==PUBLIC, run-as-owner); RBAC applies to the
   authenticated endpoint only.
4. Preserve the base's post-build KB-cache invalidation in the AG-UI
   build path.

The endpoint, AG-UI bridge, pluggable stream adapters, public endpoint,
and re-attach are unchanged.
The synchronous /api/v2/workflows response keyed every result under its
component id, so reading the answer meant knowing an id you can't predict.
Surface two additive fields:

- output_text: the flow's single text answer (ChatOutput/TextOutput). None
  when the flow has zero or multiple text outputs, so callers read outputs
  rather than the shortcut guessing which channel is the answer.
- session_id: echoes the resolved session so chat/memory callers can
  continue the same thread (v1 /run returned this; v2 had dropped it).

outputs is unchanged, so this is non-breaking.
…ponse

Pin the sync-response shortcuts on the v2 workflows endpoint:
- output_text surfaces the lone ChatOutput/TextOutput text and stays None for
  non-output message nodes, data-only flows, and multi-text flows
- session_id echoes the resolved session; the error response exposes neither
- each outputs entry exposes only {type, status, content, metadata}, with the
  component id carried by the dict key

Also drop the component_id kwarg the converter passed to ComponentOutput, which
has no such field and silently dropped it.
Replace the flat output_text shortcut with an `output` object carrying the
resolved text answer plus a `reason` that explains why it resolved that way
(single/multiple/none/non_string/failed), so a null answer is always
diagnosable instead of silently None. `reason` follows the LLM-domain
finish_reason/stop_reason convention, distinct from the lifecycle status.

Also add `display_name` to each ComponentOutput (the stable component id
stays the dict key) and a computed `has_errors` flag derived from errors.
Let a sync caller name the output(s) they want via output_ids so
output.text resolves deterministically (reason=single) on multi-output
flows instead of going null. Selection is steer-only: it picks the
answer among the named outputs without filtering the outputs map.

Invalid ids are rejected with 422 before the flow runs (and before any
job row is created), so a typo costs no compute. Resolution considers
selected outputs that actually fired, so branching flows resolve to
whichever candidate ran.
Give v2-workflows sync and the langflow stream protocol one parser. The
stream now emits a normalized "output" event per terminal output carrying
an OutputEvent (the ComponentOutput shape sync returns in outputs[id], plus
component_id). A shared build_component_output() backs both the sync
converter and the adapter, and the build loop ships authoritative vertex
metadata as an additive output_meta key on end_vertex (existing consumers
read build_data and ignore it).

This is access-pattern parity (one parser, same fields, same terminal set),
not byte-identical content: the stream reuses the v1 build path whose
display serialization differs from sync's run_graph output.
…ackend)

Turns v2 mode:background into a durable, in-API background execution service behind a BackgroundExecutionService facade. Adds the store layer (result/error columns, job_events durable milestone log, execution_signals control, heartbeat/lease, 3 migrations), the default backend (bounded executor, runner, in-memory live bus, liveness-aware single-flight orphan sweep), the v2 endpoint rewiring, and the real-instance test harness. Needs no new infra; works on the SQLite single-process install. The redis-scaled worker backend is stacked on top in a follow-up PR.
The hard_proof marker name was a vibe word that said nothing about what the
tests need. Rename it to real_services everywhere: the pytest marker
registration, the *_hard_proof.py test files, the Makefile target
(real_services_tests), the -m selector in migration-validation.yml, and the
CI job. real_services says what these tests require: real Postgres + Redis +
worker subprocesses. (integration was already taken for the external-API
suite under tests/integration.)
@coderabbitai

coderabbitai Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 724282d0-2e32-49a4-8220-a4f5f5d99ce0

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch cz/human-in-the-loop

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added the enhancement New feature or request label Jun 9, 2026
@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Migration Validation Passed

All migrations follow the Expand-Contract pattern correctly.

@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Jun 9, 2026
@Cristhianzl Cristhianzl changed the base branch from release-1.10.0 to release-1.11.0 June 10, 2026 00:02
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Jun 10, 2026
@Cristhianzl Cristhianzl reopened this Jun 10, 2026
@Cristhianzl Cristhianzl changed the title feat: human in the loop feat: durable background execution + HITL suspend/resume schema Jun 10, 2026
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Jun 10, 2026
@github-actions

github-actions Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Build successful! ✅
Deploying docs draft.
Deploy successful! View draft

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants