Skip to content

fix(daemon): surface OpenCode error frames + treat empty-output runs as failed#700

Merged
mrcfps merged 2 commits intonexu-io:mainfrom
Sid-Qin:fix/opencode-error-frames-surface-691
May 6, 2026
Merged

fix(daemon): surface OpenCode error frames + treat empty-output runs as failed#700
mrcfps merged 2 commits intonexu-io:mainfrom
Sid-Qin:fix/opencode-error-frames-surface-691

Conversation

@Sid-Qin
Copy link
Copy Markdown
Contributor

@Sid-Qin Sid-Qin commented May 6, 2026

Summary

Closes #691. OpenCode runs would silently complete in ~3 seconds without producing any visible chat output and still be rendered as a successful turn — three independent bugs along the structured-stream path conspired to produce this silent-failure shape. Fix lands all three at the right layer with full live-traffic validation.

PR description == post-mortem for whoever audits the next stream adapter that hits the same shape.

cc @lefarcen — your diagnosis on #691 cited the exact file:line for two of the three bugs; this implements that plus the empty-output guard you flagged in the third bullet.

Root cause (three independent bugs)

Bug 1 — `apps/daemon/src/json-event-stream.ts:85-91`

OpenCode emits structured error frames on stdout (provider auth failures, network errors, schema mismatches) and still exits 0. The parser was downgrading these to `{type: 'raw', line: ...}`, which the chat UI does not render as an assistant message. The error string was effectively discarded.

Fix: emit a proper `{type: 'error', message, raw}` event matching the qoder-stream contract that the daemon's existing error-handling path already recognises (server.ts:4156-4168).

Bug 2 — `apps/daemon/src/server.ts:4199-4205`

Even with Bug 1 fixed, the json-event-stream branch wired the parser to a bare `(ev) => send('agent', ev)` lambda — bypassing the `sendAgentEvent` wrapper that interprets `type:'error'` and sets the `agentStreamError` flag the close handler reads to flip the run to `failed`. So an emitted `error` event would just be forwarded as a no-op `agent` SSE event with no lifecycle effect.

Fix: route json-event-stream through `sendAgentEvent`, mirroring the qoder-stream-json wiring at line 4175.

Bug 3 — `apps/daemon/src/server.ts:4220-4234` (the empty-output case)

Even with Bugs 1 and 2 fixed, there's still a class of runs where the upstream emits no error frame, no substantive event, and exits 0. Pre-fix this was marked `succeeded` and the user saw a blank chat with no diagnostic.

Fix: track `agentProducedOutput` inside `sendAgentEvent` (set on `text_delta`, `thinking_delta`, `tool_use`, `tool_result`, `artifact` — deliberately NOT on `status` / `usage`, since a model can emit token-usage numbers for an empty completion). When the close handler sees `code === 0 && trackingSubstantiveOutput && !agentProducedOutput`, the run is marked `failed` with an explicit `AGENT_EXECUTION_FAILED` SSE error so the chat shows a clear reason instead of an empty success.

The check is gated by `trackingSubstantiveOutput` so it only fires on streams that actually contribute to the output flag — currently `qoder-stream-json` and `json-event-stream`. ACP sessions and plain stdout streams keep their existing success/failure determination.

Live verification

Verified end-to-end via a stub `opencode` binary against `pnpm tools-dev run web` for all three failure shapes:

Scenario Pre-fix Post-fix
Stub emits `{"type":"error",...}` + exit 0 `succeeded`, empty chat `failed`, OpenCode error message surfaced as SSE `error` event
Stub emits nothing + exit 0 `succeeded`, empty chat `failed`, "Agent completed without producing any output…" diagnostic
Stub emits normal `step_start` / `text` / `step_finish` + exit 0 `succeeded` `succeeded` (regression check, no behaviour change)

The exit code is propagated correctly in all three cases (the actual code from the child process, not a placeholder).

Test plan

  • Unit tests: 3 new cases in `apps/daemon/tests/json-event-stream.test.ts` pin the OpenCode error event shape — full repro (`error.data.message`), `error.name` fallback, and the generic-fallback shape when `error` is empty.
  • `pnpm --filter @open-design/daemon test` — 60 files / 851 tests pass.
  • `pnpm --filter @open-design/web test` — 42 files / 309 tests pass.
  • `pnpm typecheck` (full repo) clean after `pnpm --filter @open-design/daemon build` + `pnpm --filter @open-design/desktop build` to seed the workspace dist outputs.
  • Live e2e as documented above.

Out of scope (called out for the next person who audits this surface)

  • claude-stream-json / copilot-stream-json still wire to a bare `(ev) => send('agent', ev)` and don't currently parse `type:'error'` frames. If their CLIs ever start emitting structured error events the same pattern (route through `sendAgentEvent` + emit proper `type:'error'`) applies. Not in scope here because we have no evidence those CLIs do this today, and changing the wiring without a confirmed failure mode risks regressing currently-working flows.
  • ACP sessions (`pi-rpc`, `acp-json-rpc`) own their own success/failure determination via `acpSession?.hasFatalError()` and the empty-output guard explicitly skips them via `trackingSubstantiveOutput`.
  • Plain stdout streams have no event-level tracking, so the empty-output guard skips them too. Diagnosing a no-output plain-stream agent is a separate problem that needs different signals (probably stderr inspection or stdout-byte-count tracking).

…as failed

Closes nexu-io#691. OpenCode runs would silently complete in ~3 seconds without
producing any visible chat output and still be rendered as a successful
turn — three independent bugs along the structured-stream path conspired
to produce this silent-failure shape.

## Bug 1 — `apps/daemon/src/json-event-stream.ts:85-91`

OpenCode emits structured error frames on stdout (e.g. provider auth
failures, network errors, schema mismatches) and still exits 0. The
parser was downgrading these to `{type: 'raw', line: ...}`, which the
chat UI does not render as an assistant message. The error string was
discarded as "no-op output."

Fix: emit a proper `{type: 'error', message, raw}` event matching the
qoder-stream contract that the daemon's existing error-handling path
already recognises.

## Bug 2 — `apps/daemon/src/server.ts:4199-4205`

Even after Bug 1 was fixed, the json-event-stream branch wired the
parser to a bare `(ev) => send('agent', ev)` lambda — bypassing the
`sendAgentEvent` wrapper that interprets `type:'error'` events and
sets the `agentStreamError` flag the close handler reads to flip the
run to `failed`. So an emitted `error` event would just be forwarded
as a no-op `agent` SSE event with no lifecycle effect.

Fix: route json-event-stream through `sendAgentEvent`, mirroring the
qoder-stream-json wiring at line 4175.

## Bug 3 — `apps/daemon/src/server.ts:4220-4234`

Even after Bugs 1 and 2 are fixed, there's still a class of runs where
OpenCode never emits any error frame, never emits any substantive
event, and exits 0. Pre-fix this was marked `succeeded` and the user
saw a blank chat with no diagnostic.

Fix: track `agentProducedOutput` inside `sendAgentEvent` (set on
`text_delta`, `thinking_delta`, `tool_use`, `tool_result`, `artifact`
— deliberately NOT on `status` / `usage`, since a model can emit
token-usage numbers for an empty completion). When the close handler
sees `code === 0 && trackingSubstantiveOutput && !agentProducedOutput`
the run is marked `failed` with an explicit AGENT_EXECUTION_FAILED
SSE error so the chat shows a clear reason instead of a silent
empty turn.

The check is gated by `trackingSubstantiveOutput` so it only fires
on streams that actually contribute to the output flag (currently
qoder-stream-json and json-event-stream). ACP sessions and plain
stdout streams keep their existing success/failure determination.

## Tests

- 3 new unit tests in `apps/daemon/tests/json-event-stream.test.ts`
  pin the OpenCode error event shape: full repro
  (`error.data.message`), `error.name` fallback, and the
  generic-fallback shape when `error` is empty.
- All 60 daemon test files (851 tests) pass on `pnpm --filter
  @open-design/daemon test`. All 42 web test files (309 tests) pass
  on `pnpm --filter @open-design/web test`.
- Full repo `pnpm typecheck` clean.

## Live verification

Verified end-to-end via a stub `opencode` binary that mimics each of
the failure shapes against `pnpm tools-dev run web`:

1. Stub emits `{"type":"error",...}` then `exit 0` — run now ends as
   `failed` with the OpenCode error message surfaced as an SSE
   `error` event. Pre-fix this was `succeeded` with an empty chat.
2. Stub emits nothing then `exit 0` — run now ends as `failed` with
   "Agent completed without producing any output…" diagnostic.
   Pre-fix this was `succeeded` with an empty chat.
3. Stub emits a normal `step_start` / `text` / `step_finish` sequence
   then `exit 0` — run still succeeds. (Regression check.)

## Out of scope (mentioned for the next person)

- `claude-stream-json` and `copilot-stream-json` still wire to a bare
  `(ev) => send('agent', ev)` and don't currently parse `type:'error'`
  frames. If their CLIs ever start emitting structured error events
  the same pattern (route through `sendAgentEvent` + emit proper
  `type:'error'`) applies. Not in scope here because we have no
  evidence those CLIs do this today, and changing the wiring without
  a confirmed failure mode risks regressing currently-working flows.
- ACP sessions (`pi-rpc`, `acp-json-rpc`) own their own success /
  failure determination via `acpSession?.hasFatalError()` and the
  empty-output guard explicitly skips them via
  `trackingSubstantiveOutput`.
- Plain stdout streams have no event-level tracking, so the empty-
  output guard skips them too. Diagnosing a no-output plain-stream
  agent is a separate problem that needs different signals.
@lefarcen lefarcen added the bug-fix Fixes an existing bug label May 6, 2026
@lefarcen lefarcen self-requested a review May 6, 2026 14:56
@Sid-Qin
Copy link
Copy Markdown
Contributor Author

Sid-Qin commented May 6, 2026

CI failure is pre-existing on `main` HEAD, not from this PR.

The failing test is `e2e/tests/localized-content.test.ts` complaining that `de` / `ru` / `fr` locale content covers 67 skills but the canonical list has 68:

```
AssertionError: skills display copy: expected [ 'audio-jingle', 'blog-post', …(67) ]
to deeply equal [ 'audio-jingle', 'blog-post', …(68) ]
```

I rebased onto upstream/main HEAD (`9af2886`) and ran the same test locally — it fails identically on a clean main checkout, no diff applied. The 68th skill is `social-media-dashboard` from PR #678 (`42e4d08 feat(skills): add social-media-dashboard skill + Totality Festival design system`), which added the skill to the canonical list but didn't backfill the three localized content files.

Confirmed scope of this PR is unaffected:

  • This PR's changes are entirely in `apps/daemon/src/json-event-stream.ts`, `apps/daemon/src/server.ts`, and `apps/daemon/tests/json-event-stream.test.ts`.
  • `pnpm --filter @open-design/daemon test` — 60 files / 851 tests pass on this branch (incl. the 3 new ones for Fix OpenCode runs completing in seconds without producing any output #691).
  • `pnpm --filter @open-design/web test` — 42 files / 309 tests pass.
  • `pnpm typecheck` (full repo) clean.

The localized-content gap should be backfilled in a separate PR (or rolled into #678's follow-up). Happy to spin one up if helpful — just say the word.

Copy link
Copy Markdown
Contributor

@lefarcen lefarcen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @Sid-Qin! 👋

This is a really solid fix — three independent bugs along the stream path, each one analyzed at the right layer with clear post-mortem documentation. The empty-output guard is particularly clever: gating it with trackingSubstantiveOutput keeps ACP/plain-stdout streams from false-positiving, while the SUBSTANTIVE_AGENT_EVENT_TYPES set correctly distinguishes "model emitted nothing" from "model emitted lifecycle noise."

What I like:

  • The three-level error fallback (error.data.messageerror.name → generic) is thorough
  • Routing json-event-stream through sendAgentEvent mirrors the qoder-stream pattern — consistent architecture
  • Test coverage pins the error shape regressions
  • The PR body doubles as a post-mortem for whoever audits the next stream adapter

One observation (non-blocking):
The empty-output guard path in server.ts:4256-4280 has no direct unit test — it's an integration scenario that needs a full daemon + child process setup to exercise. That's probably fine given the live e2e verification table in the PR body, but if you ever add daemon integration tests, this would be a good candidate.

No security / correctness issues found. The raw: stringifyContent(obj) in the error event is just the original JSON string OpenCode already emitted, and the Set lookup on SUBSTANTIVE_AGENT_EVENT_TYPES is negligible for stream volumes.

Looks good to me; deferring final approval to a maintainer. 🎉

@lefarcen
Copy link
Copy Markdown
Contributor

lefarcen commented May 6, 2026

Confirmed — your diagnosis is spot-on. The locale gap from PR #678 is independent of this fix, and your scope (daemon stream handling) is unaffected.

No need to block this PR on the locale backfill — that can land separately. If you want to spin up a quick fix for the // locale files, feel free, but it's not a prerequisite here.

The daemon fix itself is clean and ready for maintainer review once CI alignment is sorted (either via the locale backfill or temporarily skipping the failing test).

Copy link
Copy Markdown
Contributor

@mrcfps mrcfps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Sid-Qin I reviewed the daemon stream handling changes for OpenCode error frames, the json-event-stream wiring through sendAgentEvent, the empty-output failure guard, and the new parser regression coverage. The implementation cleanly surfaces structured error frames and avoids the silent-success path while keeping ACP/plain stdout streams gated out; the changed ranges look safe and maintainable. Thanks for the careful post-mortem and live verification work here — this should make the failure mode much clearer for users. 🙂

Generated by Looper 0.6.1 · runner=reviewer · agent=opencode

@Sid-Qin
Copy link
Copy Markdown
Contributor Author

Sid-Qin commented May 6, 2026

@lefarcen — opened the backfill PR you suggested as #702. 6 lines across 3 files, all 6 localized-content.test.ts cases pass locally. Once it merges, this PR's CI should auto-go-green on rebase.

@Sid-Qin
Copy link
Copy Markdown
Contributor Author

Sid-Qin commented May 6, 2026

Thanks @mrcfps 🙏 — appreciate the close read on the gating logic. The `trackingSubstantiveOutput` flag was the part I went back and forth on the most; landing it as opt-in (qoder-stream-json + json-event-stream) felt safer than fanning out to every stream when only OpenCode has a confirmed need today, and the comment in the close handler should make it cheap for the next person to flip on for claude/copilot when those CLIs justify it.

#702 is the locale backfill that unblocks CI on rebase here. Either order works for landing — admin-merge this one now or wait for #702 → green CI on rebase, the daemon-side outcome is identical.

@mrcfps mrcfps merged commit 1bd1f3a into nexu-io:main May 6, 2026
1 check passed
@lefarcen
Copy link
Copy Markdown
Contributor

lefarcen commented May 6, 2026

Merged! 🎉 Thanks @Sid-Qin for the careful post-mortem and the three-layer fix — this closes a subtle failure mode that was hard to diagnose. The structured error surfacing + empty-output guard should make OpenCode runs much clearer for users going forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug-fix Fixes an existing bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix OpenCode runs completing in seconds without producing any output

3 participants