Summary
TOC currently only sets cache_control on the system prompt and one breakpoint (second-to-last message). More cache positions would increase cache hit rates and reduce cost.
What OpenCode does
OpenCode applies cache_control: {type: "ephemeral"} to the first 2 system messages AND the last 2 non-system messages in the conversation. This creates multiple cache anchors that survive as the conversation grows.
Proposed implementation
In applyCacheBreakpoint() (native_runner.go):
- Keep the existing system prompt cache control
- Add cache control to the last 2 conversation messages (not just second-to-last)
- Clear stale breakpoints from older messages to avoid cache fragmentation
- Ensure this works correctly with OpenRouter's cache_control passthrough to Anthropic
Expected impact
More cache hits → fewer input tokens billed at full price → lower per-session cost, especially for long sessions with large system prompts.
Priority
P1 — direct cost reduction with minimal implementation effort
🤖 Generated with Claude Code
Summary
TOC currently only sets
cache_controlon the system prompt and one breakpoint (second-to-last message). More cache positions would increase cache hit rates and reduce cost.What OpenCode does
OpenCode applies
cache_control: {type: "ephemeral"}to the first 2 system messages AND the last 2 non-system messages in the conversation. This creates multiple cache anchors that survive as the conversation grows.Proposed implementation
In
applyCacheBreakpoint()(native_runner.go):Expected impact
More cache hits → fewer input tokens billed at full price → lower per-session cost, especially for long sessions with large system prompts.
Priority
P1 — direct cost reduction with minimal implementation effort
🤖 Generated with Claude Code