Skip to content

Add multi-position prompt caching #134

@louismorgner

Description

@louismorgner

Summary

TOC currently only sets cache_control on the system prompt and one breakpoint (second-to-last message). More cache positions would increase cache hit rates and reduce cost.

What OpenCode does

OpenCode applies cache_control: {type: "ephemeral"} to the first 2 system messages AND the last 2 non-system messages in the conversation. This creates multiple cache anchors that survive as the conversation grows.

Proposed implementation

In applyCacheBreakpoint() (native_runner.go):

  • Keep the existing system prompt cache control
  • Add cache control to the last 2 conversation messages (not just second-to-last)
  • Clear stale breakpoints from older messages to avoid cache fragmentation
  • Ensure this works correctly with OpenRouter's cache_control passthrough to Anthropic

Expected impact

More cache hits → fewer input tokens billed at full price → lower per-session cost, especially for long sessions with large system prompts.

Priority

P1 — direct cost reduction with minimal implementation effort

🤖 Generated with Claude Code

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions