Skip to content

Latest commit

 

History

History
199 lines (136 loc) · 9.24 KB

File metadata and controls

199 lines (136 loc) · 9.24 KB

Steering

Steering allows injecting messages into an already-running agent loop, interrupting it between tool calls without waiting for the entire cycle to complete.

How it works

When the agent is executing a sequence of tool calls (e.g. the model requested 3 tools in a single turn), steering checks the queue after each tool completes. If it finds queued messages:

  1. The remaining tools are skipped and receive "Skipped due to queued user message." as their result
  2. The steering messages are injected into the conversation context
  3. The model is called again with the updated context, including the user's steering message
User ──► Steer("change approach")
                │
Agent Loop      ▼
  ├─ tool[0] ✔  (executed)
  ├─ [polling] → steering found!
  ├─ tool[1] ✘  (skipped)
  ├─ tool[2] ✘  (skipped)
  └─ new LLM turn with steering message

Scoped queues

Steering is now isolated per resolved session scope, not stored in a single global queue.

  • The active turn writes and reads from its own scope key (usually the routed session key such as agent:<agent_id>:...)
  • Steer() still works outside an active turn through a legacy fallback queue
  • Continue() first dequeues messages for the requested session scope, then falls back to the legacy queue for backwards compatibility

This prevents a message arriving from another chat, DM peer, or routed agent session from being injected into the wrong conversation.

Configuration

In config.json, under agents.defaults:

{
  "agents": {
    "defaults": {
      "steering_mode": "one-at-a-time"
    }
  }
}

Modes

Value Behavior
"one-at-a-time" (default) Dequeues only one message per polling cycle. If there are 3 messages in the queue, they are processed one at a time across 3 successive iterations.
"all" Drains the entire queue in a single poll. All pending messages are injected into the context together.

The environment variable PICOCLAW_AGENTS_DEFAULTS_STEERING_MODE can be used as an alternative.

Go API

Steer — Send a steering message

err := agentLoop.Steer(providers.Message{
    Role:    "user",
    Content: "change direction, focus on X instead",
})
if err != nil {
    // Queue is full (MaxQueueSize=10) or not initialized
}

The message is enqueued in a thread-safe manner. Returns an error if the queue is full or not initialized. It will be picked up at the next polling point (after the current tool finishes).

SteeringMode / SetSteeringMode

// Read the current mode
mode := agentLoop.SteeringMode() // SteeringOneAtATime | SteeringAll

// Change it at runtime
agentLoop.SetSteeringMode(agent.SteeringAll)

Continue — Resume an idle agent

When the agent is idle (it has finished processing and its last message was from the assistant), Continue checks if there are steering messages in the queue and uses them to start a new cycle:

response, err := agentLoop.Continue(ctx, sessionKey, channel, chatID)
if err != nil {
    // Error (e.g. "no default agent available")
}
if response == "" {
    // No steering messages in queue, the agent stays idle
}

Continue internally uses SkipInitialSteeringPoll: true to avoid double-dequeuing the same messages (since it already extracted them and passes them directly as input).

Continue also resolves the target agent from the provided session key, so agent-scoped sessions continue on the correct agent instead of always using the default one.

Polling points in the loop

Steering is checked at the following points in the agent cycle:

  1. At loop start — before the first LLM call, to catch messages enqueued during setup
  2. After every tool completes — including the first and the last. If steering is found and there are remaining tools, they are all skipped immediately
  3. After a direct LLM response — if a new steering message arrived while the model was generating a non-tool response, the loop continues instead of returning a stale answer
  4. Right before the turn is finalized — if steering arrived at the very end of the turn, the agent immediately starts a continuation turn instead of leaving the message orphaned in the queue

Why remaining tools are skipped

When a steering message is detected, all remaining tools in the batch are skipped rather than executed. The alternative — let all tools finish and inject the steering message afterwards — was considered and rejected. Here is why.

Preventing unwanted side effects

Tools can have irreversible side effects. If the user says "no, wait" while the agent is mid-batch, executing the remaining tools means those side effects happen anyway:

Tool batch Steering message With skip Without skip
[web_search, send_email] "don't send it" Email not sent Email sent, damage done
[query_db, write_file, spawn_agent] "use another database" Only the query runs File written + subagent spawned, all wasted
[search₁, search₂, search₃, write_file] user changes topic entirely 1 search 3 searches + file write, all irrelevant

Avoiding wasted time

Tools that take seconds (web fetches, API calls, database queries) would all run to completion before the agent sees the user's correction. In a batch of 3 tools each taking 3-4 seconds, that's 10+ seconds of work that will be discarded.

With skipping, the agent reacts as soon as the current tool finishes — typically within a few seconds instead of waiting for the entire batch.

The LLM gets full context

Skipped tools receive an explicit error result ("Skipped due to queued user message."), so the model knows exactly which actions were not performed. It can then decide whether to re-execute them with the new context, or take a different path entirely.

Trade-off: sequential execution

Skipping requires tools to run sequentially (the previous implementation ran them in parallel). This introduces latency when the LLM requests multiple independent tools in a single turn. In practice, most batches contain 1-2 tools, so the impact is minimal compared to the benefit of being able to stop unwanted actions.

Skipped tool result format

When steering interrupts a batch, each tool that was not executed receives a tool result with:

Content: "Skipped due to queued user message."

This is saved to the session via AddFullMessage and sent to the model, so it is aware that some requested actions were not performed.

Full flow example

1. User: "search for info on X, write a file, and send me a message"

2. LLM responds with 3 tool calls: [web_search, write_file, message]

3. web_search is executed → result saved

4. [polling] → User called Steer("no, search for Y instead")

5. write_file is skipped → "Skipped due to queued user message."
   message is skipped    → "Skipped due to queued user message."

6. Message "search for Y instead" injected into context

7. LLM receives the full updated context and responds accordingly

Automatic bus drain

When the agent loop (Run()) starts processing a message, it spawns a background goroutine that keeps consuming new inbound messages from the bus. These messages are automatically redirected into the steering queue via Steer(). This means:

  • Users on any channel (Telegram, Discord, etc.) don't need to do anything special — their messages are automatically captured as steering when the agent is busy
  • Audio messages are transcribed before being steered, so the agent receives text. If transcription fails, the original (non-transcribed) message is steered as-is
  • Only messages that resolve to the same steering scope as the active turn are redirected. Messages for other chats/sessions are requeued onto the inbound bus so they can be processed normally
  • system inbound messages are not treated as steering input
  • When processMessage finishes, the drain goroutine is canceled and normal message consumption resumes

Steering with media

Steering messages can include Media refs, just like normal inbound user messages.

  • The original media:// refs are preserved in session history via AddFullMessage
  • Before the next provider call, steering messages go through the normal media resolution pipeline
  • Image refs are converted to data URLs for multimodal providers; non-image refs are resolved the same way as standard inbound media

This applies both to in-turn steering and to idle-session continuation through Continue().

Notes

  • Steering does not interrupt a tool that is currently executing. It waits for the current tool to finish, then checks the queue.
  • With one-at-a-time mode, if multiple messages are enqueued rapidly, they will be processed one per iteration. This gives the model the opportunity to react to each message individually.
  • With all mode, all pending messages are combined into a single injection. Useful when you want the agent to receive all the context at once.
  • The steering queue has a maximum capacity of 10 messages (MaxQueueSize). Steer() returns an error when the queue is full. In the bus drain path, the error is logged as a warning and the message is effectively dropped.
  • Manual Steer() calls made outside an active turn still go to the legacy fallback queue, so older integrations keep working.