Skip to content

feat(turn-detector): LLM-based turn completion detector#5431

Draft
piyush-gambhir wants to merge 14 commits intolivekit:mainfrom
piyush-gambhir:feat/llm-turn-detector
Draft

feat(turn-detector): LLM-based turn completion detector#5431
piyush-gambhir wants to merge 14 commits intolivekit:mainfrom
piyush-gambhir:feat/llm-turn-detector

Conversation

@piyush-gambhir
Copy link
Copy Markdown
Contributor

@piyush-gambhir piyush-gambhir commented Apr 13, 2026

Summary

Adds LLMTurnDetector to livekit-plugins-turn-detector as an additional turn-detection option alongside the existing ONNX EOU models. It implements the existing _TurnDetector Protocol and plugs into the same turn_detection= slot — zero changes to voice core.

Inspired by Pipecat's Filter Incomplete Turns feature. Single-token 1 / 0 classifier today; richer multi-state prediction can follow if there's interest.

Why

Today livekit-plugins-turn-detector ships the ONNX EOU model (English + multilingual). Some users want to drive turn classification through their own LLM instead — already paying for inference in the same loop, want semantic reasoning over full conversation context, or want language coverage that matches their LLM rather than a fixed threshold table. This PR adds that as another option in the family, not a replacement.

Usage

from livekit.agents import AgentSession
from livekit.plugins import openai
from livekit.plugins.turn_detector import LLMTurnDetector

session = AgentSession(
    ...
    turn_detection=LLMTurnDetector(llm=openai.LLM(model="gpt-4o-mini")),
)

Design highlights

  • Implements the existing _TurnDetector Protocol (livekit-agents/livekit/agents/voice/turn.py) — no core changes.
  • Single-token 1 / 0 classifier prompt; parser only reads the first non-whitespace char so provider-specific formatting differences are tolerated.
  • 1.5s default timeout via asyncio.wait_for; on timeout returns a neutral 0.5 probability so endpointing falls back to default delay.
  • Never raises — timeouts, LLM exceptions, and malformed responses all return 0.5. Empty chat context short-circuits to 1.0.
  • Custom instructions kwarg lets users replace the default prompt (e.g. for non-English or domain-specific deployments).
  • No new runtime dependencies — livekit.agents.llm is already a peer dep of the plugin.

Configuration

Argument Default Purpose
llm required Any livekit.agents.llm.LLM instance
instructions None Override the default classifier prompt
unlikely_threshold 0.5 Probability below which endpointing treats the turn as likely-incomplete
timeout 1.5 Hard cap on the classifier call
max_history_turns 6 How many trailing chat messages are included in the prompt

Test plan

  • 14 unit tests in livekit-plugins/livekit-plugins-turn-detector/tests/test_llm_turn_detector.py — all passing:
    • Property and language-helper methods.
    • Happy-path classification ("1" / "0" -> 0.95 / 0.05).
    • Whitespace tolerance, garbage input, empty response — all return 0.5.
    • Timeout returns 0.5 within the configured cap.
    • LLM exceptions swallowed.
    • Empty / assistant-only chat context short-circuits to 1.0.
    • max_history_turns slicing.
    • Custom instructions override.
  • make lint, make type-check, make format all clean.
  • Manual smoke test with examples/voice_agents/llm_turn_detector.py.

Non-goals (future work, separate PR)

  • Multi-state prediction (extending _TurnDetector Protocol with short/long incomplete signals).
  • Re-engagement prompts (agent says something like "take your time" after an incomplete turn).

Happy to iterate on defaults (timeout, threshold, prompt wording) or placement if maintainers have a different preference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant