Listener: speaker filter, wake debounce, TTS lead-in, skip follow-up + Bugbot fixes by adityasingh2400 · Pull Request #1 · adityasingh2400/ziri

adityasingh2400 · 2026-03-19T19:04:17Z

Summary

Always-on listener: speaker-aware STT, fewer false wakes, cleaner playback, Spotify skip/follow-up UX, and Bugbot follow-ups. Orchestrator again indexes turns to Elasticsearch when configured.

Listener & STT

Pre-wake rolling buffer + ElevenLabs diarization → keep wake-word speaker only; strip wake phrases from routing.
Buffer sizing fix: deque maxlen matches ~30 ms downsampled mic chunks (not 80 ms wake-word frames) so the full pre-wake window is retained.
Local Whisper fallback: strip wake phrase when the same anchor + command clip was used for diarization (use_speaker_filter).
False wake reduction: higher default WAKE_WORD_THRESHOLD, consecutive-frame confirmation, cooldown (settings + listener).
TTS clarity: ~50 ms silent output lead-in before playback (audio_player) to avoid CoreAudio stream-start ramp clipping the first syllable.
Follow-up listening after MUSIC_SKIP_NO_NEXT (no second wake word); route hints for bare playlist names; optional follow-up without diarization (settings).

Spotify & tools

Duck to 45% during interaction; skip uses track-id success / fast pause; unduck ramp.
Volume down bugfix: 40% floor applies only when current volume is already ≥ 40% (no “quieter” raising 35 → 40).
Hub: pass es_store + hybrid_searcher into ZiriOrchestrator so ES turn indexing runs again.

Routing & brain

Deterministic fixes: playlist vs play, shuffle phrasing, follow-up hints (brain, tests + routing_eval.jsonl).

Repo hygiene

scripts/macos/ LaunchAgent plist + README; sample audio under tests/fixtures/audio/; README project tree refresh.

Testing

pytest (or make test): routing, orchestrator, Spotify skip tests.
Manual: make start — wake stability, first syllable of TTS, skip/no-next follow-up, ELASTICSEARCH_URL set → verify indexing if desired.

- Rolling pre-wake-word buffer + ElevenLabs Scribe diarization; keep first speaker only to reduce cross-talk from concurrent speech - Strip wake phrases from diarized/fallback transcripts (avoid Hey Jarvis as intent) - Spotify duck target volume 45% for usable playback floor - README: voice pipeline + how speaker filtering works; .env.example toggles - Orchestrator/refactor, tests, and related app updates (local workspace)

cursor

Cursor Bugbot has reviewed your changes and found 4 potential issues.

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

cursor · 2026-03-19T19:10:04Z

+        pre_ww_samples = int(PRE_WAKEWORD_BUFFER_SECS * TARGET_RATE)
+        self._pre_wakeword_buffer: collections.deque[np.ndarray] = collections.deque(
+            maxlen=pre_ww_samples // WAKEWORD_CHUNK + 10
+        )


Pre-wake buffer retains too little audio

High Severity

_pre_wakeword_buffer capacity is sized with WAKEWORD_CHUNK, but appended chunks are ~30ms callback chunks. This under-allocates the rolling buffer, so configured pre-wake audio seconds cannot be retained and diarization loses wake-word anchoring.

cursor · 2026-03-19T19:10:04Z

            if self._whisper_model == "elevenlabs":
                logger.info("[DEBUG] ElevenLabs SDK not connected, trying REST API")
-                text = self._transcribe_elevenlabs(audio)
+                text = self._transcribe_elevenlabs(audio, use_diarization=use_speaker_filter)


Local fallback keeps wake phrase in command

Medium Severity

When speaker filtering prepends pre-wake audio, local Whisper fallback transcribes that combined clip but never calls _strip_wake_phrase. If ElevenLabs REST fails, wake phrases can be sent to intent routing.

Additional Locations (1)

app/core/listener.py#L862-L867

cursor · 2026-03-19T19:10:04Z

-            settings=settings,
-            memory=self.memory_store,
-            bedrock_client=self.brain._bedrock,
-        )


Orchestrator created without Elasticsearch dependencies

Medium Severity

ZiriOrchestrator is now instantiated without es_store/hybrid_searcher, so both stay None and Elasticsearch turn indexing paths never run.

Additional Locations (2)

app/core/orchestrator.py#L134-L143

app/core/orchestrator.py#L302-L311

cursor · 2026-03-19T19:10:04Z

+        new_vol = current + delta
+        if delta < 0:
+            new_vol = max(_SPOTIFY_VOLUME_FLOOR, new_vol)
+        else:


Volume-down may raise Spotify volume

Medium Severity

Negative volume adjustments clamp to a 40% floor unconditionally. If current Spotify volume is already below 40 (for example after spotify.set_volume to 35), saying quieter can increase volume up to 40%.

Additional Locations (1)

app/core/brain.py#L385-L391

- Raise wake_word_threshold, consecutive frames + cooldown (settings + listener) - TTS: ~50ms output lead-in for CoreAudio stream startup (audio_player) - Spotify skip: track-id success, fast pause path, MUSIC_SKIP_NO_NEXT UX - Follow-up listening + route hint; speaker filter tweaks; brain routing fixes - Docs/env examples; routing tests + spotify skip test

cursor · 2026-03-19T19:43:24Z

You have used all of your free Bugbot PR reviews.

To receive reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

- scripts/macos/: LaunchAgent plist + short README - tests/fixtures/audio/sample.aiff (was test.aiff at root) - README: root summary table, accurate app/tests layout - .dockerignore: plist path under scripts/macos

…volume floor - Size pre-wake deque by ~30ms downsampled mic chunks (not 80ms WW frames) - Strip wake phrase after local Whisper / combined-clip paths when diarization path used - Pass es_store + hybrid_searcher into ZiriOrchestrator (turn indexing restored) - Volume down: apply 40% floor only when current >= floor (no boost from 35%)

cursor · 2026-03-19T20:28:06Z

You have used all of your free Bugbot PR reviews.

To receive reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

cursor Bot reviewed Mar 19, 2026

View reviewed changes

adityasingh2400 added 2 commits March 19, 2026 12:46

chore: tidy repo root — move plist + sample audio, refresh README tree

09564a0

- scripts/macos/: LaunchAgent plist + short README - tests/fixtures/audio/sample.aiff (was test.aiff at root) - README: root summary table, accurate app/tests layout - .dockerignore: plist path under scripts/macos

adityasingh2400 changed the title ~~Listener: ElevenLabs diarization speaker filter + Spotify duck 45% + docs~~ Listener: speaker filter, wake debounce, TTS lead-in, skip follow-up + Bugbot fixes Mar 19, 2026

adityasingh2400 merged commit c95f247 into master Mar 19, 2026
2 of 3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Listener: speaker filter, wake debounce, TTS lead-in, skip follow-up + Bugbot fixes#1

Listener: speaker filter, wake debounce, TTS lead-in, skip follow-up + Bugbot fixes#1
adityasingh2400 merged 4 commits into
masterfrom
feature/listener-speaker-filter-diarization

adityasingh2400 commented Mar 19, 2026 •

edited

Loading

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot Mar 19, 2026

Uh oh!

cursor Bot Mar 19, 2026

Uh oh!

cursor Bot Mar 19, 2026

Uh oh!

cursor Bot Mar 19, 2026

Uh oh!

cursor Bot commented Mar 19, 2026

Uh oh!

cursor Bot commented Mar 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

adityasingh2400 commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Listener & STT

Spotify & tools

Routing & brain

Repo hygiene

Testing

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot Mar 19, 2026

Choose a reason for hiding this comment

Pre-wake buffer retains too little audio

Uh oh!

cursor Bot Mar 19, 2026

Choose a reason for hiding this comment

Local fallback keeps wake phrase in command

Uh oh!

cursor Bot Mar 19, 2026

Choose a reason for hiding this comment

Orchestrator created without Elasticsearch dependencies

Uh oh!

cursor Bot Mar 19, 2026

Choose a reason for hiding this comment

Volume-down may raise Spotify volume

Uh oh!

cursor Bot commented Mar 19, 2026

Uh oh!

cursor Bot commented Mar 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

adityasingh2400 commented Mar 19, 2026 •

edited

Loading