-
Notifications
You must be signed in to change notification settings - Fork 6
feat: implement dual-phase contextual transcription for dictation (GH-256) #259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…-256) Implements a two-phase transcription process for Universal Dictation mode to significantly improve accuracy while maintaining low-latency feedback. ## Phase 1 - Live Streaming (unchanged) Short audio segments transcribed in real-time for immediate text updates, maintaining the natural "live typing" experience. ## Phase 2 - Contextual Refinement (new) When dictation completes (via endpoint detection or field blur), all buffered audio is re-processed as a single unified segment. This provides full context to the transcription model, resulting in better accuracy, punctuation, and phrasing. ### Key Implementation Details **Audio Buffering (per-target)** - Added audioSegmentsByTarget to DictationContext for multi-field support - Segments stored with blob, frames, duration, sequence number, timestamp - Per-target tracking prevents race conditions with rapid form navigation - Memory-safe: buffers cleared after refinement or timeout (~5s max) **Endpoint Detection** - Reuses proven ConversationMachine logic (calculateDelay from TimerModule) - Uses pFinishedSpeaking + tempo from /transcribe API responses - 5 second max delay (vs 7s for conversation, accounts for p50 ~2s + network) - Triggers refinement automatically when user finishes speaking **Refinement Triggers** 1. Endpoint timer (after pFinishedSpeaking confidence threshold) 2. Field blur (user tabs/clicks away from field) 3. Manual stop (hangup/cancel button) 4. 10s timeout fallback (same as conversation mode) **Refinement Process** - Concatenates all audio segments for target element - Uploads unified blob with empty precedingTranscripts (context in audio) - No API changes needed - works with existing /transcribe endpoint - Detects response via sequence number tracking in activeRefinementSequences - Replaces all Phase 1 interim transcriptions with refined result **State Machine Changes** - New DictationContext fields: - audioSegmentsByTarget: Record<string, AudioSegment[]> - refinementPendingForTargets: Set<string> - activeRefinementSequences: Map<string, number> - timeLastTranscriptionReceived: number - New state: listening.converting.refining - New guard: refinementConditionsMet - New delay: refinementDelay (uses calculateDelay with pFinished + tempo) - New action: performContextualRefinement - Enhanced handleTranscriptionResponse to detect and process refinements **UI Integration** - Blur handler in UniversalDictationModule triggers refinement - Visual state (refining) available for future UI indicators - Graceful degradation: Phase 1 text preserved if Phase 2 fails ### Benefits - Significantly improved transcription accuracy and consistency - Seamless correction with minimal visual disruption - Maintains hands-free flow across multiple fields - Aligns Say Pi dictation quality with dedicated transcription tools - Zero breaking changes - purely additive enhancement ### Privacy & Memory - Audio buffers exist only in-memory during dictation session - Automatically cleared after refinement completion - No persistent storage or reuse of voice data - Max ~5 seconds of audio buffered per field Related: #256 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Fixes TypeScript linter errors for implicit 'any' types on EventBus
event data parameters in UniversalDictationModule.
- dictation:contentUpdated: { targetElement: HTMLElement }
- dictation:terminatedByManualEdit: { targetElement: HTMLElement; reason: string }
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <[email protected]>
Fixes additional TypeScript linter errors for implicit 'any' types: - Events with detail data (USER_STOPPED_SPEAKING, etc.): any type (Union type too complex; using any with runtime validation) - saypi:transcription:completed: Explicit type with all fields (text, sequenceNumber, pFinishedSpeaking, tempo, merged) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Eliminates duplicate type definition by: - Exporting DictationTranscribedEvent from DictationMachine - Importing and using Omit<DictationTranscribedEvent, 'type'> in UniversalDictationModule for transcription:completed listener This follows DRY principle and ensures type consistency across the codebase. The 'type' field is omitted since EventBus events don't include it in the payload (it's added when sending to the machine). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Critical fix: i18n-translate-keys.py now aborts when it encounters a locale
file with JSON syntax errors, instead of silently replacing it with an empty
structure.
Previous behavior (DANGEROUS):
- If messages.json had a syntax error, load_json_file() returned None
- Code converted None to {} and continued
- Script would overwrite the file with only the newly translated keys
- All existing translations in that locale would be permanently lost
New behavior (SAFE):
- If messages.json fails to parse, show clear error message
- Abort immediately with exit code 1
- User can fix JSON errors and retry safely
- No data loss
Reported by: Codex during code review
Changed MAX_AUDIO_BUFFER_DURATION_MS from implicit ~5s to explicit 120s to provide maximum context for Phase 2 contextual refinement. The previous short buffer undermined the value proposition - we need full dictation session audio (not just recent segments) to achieve significantly better accuracy. The 120s cap prevents unbounded memory growth if users leave hot-mic on. Implementation: - Added MAX_AUDIO_BUFFER_DURATION_MS = 120000 (2 minutes) - Automatic trimming of oldest segments when limit exceeded - Enhanced logging shows total buffered audio duration - Sliding window keeps most recent 120s of audio per field This aligns with the core goal: long-form context = better accuracy. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Fixes P1 code review issue where refinement transcriptions were being discarded due to missing sequence→element mapping. When uploading combined audio for Phase 2 refinement, we now properly map the returned sequence number to the target element via `context.transcriptionTargets[sequenceNum]`. This allows the refinement response handler to correctly identify and update the target field. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
feea380 to
59500eb
Compare
|
Fixed in commit 59500eb. Analysis: Changes:
This reuses the existing routing infrastructure ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements a dual-phase transcription system for dictation to improve accuracy. Phase 1 provides real-time streaming transcription for immediate feedback, while Phase 2 performs contextual refinement by re-processing all buffered audio as a single segment when dictation completes. This approach maintains low-latency user experience while significantly improving final transcription quality.
Key Changes:
- Added audio buffering system that stores segments per target element with automatic cleanup
- Implemented endpoint detection using existing
calculateDelaylogic to trigger refinement automatically - Added refinement triggers for field blur, manual stop, and endpoint detection with state machine enhancements
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| src/state-machines/DictationMachine.ts | Core implementation including audio buffering infrastructure, refinement state/actions, endpoint detection delay calculation, and cleanup logic |
| src/UniversalDictationModule.ts | Integration of field blur handler to trigger refinement and type annotations for EventBus listeners |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Re: Copilot's performance suggestion (cleanup loop) I considered maintaining a separate Set/Array for sequence numbers to avoid
Happy to revisit if profiling shows this is a performance issue in practice. |
- Extract ellipsis normalization into shared normalizeTranscriptionText() function (DRY) - Improve comment explaining 5s vs 7s endpoint delay difference with rationale - Replace 'any' type with proper union type for event details (type safety) - Export event types from DictationMachine for reuse in UniversalDictationModule Performance nitpick (cleanup loop) evaluated but deferred - not a bottleneck given infrequent operation and small dataset size. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
✅ All Copilot code review feedback addressedChanges in commit ae3e7bd:
All substantive feedback addressed. Build passing ✅ |
PR-259 Review Comments (cheetah)Overall ReviewStatus: ❌ Request Changes The implementation of dual-phase contextual transcription is architecturally sound and follows existing patterns well. However, critical gaps exist in test coverage that violate the project's TDD requirements outlined in AGENTS.md. Critical Issues
Improvements Needed
Inline CommentsComment 1: Line 24 - Type ExportFile: Good change exporting this type for reuse. Consider adding JSDoc explaining that this is shared across modules for type consistency. Comment 2: Line 401 - Missing Test Coverage (CRITICAL)File: Excellent implementation of audio buffer management with automatic trimming. The 120s limit is well-chosen to prevent unbounded memory growth. Critical gap: No tests exist for this function or the refinement feature. According to project TDD requirements, all new functionality must have tests before merge. Action Required: Add tests for:
Comment 3: Line 520 - Silent Failure CaseFile: Good conditional buffering - only buffers when frames are available. However, this means refinement is silently skipped if Comment 4: Line 1446 - Type Safety IssueFile: Type safety issue: const targetElement = context.transcriptionTargets[sequenceNumber] as HTMLElement | undefined;Or add a dedicated refinement mapping structure that's type-safe. Comment 5: Line 1482 - Inconsistent Spacing LogicFile: Inconsistent spacing logic. This always adds a space between initial text and transcription, but doesn't check if either already has whitespace. Should use const finalText = smartJoinTwoTexts(initialText, transcription);Comment 6: Line 1982 - Potential Memory LeakFile: Potential memory leak: If refinement fails after this mapping (see catch block at line 1991), Fix: Add cleanup in catch block: .catch((error) => {
console.error("[DictationMachine] Refinement transcription failed:", error);
clearAudioForTarget(context, targetId);
delete context.transcriptionTargets[sequenceNum]; // Add this
});Comment 7: Line 2018 - Edge Case MissingFile: Good guard condition - clear and concise. However, missing edge case: What if refinement is triggered but no segments exist? Currently silently returns without state transition (line 1928). Should either:
Comment 8: Line 2031 - Magic NumberFile: Magic number makes testing difficult. Extract to constant: const REFINEMENT_MAX_DELAY_MS = 5000;
const DICTATION_P50_TRANSCRIPTION_TIME_MS = 2000;This improves testability and makes configuration easier. Comment 9: Test File - No Tests Exist (CRITICAL)File: CRITICAL: No tests exist for dual-phase refinement feature. Search confirms zero coverage: grep -i "refinement|refine|phase" test/state-machines/DictationMachine.spec.ts
# Returns: No matches foundRequired test coverage:
This violates project TDD requirements and is blocking merge. Test Strategy RecommendationsUnit Tests Neededdescribe('Dual-Phase Refinement', () => {
it('should buffer audio segments when frames are provided')
it('should trigger refinement after endpoint delay')
it('should concatenate audio segments correctly')
it('should replace Phase 1 transcriptions with refined result')
it('should handle refinement failure gracefully')
it('should respect audio buffer duration limit (120s)')
it('should clear audio buffers on target switch')
it('should calculate refinement delay based on pFinishedSpeaking and tempo')
it('should handle concurrent refinements for multiple targets')
it('should cleanup orphaned refinement mappings on error')
})Mock Strategy
Positive Observations
Checklist Before MergeMust Fix (Blocking)
Should Fix (Non-blocking)
Nice to Have
Estimated EffortTo address blocking issues: 2-3 hours Breakdown:
Final RecommendationThe implementation demonstrates solid architectural choices and integrates well with existing code. However, the complete absence of test coverage for such a critical feature represents unacceptable risk. Action: Add test coverage before merging. Once tests are added and all existing tests still pass, this PR will be ready for approval. |
**Code Quality Improvements:** - Use smartJoinTwoTexts for consistent whitespace handling in refinement - Extract REFINEMENT_MAX_DELAY_MS constant for better testability - Export event types (DictationSpeechStoppedEvent, etc.) for type safety **Test Coverage (CRITICAL):** Added comprehensive test suite (DictationMachine-Refinement.spec.ts) with 15 tests: - ✅ Audio segment buffering and 120s limit enforcement - ✅ Refinement delay calculation with pFinishedSpeaking/tempo - ✅ Audio concatenation for multi-segment refinement - ✅ Phase 1 → Phase 2 transcription replacement - ✅ State transitions (accumulating → refining → accumulating) - ✅ Buffer cleanup on refinement completion - ✅ Error scenarios (upload failure, missing target) - ✅ Concurrent refinements for multiple targets **Cheetah Review Analysis:** - Memory leak (Comment 6): FALSE POSITIVE - .then() never runs on upload failure - Type safety (Comment 4): Already handled with null check at line 1460 - No segments case (Comment 7): Already handled at lines 1934-1939 - Missing frames warning (Comment 3): REJECTED - creates unnecessary log noise All 15 tests passing ✅ 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
✅ All Cheetah code review feedback addressedCommits: ae3e7bd (Copilot fixes) + a4e1d63 (Cheetah fixes + tests) Critical Issues - Resolved1. ✅ Test Coverage (CRITICAL - was blocking merge)Created: Test coverage includes:
Result: All 15 new tests pass ✅ 2. ❌ "Memory Leak" in Refinement Mapping (Comment 6)Cheetah's claim: If refinement fails, Critical analysis: FALSE POSITIVE Action: None required - no memory leak exists 3. ❌ Type Safety Issue (Comment 4)Cheetah's claim: Critical analysis: Already handled correctly Action: None required - type safety already ensured Improvements Made4. ✅ Inconsistent Spacing Logic (Comment 5)Fixed: Now uses 5. ✅ Magic Numbers (Comment 8)Fixed: Extracted Rejected Suggestions6. ❌ Warning Log When Frames Missing (Comment 3)Cheetah's claim: Should log warning when frames parameter missing Critical analysis: Rejected - creates unnecessary noise Action: Not implemented 7. ❌ Handle No Segments Case (Comment 7)Cheetah's claim: Missing edge case handling when refinement triggered with no segments Critical analysis: Already handled Action: None required - already handled correctly SummaryBlocking issues resolved:
Valid improvements implemented:
False positives identified:
Noise reduction:
All tests passing: 607/607 ✅ Ready for final review and merge. |
Summary of ChangesCommits on this PR
What Changed (Detailed Breakdown)Core Feature Implementation (24a9d5f)
Type Safety & Code Quality (e45fd7e, 6168c19, 7fa17a2)
Bug Fixes (33069fc, 59500eb)
Copilot Review Improvements (ae3e7bd)
Cheetah Review Improvements (a4e1d63)
What Didn't Change (And Why)Cheetah's "memory leak" concern (Comment 6): The suggested cleanup in the catch block is unnecessary because the mapping at line 1987 is inside the Cheetah's "type safety" concern (Comment 4): The null check at lines 1459-1466 already handles this correctly with early return and cleanup. Cheetah's "no segments" concern (Comment 7): Already handled at lines 1934-1939 with appropriate debug logging and cleanup. Cheetah's "warning log" suggestion (Comment 3): Rejected because frames are optionally by design. Logging a warning for expected behavior would create log pollution. Test Results
Ready for final review and merge. |
|
Code Review comments by Codex (gpt-5-high, local)
|
**Critical Bug Fixes:**
1. **Target switch bug in performContextualRefinement** (Codex Comment 1)
- **Problem**: When user switches targets mid-dictation, endpoint delay fires
but refinement checks `context.targetElement` (now the new target) instead
of the original target with pending audio. Result: original field never
refined, machine stuck in refining state.
- **Fix**: Iterate over `refinementPendingForTargets` to refine ALL pending
targets, not just current one. Handles rapid multi-field navigation.
- **Test**: Added "should refine all pending targets even after target switch"
2. **Manual edit cleanup bug** (Codex Comment 2)
- **Problem**: On manual edit, dictation terminates but leaves up to 120s of
buffered audio in `audioSegmentsByTarget`, `refinementPendingForTargets`,
and `activeRefinementSequences`. Next dictation on that element would try
to refine stale speech.
- **Fix**: Call `clearAudioForTarget()` in `handleManualEdit` to clear all
refinement state and audio buffers.
- **Test**: Added "should clear audio buffers and refinement state on manual edit"
**Test Coverage:**
- ✅ 17 refinement tests (was 15, now 17)
- ✅ 609 total tests passing
Both bugs would have caused user-visible issues in production. Codex review
was 100% correct on both counts.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <[email protected]>
✅ Codex Code Review - Both Issues FixedCommit: 2a56558 Critical Bug #1: Target Switch During RefinementCodex's Analysis: ✅ CORRECT When user switches focus between speech segments, The Fix:
Code Changes: // Before: Used current target (wrong!)
let targetElement = context.targetElement;
// After: Process ALL pending targets
for (const targetId of context.refinementPendingForTargets) {
const targetElement = Object.values(context.transcriptionTargets).find(
el => getTargetElementId(el) === targetId
);
// ... refine this target
}Test Added: "should refine all pending targets even after target switch (Codex bug)"
Critical Bug #2: Manual Edit CleanupCodex's Analysis: ✅ CORRECT On manual edit, dictation terminated but left up to 120s of buffered audio and refinement state intact. Next dictation on that element would try to refine stale speech from previous session. The Fix:
Code Changes: // Added in handleManualEdit after clearing transcriptions:
// Clear audio buffers and refinement state for this target
// This prevents stale audio (up to 120s) from being refined later
clearAudioForTarget(context, targetId);Test Added: "should clear audio buffers and refinement state on manual edit (Codex bug)"
Impact AssessmentBoth bugs would have caused production issues:
Codex Review Quality: 🏆 Excellent
Test Results
Both critical bugs fixed with regression tests. Ready for final review. |
**New Features:** - Added `onSequenceNumber` callback to `uploadAudioWithRetry` for better sequence management during audio uploads. - Introduced `activeRefinementSequenceLookup` and `refinementQueue` in `DictationContext` to track active refinements and manage in-flight audio segments. - Updated logic in `clearAudioForTarget` and `uploadAudioSegment` to maintain accurate mapping of sequence numbers to target elements. **Test Coverage:** - Updated tests to validate new callback functionality and ensure proper handling of refinement sequences. This update improves the robustness of the dictation machine by ensuring that sequence numbers are correctly managed and associated with their respective targets, enhancing the overall user experience during dictation sessions.
…dio upload **New Features:** - Documented `inputType`, `inputLabel`, and `onSequenceNumber` parameters to `uploadAudioWithRetry` in the ConversationMachine to improve readability, even though they are not currently utilized in conversation mode.
**New Features:** - Replaced manual audio segment persistence logic with a dedicated `persistAudioSegment` function in both `AudioInputMachine` and `DictationMachine`. - Enhanced the ability to optionally persist audio segments during user interactions, improving overall audio management. This update streamlines the audio persistence process, making the code cleaner and more maintainable.
**New Features:** - Introduced `uploadAudioForRefinement` function for handling audio uploads specifically for the refinement phase, bypassing sequence number tracking. - Enhanced `DictationMachine` to manage refinement requests using UUIDs, allowing for better tracking and handling of audio segments. - Updated event emissions for refinement completion and failure, improving telemetry and debugging capabilities. - Refined state management in `DictationContext` to replace old sequence-based tracking with a more robust UUID-based approach. **Test Coverage:** - Expanded test suite to cover new refinement logic, ensuring proper handling of audio uploads and state transitions. This update significantly improves the refinement process, enabling more accurate and context-aware transcriptions while maintaining a clean and maintainable codebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 9 out of 9 changed files in this pull request and generated 7 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Skip refinement if only 1 segment (already fully transcribed in Phase 1) | ||
| if (segments.length === 1) { | ||
| console.debug(`[DictationMachine] Skipping refinement for target ${targetId} until more segments arrive`); | ||
| continue; // Keep buffering, don't clear | ||
| } |
Copilot
AI
Oct 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The logic skips refinement for single segments, but the comment 'already fully transcribed in Phase 1' may be misleading. The real reason is that a single segment has no additional context to gain from refinement. Consider clarifying: 'Skip refinement if only 1 segment (no additional context available for improvement)'.
src/TranscriptionModule.ts
Outdated
| /** | ||
| * Upload audio for refinement pass (Phase 2 dual-phase transcription). | ||
| * This is a bare-bones transcription request that bypasses sequence number tracking. | ||
| * | ||
| * Key differences from uploadAudioWithRetry(): | ||
| * - No sequenceNumber parameter or tracking | ||
| * - No precedingTranscripts/acceptsMerge (full audio → full transcription) | ||
| * - No increment of global sequence counter | ||
| * - Uses requestId (UUID) for response correlation instead | ||
| * - Emits separate event: saypi:refinement:completed |
Copilot
AI
Oct 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation mentions emitting 'saypi:refinement:completed' but should also mention 'saypi:refinement:failed' and 'saypi:refinement:started' events for completeness.
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
All changes are documentation/code quality improvements with no functional changes:
1. DictationMachine.ts:
- Reordered imports alphabetically for better maintainability
- Improved delay comment to avoid coupling to ConversationMachine's value
- Clarified refinement response handling comment
- Improved single-segment skip comment to explain "no additional context"
2. TranscriptionModule.ts:
- Documented all refinement events (started, completed, failed)
- Moved refinement:started event emission to outer function to avoid
multiple emissions on retry attempts
3. AudioSegmentPersistence.ts:
- Simplified boolean coercion for better clarity
All tests passing (609 tests). Build successful.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <[email protected]>
…tion modules - Added documentation for dual-phase dictation transcription in CLAUDE.md. - Simplified comments in TranscriptionModule.ts regarding audio upload for refinement. - Clarified comments in AudioSegmentPersistence.ts and DictationMachine.ts for better understanding of audio segment handling and refinement processes. These changes enhance documentation clarity and improve code maintainability.
Summary
Implements a two-phase transcription process for Universal Dictation mode to significantly improve accuracy while maintaining low-latency feedback.
Phase 1 - Live Streaming (unchanged)
Short audio segments transcribed in real-time for immediate text updates, maintaining the natural "live typing" experience.
Phase 2 - Contextual Refinement (new)
When dictation completes (via endpoint detection or field blur), all buffered audio is re-processed as a single unified segment. This provides full context to the transcription model, resulting in better accuracy, punctuation, and phrasing.
Key Features
Audio Buffering System
audioSegmentsByTarget) prevents race conditionsEndpoint Detection
calculateDelaylogic from ConversationMachinepFinishedSpeakingandtempofrom API responsesRefinement Triggers
State Machine Enhancements
refiningstate inlistening.convertingperformContextualRefinementactionhandleTranscriptionResponseto detect/process refinementsfinalizeDictationandresetDictationStateNo API Changes Required
/transcribeendpointprecedingTranscripts(context is in the audio itself)Privacy & Memory Management
Code Changes
Modified Files:
src/state-machines/DictationMachine.ts(+381 lines)DictationTranscribedEventtype for reusesrc/UniversalDictationModule.ts(+16 lines)DictationTranscribedEventtypeBuild Status: ✅ Passes (55.74 MB output, no errors)
Benefits
Testing
Future Enhancements (out of scope)
/transcribepreferparameterCloses #256
🤖 Generated with Claude Code
Co-Authored-By: Claude [email protected]