Refactor server error handling, add client device selection, and improve transcript UI#28
Refactor server error handling, add client device selection, and improve transcript UI#28weshaan wants to merge 1 commit intosusiai:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Refactors the transcription server, updates the audio client for device selection and CLI configurability, and refreshes the browser transcript UI for improved user feedback.
Changes:
- Simplifies
transcribe_server.pyto a basic Flask API with startup model loading and/health. - Reworks
audio_grabber.pyinto a CLI-driven loop with device selection and per-chunk POSTs. - Updates
transcribe_listener.htmlwith connect/status UI, auto-scroll, and copy-to-clipboard.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 10 comments.
| File | Description |
|---|---|
| flask/transcribe_server.py | Replaces prior server architecture with a simplified Flask server + /health + /transcribe. |
| flask/transcribe_listener.html | Adds connect/status/error UI and transcript UX improvements (auto-scroll, copy). |
| flask/audio_grabber.py | Switches from class/threaded design to argparse + device selection + looped POST to server. |
| README.md | Updates setup/run instructions to match the new workflow and features. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| else: | ||
| model_smart = whisper.load_model(model_smart_name, in_memory=True) |
There was a problem hiding this comment.
This block contains a stray else: (and references to model_smart_name) after app.run(...), which makes the module syntactically invalid. Remove these leftover lines from the previous implementation so the file can import and run.
| else: | |
| model_smart = whisper.load_model(model_smart_name, in_memory=True) |
| if model is None: | ||
| return jsonify({'error': 'Model not loaded'}), 500 | ||
| try: | ||
| data = request.get_json() |
There was a problem hiding this comment.
request.get_json() can return None (e.g., invalid/missing JSON), which would raise an AttributeError on data.get(...) and result in a 500 instead of a 400. Use request.get_json(silent=True) and explicitly return a 400 when the JSON body is missing/invalid.
| data = request.get_json() | |
| data = request.get_json(silent=True) | |
| if data is None: | |
| return jsonify({'error': 'Missing or invalid JSON body'}), 400 |
| except Exception as e: | ||
| logger.error(f"Transcription error: {e}") | ||
| return jsonify({'error': str(e)}), 500 |
There was a problem hiding this comment.
Returning str(e) to clients can leak internal details (paths, environment info, library internals). Prefer logger.exception(...) for server-side diagnostics and return a generic error message to the client (optionally with a stable error code).
| except Exception as e: | |
| logger.error(f"Transcription error: {e}") | |
| return jsonify({'error': str(e)}), 500 | |
| except Exception: | |
| logger.exception("Transcription error") | |
| return jsonify({'error': 'Internal server error'}), 500 |
| import base64 | ||
| import numpy as np | ||
| audio_bytes = base64.b64decode(audio_b64) | ||
| audio_np = np.frombuffer(audio_bytes, np.int16).astype(np.float32) / 32768.0 |
There was a problem hiding this comment.
Importing base64 and numpy inside the request handler adds overhead on every request. Move these imports to module scope so they are imported once at startup.
| </body> | ||
| </html> | ||
| } | ||
| latestChunkId = null; | ||
| } |
There was a problem hiding this comment.
There is leftover JavaScript after the closing </html> tag, which makes the HTML invalid and will likely break script execution. Remove these trailing lines (they appear to be remnants of the previous session logic).
| fetch(`http://${host}:${port}/latest`) | ||
| .then(r => r.json()) |
There was a problem hiding this comment.
The UI polls GET /latest, but the refactored server in this PR only defines /health and POST /transcribe. This will cause persistent connection errors. Either implement a compatible GET /latest endpoint on the server (with defined response shape), or update the UI to use an endpoint that actually exists.
| document.getElementById('connectBtn').onclick = connect; | ||
| document.getElementById('copyBtn').onclick = function() { | ||
| navigator.clipboard.writeText(transcript); | ||
| setStatus('Transcript copied!'); |
There was a problem hiding this comment.
navigator.clipboard.writeText(...) is async and may fail in non-secure contexts (e.g., when opening the file directly). Handle the returned Promise (show an error on rejection), and consider providing a fallback (e.g., selecting text in a temporary textarea) so the feature works more reliably.
| document.getElementById('connectBtn').onclick = connect; | |
| document.getElementById('copyBtn').onclick = function() { | |
| navigator.clipboard.writeText(transcript); | |
| setStatus('Transcript copied!'); | |
| function fallbackCopyTextToClipboard(text) { | |
| return new Promise((resolve, reject) => { | |
| const textArea = document.createElement('textarea'); | |
| textArea.value = text; | |
| textArea.setAttribute('readonly', ''); | |
| textArea.style.position = 'fixed'; | |
| textArea.style.top = '-9999px'; | |
| textArea.style.left = '-9999px'; | |
| document.body.appendChild(textArea); | |
| textArea.focus(); | |
| textArea.select(); | |
| try { | |
| const successful = document.execCommand('copy'); | |
| document.body.removeChild(textArea); | |
| if (successful) { | |
| resolve(); | |
| } else { | |
| reject(new Error('Copy command was unsuccessful.')); | |
| } | |
| } catch (e) { | |
| document.body.removeChild(textArea); | |
| reject(e); | |
| } | |
| }); | |
| } | |
| function copyTextToClipboard(text) { | |
| if (navigator.clipboard && window.isSecureContext) { | |
| return navigator.clipboard.writeText(text); | |
| } | |
| return fallbackCopyTextToClipboard(text); | |
| } | |
| document.getElementById('connectBtn').onclick = connect; | |
| document.getElementById('copyBtn').onclick = function() { | |
| copyTextToClipboard(transcript) | |
| .then(() => setStatus('Transcript copied!')) | |
| .catch(e => setError('Copy failed: ' + e)); |
| if __name__ == '__main__': | ||
| main() | ||
| if len(self.buffer) > 0: | ||
| print("send chunk") | ||
| self.send_chunk() |
There was a problem hiding this comment.
There are leftover indented lines from the previous class-based implementation after main(), which makes the script syntactically invalid. Remove these trailing lines so the module can run.
| def list_devices(audio): | ||
| print("Available audio input devices:") | ||
| for i in range(audio.get_device_count()): | ||
| info = audio.get_device_info_by_index(i) | ||
| print(f"{i}: {info['name']}") |
There was a problem hiding this comment.
This prints all devices, including those with no input channels, which can confuse users when selecting a microphone. Filter to devices where maxInputChannels > 0, and consider printing the channel count (and/or marking the default input device) to make selection more reliable.
| ### 2. Start the transcription server | ||
|
|
||
| You can use the one-click script (Windows): | ||
|
|
There was a problem hiding this comment.
The README instructs users to run start_server.bat, but that file is not included in the provided diffs for this PR. If the batch script is intended to be added/updated here (as stated in the PR description), ensure it’s actually part of the PR; otherwise adjust the README instructions.
Overview
This PR introduces significant quality of life improvements across the server, client, and UI components of the real-time audio transcription system. The goal is to enhance reliability, usability, and ease of setup for both technical and non-technical users.
Key Improvements
1. Server Enhancements (
transcribe_server.py)2. Client Improvements (
audio_grabber.py)3. UI Enhancements (
transcribe_listener.html)4. Setup Simplification (
start_server.bat)5. Documentation Updates (
README.md)Impact