Conversation
- Add 57 test cases covering all core functionality: - is_valid() transcript validation (16 tests) - clean_old_transcripts() cleanup logic (4 tests) - add_to_audio_stack()/get_transcripts() storage (5 tests) - Serializer validation (7 tests) - REST API integration for all 9 endpoints (25 tests) - Make whisper/torch imports conditional via HAS_WHISPER guard so the app can load and tests can run without heavy ML packages - All tests pass in ~1.4 seconds
Add comprehensive test suite for transcribe_app (57 tests)
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR adds a broad Django test suite for transcribe_app and updates transcribe_utils to conditionally import heavy ML dependencies so the app/tests can run without whisper/torch.
Changes:
- Added extensive unit + integration tests covering validators, transcript storage utilities, serializers, and REST endpoints.
- Made
whisper/torchoptional viatry/exceptand gated local-model initialization behindHAS_WHISPER. - Added a warning path when ML deps are absent and no local model can be loaded.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 16 comments.
| File | Description |
|---|---|
| django/transcribe_app/transcribe_utils.py | Adds conditional imports and guards local Whisper model loading when ML deps are not installed. |
| django/transcribe_app/tests.py | Introduces a large unit/integration test suite for core utilities, serializers, and API endpoints. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| import numpy as np | ||
| from unittest.mock import patch, MagicMock | ||
|
|
||
| from django.test import TestCase, override_settings |
There was a problem hiding this comment.
Several imports added here appear unused in the current file content (e.g., patch/MagicMock/override_settings and TranscriptResponseSerializer/ListTranscriptsResponseSerializer). If they aren’t needed, remove them to avoid linter failures and keep the test module focused; otherwise, add the missing tests/usages that require them.
| from django.test import TestCase, override_settings | |
| from django.test import TestCase |
| from unittest.mock import patch, MagicMock | ||
|
|
||
| from django.test import TestCase, override_settings | ||
| from rest_framework.test import APIClient | ||
| from rest_framework import status | ||
|
|
||
| from transcribe_app.serializers import ( | ||
| TranscribeInputSerializer, | ||
| TranscribeResponseSerializer, | ||
| TranscriptResponseSerializer, | ||
| ListTranscriptsResponseSerializer, |
There was a problem hiding this comment.
Several imports added here appear unused in the current file content (e.g., patch/MagicMock/override_settings and TranscriptResponseSerializer/ListTranscriptsResponseSerializer). If they aren’t needed, remove them to avoid linter failures and keep the test module focused; otherwise, add the missing tests/usages that require them.
| from unittest.mock import patch, MagicMock | |
| from django.test import TestCase, override_settings | |
| from rest_framework.test import APIClient | |
| from rest_framework import status | |
| from transcribe_app.serializers import ( | |
| TranscribeInputSerializer, | |
| TranscribeResponseSerializer, | |
| TranscriptResponseSerializer, | |
| ListTranscriptsResponseSerializer, | |
| from django.test import TestCase | |
| from rest_framework.test import APIClient | |
| from rest_framework import status | |
| from transcribe_app.serializers import ( | |
| TranscribeInputSerializer, | |
| TranscribeResponseSerializer, |
| import time | ||
| import json | ||
| import base64 | ||
| import numpy as np |
There was a problem hiding this comment.
These tests introduce a runtime dependency on NumPy just to generate silent audio bytes. If NumPy isn’t already a required dependency for the Django test environment, this can cause CI failures or unnecessarily bloat the test stack. Consider generating the same silent PCM bytes using the standard library (e.g., b'\\x00\\x00' * 16000 for int16 silence) to avoid requiring NumPy.
| import numpy as np |
| # Generate minimal valid audio (1 second of silence as int16) | ||
| audio_data = np.zeros(16000, dtype=np.int16).tobytes() |
There was a problem hiding this comment.
These tests introduce a runtime dependency on NumPy just to generate silent audio bytes. If NumPy isn’t already a required dependency for the Django test environment, this can cause CI failures or unnecessarily bloat the test stack. Consider generating the same silent PCM bytes using the standard library (e.g., b'\\x00\\x00' * 16000 for int16 silence) to avoid requiring NumPy.
| response = self.client.post( | ||
| '/api/transcribe', | ||
| data=json.dumps(payload), | ||
| content_type='application/json', | ||
| ) |
There was a problem hiding this comment.
The integration tests hard-code URL paths as strings. This makes the suite brittle if routing changes (e.g., trailing slash settings, URL namespace changes, or DRF router adjustments). Prefer resolving URLs via Django’s reverse() (or DRF’s route names) so the tests track the URLconf instead of duplicating it.
| chunk_late: {'transcript': 'Keep me.', 'translated': False}, | ||
| } | ||
|
|
||
| response = self.client.get('/api/pop_first_transcript', {'tenant_id': tenant}) |
There was a problem hiding this comment.
The integration tests hard-code URL paths as strings. This makes the suite brittle if routing changes (e.g., trailing slash settings, URL namespace changes, or DRF router adjustments). Prefer resolving URLs via Django’s reverse() (or DRF’s route names) so the tests track the URLconf instead of duplicating it.
| chunk_late: {'transcript': 'Pop me.', 'translated': False}, | ||
| } | ||
|
|
||
| response = self.client.get('/api/pop_latest_transcript', {'tenant_id': tenant}) |
There was a problem hiding this comment.
The integration tests hard-code URL paths as strings. This makes the suite brittle if routing changes (e.g., trailing slash settings, URL namespace changes, or DRF router adjustments). Prefer resolving URLs via Django’s reverse() (or DRF’s route names) so the tests track the URLconf instead of duplicating it.
|
|
||
| def test_non_ascii_only_is_invalid(self): | ||
| """A transcript with only non-ASCII or space characters should be invalid.""" | ||
| self.assertFalse(is_valid(" ")) # only spaces (code > 32 check fails) |
There was a problem hiding this comment.
The docstring says 'only non-ASCII or space characters', but the test case only checks spaces (which are ASCII). Update the docstring to match what’s being tested, or add an additional assertion covering the non-ASCII-only scenario described.
| self.assertFalse(is_valid(" ")) # only spaces (code > 32 check fails) | |
| self.assertFalse(is_valid(" ")) # only spaces (code > 32 check fails) | |
| self.assertFalse(is_valid("你好")) # only non-ASCII characters |
| else: | ||
| logger.warning("Whisper/Torch not installed. Transcription will not work without a whisper server.") |
There was a problem hiding this comment.
This warning executes at module import time, which can add noise to logs during test runs or in deployments where local transcription is intentionally disabled. Consider deferring the warning until the code path that actually requires local Whisper is exercised (or gating it behind an explicit configuration flag), so importing the module remains quiet unless transcription is requested.
| try: | ||
| import whisper | ||
| import torch | ||
| HAS_WHISPER = True | ||
| except ImportError: | ||
| HAS_WHISPER = False |
There was a problem hiding this comment.
The new conditional-import behavior (specifically the ImportError path where HAS_WHISPER becomes False) isn’t covered by tests in this PR. Add a test that verifies the module can be imported/reloaded when whisper/torch are unavailable and that the code selects the non-local-transcription path (e.g., via importlib.reload with patched imports).
What does this PR do?
This PR addresses the zero test coverage in transcribe_app/tests.py by adding 57 comprehensive test cases across all core modules. It also makes
whisper/torchimports conditional so the app can load in server-mode and tests can run without the heavy ML dependencies (~2GB).Changes
1. django/transcribe_app/tests.py — Comprehensive Test Suite
thank you,bye!,click click, etc.), forbidden exact strings (eh.,bye.), Korean hallucination chars, long words (>40 chars), case-insensitivityPOST /api/transcribe,GET /api/get_transcript,GET /api/get_first_transcript,GET /api/pop_first_transcript,GET /api/get_latest_transcript,GET /api/pop_latest_transcript,GET /api/delete_transcript,GET /api/list_transcripts,GET /api/transcripts_size2. django/transcribe_app/transcribe_utils.py — Conditional Imports