The Chatterbox TTS Audiobook Edition now includes automatic pause insertion based on line breaks (returns) in your text input. For every line break (\n or \r\n) detected in your text, the system will automatically add a 0.1-second pause to the generated audio.
- Detection: The system counts all line breaks in your input text
- Duration: Each line break adds exactly 0.1 seconds of silence
- Accumulation: Multiple line breaks accumulate (10 returns = 1 second pause)
- Debug Output: Terminal shows pause information when audio is generated
Input Text:
Hello, this is the first line.
This is the second line.
This line comes after an empty line.
Final line.
Result:
- 4 line breaks detected
- 0.4 seconds of total pause time added
- Debug output:
🔇 Detected 4 line breaks → 0.4s total pause time
- Single text input with returns
- Pauses added to the end of generated speech
- Debug output in terminal
- Text processing before chunking
- Pauses distributed throughout the audiobook
- Project metadata includes pause information
- Character dialogue with natural pauses
- Pause processing applied before voice assignment
- Debug output shows total pause time added
- Automatic pause processing for all files in batch
- Individual pause calculations per file
-
Input Text Analysis
processed_text, return_count, total_pause_duration = process_text_for_pauses(text, 0.1)
-
Silence Generation
pause_audio = create_silence_audio(total_pause_duration, sample_rate)
-
Audio Combination
final_audio = np.concatenate([speech_audio, pause_audio])
Speech-to-Text:
🔇 Detected 3 line breaks → 0.3s total pause time
🔇 Added 0.3s pause to speech (3 returns)
Audiobook Creation:
🔇 Detected 15 line breaks → 1.5s total pause time
🔇 Adding 1.5s pause (15 returns × 0.1s each)
- Natural Breaks: Use line breaks where you want natural pauses in speech
- Paragraph Separation: Double line breaks create longer pauses
- Dialogue: Separate character lines for better multi-voice audiobooks
- Punctuation: Combine with punctuation for maximum effect
Good for Natural Speech:
Welcome to our story.
Let me tell you about a magical place.
In this place, anything is possible.
The adventure begins now.
Good for Multi-Voice Audiobooks:
[Narrator] The sun was setting over the hills.
[Character1] "We need to find shelter soon."
[Character2] "I see a cave up ahead.
Let's hurry before it gets dark."
[Narrator] They rushed toward the cave.
- Current Setting: 0.1 seconds per return
- Location: Hardcoded in processing functions
- Customization: Can be modified in
src/audiobook/processing.py
- Default: 24,000 Hz
- Compatibility: Automatically matches model output
- Quality: High enough for natural-sounding pauses
Run the included test script to verify functionality:
python test_pause_functionality.py- Create text with line breaks
- Generate speech or audiobook
- Check terminal for debug output
- Listen for pauses in generated audio
No Pauses Heard:
- Check if text actually contains line breaks (
\n) - Verify debug output appears in terminal
- Ensure audio player supports the full generated file
Pauses Too Long/Short:
- Current setting is 0.1s per return (not configurable via UI)
- Multiple consecutive returns will create longer pauses
- This is intended behavior for paragraph breaks
Debug Output Missing:
- Check terminal/console where the application is running
- Ensure you're using the updated functions
- Verify pause processing is enabled
- User-configurable pause duration
- Different pause types (comma, period, paragraph)
- Visual indicators in the UI
- Pause preview before generation
- Advanced pause distribution algorithms
- Export settings in voice profiles
- Project-level pause configuration
- Advanced text markup for pause control
- Audio timeline with pause indicators
src/audiobook/processing.py- Core pause processing functionsgradio_tts_app_audiobook.py- Main TTS integrationtest_pause_functionality.py- Test and verification script
process_text_for_pauses()- Text analysis and preprocessingcreate_silence_audio()- Silence generationinsert_pauses_between_chunks()- Audio combination with pausesprocess_text_with_distributed_pauses()- Advanced chunk processing
- ✅ Windows, macOS, Linux
- ✅ CPU and GPU processing modes
- ✅ All supported audio formats
- ✅ Existing voice profiles and projects
- ✅ Batch processing workflows
Note: This feature is automatically enabled and requires no configuration. Simply use line breaks in your text where you want pauses, and the system will handle the rest!