Pause Feature Documentation

Overview

The Chatterbox TTS Audiobook Edition now includes automatic pause insertion based on line breaks (returns) in your text input. For every line break (\n or \r\n) detected in your text, the system will automatically add a 0.1-second pause to the generated audio.

How It Works

Return-Based Pause System

Detection: The system counts all line breaks in your input text
Duration: Each line break adds exactly 0.1 seconds of silence
Accumulation: Multiple line breaks accumulate (10 returns = 1 second pause)
Debug Output: Terminal shows pause information when audio is generated

Example

Input Text:

Hello, this is the first line.
This is the second line.

This line comes after an empty line.
Final line.

Result:

4 line breaks detected
0.4 seconds of total pause time added
Debug output: 🔇 Detected 4 line breaks → 0.4s total pause time

Features Supported

✅ Speech-to-Text Generation

Single text input with returns
Pauses added to the end of generated speech
Debug output in terminal

✅ Single-Voice Audiobook Creation

Text processing before chunking
Pauses distributed throughout the audiobook
Project metadata includes pause information

✅ Multi-Voice Audiobook Creation

Character dialogue with natural pauses
Pause processing applied before voice assignment
Debug output shows total pause time added

✅ Batch Audiobook Processing

Automatic pause processing for all files in batch
Individual pause calculations per file

Technical Implementation

Text Processing Pipeline

Input Text Analysis

processed_text, return_count, total_pause_duration = process_text_for_pauses(text, 0.1)

Silence Generation

pause_audio = create_silence_audio(total_pause_duration, sample_rate)

Audio Combination

final_audio = np.concatenate([speech_audio, pause_audio])

Debug Output Examples

Speech-to-Text:

🔇 Detected 3 line breaks → 0.3s total pause time
🔇 Added 0.3s pause to speech (3 returns)

Audiobook Creation:

🔇 Detected 15 line breaks → 1.5s total pause time
🔇 Adding 1.5s pause (15 returns × 0.1s each)

Usage Guidelines

Best Practices

Natural Breaks: Use line breaks where you want natural pauses in speech
Paragraph Separation: Double line breaks create longer pauses
Dialogue: Separate character lines for better multi-voice audiobooks
Punctuation: Combine with punctuation for maximum effect

Example Text Formatting

Good for Natural Speech:

Welcome to our story.
Let me tell you about a magical place.

In this place, anything is possible.
The adventure begins now.

Good for Multi-Voice Audiobooks:

[Narrator] The sun was setting over the hills.

[Character1] "We need to find shelter soon."

[Character2] "I see a cave up ahead.
Let's hurry before it gets dark."

[Narrator] They rushed toward the cave.

Configuration

Pause Duration

Current Setting: 0.1 seconds per return
Location: Hardcoded in processing functions
Customization: Can be modified in src/audiobook/processing.py

Sample Rate

Default: 24,000 Hz
Compatibility: Automatically matches model output
Quality: High enough for natural-sounding pauses

Testing

Test Script

Run the included test script to verify functionality:

python test_pause_functionality.py

Manual Testing

Create text with line breaks
Generate speech or audiobook
Check terminal for debug output
Listen for pauses in generated audio

Troubleshooting

Common Issues

No Pauses Heard:

Check if text actually contains line breaks (\n)
Verify debug output appears in terminal
Ensure audio player supports the full generated file

Pauses Too Long/Short:

Current setting is 0.1s per return (not configurable via UI)
Multiple consecutive returns will create longer pauses
This is intended behavior for paragraph breaks

Debug Output Missing:

Check terminal/console where the application is running
Ensure you're using the updated functions
Verify pause processing is enabled

Future Enhancements

Potential Improvements

User-configurable pause duration
Different pause types (comma, period, paragraph)
Visual indicators in the UI
Pause preview before generation
Advanced pause distribution algorithms

Integration Ideas

Export settings in voice profiles
Project-level pause configuration
Advanced text markup for pause control
Audio timeline with pause indicators

Technical Details

Files Modified

src/audiobook/processing.py - Core pause processing functions
gradio_tts_app_audiobook.py - Main TTS integration
test_pause_functionality.py - Test and verification script

Functions Added

process_text_for_pauses() - Text analysis and preprocessing
create_silence_audio() - Silence generation
insert_pauses_between_chunks() - Audio combination with pauses
process_text_with_distributed_pauses() - Advanced chunk processing

Compatibility

✅ Windows, macOS, Linux
✅ CPU and GPU processing modes
✅ All supported audio formats
✅ Existing voice profiles and projects
✅ Batch processing workflows

Note: This feature is automatically enabled and requires no configuration. Simply use line breaks in your text where you want pauses, and the system will handle the rest!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pause Feature Documentation

Overview

How It Works

Return-Based Pause System

Example

Features Supported

✅ Speech-to-Text Generation

✅ Single-Voice Audiobook Creation

✅ Multi-Voice Audiobook Creation

✅ Batch Audiobook Processing

Technical Implementation

Text Processing Pipeline

Debug Output Examples

Usage Guidelines

Best Practices

Example Text Formatting

Configuration

Pause Duration

Sample Rate

Testing

Test Script

Manual Testing

Troubleshooting

Common Issues

Future Enhancements

Potential Improvements

Integration Ideas

Technical Details

Files Modified

Functions Added

Compatibility

FilesExpand file tree

PAUSE_FEATURE_DOCUMENTATION.md

Latest commit

History

PAUSE_FEATURE_DOCUMENTATION.md

File metadata and controls

Pause Feature Documentation

Overview

How It Works

Return-Based Pause System

Example

Features Supported

✅ Speech-to-Text Generation

✅ Single-Voice Audiobook Creation

✅ Multi-Voice Audiobook Creation

✅ Batch Audiobook Processing

Technical Implementation

Text Processing Pipeline

Debug Output Examples

Usage Guidelines

Best Practices

Example Text Formatting

Configuration

Pause Duration

Sample Rate

Testing

Test Script

Manual Testing

Troubleshooting

Common Issues

Future Enhancements

Potential Improvements

Integration Ideas

Technical Details

Files Modified

Functions Added

Compatibility