AUDIO_PIPELINE

Audio Pipeline Architecture

This document provides detailed documentation of the Multi-Room Audio Controller's audio processing pipeline.

Overview

The audio pipeline streams audio from Music Assistant to USB DACs and sound cards via PulseAudio. The architecture uses direct passthrough with PulseAudio handling all format conversion natively.

Key features:

Direct audio passthrough to PulseAudio
Synchronized multi-room playback via clock synchronization
Dynamic playback rate adjustment for drift correction via SDK
PulseAudio handles sample rate and format conversion to devices

Signal Flow Diagram

+-------------------------------------------------------------------------+
|                         Music Assistant                                  |
|                                                                          |
|  - Audio library management                                              |
|  - Streaming source selection                                            |
|  - Player group coordination                                             |
+------------------------------------+------------------------------------+
                                     |
                                     | Sendspin Protocol
                                     | (WebSocket + mDNS discovery)
                                     |
                                     v
+-------------------------------------------------------------------------+
|                         SendSpin.SDK                                     |
|                                                                          |
|  +-------------------+    +-------------------+    +---------------------+
|  | ClockSync         |    | TimedAudioBuffer  |    | SendspinClient      |
|  |                   |    |                   |    |                     |
|  | - NTP-like sync   |<-->| - Audio buffer    |<---| - Protocol handler  |
|  | - Drift measure   |    | - Sync timing     |    | - Connection mgmt   |
|  | - Rate target     |    | - Rate correction |    | - Volume control    |
|  +-------------------+    +---------+---------+    +---------------------+
|                                     |                                     |
|                                     | Sync-adjusted PCM Float32 samples   |
+-------------------------------------+-------------------------------------+
                                      |
                                      v
+-------------------------------------------------------------------------+
|                    BufferedAudioSampleSource                             |
|                                                                          |
|  - Bridges TimedAudioBuffer to IAudioSampleSource                        |
|  - Provides current time for timed sample release                        |
|  - Direct passthrough (no resampling)                                    |
+-------------------------------------+------------------------------------+
                                      |
                                      | PCM Float32 samples (source rate)
                                      v
+-------------------------------------------------------------------------+
|                         PulseAudioPlayer                                 |
|                                                                          |
|  +-------------------+    +-------------------+    +---------------------+
|  | Format Convert    |    | Volume Control    |    | Device Output       |
|  |                   |    |                   |    |                     |
|  | Float32 -> S32    |--->| Software gain     |--->| PulseAudio          |
|  | Float32 -> S24    |    | (0-100%)          |    | pa_simple API       |
|  | Float32 -> S16    |    |                   |    |                     |
|  +-------------------+    +-------------------+    +---------------------+
|                                                                          |
+-------------------------------------+------------------------------------+
                                      |
                                      v
+-------------------------------------------------------------------------+
|                         PulseAudio Server                                |
|                                                                          |
|  - Handles sample rate conversion to device native rate                  |
|  - Manages audio routing and mixing                                      |
|  - Device format negotiation                                             |
+-------------------------------------+------------------------------------+
                                      |
                                      v
                               USB DAC / Sound Card

Component Details

1. SendSpin.SDK (External Package)

The SDK handles all network communication and synchronization:

Component	Purpose
`SendspinClientService`	WebSocket connection to Music Assistant
`ClockSynchronization`	NTP-like clock sync between client and server
`TimedAudioBuffer`	Buffer with timestamp-aware sample management and sync correction

Sync Correction: The SDK's TimedAudioBuffer handles playback rate adjustment internally via the TargetPlaybackRateChanged event, using tiered correction:

Tier 1: Deadband (no correction for small errors)
Tier 2: Rate adjustment (+/-4% speed correction)
Tier 3: Frame drop/insert (disabled in our config)

2. BufferedAudioSampleSource

A simple bridge that connects the SDK's ITimedAudioBuffer to the audio player's IAudioSampleSource interface.

public sealed class BufferedAudioSampleSource : IAudioSampleSource
{
    // Provides current time to the timed buffer for synchronized sample release
    // Direct passthrough - no processing or resampling
}

3. PulseAudioPlayer

Handles the final audio output using PulseAudio's simple API:

Feature	Description
Format Conversion	Float32 -> S32_LE/S24_LE/S16_LE based on configured bit depth
Volume Control	Software gain applied before output
Reconnection	Automatic reconnection with exponential backoff
Backend Selection	PulseAudio for both HAOS and Docker environments

4. PulseAudio Server

PulseAudio handles all format conversion to the target device:

Sample Rate Conversion: Converts to device native rate (e.g., 48kHz -> 192kHz)
Format Negotiation: Adapts to device capabilities automatically
Mixing: Allows multiple audio streams if needed

Configuration Options

Output Format

Set via the player's output format configuration:

var outputFormat = new AudioOutputFormat
{
    SampleRate = 192000,  // 48000, 96000, 192000 (sent to PulseAudio)
    BitDepth = 32,        // 16, 24, 32
    Channels = 2
};

PulseAudio will convert to this format and further convert to the device's native format if needed.

Troubleshooting

Clicks or Pops During Playback

Symptom: Intermittent clicks or pops in audio

Possible Causes:

Buffer underrun - increase buffer size
PulseAudio configuration issues
USB DAC issues - try different USB port

Solution: Check logs for underrun messages, verify PulseAudio is running correctly

Sync Drift Not Correcting

Symptom: Player drifts out of sync over time

Check:

Verify isClockSynced: true in player status
Check logs for sync correction messages
Ensure network latency is stable

Constant Sync Error with Growing Buffer

Symptom: Stats for Nerds shows:

Constant sync error (e.g., -199ms or +250ms)
Buffer growing well beyond target (e.g., 4000ms vs 250ms target)
Playback rate stuck at minimum or maximum (0.96x or 1.04x)

Cause: Output latency mismatch - the actual device latency differs from reported latency

Solution:

Check PulseAudio latency via pa_simple_get_latency()
Verify device is correctly configured in PulseAudio

Diagnostic: Open Stats for Nerds and check "Output Latency" in Clock Sync section.

Performance Considerations

CPU Usage

With PulseAudio handling resampling, CPU usage is minimal:

Component	Typical CPU (per player)
SDK Processing	~1-2%
PulseAudio	Handled by system

Memory Usage

Component	Memory
Audio Buffer	~32 KB
PulseAudio Buffers	~8 KB
Total per player	~40 KB

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AUDIO_PIPELINE

Audio Pipeline Architecture

Overview

Signal Flow Diagram

Component Details

1. SendSpin.SDK (External Package)

2. BufferedAudioSampleSource

3. PulseAudioPlayer

4. PulseAudio Server

Configuration Options

Output Format

Troubleshooting

Clicks or Pops During Playback

Sync Drift Not Correcting

Constant Sync Error with Growing Buffer

Performance Considerations

CPU Usage

Memory Usage

Further Reading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally