Deepgram Flux Composite Agent

A simple terminal-based Python application that demonstrates the Deepgram Flux API integrated with OpenAI and Deepgram TTS. This demo processes a static audio file and creates a basic agent experience entirely in the terminal.

Features:

🎤 Flux Transcription: Real-time transcription using Deepgram's Flux API
🤖 OpenAI LLM: Processes audio to respond like an agent
🔊 Deepgram TTS: Natural voice synthesis for agent responses
🖥️ Terminal-only: No UI needed - everything runs in your terminal
📁 Static Audio: Processes your converted linear16 audio files

What It Does

The application follows this simple pipeline:

📁 Audio Loading: Reads your converted linear16 audio file
🎤 Flux Transcription: Sends audio to Deepgram Flux API for real-time transcription
🤖 OpenAI Response: Generates intelligent responses using GPT-4o-mini
🔊 TTS Generation: Converts the response to speech using Deepgram TTS
💾 Audio Output: Saves the agent's speech as audio/responses/agent_response.wav

Getting an API Key

🔑 To access the Deepgram API you will need a free Deepgram API Key.

🔑 To access the OpenAI API you will need a free OpenAI API Key.

Installation

Install the required packages:
```
pip install deepgram-sdk python-dotenv
```

Running the Demo

Create a .env file with your API keys:

DEEPGRAM_API_KEY=your_deepgram_api_key_here
OPENAI_API_KEY=your_openai_api_key_here

Use the default audio file provided OR add your audio file to the audio/ directory and convert to linear16 format:
```
# Convert your audio file to the required format for Flux
ffmpeg -i audio/your_file.wav -ar 16000 -ac 1 -c:a pcm_s16le audio/your_file_linear16.wav
```
Then update the AUDIO_FILE path in main.py to point to your converted file.
Run the demo:
```
python main.py
```

Example Output

Customization

You can modify the following in main.py:

TTS model: Change "aura-2-phoebe-en" to other voices (aura-2-apollo-en, etc.)
Audio filename: Change AUDIO_FILE to use a different filename
Response logic: Modify the response generation to create different agent behaviors

Troubleshooting

Missing API Key: Set DEEPGRAM_API_KEY and OPENAI_API_KEY environment variables
Missing Audio File: Add audio file to audio/ directory and convert with FFMPEG
Wrong Audio Format: Flux requires linear16 - use FFMPEG command above
SDK Import Error: Install the Deepgram SDK correctly
Connection Issues: Check internet connection and API key validity

Documentation

You can learn more about Deepgram APIs at developers.deepgram.com.

Getting Help

We love to hear from you! If you have questions:

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github		.github
audio		audio
images		images
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
sample.env		sample.env

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Deepgram Flux Composite Agent

What It Does

Getting an API Key

Installation

Running the Demo

Example Output

Customization

Troubleshooting

Documentation

Getting Help

About

Uh oh!

Releases

Packages

Languages

License

deepgram-devs/deepgram-demos-composite-flux-agent

Folders and files

Latest commit

History

Repository files navigation

Deepgram Flux Composite Agent

What It Does

Getting an API Key

Installation

Running the Demo

Example Output

Customization

Troubleshooting

Documentation

Getting Help

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages