A simple terminal-based Python application that demonstrates the Deepgram Flux API integrated with OpenAI and Deepgram TTS. This demo processes a static audio file and creates a basic agent experience entirely in the terminal.
Features:
- 🎤 Flux Transcription: Real-time transcription using Deepgram's Flux API
- 🤖 OpenAI LLM: Processes audio to respond like an agent
- 🔊 Deepgram TTS: Natural voice synthesis for agent responses
- 🖥️ Terminal-only: No UI needed - everything runs in your terminal
- 📁 Static Audio: Processes your converted linear16 audio files
The application follows this simple pipeline:
- 📁 Audio Loading: Reads your converted linear16 audio file
- 🎤 Flux Transcription: Sends audio to Deepgram Flux API for real-time transcription
- 🤖 OpenAI Response: Generates intelligent responses using
GPT-4o-mini - 🔊 TTS Generation: Converts the response to speech using Deepgram TTS
- 💾 Audio Output: Saves the agent's speech as
audio/responses/agent_response.wav
🔑 To access the Deepgram API you will need a free Deepgram API Key.
🔑 To access the OpenAI API you will need a free OpenAI API Key.
- Install the required packages:
pip install deepgram-sdk python-dotenv
-
Create a
.envfile with your API keys:DEEPGRAM_API_KEY=your_deepgram_api_key_here OPENAI_API_KEY=your_openai_api_key_here
-
Use the default audio file provided
ORadd your audio file to theaudio/directory and convert to linear16 format:# Convert your audio file to the required format for Flux ffmpeg -i audio/your_file.wav -ar 16000 -ac 1 -c:a pcm_s16le audio/your_file_linear16.wavThen update the
AUDIO_FILEpath inmain.pyto point to your converted file. -
Run the demo:
python main.py
You can modify the following in main.py:
- TTS model: Change
"aura-2-phoebe-en"to other voices (aura-2-apollo-en, etc.) - Audio filename: Change
AUDIO_FILEto use a different filename - Response logic: Modify the response generation to create different agent behaviors
- Missing API Key: Set
DEEPGRAM_API_KEYandOPENAI_API_KEYenvironment variables - Missing Audio File: Add audio file to
audio/directory and convert with FFMPEG - Wrong Audio Format: Flux requires linear16 - use FFMPEG command above
- SDK Import Error: Install the Deepgram SDK correctly
- Connection Issues: Check internet connection and API key validity
You can learn more about Deepgram APIs at developers.deepgram.com.
We love to hear from you! If you have questions:
