Skip to content

andreimatveyeu/awesome-python-audio

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Awesome Python Audio and Music 🎵

A curated list of Python tools, libraries, and resources for audio and music processing, analysis, synthesis, and playback.

Audio Processing & I/O

  • audioread: Cross-library audio decoding (GStreamer + Core Audio + MAD + FFmpeg)
  • audiomentations: Audio data augmentation library for machine learning
  • babycat: Audio manipulation library for Rust Python WebAssembly and C
  • matchering: Open source audio matching and mastering
  • Matchering-cli: Command line application for Matchering 2.0
  • noisereduce: Noise reduction using spectral gating
  • numpy & scipy.io.wavfile: Read/write and manipulate WAV files
  • pedalboard: Spotify's library for audio effects and processing
  • PyDub: Manipulate audio with a simple and easy high level interface
  • pyAudioProcessing: Audio feature extraction classification and segmentation
  • SoundDevice: Play and record audio with Python
  • soundfile: Read and write sound files using libsndfile
  • torch-audiomentations: Fast GPU audio data augmentation for PyTorch
  • torchaudio: Audio data manipulation and transformation powered by PyTorch
  • wave: Read and write WAV files (Python standard library)

Analysis & Feature Extraction

  • aubio: Library for audio and music analysis including pitch and beat detection
  • audioFlux: Library for audio and music analysis and feature extraction
  • Essentia: C++ library with Python bindings for audio analysis and MIR
  • librosa: Python package for music and audio analysis
  • Madmom: Audio signal processing library focused on MIR tasks
  • mir_eval: Evaluation functions for MIR and audio signal processing algorithms
  • mirdata: Python library for working with MIR datasets
  • nnAudio: GPU audio processing using PyTorch neural networks
  • pyAudioAnalysis: Audio feature extraction classification segmentation and visualization
  • Pyo: Python DSP module with synthesis and analysis capabilities
  • scipy.signal: Signal processing routines for SciPy
  • timeside: Framework for audio analysis imaging transcoding and streaming

Audio Embeddings & Representations

  • CLAP (LAION): Contrastive Language-Audio Pretraining for zero-shot audio classification
  • CLAP (Microsoft): Learning audio concepts from natural language supervision
  • OpenL3: Open-source deep audio and image embeddings
  • panns-inference: Pretrained audio neural networks for audio tagging and sound event detection
  • wav2vec2: Self-supervised speech representations from Facebook AI

Speech Processing

Speech-to-Text

  • Whisper: OpenAI's robust multilingual speech recognition model
  • faster-whisper: CTranslate2 reimplementation of Whisper up to 4x faster
  • WhisperX: Whisper with word-level timestamps and speaker diarization
  • SpeechRecognition: Library for performing speech recognition with multiple backends
  • Vosk: Offline speech recognition API supporting 20+ languages
  • SpeechBrain: PyTorch toolkit for speech processing and conversational AI
  • pyannote-audio: Neural speaker diarization and voice activity detection

Text-to-Speech

  • Coqui TTS: Deep learning toolkit for Text-to-Speech
  • Bark: Transformer-based text-to-audio model with emotions and non-speech sounds
  • pyttsx3: Offline text-to-speech conversion library

Source Separation

  • Demucs: State-of-the-art music source separation from Meta
  • audio-separator: Easy stem separation using MDX-Net VR Arch and Demucs models
  • Asteroid: PyTorch-based audio source separation toolkit for researchers
  • pydsm: Google's toolkit for sound separation using deep learning
  • Spleeter: Deezer source separation library (note: Demucs now preferred)

Music Transcription & Pitch

  • basic-pitch: Spotify's lightweight neural network for polyphonic pitch detection
  • CREPE: Monophonic pitch tracker using deep convolutional neural network
  • torchcrepe: PyTorch implementation of CREPE pitch tracker
  • MT3: Multi-instrument automatic music transcription from Google Magenta
  • piano_transcription_inference: High-resolution piano transcription with pedal detection

Music Generation & AI

  • AudioCraft: Meta's library for MusicGen AudioGen EnCodec and MAGNeT models
  • Stable Audio Tools: Generative models for conditional audio generation from Stability AI
  • Riffusion: Real-time music generation using stable diffusion on spectrograms
  • Magenta: Google's machine learning for music and art generation
  • musicautobot: Music generation with transformers using fastai
  • NSynth: Neural audio synthesis model from Magenta

Synthesis & Sound Design

  • ctcsound: Python bindings for Csound using ctypes
  • Mido: MIDI objects for Python
  • Pippi: Computer music composition library
  • pyfluidsynth: Python bindings for FluidSynth software synthesizer
  • Python-audio: Jupyter notebooks about audio signal processing with Python
  • Renardo: Maintained fork of FoxDot for Python live coding music
  • sc3nb: SuperCollider integration for Python and Jupyter notebooks

Music Theory & Composition

  • Abjad: Python API for building LilyPond music notation files
  • AthenaCL: Algorithmic composition tool (Python 3 fork)
  • maelzel: Framework for computer music in Python
  • MIDIUtil: Pure Python library for creating multi-track MIDI files
  • mingus: Advanced music theory and notation package
  • music21: Toolkit for computer-aided musical analysis
  • MusPy: Toolkit for symbolic music generation
  • pretty-midi: MIDI data handling and manipulation library
  • pychord: Handle and transform musical chords
  • scamp: Suite for Computer-Assisted Music in Python

Playback & Services

  • audiostream: Audio API for streaming raw data to speakers
  • beets: Music library manager and MusicBrainz tagger
  • discord.py: Python wrapper for Discord API with music streaming
  • freesound-python: Freesound API wrapper for audio retrieval and analysis
  • miniaudio: Python bindings for miniaudio audio playback library
  • Mopidy: Extensible music server written in Python
  • Mopidy-YouTube: Mopidy extension for playing music from YouTube
  • mpv: Python interface to MPV media player
  • MusicBot: Discord music bot written in Python
  • pyAV: Pythonic bindings for FFmpeg libraries
  • pygame.mixer: Pygame module for sound loading and playback
  • pyglet: Cross-platform windowing and multimedia library
  • pyradio: Command line internet radio player
  • Spotipy: Python client for the Spotify Web API

Datasets

Audio

  • AudioSet: Large-scale dataset of manually annotated audio events
  • Birdsong: Dataset of annotated bird songs and calls
  • Common Voice: Mozilla's open source multilingual speech dataset
  • ESC-50: Environmental sound classification dataset
  • Free Spoken Digit Dataset: Dataset of spoken digits in English
  • Freesound Dataset: Collaborative dataset of audio samples from Freesound
  • LibriSpeech: Large corpus of read English speech for ASR research
  • RAVDESS: Audio-visual dataset of emotional speech and song
  • Speech Commands: Dataset for speech command recognition
  • TIDIGITS: Spoken digit dataset for speech recognition
  • UrbanSound8K: 8000 urban sound samples in 10 classes
  • VCTK: Multispeaker speech dataset for voice technologies
  • VoxCeleb: Large-scale speaker identification dataset

Music

  • Beatport EDM Key: Electronic dance music tracks with musical key labels
  • DALI: Dataset of lyrics and audio with time alignments
  • DEAM: MediaEval dataset for music emotion recognition
  • FMA: Free Music Archive dataset for music analysis
  • GiantMIDI-Piano: Large-scale MIDI dataset of classical piano music
  • hsmusic: Huge symbolic music dataset
  • IRMAS: Instrument recognition in musical audio signals
  • Jamendo Audio Tagging: Multi-label audio tagging dataset
  • LAION-Audio-630K: Large collection of audio-text pairs for CLAP training
  • MAESTRO: MIDI and audio dataset for music transcription and generation
  • MagnaTagATune: Dataset for music annotation and audio tagging
  • MedleyDB: Dataset for multi-track mixing research
  • Musdb18: Dataset for music source separation
  • MusicCaps: Dataset of music clips with rich text descriptions
  • MusicNet: Dataset of classical music with instrument labels
  • NSynth: Large-scale dataset of annotated musical notes
  • Open MIC: Open Music Instrument Classification dataset
  • RWC Music Database: Musical instrument sound genre and rhythm databases
  • symbolic-music-datasets: Collection of symbolic music datasets
  • The Million Song Dataset: Massive collection of audio features and metadata

Tutorials

About

Awesome Python resources related to audio and music

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors