Skip to content

padmanabhan-r/GrooveForge

Repository files navigation

GrooveForge

Search by vibe. Generate by blueprint.

Live App API ElevenLabs Turbopuffer Gemini ElevenHacks React FastAPI

Every musician has a tune in mind.

What if you could search and create music by feel, generate songs by vibe, play a song to create music Shazam-style, fuse genres into entirely new sounds, and transform lyrics into fully composed songs?

Meet GrooveForge — THE ULTIMATE AI TOOLKIT FOR ORIGINAL MUSIC CREATION.

GrooveForge — Search by vibe. Generate by blueprint.


Table of Contents


What is GrooveForge?

GrooveForge is a retrieval-augmented music creation system — THE ULTIMATE AI TOOLKIT FOR ORIGINAL MUSIC CREATION.

Instead of describing music in the abstract, you search by the actual structural properties that make music sound the way it does — key, tempo, mood, instrumentation, lyrical themes. GrooveForge gives you four powerful ways to create:

  • Vibe Graph — Click genre, mood, tempo, key, and theme nodes to compose a vibe
  • Sound Match — Play a song. GrooveForge extracts its sonic fingerprint and generates something completely original in the same feel — Shazam, but for creation
  • Text-to-Music — Describe what you want to create using natural language
  • Lyrics-to-Music — Transform written lyrics into a fully composed song

At its core, GrooveForge indexes millions of audio blueprints enriched with features that define a song's DNA: genre, mood, key, tempo, energy, danceability, acousticness, valence, instrumentalness, and vocal characteristics. By retrieving and analyzing the closest matches, it generates original compositions grounded in real musical structure — ensuring precision, originality, and creative control.

Every generated track comes with a visible reasoning trail — the exact blueprint cards that shaped it — so you can see why it sounds the way it does. No black boxes. No hallucinated characteristics.


How It Works

1. Describe your vibe — Select nodes in the graph, type a natural-language description, paste original lyrics, or just play a song you love and let GrooveForge extract the vibe.

2. Retrieve blueprints — Your input is searched across millions of indexed tracks to find the closest musical matches by feel, genre, mood, key, tempo, and instrumentation. The top 5–10 blueprints are surfaced and ranked.

3. Aggregate traits — The retrieved blueprints are collapsed into a generation profile: average BPM, dominant key and mode, most common genre and mood, merged instrumentation.

4. Generate your track — Gemini synthesizes a music prompt strictly from the retrieved blueprint traits and sends it to ElevenLabs Music API to produce an original composition. In Advanced mode, lyrics are placed section by section — never mixed into style guidance.

Every track includes a visible reasoning trail — the exact blueprint cards and aggregated profile that drove the generation.


Input Modes

Vibe Graph

Click genre, mood, tempo, key, mode, instrumentation, and theme nodes to compose a vibe. Every node selection tightens the search query. The system maps your picks to a hybrid retrieval query plus metadata filters, surfaces the closest blueprints, and generates.

Vibe Graph — interactive node selection

Text-to-Music

Type anything: "moody synthwave, 110 BPM, instrumental" or "upbeat pop, female vocals, summer road trip". Your description is embedded and searched across millions of indexed tracks to find the closest blueprint matches.

Free-Text Search — natural language vibe search

Lyrics-to-Music

Paste original lyrics. Gemini analyzes emotional tone, themes, energy level, and rhythmic structure. The derived traits drive blueprint retrieval — your lyrics never contaminate the style guidance. In Advanced mode, lyrics are placed in ElevenLabs lines fields per section; style guidance comes from the blueprints only.

Lyrics-to-Music — paste lyrics, get a track

Sound Match

Just hit play on any song you love. Gemini extracts the sonic fingerprint: BPM, key, mode, mood, texture, instrumentation. Those traits drive blueprint retrieval across millions of tracks, and GrooveForge generates something completely original in the same vibe. The artist name and song title never reach ElevenLabs — only the derived feel does.

Sound Match — play any song, get an original track in the same vibe

Generated History

Every track you generate is saved locally (localStorage). Replay any track, rename it, or download the MP3. History persists across sessions.

Generated History — replay, rename, download


Architecture

See ARCHITECTURE.md for the full system diagram and endpoint reference.


Datasets

GrooveForge's blueprint index is built on four open datasets. Only structured metadata and derived features are used — no audio files are stored or processed.

Dataset Size What it contributes
Million Song Dataset (MSD) ~1M tracks The backbone. Provides BPM, key, mode, loudness, artist familiarity, and release year for a million songs.
LP-MusicCaps-MSD ~513K tracks MSD tracks enriched with human-written captions from the MusicCaps annotation project. It provides rich natural-language descriptions of mood, texture, instrumentation, and genre — the primary retrieval anchor for each blueprint's text_description.
Free Music Archive (FMA) ~106K tracks Creative Commons licensed tracks with genre labels, Echonest audio features (valence, energy, danceability, instrumentalness, acousticness), and track-level metadata. Covers a wide range of independent and niche genres.
MusicCaps ~5.5K tracks A high-quality, human-annotated evaluation set from Google DeepMind. Used to validate caption quality and tag vocabulary — but instrumental in shaping the genre/mood classification vocabulary.

Together these datasets cover mainstream, indie, electronic, classical, world music, and everything in between — giving retrieval broad coverage across moods, genres, keys, and tempos.


Data Pipeline

The blueprint index was built in three offline stages. All scripts live in backend/scripts/.

Stage 1 — Raw datasets → Blueprint Parquets

ingest_blueprints.py
  LP-MusicCaps-MSD (513,977 tracks)  → data/blueprints_lp_msd.parquet
  FMA              (106,574 tracks)  → data/blueprints_fma.parquet

For LP-MusicCaps-MSD:

  • Tags parsed and classified into genre / mood / themes via vocabulary sets
  • Vocal type inferred from tag strings (female vocal, male vocal, instrumental)
  • Energy derived from loudness: (loudness + 25) / 25, clamped to [0, 1]
  • text field assembled from caption_summary + caption_writing + tags + key/mode/BPM

For FMA:

  • Genre from genre_top; mood derived from echonest valence threshold (>0.6 → upbeat, <0.3 → melancholic)
  • Vocal type from instrumentalness threshold (>0.8 → instrumental)
  • text field assembled from title + genre + descriptors + BPM + mood

Stage 2 — Parquets → Turbopuffer

embed_blueprints.py
  blueprints_lp_msd.parquet  → Turbopuffer namespace lp_msd_minilm  (513,977 records)
  blueprints_fma.parquet     → Turbopuffer namespace fma_minilm     (106,574 records)
  • Embedding model: sentence-transformers/all-MiniLM-L6-v2 (384-dim, L2-normalized)
    • During data pipeline (local): The model was run locally via the sentence-transformers Python package — no GPU required. all-MiniLM-L6-v2 is small enough to encode comfortably on CPU, producing ~1,000 vectors/sec and making it practical to embed all 620K+ blueprint records in a single offline run. Running it locally meant zero API cost for the bulk embedding pass and no rate-limit concerns.
    • At inference (OpenRouter): For query embedding at request time, we switched to the same model served via the OpenRouter API. This avoids bundling the model weights in the Railway server container, keeps the deployment lightweight, and centralizes access with a single API key. The vectors are dimensionally identical (384-dim, L2-normalized), so ANN queries work seamlessly against the locally-built index.
    • Batch encode: 256 rows per encode call; upsert 500 rows per Turbopuffer write call
  • Schema: text (full-text search enabled), plus filterable string attributes (source, genre, mood, vocal_type, key, mode, mode_key) and numeric attributes (bpm, year, energy, acousticness, valence, danceability, instrumentalness, artist_familiarity)
  • Checkpointed: progress saved to data/.embed_checkpoints/ so a killed run resumes from the last successful batch

Both namespaces are queried concurrently at runtime via asyncio.gather. Results are merged with Reciprocal Rank Fusion (RRF, k=60).


Generation Modes

Mode Description
Simple (Prompt) Fast iteration — one text prompt derived from aggregated blueprint traits sent to ElevenLabs
Advanced (Composition Plan) Structured songs — section-level control (intro/verse/chorus/bridge/outro), lyric placement per section, local style guides

Both modes support Review Before Generate — a dry-run that synthesizes and shows you the exact prompt or composition plan before committing to an ElevenLabs call. Approve or cancel.

Composition plan structure:

  • positive_global_styles — genre, mood, tempo, key from aggregated blueprints
  • positive_local_styles — per-section style directions
  • lines — user lyrics only, placed per section (never mixed with style guidance)
  • negative_global_styles — traits to suppress

Tech Stack

Category Technology
Frontend React 18, TypeScript, Vite, TailwindCSS, Framer Motion, react-flow, Radix UI
Backend FastAPI, Python 3.13+, uv, Pydantic, asyncio
AI & Music ElevenLabs Music API (prompt + composition-plan), Google Gemini 2.5-flash
Retrieval Turbopuffer — ANN + BM25 hybrid, metadata filters, RRF merge across 2 namespaces
Embeddings all-MiniLM-L6-v2 via OpenRouter API (384-dim, no local GPU needed)
Data Sources LP-MusicCaps-MSD (513K), Free Music Archive (106K), MSD full (1M), MusicCaps (5.5K)
Deployment Railway (backend, Hobby tier) + Vercel (frontend, SPA rewrite)

Screenshots

Blueprint Cards

Blueprint cards — BPM chip, key badge, energy bar, genre

Generated Track

Generated track hero — audio player, reasoning trail

Review Composition Plan

Review composition plan before sending to ElevenLabs

Generation Overlay

Animated two-phase loading overlay


Running Locally

Prerequisites: Python 3.11+, Node 18+, uv

# Backend
cd backend
uv sync
cp .env.example .env   # fill in API keys
uv run uvicorn app.main:app --reload --port 8000

# Frontend (separate terminal)
cd frontend
npm install
npm run dev
# Opens at http://localhost:8080

Environment variables (backend/.env):

ELEVENLABS_API_KEY=...
TURBOPUFFER_API_KEY=...
OPENROUTER_API_KEY=...
GEMINI_API_KEY=...

Data pipeline (one-time setup — only needed to rebuild the Turbopuffer index):

cd backend
uv run python scripts/ingest_blueprints.py   # dataset → blueprint records
uv run python scripts/embed_blueprints.py    # embed + upsert into Turbopuffer

The Turbopuffer namespaces (lp_msd_minilm, fma_minilm) are already populated in production. You only need to run the data pipeline if you're rebuilding the index from scratch.


Copyright-Safe by Design

No copyrighted material ever reaches ElevenLabs — by design, not accident.

  • Metadata only — blueprints are structured features (BPM, key, genre, energy) and human-written captions. No audio is stored or processed.
  • Artist & title firewall — names and titles are stripped before anything reaches Gemini or ElevenLabs. Only derived traits flow into generation.
  • text_description excluded from LLM context — free-text fields stay retrieval-only (BM25); never passed to Gemini as they may embed real artist references.
  • Original output — ElevenLabs generates a brand-new composition. Blueprint retrieval shapes the style; it reproduces nothing.

License

MIT License

This project is licensed under the MIT License.

About

A MUSICIAN'S ULTIMATE AI-POWERED ORIGINAL MUSIC GENERATION TOOLKIT. Search through millions of arrangements and compositions and forge your grooves.

Topics

Resources

License

Stars

Watchers

Forks

Contributors