From c1ad0dc8e1170939edd0459d8a71e9f18b537ef4 Mon Sep 17 00:00:00 2001 From: Daniel Joaquin Trujillo <54636507+danieljtrujillo@users.noreply.github.com> Date: Fri, 12 Jun 2026 00:34:47 -0700 Subject: [PATCH 1/2] docs: overhaul guide opener, add screenshots throughout, document suggester/play count/automix Rewrites the opening overview around the full scope of the app, leads with a galaxy hero and spreads supporting screenshots through the sections (sized down and borderless) so section ends are not blank, and adds the Suggest-a-Playlist section, a play-count control with a Plays sort, and an Automix note. README Library and DJ summaries updated to match. --- README.md | 4 +- backend/modules/library/router.py | 5 +- docs/USER_GUIDE.md | 60 ++++++++++++++++++--- docs/reports/feature-doc-coverage-report.md | 8 +-- docs/reports/feature-doc-coverage.json | 9 ++-- docs/screenshots/manifest.json | 2 +- frontend/public/USER_GUIDE.md | 60 ++++++++++++++++++--- 7 files changed, 120 insertions(+), 28 deletions(-) diff --git a/README.md b/README.md index ca81c2b..920b479 100644 --- a/README.md +++ b/README.md @@ -142,7 +142,7 @@ A chain of 24 FFmpeg effects covers a mastering chain, compression, highpass and

-The library lives on the backend, with audio on disk, metadata in `data/library.db`, and access over `/api/library/*`. Every render saves automatically with its prompt, model, duration, steps, CFG, seed, MIME type, and timestamp. List and grid views, full-text search, a favorites filter, and sorting by newest, duration, or title organize the collection, and each row plays inline with download, delete, favorite, and send-to-editor actions plus a details panel. The Catalogue view adds a cross-provider gallery with provider badges, an inspector with on-demand spectrograms, and a lineage panel, and it runs Suno cover and mashup from any entry.
+The library lives on the backend, with audio on disk, metadata in `data/library.db`, and access over `/api/library/*`. Every render saves automatically with its prompt, model, duration, steps, CFG, seed, MIME type, and timestamp. List and grid views, full-text search, a favorites filter, and sorting by newest, duration, title, or play count organize the collection, and each row plays inline with download, delete, favorite, and send-to-editor actions plus a details panel. A per-entry play count persists to the library database and surfaces as a badge and a sort. SUGGEST builds a continuous playlist from the analyzed library, ordered by Camelot-wheel harmony and a chosen BPM flow across a time budget, then plays it through the footer queue or sends it to the DJ tab as an automix set. The Catalogue view adds a cross-provider gallery with provider badges, an inspector with on-demand spectrograms, and a lineage panel, and it runs Suno cover and mashup from any entry.
### Bottom panel tools
diff --git a/backend/modules/library/router.py b/backend/modules/library/router.py
index f0b46aa..0d46c01 100644
--- a/backend/modules/library/router.py
+++ b/backend/modules/library/router.py
@@ -109,7 +109,7 @@ async def stream_audio(entry_id: str) -> Response:
resp.raise_for_status()
audio_bytes = resp.content
# Cache to disk so future requests skip CDN.
- local_name = meta.get("audio_filename") or f"{entry_id}.mp3"
+ local_name = (meta or {}).get("audio_filename") or f"{entry_id}.mp3"
local_path = entry_dir / local_name
try:
local_path.write_bytes(audio_bytes)
@@ -176,7 +176,8 @@ def register_play(entry_id: str) -> dict[str, Any]:
if record is None:
raise HTTPException(404, f"Entry {entry_id!r} not found")
entry_dir = store._dir_for(entry_id) # noqa: SLF001
- store._sync_record_to_db(record, _read_metadata(entry_dir) or {}) # noqa: SLF001
+ meta = _read_metadata(entry_dir) if entry_dir is not None else None
+ store._sync_record_to_db(record, meta or {}) # noqa: SLF001
new_count = store.db.increment_play_count(entry_id) or 1
return {"id": entry_id, "play_count": new_count}
diff --git a/docs/USER_GUIDE.md b/docs/USER_GUIDE.md
index 94ca4b3..bf5c0e8 100644
--- a/docs/USER_GUIDE.md
+++ b/docs/USER_GUIDE.md
@@ -2,15 +2,21 @@
_by GANTASMO_
-theDAW is a complete digital audio workstation built on Stable Audio 3, the state-of-the-art open audio model from Stability AI. The model turns a text prompt into high-fidelity 44.1 kHz stereo audio, and theDAW surrounds that core with a full production environment. A React front end and a FastAPI backend run together from one launcher, so the whole studio starts with a single double-click.
+theDAW packs most of a music career into one program. It writes original audio from a text prompt, arranges it on a multitrack timeline, masters it through a deep effects chain, splits it into stems, performs it across two beatmatched DJ decks, runs a reactive visual show, transcribes it to sheet music and tablature, and files everything in a searchable library that records how each piece descended from the last. It trains custom models on a collection and answers questions about any of it from a built-in assistant. A separate DAW, audio generator, stem separator, DJ rig, VJ engine, notation editor, sample manager, and model trainer collapse into a single window.
-Audio generation happens in the MAKE workspace. A prompt becomes finished audio through Stable Audio 3, and the same model menu also reaches Magenta RealTime 2 for streaming text-to-music and Suno for cloud generation. Chimera fusion can blend several clips into one downbeat-aligned piece, while init signals, inpainting, and LoRA adapters give finer control over the result. When the model selection moves between Stable Audio and the Magenta sidecar, the GPU swap runs on its own.
+
-Editing and mastering come next. The EDIT workspace opens a piece in a waveform editor that supports region inpainting. Effects and loudness work happens in MIX, which hosts the Edit Tool Stack alongside Quick Master macros. For live use, the DJ workspace provides a two-deck engine with beatmatch sync, key-lock, live stems, an FX rack, hot cues, and hands-free automix. WebGL visuals in the VJ workspace respond to the audio and accept MIDI, microphone, and mobile input.
+Generation runs on Stable Audio 3, the open generative audio model from Stability AI. A prompt becomes finished stereo, and the same model menu reaches Magenta RealTime 2 for streaming text-to-music and Suno for cloud renders. Chimera fusion folds several clips into one tempo-aligned track, while init signals, inpainting, and LoRA adapters steer a render toward a target.
-Everything generated is kept. The Library stores each piece on disk together with its analysis, stems, and MIDI, and LEARN renders the relationships between pieces as a navigable 3D genealogy. A symbolic-music pipeline produces sheet music, tablature, and multi-instrument arrangements, and it can read a score back into a usable text prompt. LoRA training and autoencoder round-trips have a home in the TRAIN workspace. An in-app assistant answers questions from this guide, and a paste-a-URL importer pulls audio from YouTube, SoundCloud, and Bandcamp.
+
-The in-app **Docs** button renders this guide as an interactive modal with a filterable table of contents, raw Markdown download, and print-to-PDF. Each workspace and the backend are documented in full below.
+A track then moves through EDIT for waveform surgery and region inpainting, and MIX for the Edit Tool Stack and Quick Master macros. The DJ workspace runs two decks with beatmatch sync, key-lock, live stems, an FX rack, hot cues, and hands-free automix, and the VJ workspace drives WebGL visuals from the audio, a microphone, MIDI, or a phone on the same network.
+
+
+
+Nothing is discarded. The Library keeps every piece on disk with its analysis, stems, and MIDI, and LEARN draws the lineage between pieces as a navigable 3D genealogy. A symbolic-music pipeline turns a track into sheet music, tablature, and multi-instrument arrangements, and reads a finished score back into a prompt. TRAIN fits LoRA adapters and runs autoencoder round-trips, an assistant answers from this guide, and a paste-a-URL importer pulls audio from YouTube, SoundCloud, and Bandcamp.
+
+The **Docs** button in the top bar opens this guide as a modal with a filterable table of contents, a Markdown download, and a print-to-PDF in three styles. Every workspace and the full backend API follow below.
---
@@ -310,6 +316,8 @@ A sticky bar fixed at the bottom of the MAKE workspace submits the generation jo
---
+
+
## 7. EDIT Tab
### Purpose
@@ -395,6 +403,8 @@ During the render, the COMMIT EDIT button shows an animated spinner and is disab
---
+
+
## 8. MIX Tab
### Purpose
@@ -465,6 +475,8 @@ The last processing invocations are retained in the store. Any history item can
---
+
+
## 9. DJ Tab
### Purpose
@@ -501,7 +513,7 @@ DJ MIDI-learn binds a hardware controller to deck, mixer, and hotcue actions. It
### 9.7 Automix, Sampler, and Side List
-- **Automix** sequences a setlist hands-free, beatmatching each transition.
+- **Automix** sequences a setlist hands-free, beatmatching each transition. The Library's **Suggest a Playlist** can populate this set and start it in one step through its **Send to DJ** action (see §13.9).
- **Sampler bank**: drag a clip onto a pad to load a one-shot, then trigger pads during a set.
- **Side List**: a play-next staging lane above the browser. Stage upcoming tracks, reorder them, and pull them onto a deck when ready.
@@ -515,6 +527,8 @@ A floating **Edit Layout** control turns on Design Mode. In Design Mode, panels
---
+
+
## 10. VJ Tab
### Purpose
@@ -597,6 +611,8 @@ Long-running training jobs are tracked through `GET /api/jobs/{id}`, polled at 1
---
+
+
## 12. LEARN Tab
### Purpose
@@ -631,6 +647,8 @@ theDAW carries several other rich visualizations, each documented in its own sec
---
+
+
## 13. Library
### Purpose
@@ -675,13 +693,14 @@ Toggle between a dense **List** view (one row per entry) and a **Grid** view (ti
- **Search** filters across title, prompt, model, tags, and notes at the same time.
- **FAVS** toggle restricts the view to favorited entries.
-- **Sort** by Newest (timestamp descending), Duration (longest first), or Title (alphabetical).
+- **Sort** by Newest (timestamp descending), Duration (longest first), Title (alphabetical), or Plays (most-played first).
### 13.4 Per-entry Controls
| Control | Description |
|---|---|
| **Play / Pause** | Loads and plays the entry through `playerStore`. Pausing one entry while another is active stops playback globally. |
+| **Play count** | A badge on each played entry shows how many times it has played. The first play after a load increments a per-entry tally persisted in the library database (`POST /api/library/entries/{id}/play`), so the count survives restarts and metadata edits. |
| **Favorite star** | Toggles the favorite flag; persisted to the backend immediately. |
| **Download** | Triggers a browser file download of the audio. |
| **Delete** | Removes the entry from the backend store and the in-memory store. |
@@ -733,7 +752,18 @@ Important endpoints:
A stats footer shows the total entry count, the favorites count, cumulative storage size, and cumulative playback duration.
-### 13.9 Empty State
+### 13.9 Suggest a Playlist
+
+The **SUGGEST** button opens a playlist builder that sequences analyzed tracks into a continuous set. The criteria are a target length, an optional BPM range, a flow shape (Steady, Build up, Wind down, or Wave), a harmonic toggle, and an optional genre or text filter. The backend engine (`POST /api/library/suggest-playlist`) reads each track's analysis and orders the set by harmonic key on the Camelot wheel, the chosen BPM flow, and small nudges toward popular and stylistically varied picks, filling the time budget. Each result row shows its BPM, Camelot code, and the reason it was chosen.
+
+Two actions run the result:
+
+- **Play All** loads the sequence into the footer player and auto-advances track to track, restoring the loop preference when it finishes.
+- **Send to DJ** loads the order as the active automix set, switches to the DJ tab, and starts the beatmatch-crossfade automix (see §9.7).
+
+Suggestions are strongest when the library is analyzed. Unanalyzed tracks still fill the budget but cannot be harmonically sequenced; the analyzer and the analyze-on-add toggle cover this over time.
+
+### 13.10 Empty State
Shown until the first generation. It contains a **Go generate something** button that switches the active workspace to MAKE.
@@ -795,6 +825,8 @@ These exports use the same PPQ timing constants as live playback (`PPQ = 480`, o
---
+
+
## 15. Piano Roll
### Purpose
@@ -839,6 +871,8 @@ Clips in the waveform editor whose `sourceKind` is `'piano-roll'` display an **E
---
+
+
## 16. Bottom Panel Tabs
The bottom panel is collapsible and vertically resizable (drag the grip handle above it), and a maximize toggle expands any tab to fill the window. Six tabs are available.
@@ -867,6 +901,8 @@ Text in the overlay uses `textShadow` for legibility against any visualization b
**Canvas scaling:** the canvas is sized to its container in physical pixels (device pixel ratio capped at 2×) through a `ResizeObserver`. The `style` dimensions are set in CSS pixels, so the canvas stays crisp on high-DPI displays.
+
+
### 16.2 Piano
Full piano roll interface embedded in the bottom panel. See [§15](#15-piano-roll).
@@ -913,6 +949,8 @@ A session-scoped file holding area for arbitrary audio files. Contents are clear
The SLIDE tab is the glass-capsule control surface that mirrors the VJ engine's control manifest as faders. Moving a SLIDE fader updates the matching control in the VJ, and moving a control inside the VJ updates SLIDE. A content toggle switches which control set is shown, a detach button pops SLIDE out into its own window for a second monitor, and the shared maximize toggle expands it to fill the panel.
+
+
### 16.7 Score
The Score tab renders a track's symbolic music — sheet music, guitar and bass tabs, and arrangements — from its MIDI artifacts, and exports to MusicXML, ABC, PDF, and SVG. It is documented in full in §33.
@@ -1729,6 +1767,8 @@ The result runs on Windows through WSL2 with NVIDIA, on native Linux with NVIDIA
---
+
+
## 28. Edit Tool Stack
Beyond the 24-effect MIX chain (§8), theDAW mounts the **Edit Tool Stack**, six backend module families under `/api/edit/*`. Each family provides a focused set of audio processors built on FFmpeg, NumPy, and librosa DSP. The browser GUIs come from `frontend/public/edit-modules/` and iframe into the MIX effect stage.
@@ -1761,6 +1801,8 @@ The **Catalogue** view (`CatalogueView`, lazy-loaded in the shell) is a cross-pr
---
+
+
## 30. YouTube Import
The `ytimport` module (`/api/ytimport`) imports audio from a URL into the Library.
@@ -1820,6 +1862,8 @@ theDAW turns audio into symbolic music and back: audio → MIDI → sheet music,
A track needs a MIDI first. Convert one from the Library (right-click → Convert to MIDI, §13.7); once a MIDI artifact exists, the Score buttons activate.
+
+
### 33.1 Notation artifacts
Every symbolic file a track produces is a notation artifact with a kind: `midi`, `musicxml`, `abc`, `alphatex` (tabs), `pdf`, or `svg`. The Score panel's left rail lists them; selecting one previews it (MusicXML as sheet music through OpenSheetMusicDisplay, alphaTex as tablature through alphaTab), and DOWNLOAD saves it. Artifacts are stored under `data/generations/