Reported by @ducknoodledance in #63 (with a working community
workaround script).
Problem
Every entity uses soft-delete (deleted_at) and the codebase intentionally
"never hard-deletes in application code" (CLAUDE.md). But there is no
scheduled job that ever reclaims the storage, so:
- Soft-deleted rows pile up in Postgres indefinitely.
- The underlying S3 / MinIO objects (
s3_key_raw, s3_key_processed,
s3_key_thumbnail) are never removed, even after the parent asset,
version, folder, or project is soft-deleted.
- Orphans created by interrupted uploads / failed transcodes are not
swept.
ShareLink.expires_at is defined but never enforced or cleaned.
For a long-lived self-hosted install this means the bucket and DB
grow unbounded.
Evidence
- All delete endpoints only flip
deleted_at — e.g. apps/api/routers/assets.py:197,
apps/api/routers/projects.py, apps/api/routers/folders.py,
apps/api/routers/share.py:440.
- The only place that actually calls
s3.delete_object is comment
attachments in apps/api/routers/comments.py.
- Celery beat schedule in
apps/api/tasks/celery_app.py:59-64 only
runs send_due_date_reminders — no cleanup task is registered.
MediaFile has no deleted_at; it's only reachable through the
parent version, so once a version is soft-deleted those S3 keys
become invisible to the app but stay in the bucket.
Proposed fix
- Add a Celery beat task (
cleanup_soft_deleted, e.g. daily) that:
- hard-deletes rows soft-deleted more than N days ago (configurable
retention window, default e.g. 30 days)
- deletes their S3 objects (
raw, processed/HLS prefix, thumbnail)
- cascades through versions, media files, comments, annotations,
approvals, share links
- Add an orphan sweeper that lists S3 prefixes and removes keys with
no matching DB row.
- Honor
ShareLink.expires_at (reject in the API and clean up
expired rows).
- Document the retention window + a manual "purge now" admin endpoint.
Reported by @ducknoodledance in #63 (with a working community
workaround script).
Problem
Every entity uses soft-delete (
deleted_at) and the codebase intentionally"never hard-deletes in application code" (CLAUDE.md). But there is no
scheduled job that ever reclaims the storage, so:
s3_key_raw,s3_key_processed,s3_key_thumbnail) are never removed, even after the parent asset,version, folder, or project is soft-deleted.
swept.
ShareLink.expires_atis defined but never enforced or cleaned.For a long-lived self-hosted install this means the bucket and DB
grow unbounded.
Evidence
deleted_at— e.g.apps/api/routers/assets.py:197,apps/api/routers/projects.py,apps/api/routers/folders.py,apps/api/routers/share.py:440.s3.delete_objectis commentattachments in
apps/api/routers/comments.py.apps/api/tasks/celery_app.py:59-64onlyruns
send_due_date_reminders— no cleanup task is registered.MediaFilehas nodeleted_at; it's only reachable through theparent version, so once a version is soft-deleted those S3 keys
become invisible to the app but stay in the bucket.
Proposed fix
cleanup_soft_deleted, e.g. daily) that:retention window, default e.g. 30 days)
raw,processed/HLS prefix, thumbnail)approvals, share links
no matching DB row.
ShareLink.expires_at(reject in the API and clean upexpired rows).