fix(authz): schedule periodic audit-log retention cleanup#13546
fix(authz): schedule periodic audit-log retention cleanup#13546erichare wants to merge 1 commit into
Conversation
clean_authz_audit_log() was only invoked once, at startup inside initialize_services(). A long-running instance never pruned authz_audit_log again after boot, so the table grew unbounded between restarts even though the retention helper was already implemented and unit-tested. Wire the same helper to a recurring background worker: - New AuditLogCleanupWorker (services/task/audit_cleanup.py), modelled on the sibling temp_flow_cleanup.CleanupWorker: a stop-event-driven asyncio task that prunes on a fixed interval in its own session_scope(). Best-effort -- the helper logs-and-swallows DB errors and an outer guard keeps the loop alive, so it never blocks the event loop or request path. - Gated: the worker only schedules when AUTHZ_AUDIT_ENABLED is True and AUTHZ_AUDIT_RETENTION_DAYS > 0; otherwise start() is a no-op. Retention=0 stays a no-op end to end. - Sleep-first loop: the unconditional startup sweep still prunes at boot, so the first scheduled pass waits one interval to avoid a redundant immediate delete. The startup sweep is left unchanged. - New AUTHZ_AUDIT_CLEANUP_INTERVAL setting (default 86400 = daily, floor 300s). - Started/stopped from the application lifespan, both best-effort. Tests (test_audit_cleanup_worker.py) show cleanup runs repeatedly on the recurring schedule (not just at startup), is a no-op when retention or auditing is disabled, survives sweep failures, and -- end to end against a real SQLite engine -- prunes a row inserted after startup while leaving in-window rows. Closes the audit-retention-scheduling gap (C) for OSS RBAC on release-1.10.0. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
WalkthroughThis PR adds a periodic background worker that automatically prunes retention-expired rows from the ChangesAudit Log Retention Cleanup Worker
Sequence DiagramsequenceDiagram
participant Startup
participant AuditLogCleanupWorker
participant Database
participant clean_authz_audit_log as clean_authz_audit_log service
Startup->>AuditLogCleanupWorker: start()
activate AuditLogCleanupWorker
Note over AuditLogCleanupWorker: Verify audit enabled & retention > 0
AuditLogCleanupWorker->>AuditLogCleanupWorker: Create async task for _run()
Note over AuditLogCleanupWorker: Loop: sleep_or_stop(interval)
AuditLogCleanupWorker->>Database: session_scope()
activate Database
Database->>clean_authz_audit_log: clean_authz_audit_log()
activate clean_authz_audit_log
clean_authz_audit_log->>Database: DELETE stale rows
clean_authz_audit_log-->>Database: return pruned count
deactivate clean_authz_audit_log
Database-->>AuditLogCleanupWorker: session committed/rolled back
deactivate Database
Note over AuditLogCleanupWorker: Continue loop or break on stop event
Startup->>AuditLogCleanupWorker: stop()
AuditLogCleanupWorker->>AuditLogCleanupWorker: Signal stop event, await task
deactivate AuditLogCleanupWorker
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 8 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (8 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
✅ Test Coverage AdvisorNo source changes detected without accompanying tests. Thanks for keeping coverage up! 🎉
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## release-1.10.0 #13546 +/- ##
==================================================
+ Coverage 58.33% 58.49% +0.16%
==================================================
Files 2290 2290
Lines 219855 219177 -678
Branches 32361 31136 -1225
==================================================
- Hits 128245 128206 -39
+ Misses 90151 89513 -638
+ Partials 1459 1458 -1
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
Problem
clean_authz_audit_log()(services/utils.py) is implemented and unit-tested, and bulk-deletesauthz_audit_logrows older thanAUTHZ_AUDIT_RETENTION_DAYS(default 90;0disables). But it was invoked exactly once, at startup insideinitialize_services(). A long-running instance never pruned again after boot, so the table grew unbounded between restarts — even though the retention field's own docstring promised rows are "deleted on startup and by the periodic cleanup task." The periodic task did not exist; this PR adds it. (Closes gap C of the OSS RBAC rollout onrelease-1.10.0.)Change
Wire the existing helper to a recurring background worker, modelled on the sibling
temp_flow_cleanup.CleanupWorker:AuditLogCleanupWorker— new moduleservices/task/audit_cleanup.py. A stop-event-drivenasynciotask that prunes on a fixed interval, each sweep opening its ownsession_scope(). Best-effort: the helper already logs-and-swallows DB errors, and an outer guard keeps the loop alive across transient outages, so it never blocks the event loop or the request path.AUTHZ_AUDIT_ENABLEDisTrueandAUTHZ_AUDIT_RETENTION_DAYS > 0. Otherwisestart()is a quiet no-op (no task created).retention=0therefore stays a no-op end to end.AUTHZ_AUDIT_CLEANUP_INTERVAL(AuthSettings), default86400(daily), floor300s.Why a worker (not a setting-only change)
The retention logic already existed and was correct — the only missing piece was recurrence. Reusing the helper verbatim (rather than reimplementing the delete) keeps the bounded-table guarantee in one place, and the
CleanupWorkershape is already the established pattern in this package for periodic DB cleanup.Tests
New
test_audit_cleanup_worker.py(7 tests):test_worker_runs_cleanup_repeatedly_on_schedule— the acceptance test: the helper is invoked ≥2× on a tiny interval, proving recurrence (not just a startup call).test_worker_is_noop_when_retention_disabled—AUTHZ_AUDIT_RETENTION_DAYS=0⇒ no task scheduled, helper never called.test_worker_is_noop_when_audit_disabled—AUTHZ_AUDIT_ENABLED=False⇒ no-op.test_worker_prunes_old_rows_on_schedule— end-to-end against a real SQLite engine: a row inserted after startup is pruned by the scheduled worker while an in-window row survives.Backend
ruff check/ruff formatclean; all pre-commit hooks (incl. detect-secrets) pass.Notes / non-goals
temp_flow_cleanup, each uvicorn worker runs its own sweep. The delete is a single idempotentWHERE timestamp < cutoff, so concurrent sweeps after the first are no-ops; a distributed lock is intentionally out of scope (EE territory).start()(the env-configured common case). Runtime-togglingAUTHZ_AUDIT_ENABLEDtoFalseleaves a harmless pruner running until restart; the helper's internalretention<=0check is the per-tick defense for retention.Summary by CodeRabbit