fix(authz): schedule periodic audit-log retention cleanup by erichare · Pull Request #13546 · langflow-ai/langflow

erichare · 2026-06-08T20:55:14Z

Problem

clean_authz_audit_log() (services/utils.py) is implemented and unit-tested, and bulk-deletes authz_audit_log rows older than AUTHZ_AUDIT_RETENTION_DAYS (default 90; 0 disables). But it was invoked exactly once, at startup inside initialize_services(). A long-running instance never pruned again after boot, so the table grew unbounded between restarts — even though the retention field's own docstring promised rows are "deleted on startup and by the periodic cleanup task." The periodic task did not exist; this PR adds it. (Closes gap C of the OSS RBAC rollout on release-1.10.0.)

Change

Wire the existing helper to a recurring background worker, modelled on the sibling temp_flow_cleanup.CleanupWorker:

AuditLogCleanupWorker — new module services/task/audit_cleanup.py. A stop-event-driven asyncio task that prunes on a fixed interval, each sweep opening its own session_scope(). Best-effort: the helper already logs-and-swallows DB errors, and an outer guard keeps the loop alive across transient outages, so it never blocks the event loop or the request path.
Gating — the worker only schedules when AUTHZ_AUDIT_ENABLED is True and AUTHZ_AUDIT_RETENTION_DAYS > 0. Otherwise start() is a quiet no-op (no task created). retention=0 therefore stays a no-op end to end.
Sleep-first loop — the unconditional startup sweep still prunes at boot, so the first scheduled pass waits one interval to avoid a redundant immediate delete. The startup sweep is left unchanged.
New setting — AUTHZ_AUDIT_CLEANUP_INTERVAL (AuthSettings), default 86400 (daily), floor 300s.
Lifespan wiring — started right after the telemetry writer and stopped in the shutdown "Cancelling Background Tasks" step, both wrapped best-effort so cleanup scheduling can never block startup/shutdown.

Why a worker (not a setting-only change)

The retention logic already existed and was correct — the only missing piece was recurrence. Reusing the helper verbatim (rather than reimplementing the delete) keeps the bounded-table guarantee in one place, and the CleanupWorker shape is already the established pattern in this package for periodic DB cleanup.

Tests

New test_audit_cleanup_worker.py (7 tests):

test_worker_runs_cleanup_repeatedly_on_schedule — the acceptance test: the helper is invoked ≥2× on a tiny interval, proving recurrence (not just a startup call).
test_worker_is_noop_when_retention_disabled — AUTHZ_AUDIT_RETENTION_DAYS=0 ⇒ no task scheduled, helper never called.
test_worker_is_noop_when_audit_disabled — AUTHZ_AUDIT_ENABLED=False ⇒ no-op.
test_worker_prunes_old_rows_on_schedule — end-to-end against a real SQLite engine: a row inserted after startup is pruned by the scheduled worker while an in-window row survives.
Plus start/stop idempotency, interval resolution (override vs setting vs default), and resilience to a failing sweep.

26 passed   # new file + existing audit/retention/authorization-service suites

Backend ruff check/ruff format clean; all pre-commit hooks (incl. detect-secrets) pass.

Notes / non-goals

Like the existing startup sweep and temp_flow_cleanup, each uvicorn worker runs its own sweep. The delete is a single idempotent WHERE timestamp < cutoff, so concurrent sweeps after the first are no-ops; a distributed lock is intentionally out of scope (EE territory).
Gates are read at start() (the env-configured common case). Runtime-toggling AUTHZ_AUDIT_ENABLED to False leaves a harmless pruner running until restart; the helper's internal retention<=0 check is the per-tick defense for retention.

Summary by CodeRabbit

New Features
- Automatic periodic cleanup of authorization audit logs to prevent indefinite database accumulation.
- Background worker runs daily by default to remove expired audit entries.
- Cleanup respects audit logging configuration and only activates when auditing and retention are enabled.

clean_authz_audit_log() was only invoked once, at startup inside initialize_services(). A long-running instance never pruned authz_audit_log again after boot, so the table grew unbounded between restarts even though the retention helper was already implemented and unit-tested. Wire the same helper to a recurring background worker: - New AuditLogCleanupWorker (services/task/audit_cleanup.py), modelled on the sibling temp_flow_cleanup.CleanupWorker: a stop-event-driven asyncio task that prunes on a fixed interval in its own session_scope(). Best-effort -- the helper logs-and-swallows DB errors and an outer guard keeps the loop alive, so it never blocks the event loop or request path. - Gated: the worker only schedules when AUTHZ_AUDIT_ENABLED is True and AUTHZ_AUDIT_RETENTION_DAYS > 0; otherwise start() is a no-op. Retention=0 stays a no-op end to end. - Sleep-first loop: the unconditional startup sweep still prunes at boot, so the first scheduled pass waits one interval to avoid a redundant immediate delete. The startup sweep is left unchanged. - New AUTHZ_AUDIT_CLEANUP_INTERVAL setting (default 86400 = daily, floor 300s). - Started/stopped from the application lifespan, both best-effort. Tests (test_audit_cleanup_worker.py) show cleanup runs repeatedly on the recurring schedule (not just at startup), is a no-op when retention or auditing is disabled, survives sweep failures, and -- end to end against a real SQLite engine -- prunes a row inserted after startup while leaving in-window rows. Closes the audit-retention-scheduling gap (C) for OSS RBAC on release-1.10.0. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-06-08T20:55:38Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8d04f2c6-74f2-4a2a-817e-76c4ae1ad1bd

📥 Commits

Reviewing files that changed from the base of the PR and between 6610091 and 95bc647.

📒 Files selected for processing (4)

src/backend/base/langflow/main.py
src/backend/base/langflow/services/task/audit_cleanup.py
src/backend/tests/unit/services/authorization/test_audit_cleanup_worker.py
src/lfx/src/lfx/services/settings/auth.py

Walkthrough

This PR adds a periodic background worker that automatically prunes retention-expired rows from the authz_audit_log table. The worker is configurable via settings, conditionally activated based on audit logging configuration, integrated into the FastAPI application lifespan, and covered by comprehensive unit and integration tests.

Changes

Audit Log Retention Cleanup Worker

Layer / File(s)	Summary
Configuration Settings `src/lfx/src/lfx/services/settings/auth.py`	Adds `AUTHZ_AUDIT_CLEANUP_INTERVAL` setting with a default of 86400 seconds and a minimum of 300 seconds, governing the frequency of retention cleanup sweeps.
Worker Implementation `src/backend/base/langflow/services/task/audit_cleanup.py`	Implements `AuditLogCleanupWorker` class with async lifecycle (start/stop), periodic loop that sleeps between intervals or waits for stop signal, session-scoped sweep execution via `clean_authz_audit_log`, and exception tolerance to keep the worker alive across failures.
Application Lifespan Wiring `src/backend/base/langflow/main.py`	Integrates the worker into FastAPI lifespan: starts the worker after telemetry initialization and stops it after MCP manager shutdown, both with best-effort error handling and appropriate logging.
Test Coverage `src/backend/tests/unit/services/authorization/test_audit_cleanup_worker.py`	Comprehensive test suite covering recurring scheduling semantics, conditional activation based on audit settings, lifecycle idempotency, exception resilience, interval resolution precedence, and end-to-end integration test with in-memory SQLite database confirming pruning behavior.

Sequence Diagram

sequenceDiagram
  participant Startup
  participant AuditLogCleanupWorker
  participant Database
  participant clean_authz_audit_log as clean_authz_audit_log service
  Startup->>AuditLogCleanupWorker: start()
  activate AuditLogCleanupWorker
  Note over AuditLogCleanupWorker: Verify audit enabled & retention > 0
  AuditLogCleanupWorker->>AuditLogCleanupWorker: Create async task for _run()
  Note over AuditLogCleanupWorker: Loop: sleep_or_stop(interval)
  AuditLogCleanupWorker->>Database: session_scope()
  activate Database
  Database->>clean_authz_audit_log: clean_authz_audit_log()
  activate clean_authz_audit_log
  clean_authz_audit_log->>Database: DELETE stale rows
  clean_authz_audit_log-->>Database: return pruned count
  deactivate clean_authz_audit_log
  Database-->>AuditLogCleanupWorker: session committed/rolled back
  deactivate Database
  Note over AuditLogCleanupWorker: Continue loop or break on stop event
  Startup->>AuditLogCleanupWorker: stop()
  AuditLogCleanupWorker->>AuditLogCleanupWorker: Signal stop event, await task
  deactivate AuditLogCleanupWorker

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

langflow-ai/langflow#13126: Introduces telemetry writer background service integration pattern into FastAPI lifespan, establishing the precedent for best-effort worker startup/shutdown used by this audit cleanup worker PR.

Suggested labels

fix-index

Suggested reviewers

dkaushik94
jordanrfrazier

🚥 Pre-merge checks | ✅ 8 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 69.57% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (8 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately and concisely describes the main change: scheduling periodic audit-log retention cleanup via a background worker, which directly addresses the core problem solved by this PR.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Test Coverage For New Implementations	✅ Passed	Test file test_audit_cleanup_worker.py (241 lines, 7 tests) covers new AuditLogCleanupWorker with unit, integration, and edge cases following test_*.py convention with meaningful assertions.
Test Quality And Coverage	✅ Passed	7 comprehensive tests cover recurring cleanup execution, async patterns, gating conditions, lifecycle management, error resilience, and E2E database behavior following pytest backend conventions.
Test File Naming And Structure	✅ Passed	Correct test_*.py naming, pytest async structure, 7 test functions with docstrings, fixtures, helpers, clear organization, positive/negative scenarios, edge cases, and integration test.
Excessive Mock Usage Warning	✅ Passed	Mock usage is appropriate: minimal mocks (0.9 per test) target external dependencies only. Core logic tested unmocked with real asyncio objects. Includes integration test with real database.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/schedule-authz-audit-retention-cleanup

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-06-08T20:56:32Z

✅ Test Coverage Advisor

No source changes detected without accompanying tests. Thanks for keeping coverage up! 🎉

Advisory check only — never blocks merge.

codecov · 2026-06-08T21:04:44Z

Codecov Report

❌ Patch coverage is 90.12346% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.49%. Comparing base (6610091) to head (95bc647).
⚠️ Report is 3 commits behind head on release-1.10.0.

Files with missing lines	Patch %	Lines
src/backend/base/langflow/main.py	60.00%	4 Missing ⚠️
...ckend/base/langflow/services/task/audit_cleanup.py	94.28%	4 Missing ⚠️

Additional details and impacted files

@@                Coverage Diff                 @@
##           release-1.10.0   #13546      +/-   ##
==================================================
+ Coverage           58.33%   58.49%   +0.16%     
==================================================
  Files                2290     2290              
  Lines              219855   219177     -678     
  Branches            32361    31136    -1225     
==================================================
- Hits               128245   128206      -39     
+ Misses              90151    89513     -638     
+ Partials             1459     1458       -1

Flag	Coverage Δ
backend	`65.29% <90.00%> (+0.16%)`	⬆️
frontend	`57.81% <ø> (+0.19%)`	⬆️
lfx	`54.27% <100.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
src/lfx/src/lfx/services/settings/auth.py	`60.54% <100.00%> (+0.27%)`	⬆️
src/backend/base/langflow/main.py	`64.09% <60.00%> (+2.08%)`	⬆️
...ckend/base/langflow/services/task/audit_cleanup.py	`94.28% <94.28%> (ø)`

... and 245 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

github-actions · 2026-06-08T21:08:26Z

Frontend Unit Test Coverage Report

Coverage Summary

Lines	Statements	Branches	Functions
	43.28% (57621/133123)	69.22% (7829/11310)	41.49% (1291/3111)

Unit Test Results

Tests	Skipped	Failures	Errors	Time
4940	0 💤	0 ❌	0 🔥	11m 52s ⏱️

github-actions Bot added the bug Something isn't working label Jun 8, 2026

github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Jun 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(authz): schedule periodic audit-log retention cleanup#13546

fix(authz): schedule periodic audit-log retention cleanup#13546
erichare wants to merge 1 commit into
release-1.10.0from
fix/schedule-authz-audit-retention-cleanup

erichare commented Jun 8, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 8, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented Jun 8, 2026

Uh oh!

codecov Bot commented Jun 8, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

erichare commented Jun 8, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Change

Why a worker (not a setting-only change)

Tests

Notes / non-goals

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented Jun 8, 2026

✅ Test Coverage Advisor

Uh oh!

codecov Bot commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions Bot commented Jun 8, 2026

Frontend Unit Test Coverage Report

Coverage Summary

Unit Test Results

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

erichare commented Jun 8, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 8, 2026 •

edited

Loading

codecov Bot commented Jun 8, 2026 •

edited

Loading