Skip to content

Ce/audit generation#1156

Open
calellowitz wants to merge 4 commits intomainfrom
ce/audit-generation
Open

Ce/audit generation#1156
calellowitz wants to merge 4 commits intomainfrom
ce/audit-generation

Conversation

@calellowitz
Copy link
Copy Markdown
Collaborator

@calellowitz calellowitz commented Apr 30, 2026

Technical Summary

This adds the celery task for creating the weekly audit report, building on the base models and calculation classes from #1147. It is only the backend and does not include any of the UI changes, which will come in the next two PRs.

Safety Assurance

Safety story

Backend only and flag turned off

Automated test coverage

Tests are included for the relevant pieces

Labels & Review

  • The set of people pinged as reviewers is appropriate for the level of risk of the change

calellowitz and others added 3 commits April 30, 2026 13:04
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Base automatically changed from ce/audit-backend to main May 5, 2026 11:39
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 5, 2026

Walkthrough

This pull request introduces a complete audit reporting system as a new Django app (commcare_connect.audit). It comprises database models (AuditReport and AuditReportEntry) to persist audit data, an extensible calculation framework with registration for defining audit metrics, service functions for generating periodic reports, and a Celery task scheduled to run weekly. The feature integrates with feature flags to control which opportunities receive audit reports and includes comprehensive test coverage via factories and integration tests.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 10.81% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'Ce/audit generation' is vague and uses generic formatting with a branch prefix rather than clearly summarizing the main change. Revise the title to clearly describe the main change, such as 'Add Celery task for generating weekly audit reports' or 'Implement weekly audit report generation backend'.
✅ Passed checks (3 passed)
Check name Status Explanation
Description check ✅ Passed The description clearly relates to the changeset, explaining the backend implementation of a Celery task for weekly audit reports with references to related PRs, testing approach, and safety considerations.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ce/audit-generation

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (4)
commcare_connect/audit/tests/test_services.py (1)

11-17: ⚡ Quick win

isolated_registry fixture duplicates manual _REGISTRY manipulation in test_calculations.py — move to conftest.py

This fixture implements the same save/clear/restore pattern that test_calculations.py:test_register_calculation_appends_instance also does manually (lines 65–81 of that file). As per coding guidelines: "Prefer fixtures over factories to avoid duplication. Check conftest.py files (global and per-app) for existing fixtures." Moving isolated_registry to commcare_connect/audit/tests/conftest.py makes it available to both test modules and removes the duplication.

(See the companion comment on test_calculations.py for the proposed refactor.)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@commcare_connect/audit/tests/test_services.py` around lines 11 - 17, Move the
isolated_registry pytest fixture out of
commcare_connect/audit/tests/test_services.py into
commcare_connect/audit/tests/conftest.py so it can be reused; specifically, keep
the same save/clear/restore logic that manipulates calculations._REGISTRY and
remove the duplicate manual registry handling in test_calculations.py (the code
in test_register_calculation_appends_instance) to use the shared
isolated_registry fixture instead. Ensure the fixture name remains
isolated_registry and that tests import nothing extra so pytest auto-discovers
it from conftest.py.
commcare_connect/audit/migrations/0002_periodic_task.py (1)

30-42: 💤 Low value

Ruff RUF012 on dependencies/operations — Django migration false positive; suppress if needed

The dependencies and operations class-level lists always trigger RUF012 ("Mutable default value for class attribute") in Django migrations because they're not annotated with ClassVar. This is a well-known Ruff false positive for the standard migrations.Migration pattern. If your CI enforces this rule, add # noqa: RUF012 to each list or configure a per-path ignore for migration files.

💡 One-line suppression option
-    dependencies = [
+    dependencies: ClassVar = [  # or add  # noqa: RUF012
         ("audit", "0001_initial"),
         ("django_celery_beat", "0001_initial"),
     ]

-    operations = [
+    operations: ClassVar = [  # or add  # noqa: RUF012
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@commcare_connect/audit/migrations/0002_periodic_task.py` around lines 30 -
42, Ruff flags the class-level mutable lists in the Migration class
(dependencies and operations) as RUF012; suppress it by adding a per-line noqa
comment to each list (e.g., append "# noqa: RUF012" to the dependencies = [...]
and operations = [...] declarations) or alternatively add a per-path/filename
ignore for migration files in your Ruff config; target the Migration class and
its dependencies and operations symbols when applying the suppression.
commcare_connect/audit/tests/test_calculations.py (1)

64-81: ⚡ Quick win

Manual _REGISTRY save/clear/restore duplicates the isolated_registry fixture in test_services.py

test_register_calculation_appends_instance replicates the same pattern as isolated_registry in test_services.py (save original → clear → manipulate → restore in finally). Per project coding guidelines, prefer fixtures over ad-hoc setup. Moving isolated_registry to the app-level conftest.py would let this test use it directly and eliminate the duplication.

♻️ Proposed refactor

In commcare_connect/audit/tests/conftest.py (create or add to existing):

import pytest
from commcare_connect.audit import calculations

`@pytest.fixture`
def isolated_registry():
    original = list(calculations._REGISTRY)
    calculations._REGISTRY.clear()
    yield
    calculations._REGISTRY[:] = original

Then in test_calculations.py:

-def test_register_calculation_appends_instance():
-    original = list(calculations._REGISTRY)
-    try:
-        calculations._REGISTRY.clear()
-
-        `@register_calculation`
-        class Dummy(AuditCalculation):
-            ...
-
-        registered = calculations.get_registered_calculations()
-        assert len(registered) == 1
-        assert isinstance(registered[0], Dummy)
-    finally:
-        calculations._REGISTRY[:] = original
+def test_register_calculation_appends_instance(isolated_registry):
+    `@register_calculation`
+    class Dummy(AuditCalculation):
+        ...
+
+    registered = calculations.get_registered_calculations()
+    assert len(registered) == 1
+    assert isinstance(registered[0], Dummy)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@commcare_connect/audit/tests/test_calculations.py` around lines 64 - 81, The
test test_register_calculation_appends_instance duplicates manual
save/clear/restore of calculations._REGISTRY; create a reusable pytest fixture
named isolated_registry that saves original = list(calculations._REGISTRY),
clears calculations._REGISTRY, yields, and restores it after the test, then
change test_register_calculation_appends_instance to accept isolated_registry as
a parameter and remove the try/finally block so the fixture handles
setup/teardown for calculations._REGISTRY.
commcare_connect/audit/tests/test_tasks.py (1)

9-9: ⚡ Quick win

AuditReportFactory side-effect import should live in conftest.py, not a test module

The # noqa: F401 (keeps factory importable) comment implies this import exists purely for its side effect (registering the factory with pytest or the factory-boy framework). Per coding guidelines, check conftest.py files for existing fixtures and prefer that location for registration-style imports so they apply across all tests in the app, not just this module.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@commcare_connect/audit/tests/test_tasks.py` at line 9, Move the side-effect
import of AuditReportFactory out of the test module and into the package-level
conftest.py so the factory registration applies globally; remove the line "from
commcare_connect.audit.tests.factories import AuditReportFactory  # noqa: F401"
from commcare_connect/audit/tests/test_tasks.py and add an equivalent import
statement in the app's conftest.py (or create one) to perform the import for
side-effect registration instead.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@commcare_connect/audit/calculations.py`:
- Around line 57-67: The run() method may call _in_range(None) if compute()
returns (None, sample_size) with sample_size >= min_sample_size, causing
TypeError; update run (around the compute/opportunity_access logic) to
explicitly guard for value is None after compute() and before calling _in_range:
when value is None, return a CalculationResult (same name/label) that sets
value=None, has_sufficient_data=False (or another explicit flag per existing
semantics) and in_range=True/False as appropriate so _in_range is never invoked
with None; reference the run, compute, _in_range, min_sample_size, and
CalculationResult symbols when making the change.

In `@commcare_connect/audit/services.py`:
- Around line 32-36: Replace the unconditional creation of AuditReport with an
idempotent lookup/create: use AuditReport.objects.get_or_create(...) keyed on
opportunity, period_start, and period_end instead of
AuditReport.objects.create(...). Update the surrounding logic that expects a
newly created instance (the variable report) to handle the (instance, created)
tuple returned by get_or_create and avoid duplicating reports when the task
retries or runs multiple times.

In `@commcare_connect/audit/tests/factories.py`:
- Line 25: AuditReportEntryFactory defines a mutable class-scope default results
= {} which is shared across instances; change the factory attribute to create a
new dict per instance by using factory.LazyFunction(dict) for the results field
(update the AuditReportEntryFactory definition to replace the class-scope
results = {} with results = factory.LazyFunction(dict) so each generated
instance gets its own dict).

---

Nitpick comments:
In `@commcare_connect/audit/migrations/0002_periodic_task.py`:
- Around line 30-42: Ruff flags the class-level mutable lists in the Migration
class (dependencies and operations) as RUF012; suppress it by adding a per-line
noqa comment to each list (e.g., append "# noqa: RUF012" to the dependencies =
[...] and operations = [...] declarations) or alternatively add a
per-path/filename ignore for migration files in your Ruff config; target the
Migration class and its dependencies and operations symbols when applying the
suppression.

In `@commcare_connect/audit/tests/test_calculations.py`:
- Around line 64-81: The test test_register_calculation_appends_instance
duplicates manual save/clear/restore of calculations._REGISTRY; create a
reusable pytest fixture named isolated_registry that saves original =
list(calculations._REGISTRY), clears calculations._REGISTRY, yields, and
restores it after the test, then change
test_register_calculation_appends_instance to accept isolated_registry as a
parameter and remove the try/finally block so the fixture handles setup/teardown
for calculations._REGISTRY.

In `@commcare_connect/audit/tests/test_services.py`:
- Around line 11-17: Move the isolated_registry pytest fixture out of
commcare_connect/audit/tests/test_services.py into
commcare_connect/audit/tests/conftest.py so it can be reused; specifically, keep
the same save/clear/restore logic that manipulates calculations._REGISTRY and
remove the duplicate manual registry handling in test_calculations.py (the code
in test_register_calculation_appends_instance) to use the shared
isolated_registry fixture instead. Ensure the fixture name remains
isolated_registry and that tests import nothing extra so pytest auto-discovers
it from conftest.py.

In `@commcare_connect/audit/tests/test_tasks.py`:
- Line 9: Move the side-effect import of AuditReportFactory out of the test
module and into the package-level conftest.py so the factory registration
applies globally; remove the line "from commcare_connect.audit.tests.factories
import AuditReportFactory  # noqa: F401" from
commcare_connect/audit/tests/test_tasks.py and add an equivalent import
statement in the app's conftest.py (or create one) to perform the import for
side-effect registration instead.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5ee5109a-58ce-40eb-a7b8-af67b454c430

📥 Commits

Reviewing files that changed from the base of the PR and between 15c5034 and c753875.

📒 Files selected for processing (19)
  • commcare_connect/audit/__init__.py
  • commcare_connect/audit/admin.py
  • commcare_connect/audit/apps.py
  • commcare_connect/audit/calculations.py
  • commcare_connect/audit/migrations/0001_initial.py
  • commcare_connect/audit/migrations/0002_periodic_task.py
  • commcare_connect/audit/migrations/__init__.py
  • commcare_connect/audit/models.py
  • commcare_connect/audit/services.py
  • commcare_connect/audit/tasks.py
  • commcare_connect/audit/tests/__init__.py
  • commcare_connect/audit/tests/factories.py
  • commcare_connect/audit/tests/test_calculations.py
  • commcare_connect/audit/tests/test_models.py
  • commcare_connect/audit/tests/test_services.py
  • commcare_connect/audit/tests/test_tasks.py
  • commcare_connect/flags/flag_names.py
  • commcare_connect/opportunity/migrations/0026_create_send_inactive_notification_periodic_task.py
  • config/settings/base.py

Comment on lines +57 to +67
def run(self, opportunity_access, period_start, period_end) -> CalculationResult:
value, sample_size = self.compute(opportunity_access, period_start, period_end)
has_sufficient_data = sample_size >= self.min_sample_size
if not has_sufficient_data:
return CalculationResult(
name=self.name,
label=self.label,
value=None,
has_sufficient_data=False,
in_range=True,
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

_in_range called with None value when compute returns (None, n≥min_sample_size)

run() only short-circuits when sample_size < min_sample_size. If a compute() implementation legitimately returns (None, sample_size) where sample_size >= min_sample_size, the code proceeds to _in_range(None), which raises TypeError on either bound comparison.

The docstring says "value may be None when sample_size == 0" — but there's no enforcement. If a future subclass's compute can return None for value with sample_size > 0, this will crash silently in production.

Consider an explicit guard:

🛡️ Proposed guard
     def run(self, opportunity_access, period_start, period_end) -> CalculationResult:
         value, sample_size = self.compute(opportunity_access, period_start, period_end)
         has_sufficient_data = sample_size >= self.min_sample_size
         if not has_sufficient_data:
             return CalculationResult(
                 name=self.name,
                 label=self.label,
                 value=None,
                 has_sufficient_data=False,
                 in_range=True,
             )
+        if value is None:
+            raise ValueError(
+                f"{type(self).__name__}.compute() returned value=None with sample_size={sample_size} "
+                f">= min_sample_size={self.min_sample_size}; value must not be None when data is sufficient."
+            )
         return CalculationResult(
             name=self.name,
             label=self.label,
             value=value,
             has_sufficient_data=True,
             in_range=self._in_range(value),
         )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@commcare_connect/audit/calculations.py` around lines 57 - 67, The run()
method may call _in_range(None) if compute() returns (None, sample_size) with
sample_size >= min_sample_size, causing TypeError; update run (around the
compute/opportunity_access logic) to explicitly guard for value is None after
compute() and before calling _in_range: when value is None, return a
CalculationResult (same name/label) that sets value=None,
has_sufficient_data=False (or another explicit flag per existing semantics) and
in_range=True/False as appropriate so _in_range is never invoked with None;
reference the run, compute, _in_range, min_sample_size, and CalculationResult
symbols when making the change.

Comment on lines +32 to +36
report = AuditReport.objects.create(
opportunity=opportunity,
period_start=period_start,
period_end=period_end,
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify whether AuditReport has a unique constraint on (opportunity, period_start, period_end)
fd -e py '0001_initial' --search-path commcare_connect/audit/migrations | xargs cat

Repository: dimagi/commcare-connect

Length of output: 4729


🏁 Script executed:

#!/bin/bash
# Check for any migrations after 0001_initial that might modify AuditReport constraints
fd -e py 'migrations/.*\.py$' --search-path commcare_connect/audit/migrations | sort | head -20

Repository: dimagi/commcare-connect

Length of output: 49


🏁 Script executed:

#!/bin/bash
# Examine the actual services.py file at lines 32-36 and surrounding context
sed -n '25,50p' commcare_connect/audit/services.py

Repository: dimagi/commcare-connect

Length of output: 737


🏁 Script executed:

#!/bin/bash
# List all migrations in the audit directory
ls -la commcare_connect/audit/migrations/ | grep -E '\.py$'

Repository: dimagi/commcare-connect

Length of output: 264


🏁 Script executed:

#!/bin/bash
# Check the second migration to see if it modifies AuditReport
cat commcare_connect/audit/migrations/0002_periodic_task.py

Repository: dimagi/commcare-connect

Length of output: 1256


AuditReport.objects.create() lacks idempotency guard — duplicate reports will silently accumulate if task retries or fires multiple times

The AuditReport model has no UniqueConstraint or unique_together on (opportunity, period_start, period_end). The unconditional .create() call means that if the task fires more than once for the same period (manual retrigger, a second beat instance, or worker restart with acks_late), fresh AuditReport rows are created for each opportunity without any deduplication.

Robust periodic Celery tasks must be idempotent. Use get_or_create() instead:

Proposed fix
-    report = AuditReport.objects.create(
-        opportunity=opportunity,
-        period_start=period_start,
-        period_end=period_end,
-    )
+    report, created = AuditReport.objects.get_or_create(
+        opportunity=opportunity,
+        period_start=period_start,
+        period_end=period_end,
+    )
+    if not created:
+        # Already generated for this period; skip to avoid duplicates.
+        return report
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@commcare_connect/audit/services.py` around lines 32 - 36, Replace the
unconditional creation of AuditReport with an idempotent lookup/create: use
AuditReport.objects.get_or_create(...) keyed on opportunity, period_start, and
period_end instead of AuditReport.objects.create(...). Update the surrounding
logic that expects a newly created instance (the variable report) to handle the
(instance, created) tuple returned by get_or_create and avoid duplicating
reports when the task retries or runs multiple times.


audit_report = factory.SubFactory(AuditReportFactory)
opportunity_access = factory.SubFactory(OpportunityAccessFactory)
results = {}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify mutable dict/list class attributes in factory files.
# Expected: this should report the current Line 25 occurrence at minimum.
rg -nP --type=py '^\s*[A-Za-z_][A-Za-z0-9_]*\s*=\s*(\{\}|\[\])\s*$' --iglob '**/tests/factories.py'

Repository: dimagi/commcare-connect

Length of output: 129


🏁 Script executed:

cat -n commcare_connect/audit/tests/factories.py | head -50

Repository: dimagi/commcare-connect

Length of output: 1010


🏁 Script executed:

# Verify there are no other similar issues in this file
wc -l commcare_connect/audit/tests/factories.py

Repository: dimagi/commcare-connect

Length of output: 112


🏁 Script executed:

# Check if there are other factory files in the tests directory to see if LazyFunction pattern is already used
rg -n 'LazyFunction' --type=py --iglob '**/tests/factories.py'

Repository: dimagi/commcare-connect

Length of output: 976


Use factory.LazyFunction(dict) instead of mutable class-scope default.

Line 25 assigns results = {} at class scope in AuditReportEntryFactory. This mutable object is shared across factory instances and can leak state between tests. Use factory.LazyFunction(dict) to create a new dict for each instance.

Suggested fix
 class AuditReportEntryFactory(DjangoModelFactory):
     class Meta:
         model = AuditReportEntry
 
     audit_report = factory.SubFactory(AuditReportFactory)
     opportunity_access = factory.SubFactory(OpportunityAccessFactory)
-    results = {}
+    results = factory.LazyFunction(dict)
     flagged = False
🧰 Tools
🪛 Ruff (0.15.12)

[warning] 25-25: Mutable default value for class attribute

(RUF012)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@commcare_connect/audit/tests/factories.py` at line 25,
AuditReportEntryFactory defines a mutable class-scope default results = {} which
is shared across instances; change the factory attribute to create a new dict
per instance by using factory.LazyFunction(dict) for the results field (update
the AuditReportEntryFactory definition to replace the class-scope results = {}
with results = factory.LazyFunction(dict) so each generated instance gets its
own dict).

assert AuditReportEntry.objects.filter(audit_report=report).count() == 0


def test_period_for_monday_returns_previous_week():
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This and the following two tests look like they can all be parametrized as they all share a very similar structure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants