Experiment: Clone DB to isolate tests by PeterNerlich · Pull Request #3872 · digitalfabrik/integreat-cms

PeterNerlich · 2025-09-09T08:49:09Z

Short description

This is one ongoing experiment trying to reign in the growing chaos in our tests

Proposed changes

Implement a function snapshot_db() to clone the current database and switch out djangos connection, and reverting it afterwards
Provide a session scoped fixture to make an empty database snapshot (all other fixtures depend on this one way or another so we can be sure that this will actually be an empty one)
Provide a session scoped fixture to make a snapshot and fill it with the test data
Provide a function scoped fixture to just make a snapshot. This is what every test needing the db should use, and replaces the load_test_data fixture for tests

This experiment is still Work In Progress.

Side effects

Side effects between tests should be mitigated with this

Faithfulness to issue description and design

There are no intended deviations from the issue and design.

Resolved issues

Fixes: #3777

Pull Request Review Guidelines

dkehne · 2026-01-26T21:40:22Z

Analysis & Suggestions for Database Cloning Approach

This is a great approach! Database cloning for test isolation is being discussed upstream in Django as a faster alternative to serialized_rollback. Here's my analysis:

Issues Found

1. Missing Imports

The snapshot_db() function uses sqlite3 and urllib.parse but they're not imported:

import sqlite3
import urllib.parse

2. Settings Dict Reference Issue

prev_settings[db_name] = conn.settings_dict  # This is a reference, not a copy!

When conn.settings_dict is modified later (e.g., conn.settings_dict["NAME"] = sandbox_uri), prev_settings will point to the modified dict. Use:

prev_settings[db_name] = conn.settings_dict.copy()

3. Potential UnboundLocalError

test_database_name is assigned inside the loop but used in cleanup. If databases is empty, this would fail:

test_database_name = None  # Already there, good

# But in cleanup:
conn.creation.destroy_test_db(old_database_name=test_database_name)  # Could be None

Consider storing per-database:

prev_db_names = {}
# ...
prev_db_names[db_name] = conn.settings_dict["NAME"]

4. SQLite In-Memory Reconnection Order

conn.close()
conn.connect()  # reconnect before closing so we don't lose the db
target.close()

The comment says "reconnect before closing" but conn.close() is called first. This might work but the logic is confusing. Consider:

# Keep target connection open until Django reconnects
conn.settings_dict["NAME"] = sandbox_uri
conn.close()
conn.connect()  # Now connected to the clone
# Safe to close the backup connections
source.close()
target.close()

Architectural Suggestions

1. Consider Using `@contextmanager` Decorator

Instead of manually calling contextmanager(snapshot_db), define it as one:

from contextlib import contextmanager

@contextmanager
def snapshot_db(django_db_blocker, suffix="snap", databases=(DEFAULT_DB_ALIAS,)):
    # ... setup ...
    try:
        yield
    finally:
        # ... cleanup ...

2. Fixture Dependency Simplification

The current pattern requires tests to use both test_data_db_snapshot AND db_snapshot:

def test_foo(test_data_db_snapshot: None, db_snapshot: None):

Consider a single fixture that handles the hierarchy:

@pytest.fixture(scope="function")
def isolated_test_db(test_data_db_snapshot, django_db_blocker):
    """Provides an isolated database with test data for each test."""
    yield from snapshot_db(django_db_blocker, suffix="test")

Then tests just need:

def test_foo(isolated_test_db):

3. Add Error Handling

Wrap cleanup in try/finally to ensure database is restored even if test crashes:

try:
    yield
finally:
    with django_db_blocker.unblock():
        for db_name in databases:
            try:
                conn = connections[db_name]
                conn.creation.destroy_test_db(old_database_name=prev_db_names[db_name])
            except Exception as e:
                logger.warning(f"Failed to cleanup test db {db_name}: {e}")
            finally:
                conn.close()
                conn.settings_dict = prev_settings[db_name]
                conn.connect()

Performance Consideration

For PostgreSQL, clone_test_db() uses CREATE DATABASE ... TEMPLATE which is fast. For large test suites, consider:

Parallel test workers: Each worker can have its own clone suffix
Caching: If test data fixtures are expensive, the session-scoped test_data_db_snapshot is the right approach

References

Note: This analysis was done with AI assistance (Claude Code).

MizukiTemma · 2026-03-24T16:33:03Z

This PR has not been worked for a long time but is usuful as inspiration for test improvement: see the conversation here

PeterNerlich added 2 commits June 15, 2025 17:51

WIP test db snapshot fixture

87e54c4

WIP use new fixture everywhere

a50edcf

PeterNerlich assigned hannaseithe and MizukiTemma Sep 9, 2025

PeterNerlich changed the title ~~Experiment:~~ Experiment: Clone DB to isolate tests Sep 9, 2025

hannaseithe mentioned this pull request Oct 16, 2025

Investigate the Flakiness of tests on the CircleCI pipeline #3949

Open

PeterNerlich mentioned this pull request Oct 22, 2025

Refactor dummy_region fixture as function #3950

Merged

MizukiTemma added the stale This PR has been stale for a while (~3 months). This is a first warning. label Jan 26, 2026

hannaseithe removed their assignment Feb 2, 2026

osmers added this to the Next milestone Mar 17, 2026

jarlhengstmengel removed this from the Next milestone Mar 17, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experiment: Clone DB to isolate tests#3872

Experiment: Clone DB to isolate tests#3872
PeterNerlich wants to merge 2 commits intodevelopfrom
tests/clonedb

PeterNerlich commented Sep 9, 2025

Uh oh!

dkehne commented Jan 26, 2026

Uh oh!

MizukiTemma commented Mar 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

PeterNerlich commented Sep 9, 2025

Short description

Proposed changes

Side effects

Faithfulness to issue description and design

Resolved issues

Uh oh!

dkehne commented Jan 26, 2026

Analysis & Suggestions for Database Cloning Approach

Issues Found

1. Missing Imports

2. Settings Dict Reference Issue

3. Potential UnboundLocalError

4. SQLite In-Memory Reconnection Order

Architectural Suggestions

1. Consider Using @contextmanager Decorator

2. Fixture Dependency Simplification

3. Add Error Handling

Performance Consideration

References

Uh oh!

MizukiTemma commented Mar 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

1. Consider Using `@contextmanager` Decorator