-
Notifications
You must be signed in to change notification settings - Fork 33
Lock names include attributes values #6888
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: stable
Are you sure you want to change the base?
Conversation
Warning Rate limit exceeded@fatih-acar has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 11 minutes and 43 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (6)
WalkthroughThe changes add a centralized constraint/validation-and-save helper ( Pre-merge checks and finishing touches and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
CodSpeed Performance ReportMerging #6888 will not alter performanceComparing Summary
Footnotes |
6206bb8
to
2628683
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (6)
backend/infrahub/core/node/save.py (1)
10-26
: Add documentation and consider error handling.The implementation looks good, but consider these improvements:
- Add a docstring explaining the function's purpose, parameters, and behavior
- Consider what happens if constraint checks fail - will the locks be properly released?
Add documentation:
async def run_constraints_and_save( node: Node, node_constraint_runner: NodeConstraintRunner, fields_to_validate: list[str], fields_to_save: list[str], db: InfrahubDatabase, branch: Branch, skip_uniqueness_check: bool = False, ) -> None: + """Run constraint checks and save node within a locked context. + + This function acquires locks based on the node's schema and uniqueness constraints, + runs constraint validation, and saves the node to the database. + + Args: + node: The node to validate and save + node_constraint_runner: Runner for constraint checks + fields_to_validate: Fields to validate during constraint checking + fields_to_save: Fields to save (empty list saves all fields) + db: Database connection + branch: Branch context + skip_uniqueness_check: Whether to skip uniqueness constraint checks + + Raises: + Any exceptions from constraint checks or save operations + """ schema_branch = db.schema.get_schema_branch(name=branch.name)backend/infrahub/graphql/mutations/ipam.py (1)
122-131
: Consider reducing code duplication in reconciliation calls.The if/else pattern for conditional locking appears multiple times with nearly identical reconciliation calls. Consider extracting this pattern to reduce duplication.
Consider creating a helper method to encapsulate the conditional locking pattern:
+ @classmethod + async def _reconcile_with_optional_lock( + cls, + reconciler: IpamReconciler, + ip_value: IPv4Interface | ipaddress.IPv6Interface | ipaddress.IPv4Network | ipaddress.IPv6Network, + namespace: str, + node_uuid: str, + lock_name: str | None = None, + is_delete: bool = False, + ) -> Node: + """Run reconciliation with optional locking.""" + reconcile_kwargs = { + "ip_value": ip_value, + "namespace": namespace, + "node_uuid": node_uuid, + } + if hasattr(reconciler.reconcile, 'is_delete'): + reconcile_kwargs["is_delete"] = is_delete + + if lock_name: + async with InfrahubMultiLock(lock_registry=lock.registry, locks=[lock_name]): + return await reconciler.reconcile(**reconcile_kwargs) + else: + return await reconciler.reconcile(**reconcile_kwargs) # Then use it in the methods: - if lock_name := cls._get_lock_name(namespace_id, branch): - async with InfrahubMultiLock(lock_registry=lock.registry, locks=[lock_name]): - reconciled_address = await reconciler.reconcile( - ip_value=ip_address, namespace=namespace_id, node_uuid=address.get_id() - ) - else: - reconciled_address = await reconciler.reconcile( - ip_value=ip_address, namespace=namespace_id, node_uuid=address.get_id() - ) + lock_name = cls._get_lock_name(namespace_id, branch) + reconciled_address = await cls._reconcile_with_optional_lock( + reconciler=reconciler, + ip_value=ip_address, + namespace=namespace_id, + node_uuid=address.get_id(), + lock_name=lock_name, + )Also applies to: 168-177
backend/infrahub/lock_utils.py (4)
47-47
: Fix docstring to reference the correct constant nameThe docstring references
KINDS_TO_LOCK_ON_ANY_BRANCH
but the actual constant isKINDS_CONCURRENT_MUTATIONS_NOT_ALLOWED
.- Check whether kind or any kind generic is in KINDS_TO_LOCK_ON_ANY_BRANCH. + Check whether kind or any kind generic is in KINDS_CONCURRENT_MUTATIONS_NOT_ALLOWED.
16-16
: Clarify the docstring about constraint override behaviorThe docstring is misleading. It states "we only need to lock on the generic" but the implementation removes the node schema kind from the list while keeping both the node kind and generic kinds when appropriate.
- it means node schema overrided this constraint, in which case we only need to lock on the generic. + it means node schema overrode this constraint, in which case we exclude the node schema kind from locking.
23-41
: Potential issue with list manipulation logicThe use of
pop(0)
on line 40 assumes thatkinds[0]
is alwaysnode_schema.kind
, but this is only guaranteed ifnode_schema.uniqueness_constraints
is truthy (lines 23-25). If the logic changes in the future, this could lead to removing the wrong item.Consider using explicit removal instead:
- kinds.pop(0) + if node_schema.kind in kinds: + kinds.remove(node_schema.kind)
66-66
: Fix grammatical error in docstring- concurrent mutations are only allowed on non-main branch as objects validations will be performed at least when merging in main branch. + concurrent mutations are only allowed on non-main branch as object validations will be performed at least when merging in main branch.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
backend/infrahub/core/node/create.py
(2 hunks)backend/infrahub/core/node/save.py
(1 hunks)backend/infrahub/graphql/mutations/ipam.py
(8 hunks)backend/infrahub/graphql/mutations/main.py
(4 hunks)backend/infrahub/lock_utils.py
(1 hunks)backend/tests/unit/core/test_get_kinds_lock.py
(2 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
backend/**/*
📄 CodeRabbit Inference Engine (.github/copilot-instructions.md)
Run backend tests with
pytest
or viainvoke
tasks
Files:
backend/infrahub/core/node/create.py
backend/infrahub/core/node/save.py
backend/tests/unit/core/test_get_kinds_lock.py
backend/infrahub/graphql/mutations/main.py
backend/infrahub/graphql/mutations/ipam.py
backend/infrahub/lock_utils.py
backend/tests/**/*
📄 CodeRabbit Inference Engine (.github/copilot-instructions.md)
Place backend tests in
backend/tests/
Files:
backend/tests/unit/core/test_get_kinds_lock.py
🧠 Learnings (1)
📓 Common learnings
Learnt from: CR
PR: opsmill/infrahub#0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-07-22T08:13:56.198Z
Learning: Backend implements the core graph-based infrastructure datastore, API, and business logic in `backend/infrahub/`
Learnt from: CR
PR: opsmill/infrahub#0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-07-22T08:13:56.198Z
Learning: Infrahub offers a central hub to manage the data, templates and playbooks that powers your infrastructure by combining the version control and branch management capabilities similar to Git with the flexible data model and UI of a graph database
🔇 Additional comments (6)
backend/infrahub/core/node/create.py (1)
156-163
: Good consolidation of constraint checking and saving logic.The refactoring to use
run_constraints_and_save
improves code maintainability by encapsulating locking, constraint validation, and saving into a single atomic operation. This ensures better data integrity during node creation.backend/tests/unit/core/test_get_kinds_lock.py (1)
39-83
: Excellent test coverage for the new locking behavior.The new tests comprehensively cover:
- Branch-specific locking behavior (
test_lock_other_branch
)- Cross-branch locking requirements (
test_lock_groups_on_other_branches
)- Attribute-based uniqueness constraint lock names (
test_lock_names_only_attributes
)The test structure is clear and covers important edge cases for the refactored locking logic.
backend/infrahub/graphql/mutations/ipam.py (2)
104-109
: Good consolidation of lock name logic.The
_get_lock_name
method properly encapsulates the logic for determining when locking is needed, returningNone
for non-default branches to avoid unnecessary locking.
259-264
: Clear documentation of IP prefix locking requirements.The comment explains why IP prefix mutations need locking on all branches due to cardinality-one relationships, which is helpful for understanding the different locking strategy compared to IP addresses.
backend/infrahub/graphql/mutations/main.py (2)
268-276
: Excellent refactoring to use centralized constraint and save logic.The replacement of separate constraint checking and saving with
run_constraints_and_save
improves code consistency and ensures that locking is properly handled at the core layer rather than in the GraphQL mutation layer.
146-151
: Good removal of locking logic from wrapper methods.The simplified wrapper methods now focus solely on their primary responsibility without concerning themselves with locking details. This improves separation of concerns and makes the code more maintainable.
Also applies to: 207-223
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you please add a PR description that explains why the changes in _get_kind_lock_names_on_object_mutation
will help with the locking slowness?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
usually utils
means that a file contains a bunch of things that don't really belong together and I don't think that is the case here. I think this file can be locks.py
or even lock_getter.py
backend/infrahub/lock_utils.py
Outdated
return False | ||
|
||
|
||
def _get_lock_names_on_object_mutation(node: Node, branch: Branch, schema_branch: SchemaBranch) -> list[str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at least one of these should be a public method and I think it's this one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right
Quick remark: we should be very careful with user inputs, in this case you are lucky that Redis keys are very permissive regarding character set. I'm not sure it would be the case with all implementations. We can instead of using the attribute value directly, use some kind of hex hash of it. Edit: not saying we should do it right now, I don't even know performance implications of such a change. |
@fatih-acar good point, I'll update to use the hash, I don't expect having performances issues on that, hashing a 1 million length string takes less than 1ms. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🔭 Outside diff range comments (1)
backend/infrahub/lock_getter.py (1)
63-109
:⚠️ Use a stable hash function instead of Python's built-in hash()The implementation correctly uses hashing as discussed in the PR comments. However, Python's built-in
hash()
function is not stable across Python processes due to hash randomization. This could lead to different lock names for the same data across different service instances or restarts.Consider using a cryptographic hash function like SHA-256 or a non-cryptographic but stable hash like xxhash.
+import hashlib + def get_lock_names_on_object_mutation(node: Node, branch: Branch, schema_branch: SchemaBranch) -> list[str]: """ Return lock names for object on which we want to avoid concurrent mutation (create/update). Except for some specific kinds, concurrent mutations are only allowed on non-main branch as objects validations will be performed at least when merging in main branch. Lock names include kind, some generic kinds, and values of attributes of corresponding uniqueness constraints. """ if not branch.is_default and not _should_kind_be_locked_on_any_branch(node.get_kind(), schema_branch): return [] lock_kinds = _get_kinds_to_lock_on_object_mutation(node.get_kind(), schema_branch) lock_names = [] for kind in lock_kinds: schema = schema_branch.get(name=kind) ucs = schema.uniqueness_constraints if ucs is None: continue ucs_lock_names = [] uc_attributes_names = set() for uc in ucs: uc_attributes_values = [] # Keep only attributes constraints for field_path in uc: # Some attributes may exist in different uniqueness constraints, we de-duplicate them if field_path in uc_attributes_names: continue # Exclude relationships uniqueness constraints schema_path = schema.parse_schema_path(path=field_path, schema=schema_branch) if schema_path.related_schema is not None or schema_path.attribute_schema is None: continue uc_attributes_names.add(field_path) - value_hashed = str(hash(str(getattr(node, schema_path.attribute_schema.name).value))) + value_str = str(getattr(node, schema_path.attribute_schema.name).value) + value_hashed = hashlib.sha256(value_str.encode()).hexdigest()[:16] # Use first 16 chars for brevity uc_attributes_values.append(value_hashed) if uc_attributes_values: uc_lock_name = ".".join(uc_attributes_values) ucs_lock_names.append(uc_lock_name) partial_lock_name = kind + "." + ".".join(ucs_lock_names) lock_names.append(build_object_lock_name(partial_lock_name)) return lock_names
🧹 Nitpick comments (2)
backend/tests/unit/core/test_get_kinds_lock.py (1)
64-83
: Verify the exclusion of owner relationship from lock nameThe uniqueness constraint includes three paths:
["name__value", "color__value", "owner__name"]
, but the expected lock name only includes hashes for "mercedes" and "blue", excluding the owner relationship.Based on the PR objectives stating "this approach is not applied to relationships", this appears intentional. However, consider adding a comment in the test to clarify this behavior for future maintainers.
car_person_schema_unregistered.nodes[0].uniqueness_constraints = [ ["name__value", "color__value", "owner__name"] ] registry.schema.register_schema(schema=car_person_schema_unregistered, branch=default_branch.name) schema_branch = registry.schema.get_schema_branch(name=default_branch.name) person = await create_and_save(db=db, schema="TestPerson", name="John") car = await create_and_save(db=db, schema="TestCar", name="mercedes", color="blue", owner=person) +# Note: owner__name is excluded from lock names as relationships are not included in lock calculations assert get_lock_names_on_object_mutation(car, branch=default_branch, schema_branch=schema_branch) == [ "global.object.TestCar." + str(hash("mercedes")) + "." + str(hash("blue")) ]
backend/infrahub/lock_getter.py (1)
11-43
: Consider refactoring the list modification logic for better clarityThe function logic is correct, but using
pop(0)
while building the list could be clearer. Consider building the final list without modification.def _get_kinds_to_lock_on_object_mutation(kind: str, schema_branch: SchemaBranch) -> list[str]: """ Return kinds for which we want to lock during creating / updating an object of a given schema node. Lock should be performed on schema kind and its generics having a uniqueness_constraint defined. If a generic uniqueness constraint is the same as the node schema one, it means node schema overrided this constraint, in which case we only need to lock on the generic. """ node_schema = schema_branch.get(name=kind) schema_uc = None - kinds = [] + include_node_kind = False if node_schema.uniqueness_constraints: - kinds.append(node_schema.kind) + include_node_kind = True schema_uc = node_schema.uniqueness_constraints if isinstance(node_schema, GenericSchema): - return kinds + return [node_schema.kind] if include_node_kind else [] generics_kinds = node_schema.inherit_from - node_schema_kind_removed = False + kinds = [] for generic_kind in generics_kinds: generic_uc = schema_branch.get(name=generic_kind).uniqueness_constraints if generic_uc: kinds.append(generic_kind) - if not node_schema_kind_removed and generic_uc == schema_uc: + if include_node_kind and generic_uc == schema_uc: # Check whether we should remove original schema kind as it simply overrides uniqueness_constraint # of a generic - kinds.pop(0) - node_schema_kind_removed = True + include_node_kind = False + + if include_node_kind: + kinds.insert(0, node_schema.kind) + return kinds
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
backend/infrahub/core/node/save.py
(1 hunks)backend/infrahub/lock_getter.py
(1 hunks)backend/tests/unit/core/test_get_kinds_lock.py
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/infrahub/core/node/save.py
🧰 Additional context used
📓 Path-based instructions (2)
backend/tests/**/*
📄 CodeRabbit Inference Engine (.github/copilot-instructions.md)
Place backend tests in
backend/tests/
Files:
backend/tests/unit/core/test_get_kinds_lock.py
backend/**/*
📄 CodeRabbit Inference Engine (.github/copilot-instructions.md)
Run backend tests with
pytest
or viainvoke
tasks
Files:
backend/tests/unit/core/test_get_kinds_lock.py
backend/infrahub/lock_getter.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: E2E-testing-version-upgrade / From 1.1.0
- GitHub Check: backend-benchmark
- GitHub Check: backend-docker-integration
- GitHub Check: E2E-testing-invoke-demo-start
- GitHub Check: validate-generated-documentation
- GitHub Check: E2E-testing-playwright
- GitHub Check: backend-tests-functional
- GitHub Check: backend-tests-integration
- GitHub Check: backend-tests-unit
🔇 Additional comments (6)
backend/tests/unit/core/test_get_kinds_lock.py (4)
1-12
: Imports look good!The new imports are appropriate for testing the refactored lock functionality.
16-38
: Test coverage for kind locking logic is comprehensive!The test properly covers different scenarios: inherited uniqueness constraints, generic relationships, and cases with no constraints.
39-51
: Good test for branch-specific locking behavior!This test correctly verifies that objects on non-default branches return empty lock names, which is the expected behavior.
52-63
: No changes required: CoreGraphQLQueryGroup is correctly locked via inheritanceThe
CoreGraphQLQueryGroup
schema definition inherits fromInfrahubKind.GENERICGROUP
, so_should_kind_be_locked_on_any_branch("CoreGraphQLQueryGroup", …)
rightly returnsTrue
.
No update needed here.backend/infrahub/lock_getter.py (2)
1-9
: Imports and constant definition look good!The constant clearly identifies kinds that require special locking behavior across all branches.
45-61
: Clean implementation for branch-agnostic locking check!The function efficiently checks both the kind and its inherited kinds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
backend/infrahub/lock_getter.py (2)
37-41
: Consider using a more explicit approach for removing the node schema kind.The use of
kinds.pop(0)
assumes that the node schema kind is always at index 0. While this is currently true due to the order of operations, it makes the code fragile and dependent on ordering.Consider storing the node schema kind separately and removing it explicitly:
node_schema_kind_removed = False for generic_kind in generics_kinds: generic_uc = schema_branch.get(name=generic_kind).uniqueness_constraints if generic_uc: kinds.append(generic_kind) if not node_schema_kind_removed and generic_uc == schema_uc: # Check whether we should remove original schema kind as it simply overrides uniqueness_constraint # of a generic - kinds.pop(0) + if node_schema.kind in kinds: + kinds.remove(node_schema.kind) node_schema_kind_removed = True
112-113
: Handle edge case when all uniqueness constraints are relationship-based.When all uniqueness constraints are relationship-based (and thus skipped),
ucs_lock_names
will be empty, resulting in lock names with consecutive dots like"kind.."
.Consider handling this edge case:
- partial_lock_name = kind + "." + ".".join(ucs_lock_names) + if ucs_lock_names: + partial_lock_name = kind + "." + ".".join(ucs_lock_names) + else: + partial_lock_name = kindThis ensures consistent lock name formatting.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
backend/infrahub/lock_getter.py
(1 hunks)backend/tests/unit/core/test_get_kinds_lock.py
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/tests/unit/core/test_get_kinds_lock.py
🧰 Additional context used
📓 Path-based instructions (1)
backend/**/*
📄 CodeRabbit Inference Engine (.github/copilot-instructions.md)
Run backend tests with
pytest
or viainvoke
tasks
Files:
backend/infrahub/lock_getter.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: E2E-testing-version-upgrade / From 1.1.0
- GitHub Check: backend-docker-integration
- GitHub Check: E2E-testing-playwright
- GitHub Check: validate-generated-documentation
- GitHub Check: backend-tests-functional
- GitHub Check: backend-benchmark
- GitHub Check: E2E-testing-invoke-demo-start
- GitHub Check: backend-tests-unit
- GitHub Check: backend-tests-integration
🔇 Additional comments (2)
backend/infrahub/lock_getter.py (2)
1-9
: LGTM!Imports and constant definition are appropriate for the module's functionality.
45-61
: LGTM!The function correctly checks if a kind or its generics require locking on any branch.
e973da4
to
5f677ea
Compare
I think there's an issue with objects depending on a resource manager such as NumberPool... I end up with duplicate numbers allocated. I think we still require to lock exclusively on kind when the object uses resource manager... Edit: this issue is NOT related to #6866 |
#7025 is not enough to fix the issue because the |
80f493f
to
40b213b
Compare
b8a059b
to
1c1a372
Compare
Fixes #7254 |
We can probably re-enable locking within branches now that they are more granular? |
This reverts commit fca14f2.
We should also lock regardless of the branch when allocating from a resource manager. Signed-off-by: Fatih Acar <[email protected]>
BROKEN: this does not handle double lock/deadlock issues when we acquire a resource pool lock in run_constraints_and_save() and then in get_resource(). We probably require a mechanism to check if the lock is actually held by the current worker, if so it can proceed without acquiring the lock. This reverts commit 31d7940.
Make locks re-entrant within the same asyncio context. Signed-off-by: Fatih Acar <[email protected]>
Signed-off-by: Fatih Acar <[email protected]>
WIP: still need to fix the uniqueness constraint lock on update. Signed-off-by: Fatih Acar <[email protected]>
Signed-off-by: Fatih Acar <[email protected]>
Fixes "Lock names include attributes values" Signed-off-by: Fatih Acar <[email protected]>
5a949a9
to
746a0d1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
backend/infrahub/core/attribute.py
(2 hunks)backend/infrahub/core/node/__init__.py
(9 hunks)backend/infrahub/core/node/create.py
(4 hunks)backend/infrahub/core/node/save.py
(1 hunks)backend/infrahub/graphql/mutations/ipam.py
(10 hunks)backend/infrahub/graphql/mutations/main.py
(5 hunks)backend/infrahub/lock.py
(3 hunks)backend/infrahub/lock_getter.py
(1 hunks)backend/tests/unit/core/test_get_kinds_lock.py
(2 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
backend/**/*
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Run backend tests with
pytest
or viainvoke
tasks
Files:
backend/infrahub/lock_getter.py
backend/infrahub/core/attribute.py
backend/infrahub/graphql/mutations/main.py
backend/infrahub/graphql/mutations/ipam.py
backend/infrahub/core/node/save.py
backend/infrahub/lock.py
backend/infrahub/core/node/__init__.py
backend/infrahub/core/node/create.py
backend/tests/unit/core/test_get_kinds_lock.py
**/*.py
📄 CodeRabbit inference engine (.github/instructions/python-docstring.instructions.md)
**/*.py
: Use triple double quotes (""") for all Python docstrings
Write docstrings in Google-style format
Include a brief one-line description at the top of each docstring
Add a detailed description section when additional context is needed
Document function/method parameters under an Args/Parameters section without typing information
Include a Returns section describing the return value
Include a Raises section listing possible exceptions
Provide an Examples section demonstrating usage when helpful
**/*.py
: Use type hints for all function parameters and return values in Python code
Prefer asynchronous implementations when feasible (use Async whenever possible)
Define asynchronous functions withasync def
Useawait
for asynchronous calls
Use Pydantic models instead of standard library dataclasses
All Python functions must include a docstring
Use Ruff and MyPy for linting and type checkingUse ruff and mypy to validate and lint Python files
**/*.py
: Use type hints for all Python function parameters and return values
Use async/await whenever possible in Python code
Useasync def
for asynchronous functions
Useawait
for asynchronous calls within async functions
Use Pydantic models instead of dataclasses for data modeling
All Python functions must have docstrings
Docstrings must use triple quotes (""")
Follow Google-style docstring format
Docstrings should include, when applicable: brief description, detailed description, Args/Parameters (without types), Returns, Raises, and Examples
Validate and lint Python with ruff and mypy
Format Python code using the project’s formatter (invoke format)
Files:
backend/infrahub/lock_getter.py
backend/infrahub/core/attribute.py
backend/infrahub/graphql/mutations/main.py
backend/infrahub/graphql/mutations/ipam.py
backend/infrahub/core/node/save.py
backend/infrahub/lock.py
backend/infrahub/core/node/__init__.py
backend/infrahub/core/node/create.py
backend/tests/unit/core/test_get_kinds_lock.py
backend/tests/**/*
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Place backend tests in
backend/tests/
Files:
backend/tests/unit/core/test_get_kinds_lock.py
🧬 Code graph analysis (9)
backend/infrahub/lock_getter.py (4)
backend/infrahub/core/schema/schema_branch.py (1)
SchemaBranch
(72-2314)backend/infrahub/lock.py (2)
build_object_lock_name
(291-292)get
(262-268)backend/infrahub/core/relationship/model.py (1)
RelationshipManager
(718-1242)backend/infrahub/core/schema/basenode_schema.py (1)
parse_schema_path
(426-474)
backend/infrahub/core/attribute.py (1)
backend/infrahub/core/node/__init__.py (2)
from_graphql
(966-994)handle_pool
(260-321)
backend/infrahub/graphql/mutations/main.py (6)
backend/infrahub/core/node/__init__.py (4)
save
(814-824)get_kind
(89-91)from_graphql
(966-994)Node
(72-1062)backend/infrahub/core/node/save.py (1)
run_constraints_and_save
(12-57)backend/infrahub/lock.py (1)
InfrahubMultiLock
(49-73)backend/infrahub/lock_getter.py (1)
get_lock_names_on_object_mutation
(75-146)backend/infrahub/core/node/create.py (2)
get_profile_ids
(116-120)refresh_for_profile_update
(123-144)backend/infrahub/core/manager.py (4)
NodeManager
(132-1360)get_one_by_id_or_default_filter
(869-882)get_one_by_id_or_default_filter
(886-899)get_one_by_id_or_default_filter
(902-949)
backend/infrahub/graphql/mutations/ipam.py (5)
backend/infrahub/core/node/create.py (1)
get_profile_ids
(116-120)backend/infrahub/lock.py (2)
InfrahubMultiLock
(49-73)build_object_lock_name
(291-292)backend/infrahub/lock_getter.py (1)
get_lock_names_on_object_mutation
(75-146)backend/infrahub/core/node/__init__.py (2)
Node
(72-1062)from_graphql
(966-994)backend/infrahub/graphql/mutations/main.py (1)
mutate_update_object
(282-356)
backend/infrahub/core/node/save.py (5)
backend/infrahub/core/constraint/node/runner.py (1)
NodeConstraintRunner
(13-43)backend/infrahub/core/node/__init__.py (2)
Node
(72-1062)save
(814-824)backend/infrahub/lock.py (1)
InfrahubMultiLock
(49-73)backend/infrahub/lock_getter.py (1)
get_lock_names_on_object_mutation
(75-146)backend/infrahub/core/node/create.py (1)
_persist
(198-212)
backend/infrahub/lock.py (1)
backend/infrahub/core/timestamp.py (1)
current_timestamp
(33-34)
backend/infrahub/core/node/__init__.py (2)
backend/infrahub/core/protocols_base.py (3)
InfrahubDatabase
(35-65)new
(84-84)from_graphql
(104-104)backend/infrahub/core/attribute.py (2)
validate
(194-203)from_graphql
(581-629)
backend/infrahub/core/node/create.py (5)
backend/infrahub/core/constraint/node/runner.py (1)
NodeConstraintRunner
(13-43)backend/infrahub/core/node/save.py (2)
run_constraints_and_save
(12-57)_persist
(47-51)backend/infrahub/dependencies/registry.py (1)
get_component_registry
(61-62)backend/infrahub/lock.py (1)
InfrahubMultiLock
(49-73)backend/infrahub/lock_getter.py (1)
get_lock_names_on_object_mutation
(75-146)
backend/tests/unit/core/test_get_kinds_lock.py (3)
backend/infrahub/lock_getter.py (4)
_get_kinds_to_lock_on_object_mutation
(17-48)_hash
(69-72)_should_kind_be_locked_on_any_branch
(51-66)get_lock_names_on_object_mutation
(75-146)backend/tests/node_creation.py (1)
create_and_save
(8-12)backend/tests/conftest.py (1)
car_person_schema_unregistered
(497-563)
🔇 Additional comments (36)
backend/infrahub/lock.py (3)
135-136
: LGTM! Re-entrant lock state initialization.The new instance variables
_owner_task
and_recursion_depth
are initialized correctly to support per-task ownership tracking and recursive locking semantics.
159-177
: Verify metrics behavior on re-entrant acquisition.The re-entrant locking logic correctly increments
_recursion_depth
and returns early when the current task already owns the lock. However,acquire_time
is not reset, and no acquisition metric is recorded for re-entrant calls. Confirm that this is the intended behavior, as re-entrant acquisitions bypass the lock acquisition metric instrumentation and do not updateacquire_time
.
178-202
: LGTM! Recursive release logic with ownership enforcement.The release logic correctly enforces ownership, supports recursive release by decrementing
_recursion_depth
, and only performs final cleanup (metrics, underlying lock release, ownership clear, event signal) on the final release when recursion depth reaches zero.backend/infrahub/core/node/__init__.py (4)
435-440
: LGTM! Propagatingprocess_pools
parameter.The
_process_fields
method now acceptsprocess_pools: bool = True
and correctly propagates it to_process_fields_attributes
at line 494.
531-536
: LGTM!process_pools
parameter accepted and used.The
_process_fields_attributes
method now requiresprocess_pools: bool
as a parameter (no default value), which is correctly provided by the caller_process_fields
at line 494.
966-994
: LGTM!from_graphql
updated withprocess_pools
parameter.The
from_graphql
method signature now includesprocess_pools: bool = True
, and the docstring (lines 972-981) correctly documents the new parameter. The parameter is correctly propagated toattribute.from_graphql
at line 988.
561-569
: Remove gating concern: validation always occurs later.The initial gate around the first
attribute.validate
is safe because a second, unconditional call toattribute.validate
at line 644 ensures all attributes are validated regardless ofprocess_pools
orfrom_pool
.Likely an incorrect or invalid review comment.
backend/infrahub/core/attribute.py (1)
598-602
: Capture and propagate errors fromhandle_pool
.
In backend/infrahub/core/attribute.py line 601, you pass an ephemeralerrors=[]
toawait self.node.handle_pool(...)
, causing any allocation or validation errors to be lost. Either collect these errors into a local variable and handle or re-raise them, or switch to using the shared errors list pattern from the other call site.backend/infrahub/core/node/create.py (4)
3-3
: LGTM! Imports align with the new locking strategy.The new imports support the centralized locking and constraint validation flow.
Also applies to: 5-5, 11-11, 14-15
198-220
: LGTM! Multi-lock context properly guards persistence.The nested
_persist
function andInfrahubMultiLock
context manager ensure that:
- Locks are computed before acquisition (via preview object)
- Lock acquisition is atomic across all required locks
- Persistence happens within the locked region
- Transaction handling is correctly nested inside the lock context
This addresses the concurrency concerns raised in the PR objectives.
147-176
: Approve emptyfields_to_save
for new nodes
Thesave
method calls_create
wheneverself._existing
is False—ignoring thefields
argument—so passing an empty list does not prevent the node from being persisted.
193-196
: Preview object with process_pools=False is implemented correctly
Node.new’s process_pools parameter is defined and honored, so lock‐name computation avoids premature resource allocation as intended.backend/infrahub/lock_getter.py (6)
1-15
: LGTM! Imports and constants are appropriate.The constant
KINDS_CONCURRENT_MUTATIONS_NOT_ALLOWED
clearly documents which kinds require locking on any branch (not just the default branch).
17-48
: LGTM! Optimization correctly avoids redundant locking.The function correctly identifies when a node schema overrides a generic's uniqueness constraint and avoids locking both. The
kinds.pop(0)
operation is safe because the node kind is always added first (line 30) before iterating generics.
51-66
: LGTM! Branch locking policy is correctly enforced.The function properly checks both the node kind and its generics against the disallowed list. The early return for
GenericSchema
is correct since generics don't inherit from other types.
69-72
: LGTM! SHA-256 ensures deterministic lock names across processes.The comment clearly explains why the built-in
hash()
is avoided (randomization), and SHA-256 provides the necessary consistency for distributed locking.
84-100
: LGTM! Resource pool locking prevents duplicate allocations.This code directly addresses the issue raised in PR comments about objects depending on resource managers ending up with duplicate allocations. By acquiring locks on resource pool IDs during mutations, the code ensures exclusive access during allocation.
The deduplication check (
if lock_name not in lock_names
) prevents redundant locks when multiple attributes or relationships use the same pool.
102-103
: LGTM! Branch-based locking optimization is sound.This early return implements the PR's key optimization: allowing concurrent mutations on non-default branches (except for specific kinds) since validation will occur during merge to main. This reduces lock contention while maintaining safety.
backend/infrahub/core/node/save.py (4)
1-10
: LGTM! Imports are clean and complete.All necessary imports are present for the centralized constraint validation and persistence flow.
37-38
: LGTM! Validation prevents misuse of the API.The check ensures that callers who want to manage locks externally (
manage_lock=False
) must provide the lock names. The error message is clear.
40-45
: LGTM! Lock resolution is correct.The function correctly uses provided locks when available, or computes them via
get_lock_names_on_object_mutation
. Converting to a list ensures mutability for the lock registry.
47-57
: LGTM! Dual-mode locking design is well-implemented.The nested
_persist
function cleanly encapsulates the critical section, and the conditional lock acquisition supports both:
- Standalone usage where this function manages locks
- Nested usage where the caller has already acquired locks (e.g., when calling from
_do_create_node
)This design enables re-entrant locking mentioned in the PR objectives.
backend/tests/unit/core/test_get_kinds_lock.py (5)
1-13
: LGTM!The imports align with the new centralized locking strategy and test helper usage. The
deepcopy
import is appropriately used to avoid mutating shared fixtures.
40-51
: LGTM!The test correctly validates that objects on non-default branches do not acquire locks (unless the kind specifically requires locking on any branch), which aligns with the PR's goal of enabling more granular locking.
53-63
: LGTM!The test correctly validates that specific kinds (CoreGraphQLQueryGroup) require locking on all branches, ensuring critical objects remain protected from concurrent mutations regardless of branch context.
65-83
: LGTM!The test correctly validates attribute-based lock name generation. Note that the uniqueness constraint includes a relationship field (
owner__name
), but the lock name correctly excludes it—this aligns with the implementation inlock_getter.py
which skips relationship-based constraints.
85-100
: LGTM!The test correctly validates handling of optional attributes in lock names. When an attribute is part of a uniqueness constraint but has no value, it correctly contributes an empty hash (
_hash("")
) to the lock name.backend/infrahub/graphql/mutations/main.py (4)
1-46
: LGTM!The import changes correctly introduce the centralized locking utilities (
InfrahubMultiLock
,get_lock_names_on_object_mutation
) and the unified constraint-and-save helper (run_constraints_and_save
), aligning with the PR's refactoring goals.
213-233
: Excellent use of preview pattern to avoid side effects during lock computation.The code correctly:
- Fetches a clean copy of the object to preview the mutation
- Calls
from_graphql
withprocess_pools=False
to prevent resource pool allocations during lock name computation- Computes lock names before acquiring locks or starting transactions
This prevents the duplicate allocation issue mentioned in the PR objectives.
235-258
: LGTM! Correct lock and transaction nesting.The implementation properly:
- Acquires locks at the outermost scope via
InfrahubMultiLock
- Passes
manage_lock=False
to the inner mutation to avoid re-acquiring locks (preventing deadlock)- Handles both transactional and non-transactional database contexts
- Maintains consistent lock ordering across the mutation lifecycle
281-356
: LGTM! Well-structured mutation orchestration.The refactored
mutate_update_object
correctly:
- Expands the signature to support the unified constraint-and-save flow
- Includes a comprehensive docstring following Google style (as per coding guidelines)
- Uses conditional
apply_data
to support both fresh mutations and preview-based flows- Delegates constraint validation and persistence to
run_constraints_and_save
- Properly handles lock computation and profile refresh
The complexity is justified by the need to support multiple mutation paths (direct, preview-based, locked, unlocked) with a single implementation.
backend/infrahub/graphql/mutations/ipam.py (5)
1-29
: LGTM!The import changes correctly add profile tracking (
get_profile_ids
) and centralized lock computation (get_lock_names_on_object_mutation
) utilities, aligning with the broader mutation refactoring.
228-242
: LGTM! Consistent nested lock ordering.The implementation correctly uses nested
InfrahubMultiLock
contexts with consistent ordering:
- Namespace-level lock (outer)
- Object-level lock (inner)
- Database transaction (innermost)
This pattern, combined with re-entrant locking (per PR objectives), prevents deadlocks while ensuring proper isolation for IPAM reconciliation.
392-402
: Verify resource pool handling to prevent duplicate allocations.Similar to IPAddress, this code applies GraphQL data directly to the
prefix
object (line 393) with defaultprocess_pools=True
before computing lock names. Ifprefix
attributes depend on resource managers, this could trigger allocation before the lock is acquired.The same mitigation as suggested for IPAddress would apply here: use the preview pattern or verify that IPPrefix doesn't use resource pools.
The verification script from the previous comment will cover both IPAddress and IPPrefix schemas.
479-487
: LGTM! Delete operation correctly uses locking.The delete flow properly acquires the namespace-level lock before starting the transaction and performing IPAM reconciliation, ensuring consistency during concurrent operations.
215-225
: No resource pools on IPAddress—no changes needed.IPAddress objects don’t use any pool attributes (only IPAddressPool nodes are backed by pools), and
from_graphql
doesn’t support aprocess_pools
flag. The mutation as written won’t trigger duplicate allocations.
async def from_graphql( | ||
self, | ||
data: dict, | ||
db: InfrahubDatabase, | ||
process_pools: bool = True, | ||
) -> bool: | ||
"""Update attr from GraphQL payload""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Update docstring to document parameters.
The from_graphql
method signature now includes process_pools: bool = True
, but the docstring (line 587) is brief and does not document any parameters. Please update the docstring to include an Args
section documenting the data
, db
, and process_pools
parameters, as well as a Returns
section.
As per coding guidelines
Apply this diff to update the docstring:
async def from_graphql(
self,
data: dict,
db: InfrahubDatabase,
process_pools: bool = True,
) -> bool:
- """Update attr from GraphQL payload"""
+ """Update attribute from a GraphQL payload.
+
+ Args:
+ data: GraphQL payload to apply.
+ db: Database connection used for related lookups.
+ process_pools: Whether resource pool allocations should be performed; when False, pool allocation is skipped.
+
+ Returns:
+ True if the attribute changed, otherwise False.
+ """
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
async def from_graphql( | |
self, | |
data: dict, | |
db: InfrahubDatabase, | |
process_pools: bool = True, | |
) -> bool: | |
"""Update attr from GraphQL payload""" | |
async def from_graphql( | |
self, | |
data: dict, | |
db: InfrahubDatabase, | |
process_pools: bool = True, | |
) -> bool: | |
"""Update attribute from a GraphQL payload. | |
Args: | |
data: GraphQL payload to apply. | |
db: Database connection used for related lookups. | |
process_pools: Whether resource pool allocations should be performed; when False, pool allocation is skipped. | |
Returns: | |
True if the attribute changed, otherwise False. | |
""" |
🤖 Prompt for AI Agents
In backend/infrahub/core/attribute.py around lines 581 to 587, the from_graphql
method docstring is too brief and missing parameter and return documentation;
update the docstring to include an Args section documenting: data (dict) —
GraphQL payload used to update the attribute; db (InfrahubDatabase) — database
instance used for lookups/updates; process_pools (bool, default True) — whether
to process pool-related data; and add a Returns section describing the boolean
return (True on successful update, False otherwise). Keep formatting consistent
with the project's docstring style and place the Args and Returns under the
existing one-line summary.
async def handle_pool( | ||
self, | ||
db: InfrahubDatabase, | ||
attribute: BaseAttribute, | ||
errors: list, | ||
allocate_resources: bool = True, | ||
) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Update docstring to document allocate_resources
parameter.
The method signature now includes allocate_resources: bool = True
, but the docstring (lines 267-271) does not document this new parameter. Please update the docstring to include an Args
section with the allocate_resources
parameter.
As per coding guidelines
Apply this diff to update the docstring:
async def handle_pool(
self,
db: InfrahubDatabase,
attribute: BaseAttribute,
errors: list,
allocate_resources: bool = True,
) -> None:
- """Evaluate if a resource has been requested from a pool and apply the resource
+ """Evaluate if a resource has been requested from a pool and apply the resource.
+
+ Args:
+ db: Database connection.
+ attribute: Attribute to evaluate for pool allocation.
+ errors: List to append validation errors.
+ allocate_resources: Whether to perform pool allocation; when False, pool allocation is skipped.
This method only works on number pools, currently Integer is the only type that has the from_pool
within the create code.
"""
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
async def handle_pool( | |
self, | |
db: InfrahubDatabase, | |
attribute: BaseAttribute, | |
errors: list, | |
allocate_resources: bool = True, | |
) -> None: | |
async def handle_pool( | |
self, | |
db: InfrahubDatabase, | |
attribute: BaseAttribute, | |
errors: list, | |
allocate_resources: bool = True, | |
) -> None: | |
"""Evaluate if a resource has been requested from a pool and apply the resource. | |
Args: | |
db: Database connection. | |
attribute: Attribute to evaluate for pool allocation. | |
errors: List to append validation errors. | |
allocate_resources: Whether to perform pool allocation; when False, pool allocation is skipped. | |
This method only works on number pools, currently Integer is the only type that has the from_pool | |
within the create code. | |
""" |
🤖 Prompt for AI Agents
In backend/infrahub/core/node/__init__.py around lines 260 to 266, the method
handle_pool now accepts an allocate_resources: bool = True parameter but the
docstring was not updated; add or update the docstring's Args section to include
allocate_resources with a brief description (type bool, default True) explaining
that when True resources will be allocated during handling and when False
allocation is skipped, and ensure the Args list includes the existing parameters
(db, attribute, errors) in the same style and formatting as the project's
docstring conventions.
async def new( | ||
self, | ||
db: InfrahubDatabase, | ||
id: str | None = None, | ||
process_pools: bool = True, | ||
**kwargs: Any, | ||
) -> Self: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Update docstring to document process_pools
parameter.
The new
method signature now includes process_pools: bool = True
, but there is no docstring present for this method. Please add a docstring that includes an Args
section documenting the process_pools
parameter.
As per coding guidelines
Apply this diff to add a docstring:
async def new(
self,
db: InfrahubDatabase,
id: str | None = None,
process_pools: bool = True,
**kwargs: Any,
) -> Self:
+ """Create a new node instance with the provided data.
+
+ Args:
+ db: Database connection.
+ id: Optional UUID for the node; generated if not provided.
+ process_pools: Whether to perform resource pool allocations; when False, pool allocation is skipped.
+ **kwargs: Field values for the node.
+
+ Returns:
+ The initialized node instance.
+ """
if id and not is_valid_uuid(id):
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
async def new( | |
self, | |
db: InfrahubDatabase, | |
id: str | None = None, | |
process_pools: bool = True, | |
**kwargs: Any, | |
) -> Self: | |
async def new( | |
self, | |
db: InfrahubDatabase, | |
id: str | None = None, | |
process_pools: bool = True, | |
**kwargs: Any, | |
) -> Self: | |
"""Create a new node instance with the provided data. | |
Args: | |
db: Database connection. | |
id: Optional UUID for the node; generated if not provided. | |
process_pools: Whether to perform resource pool allocations; when False, pool allocation is skipped. | |
**kwargs: Field values for the node. | |
Returns: | |
The initialized node instance. | |
""" | |
if id and not is_valid_uuid(id): |
🤖 Prompt for AI Agents
In backend/infrahub/core/node/__init__.py around lines 697 to 703, the async
new(...) method lacks a docstring and must document the new process_pools: bool
= True parameter; add a concise docstring above the method including a short
description of the factory behavior and an Args section that documents db, id,
process_pools (explain that process_pools controls whether pool-related
processing is performed, default True), and **kwargs, and mention the return
type in a Returns section (Self or instance). Ensure formatting follows existing
docstring style in the file.
async def run_constraints_and_save( | ||
node: Node, | ||
node_constraint_runner: NodeConstraintRunner, | ||
fields_to_validate: list[str], | ||
fields_to_save: list[str], | ||
db: InfrahubDatabase, | ||
branch: Branch, | ||
skip_uniqueness_check: bool = False, | ||
lock_names: Sequence[str] | None = None, | ||
manage_lock: bool = True, | ||
) -> None: | ||
"""Validate a node and persist it, optionally reusing an existing lock context. | ||
Args: | ||
node: The node instance to validate and persist. | ||
node_constraint_runner: Runner executing node-level constraints. | ||
fields_to_validate: Field names that must be validated. | ||
fields_to_save: Field names that must be persisted. | ||
db: Database connection or transaction to use for persistence. | ||
branch: Branch associated with the mutation. | ||
skip_uniqueness_check: Whether to skip uniqueness constraints. | ||
lock_names: Precomputed lock identifiers to reuse when ``manage_lock`` is False. | ||
manage_lock: Whether this helper should acquire and release locks itself. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Add Raises section to docstring.
The docstring is missing a Raises
section to document the ValueError
raised on line 38 when manage_lock
is False but lock_names
is None.
Apply this diff:
Args:
node: The node instance to validate and persist.
node_constraint_runner: Runner executing node-level constraints.
fields_to_validate: Field names that must be validated.
fields_to_save: Field names that must be persisted.
db: Database connection or transaction to use for persistence.
branch: Branch associated with the mutation.
skip_uniqueness_check: Whether to skip uniqueness constraints.
lock_names: Precomputed lock identifiers to reuse when ``manage_lock`` is False.
manage_lock: Whether this helper should acquire and release locks itself.
+
+ Raises:
+ ValueError: When manage_lock is False but lock_names is not provided.
"""
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
async def run_constraints_and_save( | |
node: Node, | |
node_constraint_runner: NodeConstraintRunner, | |
fields_to_validate: list[str], | |
fields_to_save: list[str], | |
db: InfrahubDatabase, | |
branch: Branch, | |
skip_uniqueness_check: bool = False, | |
lock_names: Sequence[str] | None = None, | |
manage_lock: bool = True, | |
) -> None: | |
"""Validate a node and persist it, optionally reusing an existing lock context. | |
Args: | |
node: The node instance to validate and persist. | |
node_constraint_runner: Runner executing node-level constraints. | |
fields_to_validate: Field names that must be validated. | |
fields_to_save: Field names that must be persisted. | |
db: Database connection or transaction to use for persistence. | |
branch: Branch associated with the mutation. | |
skip_uniqueness_check: Whether to skip uniqueness constraints. | |
lock_names: Precomputed lock identifiers to reuse when ``manage_lock`` is False. | |
manage_lock: Whether this helper should acquire and release locks itself. | |
""" | |
async def run_constraints_and_save( | |
node: Node, | |
node_constraint_runner: NodeConstraintRunner, | |
fields_to_validate: list[str], | |
fields_to_save: list[str], | |
db: InfrahubDatabase, | |
branch: Branch, | |
skip_uniqueness_check: bool = False, | |
lock_names: Sequence[str] | None = None, | |
manage_lock: bool = True, | |
) -> None: | |
"""Validate a node and persist it, optionally reusing an existing lock context. | |
Args: | |
node: The node instance to validate and persist. | |
node_constraint_runner: Runner executing node-level constraints. | |
fields_to_validate: Field names that must be validated. | |
fields_to_save: Field names that must be persisted. | |
db: Database connection or transaction to use for persistence. | |
branch: Branch associated with the mutation. | |
skip_uniqueness_check: Whether to skip uniqueness constraints. | |
lock_names: Precomputed lock identifiers to reuse when ``manage_lock`` is False. | |
manage_lock: Whether this helper should acquire and release locks itself. | |
Raises: | |
ValueError: When manage_lock is False but lock_names is not provided. | |
""" |
🤖 Prompt for AI Agents
In backend/infrahub/core/node/save.py around lines 12 to 35, the function
docstring is missing a Raises section documenting that a ValueError is raised
when manage_lock is False but lock_names is None; add a "Raises" section to the
docstring that specifies ValueError is raised in that exact condition (include
the parameter names and a brief message such as "manage_lock is False but
lock_names is None"), so callers understand the precondition and error behavior.
Previous implementation was wrong due to asyncio context switching. Signed-off-by: Fatih Acar <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
backend/infrahub/graphql/mutations/ipam.py
(10 hunks)backend/infrahub/lock.py
(3 hunks)backend/infrahub/lock_getter.py
(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
backend/**/*
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Run backend tests with
pytest
or viainvoke
tasks
Files:
backend/infrahub/lock_getter.py
backend/infrahub/lock.py
backend/infrahub/graphql/mutations/ipam.py
**/*.py
📄 CodeRabbit inference engine (.github/instructions/python-docstring.instructions.md)
**/*.py
: Use triple double quotes (""") for all Python docstrings
Write docstrings in Google-style format
Include a brief one-line description at the top of each docstring
Add a detailed description section when additional context is needed
Document function/method parameters under an Args/Parameters section without typing information
Include a Returns section describing the return value
Include a Raises section listing possible exceptions
Provide an Examples section demonstrating usage when helpful
**/*.py
: Use type hints for all function parameters and return values in Python code
Prefer asynchronous implementations when feasible (use Async whenever possible)
Define asynchronous functions withasync def
Useawait
for asynchronous calls
Use Pydantic models instead of standard library dataclasses
All Python functions must include a docstring
Use Ruff and MyPy for linting and type checkingUse ruff and mypy to validate and lint Python files
**/*.py
: Use type hints for all Python function parameters and return values
Use async/await whenever possible in Python code
Useasync def
for asynchronous functions
Useawait
for asynchronous calls within async functions
Use Pydantic models instead of dataclasses for data modeling
All Python functions must have docstrings
Docstrings must use triple quotes (""")
Follow Google-style docstring format
Docstrings should include, when applicable: brief description, detailed description, Args/Parameters (without types), Returns, Raises, and Examples
Validate and lint Python with ruff and mypy
Format Python code using the project’s formatter (invoke format)
Files:
backend/infrahub/lock_getter.py
backend/infrahub/lock.py
backend/infrahub/graphql/mutations/ipam.py
🧬 Code graph analysis (3)
backend/infrahub/lock_getter.py (4)
backend/infrahub/core/schema/schema_branch.py (1)
SchemaBranch
(72-2314)backend/infrahub/lock.py (2)
build_object_lock_name
(283-284)get
(254-260)backend/infrahub/core/relationship/model.py (2)
RelationshipManager
(718-1242)pop
(678-684)backend/infrahub/core/schema/basenode_schema.py (1)
parse_schema_path
(426-474)
backend/infrahub/lock.py (1)
backend/infrahub/core/timestamp.py (1)
current_timestamp
(33-34)
backend/infrahub/graphql/mutations/ipam.py (5)
backend/infrahub/core/node/create.py (1)
get_profile_ids
(116-120)backend/infrahub/lock.py (2)
InfrahubMultiLock
(50-74)build_object_lock_name
(283-284)backend/infrahub/lock_getter.py (1)
get_lock_names_on_object_mutation
(54-125)backend/infrahub/core/node/__init__.py (2)
Node
(72-1062)from_graphql
(966-994)backend/infrahub/graphql/mutations/main.py (1)
mutate_update_object
(282-356)
🪛 GitHub Actions: CI
backend/infrahub/lock_getter.py
[error] 54-54: Ruff: ARG001 Unused function argument: branch
.
🔇 Additional comments (11)
backend/infrahub/lock.py (4)
8-8
: LGTM!The
ContextVar
import is necessary for implementing re-entrant locking semantics.
136-136
: LGTM!The
ContextVar
initialization correctly provides per-context recursion tracking with a unique name to avoid collisions between different lock instances.
159-172
: LGTM!The re-entrant acquire logic is correct: it increments the recursion depth if the lock is already held in the current context and returns early without performing a duplicate acquisition. The outermost acquire initializes the depth to 1 and records metrics appropriately.
174-194
: LGTM!The re-entrant release logic correctly handles:
- Error case: raises
RuntimeError
if releasing without ownership- Nested releases: decrements depth without releasing the underlying lock
- Final release: records metrics, releases the underlying lock, and clears the recursion context
The implementation ensures that metrics are only recorded for the outermost acquisition/release pair.
backend/infrahub/lock_getter.py (3)
14-45
: LGTM!The function correctly identifies kinds requiring locks by examining uniqueness constraints on both the node schema and its generics. The logic to remove the node kind when it simply overrides a generic's constraint (lines 40-44) prevents duplicate locking and is well-documented.
48-51
: LGTM!The use of SHA-256 ensures consistent lock names across different processes, which is essential for distributed locking. The comment clearly explains why Python's built-in
hash()
is unsuitable.
81-125
: LGTM!The uniqueness constraint processing correctly:
- Deduplicates attributes across multiple constraints (lines 96-97)
- Excludes relationship constraints as intended per PR objectives (lines 100-102)
- Handles optional/None attributes gracefully (lines 106-109)
- Skips kinds with no qualifying attributes to prevent malformed lock names (lines 119-120)
- Builds hierarchical lock names from hashed attribute values (lines 122-123)
The implementation addresses previous review feedback and handles edge cases appropriately.
backend/infrahub/graphql/mutations/ipam.py (4)
131-152
: LGTM, pending signature fix.The refactored
mutate_create
correctly:
- Applies
@retry_db_transaction
decorator for resilience- Acquires locks before opening the transaction (lines 145-146)
- Performs reconciliation within the transaction scope (lines 147-149)
The structure is sound once the signature inconsistency on line 144 is resolved.
292-332
: LGTM!The
IPPrefix
mutation correctly:
- Documents the cardinality-one relationship locking requirement (lines 293-295)
- Acquires namespace-level lock before the transaction (line 325)
- Performs reconciliation within the transaction scope (lines 327-329)
- Applies
@retry_db_transaction
decorator for resilienceThe implementation is consistent and well-structured.
366-484
: LGTM!Both
mutate_update
andmutate_delete
correctly implement the centralized locking pattern:
mutate_update
: nested locks (namespace → object) wrap the transaction and reconciliation (lines 402-417)mutate_delete
: namespace lock wraps the transaction and reconciliation (lines 477-481)The lock acquisition and transaction boundaries are properly aligned, ensuring data consistency during concurrent operations.
189-242
: Ensure_get_lock_names(namespace_id, branch)
signature matches its call. Lock ordering across IPAM update mutations is consistent (namespace → object → transaction).
763f5cc
to
7b6f018
Compare
We can enable locking for consistency on branches now that they are more granular and avoid performance penalty. Signed-off-by: Fatih Acar <[email protected]>
Signed-off-by: Fatih Acar <[email protected]>
7b6f018
to
3aa9ba8
Compare
IFC-1874
This PR makes the current locks more granular, ie instead of locking only on object kind, we also lock on values of the attributes contained within uniqueness constraint. In this way, we allow concurrent creation/updates of objects having different values on these key attributes. We cannot do it for relationships at it would require fetching relationships peers before locking.
EDIT 2025-09-30: this PR also contains fixes for the resource managers to prevent duplicate allocations. Locking is now done during the mutation but also during the allocation of the resource, thus making the locks "re-entrant" was required to avoid deadlocks within a same request.
Summary by CodeRabbit
New Features
Refactor
Bug Fixes
Tests