Skip to content

fix(backend): create node with many rels faster #6883

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: stable
Choose a base branch
from

Conversation

fatih-acar
Copy link
Contributor

@fatih-acar fatih-acar commented Jul 23, 2025

Avoid fetching each peer during node creation by fetching them all at once prior to calling get_create_data().

Summary by CodeRabbit

  • Bug Fixes
    • Improved handling of relationship peers during node creation to ensure peers are explicitly set before processing relationships.
  • Refactor
    • Converted several relationship-related methods from asynchronous to synchronous for simplified processing.
    • Updated test fixtures to delegate schema registration externally, simplifying setup.
    • Adjusted test code to reflect synchronous behavior of relationship methods.
  • Chores
    • Changed default test container usage flag to disabled unless explicitly enabled.

@fatih-acar fatih-acar requested a review from a team as a code owner July 23, 2025 08:48
Copy link

coderabbitai bot commented Jul 23, 2025

Walkthrough

The asynchronous method NodeCreateAllQuery.query_init was updated to explicitly fetch peers for each relationship manager and set the peer for each relationship before proceeding. Additionally, several methods in the Relationship class and related calls were converted from asynchronous to synchronous, removing unnecessary async/await usage. Corresponding changes were made in NodeManager.query_peers and GraphQL mutations to reflect this sync conversion. Test fixtures for migrations were simplified by delegating schema registration to a new fixture. The default environment variable for test containers was changed from true to false. No public interfaces or signatures were changed except for method async-to-sync updates.

Changes

File Change Summary
backend/infrahub/core/query/node.py Refactored NodeCreateAllQuery.query_init to explicitly await peer retrieval and set peers on relationships before calling get_create_data.
backend/infrahub/core/relationship/model.py Converted several Relationship class methods (_process_data, new, load, set_peer, resolve) and related calls from async to sync.
backend/infrahub/core/manager.py Changed NodeManager.query_peers to call Relationship.load() and result.set_peer() synchronously instead of awaiting them.
backend/infrahub/graphql/mutations/relationship.py Removed await from Relationship.load() call in RelationshipRemove mutation.
backend/tests/helpers/constants.py Changed default for INFRAHUB_USE_TEST_CONTAINERS environment variable from "true" to "false".
backend/tests/unit/core/migrations/graph/test_012.py Simplified migration_012_data fixture by removing manual schema loading; added dependency on register_core_models_schema fixture.
backend/tests/unit/core/migrations/graph/test_013.py Removed manual schema loading in migration_013_data and migration_013_schema fixtures; added register_core_models_schema fixture dependency; removed unused imports.
backend/tests/unit/core/test_relationship.py Removed unnecessary await from calls to rel.load() and rel.set_peer() in test cases.
backend/tests/unit/core/test_relationship_query.py Removed await from Relationship.load() calls in test cases.

Estimated code review effort

3 (~45 minutes)


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4ad348f and 337a468.

📒 Files selected for processing (9)
  • backend/infrahub/core/manager.py (1 hunks)
  • backend/infrahub/core/query/node.py (1 hunks)
  • backend/infrahub/core/relationship/model.py (10 hunks)
  • backend/infrahub/graphql/mutations/relationship.py (1 hunks)
  • backend/tests/helpers/constants.py (1 hunks)
  • backend/tests/unit/core/migrations/graph/test_012.py (1 hunks)
  • backend/tests/unit/core/migrations/graph/test_013.py (3 hunks)
  • backend/tests/unit/core/test_relationship.py (4 hunks)
  • backend/tests/unit/core/test_relationship_query.py (2 hunks)
📓 Path-based instructions (2)
backend/tests/**/*

📄 CodeRabbit Inference Engine (.github/copilot-instructions.md)

Place backend tests in backend/tests/

Files:

  • backend/tests/helpers/constants.py
  • backend/tests/unit/core/test_relationship.py
  • backend/tests/unit/core/migrations/graph/test_012.py
  • backend/tests/unit/core/migrations/graph/test_013.py
backend/**/*

📄 CodeRabbit Inference Engine (.github/copilot-instructions.md)

Run backend tests with pytest or via invoke tasks

Files:

  • backend/tests/helpers/constants.py
  • backend/infrahub/core/manager.py
  • backend/tests/unit/core/test_relationship.py
  • backend/infrahub/graphql/mutations/relationship.py
  • backend/tests/unit/core/migrations/graph/test_012.py
  • backend/infrahub/core/relationship/model.py
  • backend/tests/unit/core/migrations/graph/test_013.py
🧬 Code Graph Analysis (4)
backend/tests/helpers/constants.py (1)
tasks/utils.py (1)
  • str_to_bool (72-101)
backend/tests/unit/core/migrations/graph/test_012.py (1)
backend/tests/unit/conftest.py (1)
  • delete_all_nodes_in_db (2352-2353)
backend/infrahub/core/relationship/model.py (4)
backend/infrahub/core/query/relationship.py (1)
  • RelationshipPeerData (70-113)
backend/infrahub/core/branch/models.py (1)
  • isinstance (170-171)
backend/infrahub/core/node/__init__.py (3)
  • new (675-688)
  • load (695-714)
  • Node (72-1020)
backend/infrahub/core/protocols_base.py (3)
  • new (84-84)
  • InfrahubDatabase (35-65)
  • load (87-94)
backend/tests/unit/core/migrations/graph/test_013.py (1)
backend/tests/unit/conftest.py (1)
  • delete_all_nodes_in_db (2352-2353)
✅ Files skipped from review due to trivial changes (1)
  • backend/tests/unit/core/test_relationship_query.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • backend/infrahub/core/query/node.py
🧰 Additional context used
📓 Path-based instructions (2)
backend/tests/**/*

📄 CodeRabbit Inference Engine (.github/copilot-instructions.md)

Place backend tests in backend/tests/

Files:

  • backend/tests/helpers/constants.py
  • backend/tests/unit/core/test_relationship.py
  • backend/tests/unit/core/migrations/graph/test_012.py
  • backend/tests/unit/core/migrations/graph/test_013.py
backend/**/*

📄 CodeRabbit Inference Engine (.github/copilot-instructions.md)

Run backend tests with pytest or via invoke tasks

Files:

  • backend/tests/helpers/constants.py
  • backend/infrahub/core/manager.py
  • backend/tests/unit/core/test_relationship.py
  • backend/infrahub/graphql/mutations/relationship.py
  • backend/tests/unit/core/migrations/graph/test_012.py
  • backend/infrahub/core/relationship/model.py
  • backend/tests/unit/core/migrations/graph/test_013.py
🧬 Code Graph Analysis (4)
backend/tests/helpers/constants.py (1)
tasks/utils.py (1)
  • str_to_bool (72-101)
backend/tests/unit/core/migrations/graph/test_012.py (1)
backend/tests/unit/conftest.py (1)
  • delete_all_nodes_in_db (2352-2353)
backend/infrahub/core/relationship/model.py (4)
backend/infrahub/core/query/relationship.py (1)
  • RelationshipPeerData (70-113)
backend/infrahub/core/branch/models.py (1)
  • isinstance (170-171)
backend/infrahub/core/node/__init__.py (3)
  • new (675-688)
  • load (695-714)
  • Node (72-1020)
backend/infrahub/core/protocols_base.py (3)
  • new (84-84)
  • InfrahubDatabase (35-65)
  • load (87-94)
backend/tests/unit/core/migrations/graph/test_013.py (1)
backend/tests/unit/conftest.py (1)
  • delete_all_nodes_in_db (2352-2353)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: javascript-lint
  • GitHub Check: prepare-environment / prepare
  • GitHub Check: Detect which file has changed
  • GitHub Check: triage
  • GitHub Check: scan
🔇 Additional comments (16)
backend/tests/helpers/constants.py (1)

5-5: Configuration change: Test containers now disabled by default.

This change will improve development performance by avoiding container overhead unless explicitly enabled. Developers who need test containers should set INFRAHUB_USE_TEST_CONTAINERS=true in their environment.

backend/tests/unit/core/test_relationship.py (1)

104-104: LGTM: Correctly updated test calls to match synchronous method signatures.

The removal of await keywords from rel.load() and rel.set_peer() calls correctly reflects the conversion of these methods from asynchronous to synchronous in the core relationship model.

Also applies to: 118-118, 134-134, 150-150, 155-155, 158-158, 165-165, 168-168, 171-171

backend/infrahub/graphql/mutations/relationship.py (1)

236-236: LGTM: Correctly updated to synchronous method call.

The removal of await from rel.load() correctly reflects the method's conversion to synchronous in the core relationship model.

backend/infrahub/core/manager.py (2)

403-409: LGTM: Correctly updated to synchronous method chain.

The removal of await from the Relationship(...).load(...) method chain correctly reflects the conversion to synchronous methods in the relationship model.


411-411: LGTM: Correctly updated to synchronous method call.

The removal of await from result.set_peer() correctly reflects the method's conversion to synchronous in the relationship model.

backend/tests/unit/core/migrations/graph/test_012.py (1)

137-139: LGTM: Improved test organization with dedicated schema fixture.

The update to use register_core_models_schema fixture is a good refactoring that promotes reusability and consistency by delegating schema registration to a dedicated fixture instead of inline schema loading.

backend/tests/unit/core/migrations/graph/test_013.py (3)

14-14: LGTM: Clean import removal

The removal of unused schema imports aligns with the refactoring to delegate schema registration to the register_core_models_schema fixture, reducing dependencies and simplifying the test setup.


143-145: LGTM: Proper fixture dependency injection

The addition of register_core_models_schema parameter effectively delegates schema registration to a centralized fixture, reducing code duplication and simplifying test setup while maintaining the same functionality.


166-168: LGTM: Consistent fixture refactoring

The consistent addition of register_core_models_schema parameter across both fixtures maintains a clean and unified approach to schema registration in migration tests.

backend/infrahub/core/relationship/model.py (7)

169-169: LGTM: Appropriate async-to-sync conversion

Converting _process_data to a synchronous method is logical since it only performs data manipulation operations that don't require async behavior, improving performance by removing unnecessary async overhead.


173-173: LGTM: Consistent synchronous method calls

The removal of await from set_peer calls is consistent with the method's conversion to synchronous operation, maintaining code coherence throughout the refactoring.

Also applies to: 190-190, 201-201


209-209: LGTM: Correct synchronous method call

The removal of await from the _process_data call is appropriate since the method is now synchronous, while maintaining the new method's async interface for potential other async operations.


213-213: LGTM: Complete async-to-sync method conversion

Converting the load method to synchronous operation is appropriate since it only performs data assignments and calls other synchronous methods, eliminating unnecessary async overhead.

Also applies to: 226-226


255-260: LGTM: Optimal sync conversion for simple assignment

Converting set_peer to synchronous operation is ideal since it only performs simple conditional assignments without any async operations, directly supporting the PR's performance optimization goals.


436-436: LGTM: Consistent sync method calls in async context

The removal of await from set_peer calls within the async resolve method is correct and consistent with the method's conversion to synchronous operation.

Also applies to: 453-453, 476-476


965-971: LGTM: Completing the synchronous refactoring pattern

The removal of await from Relationship.load() calls in RelationshipManager correctly completes the async-to-sync conversion, maintaining consistency across the codebase and supporting the PR's performance optimization objectives.

Also applies to: 1053-1053

✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary or Summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions github-actions bot added the group/backend Issue related to the backend (API Server, Git Agent) label Jul 23, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
backend/infrahub/core/query/node.py (1)

162-162: Approve the optimization with a suggestion for defensive programming.

The peer assignment logic is correct and maintains the same functionality while improving performance. However, consider adding error handling for the case where rel.peer_id might not exist in the peers dictionary.

Consider this defensive approach:

-                await rel.set_peer(value=peers[rel.peer_id])
+                if rel.peer_id in peers:
+                    await rel.set_peer(value=peers[rel.peer_id])
+                else:
+                    # Log warning or handle missing peer case
+                    raise ValueError(f"Peer ID {rel.peer_id} not found in fetched peers")

Alternatively, if you're confident that rel.peer_id will always exist in peers, the current implementation is fine.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 99495f7 and 82f7e2f.

📒 Files selected for processing (1)
  • backend/infrahub/core/query/node.py (1 hunks)
📓 Path-based instructions (1)
backend/**/*

📄 CodeRabbit Inference Engine (.github/copilot-instructions.md)

Run backend tests with pytest or via invoke tasks

Files:

  • backend/infrahub/core/query/node.py
🧰 Additional context used
📓 Path-based instructions (1)
backend/**/*

📄 CodeRabbit Inference Engine (.github/copilot-instructions.md)

Run backend tests with pytest or via invoke tasks

Files:

  • backend/infrahub/core/query/node.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: E2E-testing-version-upgrade / From 1.1.0
  • GitHub Check: backend-benchmark
  • GitHub Check: E2E-testing-invoke-demo-start
  • GitHub Check: backend-docker-integration
  • GitHub Check: E2E-testing-playwright
  • GitHub Check: validate-generated-documentation
  • GitHub Check: backend-tests-unit
  • GitHub Check: Cloudflare Pages
🔇 Additional comments (1)
backend/infrahub/core/query/node.py (1)

160-160: Excellent optimization approach!

Fetching all peers at once before the relationship processing loop is a smart performance improvement that follows the batch operation pattern to reduce database calls.

@fatih-acar fatih-acar force-pushed the fac-faster-many-rels-create branch from 82f7e2f to 92acc57 Compare July 23, 2025 08:51
Copy link

codspeed-hq bot commented Jul 23, 2025

CodSpeed Performance Report

Merging #6883 will degrade performances by 51.75%

Comparing fac-faster-many-rels-create (337a468) with stable (99495f7)

Summary

❌ 1 regressions
✅ 9 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark BASE HEAD Change
test_load_node_to_db_node_schema 62.5 ms 129.6 ms -51.75%

Avoid fetching each peer during node creation by fetching them all at
once prior to calling get_create_data().

Signed-off-by: Fatih Acar <[email protected]>
@fatih-acar fatih-acar force-pushed the fac-faster-many-rels-create branch from 92acc57 to 4ad348f Compare July 23, 2025 09:15
Comment on lines 160 to 163
peers = await rel_manager.get_peers(db=db, branch_agnostic=self.branch_agnostic)
for rel in rel_manager._relationships:
if rel.get_peer_id() in peers:
await rel.set_peer(value=peers[rel.get_peer_id()])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
peers = await rel_manager.get_peers(db=db, branch_agnostic=self.branch_agnostic)
for rel in rel_manager._relationships:
if rel.get_peer_id() in peers:
await rel.set_peer(value=peers[rel.get_peer_id()])
for rel in rel_manager.get_relationships(db=db, branch_agnostic=self.branch_agnostic):

This should fetch all relationships at once, and it avoids accessing private members _relationships

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I already tried that and the thing is that get_relationships() only sets the peer ids (and does not call a get_peers equivalent to fetch the full peer node).

I wonder if we really need the peer attributes for the NodeCreateAll query to work, maybe we can just avoid the peer.resolve() call within the rel.get_create_data function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I already tried that and the thing is that get_relationships() only sets the peer ids (and does not call a get_peers equivalent to fetch the full peer node).

Right

It seems we need to fetch to get the branch peer

@LucasG0 LucasG0 marked this pull request as draft July 24, 2025 12:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
group/backend Issue related to the backend (API Server, Git Agent)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants