⚡️ Speed up function get_input_data_lineage_excluding_auto_batch_casting by 13% in PR #1504 (feature/try-to-beat-the-limitation-of-ee-in-terms-of-singular-elements-pushed-into-batch-inputs)
#1506
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
⚡️ This pull request contains optimizations for PR #1504
If you approve this dependent PR, these changes will be merged into the original PR branch
feature/try-to-beat-the-limitation-of-ee-in-terms-of-singular-elements-pushed-into-batch-inputs.📄 13% (0.13x) speedup for
get_input_data_lineage_excluding_auto_batch_castingininference/core/workflows/execution_engine/v1/compiler/graph_constructor.py⏱️ Runtime :
1.46 milliseconds→1.29 milliseconds(best of18runs)📝 Explanation and details
The optimization achieves a 12% speedup by applying two key changes:
1. Function Call Inlining (Primary Optimization)
The main performance gain comes from inlining the
get_lineage_for_input_propertyfunction logic directly into the main loop ofget_input_data_lineage_excluding_auto_batch_casting. This eliminates ~2,342 function calls (as shown in the profiler), reducing the overhead from 79.6% to 31.6% of total time spent in theidentify_lineagecall.The inlined logic checks
input_definition.is_compound_input()directly in the loop and handles both compound and simple inputs inline, avoiding the function call overhead entirely for the common case of simple batch-oriented inputs.2. Dictionary Implementation Change
In
verify_lineages, replaceddefaultdict(list)with a plain dictionary using explicit key existence checks. This reduces the overhead of defaultdict's factory function calls and provides more predictable performance characteristics, especially beneficial when processing large numbers of lineages.Performance Impact by Test Type:
The optimization maintains identical behavior and error handling while significantly reducing the computational overhead in the hot path where most properties are processed.
✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-pr1504-2025-08-22T09.05.08and push.