-
Notifications
You must be signed in to change notification settings - Fork 25.4k
[ML] Make AssignmentPlan to consider only assigned allocations #132170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hi @valeriy42, I've created a changelog YAML for you. |
# Conflicts: # x-pack/plugin/ml/src/main/java/org/elasticsearch/xpack/ml/inference/assignment/TrainedModelAssignmentRebalancer.java # x-pack/plugin/ml/src/main/java/org/elasticsearch/xpack/ml/inference/assignment/planning/AssignmentPlan.java # x-pack/plugin/ml/src/main/java/org/elasticsearch/xpack/ml/inference/assignment/planning/ZoneAwareAssignmentPlanner.java
Map<AssignmentPlan.Node, Integer> sourceNodeAssignments = source.assignments(deployment).orElse(Map.of()); | ||
for (Map.Entry<AssignmentPlan.Node, Integer> sourceAssignment : sourceNodeAssignments.entrySet()) { | ||
AssignmentPlan.Node node = originalNodeById.get(sourceAssignment.getKey().id()); | ||
dest.assignModelToNode(deployment, node, sourceAssignment.getValue()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now, calling assignedModelToNode
is enough to correctly account for allocated memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a unit test for the copyAssignments
method in TrainedModelAssignmentRebalancerTests
); | ||
planBuilder.accountMemory(m, originalNode, requiredMemory); | ||
} | ||
finalPlanBuilder.assignModelToNode(originalDeployment, originalNode, assignment.getValue()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now, calling assignedModelToNode
is enough to correctly account for allocated memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make AssignmentPlan::accountMemory()
a private method so it is not accidentally called by one of the planners
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes the AssignmentPlan to consider only assigned allocations (not current allocations) in memory requirement calculations. The main change eliminates the need for manual memory accounting of current allocations since AssignmentPlan.Builder now properly calculates memory requirements based only on newly assigned allocations.
Key changes:
- Memory calculation logic updated to exclude current allocations from requirements
- Simplified code in ZoneAwareAssignmentPlanner and TrainedModelRebalancer by removing manual memory accounting
- Improved code readability with better documentation and variable naming
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
AssignmentPlanTests.java | Updated test expectations to reflect corrected memory calculations |
ZoneAwareAssignmentPlanner.java | Removed manual memory accounting for current allocations and improved documentation |
AssignmentPlanner.java | Enhanced documentation and improved code flow with clearer variable names |
AssignmentPlan.java | Added comprehensive documentation and simplified allocation tracking methods |
TrainedModelAssignmentRebalancer.java | Removed manual memory accounting logic and improved variable naming |
...rc/main/java/org/elasticsearch/xpack/ml/inference/assignment/planning/AssignmentPlanner.java
Outdated
Show resolved
Hide resolved
...l/src/main/java/org/elasticsearch/xpack/ml/inference/assignment/planning/AssignmentPlan.java
Outdated
Show resolved
Hide resolved
Pinging @elastic/ml-core (Team:ML) |
…rence/assignment/planning/AssignmentPlan.java Co-authored-by: Copilot <[email protected]>
…rence/assignment/planning/AssignmentPlanner.java Co-authored-by: Copilot <[email protected]>
@@ -131,7 +132,7 @@ public void testAssignModelToNode_GivenNewPlanSatisfiesCurrentAssignment() { | |||
builder.assignModelToNode(m, n, 1); | |||
|
|||
assertThat(builder.getRemainingCores(n), equalTo(2)); | |||
assertThat(builder.getRemainingMemory(n), equalTo(ByteSizeValue.ofMb(350).getBytes())); | |||
assertThat(builder.getRemainingMemory(n), equalTo(ByteSizeValue.ofMb(50).getBytes())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why 50?
If I understand correctly, the node has 350MB and you've assigned a 30MB deployment to it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see it consists of:
- 240MB of memory overhead
- 2 x 30MB (=model size)
These tests could use some explanation of the numbers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, see the one comment.
(Disclaimer: this code is pretty hard to read, so not 100% sure. Another pair of eyes might be worth it...)
💔 Backport failed
You can use sqren/backport to manually backport by running |
…ic#132170) A follow-up to elastic#131990. This PR ensures that only assigned allocations and not current allocations are used in the memory requirements calculation in AssignmentPlan. This change led to the simplification of the code in ZoneAwareAssignmentPlanner and TrainedModelRebalancer. This PR also improves readability by adding comments, code documentation, renaming variables, and making the flow of if statements more straightforward. Marking is a non-issue since the bug was already documented in elastic#131990.
…ic#132170) A follow-up to elastic#131990. This PR ensures that only assigned allocations and not current allocations are used in the memory requirements calculation in AssignmentPlan. This change led to the simplification of the code in ZoneAwareAssignmentPlanner and TrainedModelRebalancer. This PR also improves readability by adding comments, code documentation, renaming variables, and making the flow of if statements more straightforward. Marking is a non-issue since the bug was already documented in elastic#131990.
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
…ic#132170) A follow-up to elastic#131990. This PR ensures that only assigned allocations and not current allocations are used in the memory requirements calculation in AssignmentPlan. This change led to the simplification of the code in ZoneAwareAssignmentPlanner and TrainedModelRebalancer. This PR also improves readability by adding comments, code documentation, renaming variables, and making the flow of if statements more straightforward. Marking is a non-issue since the bug was already documented in elastic#131990. (cherry picked from commit 80c47f3)
…ic#132170) A follow-up to elastic#131990. This PR ensures that only assigned allocations and not current allocations are used in the memory requirements calculation in AssignmentPlan. This change led to the simplification of the code in ZoneAwareAssignmentPlanner and TrainedModelRebalancer. This PR also improves readability by adding comments, code documentation, renaming variables, and making the flow of if statements more straightforward. Marking is a non-issue since the bug was already documented in elastic#131990. (cherry picked from commit 80c47f3)
…) (#132271) A follow-up to #131990. This PR ensures that only assigned allocations and not current allocations are used in the memory requirements calculation in AssignmentPlan. This change led to the simplification of the code in ZoneAwareAssignmentPlanner and TrainedModelRebalancer. This PR also improves readability by adding comments, code documentation, renaming variables, and making the flow of if statements more straightforward. Marking is a non-issue since the bug was already documented in #131990.
…) (#132270) A follow-up to #131990. This PR ensures that only assigned allocations and not current allocations are used in the memory requirements calculation in AssignmentPlan. This change led to the simplification of the code in ZoneAwareAssignmentPlanner and TrainedModelRebalancer. This PR also improves readability by adding comments, code documentation, renaming variables, and making the flow of if statements more straightforward. Marking is a non-issue since the bug was already documented in #131990.
…ic#132170) A follow-up to elastic#131990. This PR ensures that only assigned allocations and not current allocations are used in the memory requirements calculation in AssignmentPlan. This change led to the simplification of the code in ZoneAwareAssignmentPlanner and TrainedModelRebalancer. This PR also improves readability by adding comments, code documentation, renaming variables, and making the flow of if statements more straightforward. Marking is a non-issue since the bug was already documented in elastic#131990.
…#132170) (#132275) * [ML] Make AssignmentPlan to consider only assigned allocations (#132170) A follow-up to #131990. This PR ensures that only assigned allocations and not current allocations are used in the memory requirements calculation in AssignmentPlan. This change led to the simplification of the code in ZoneAwareAssignmentPlanner and TrainedModelRebalancer. This PR also improves readability by adding comments, code documentation, renaming variables, and making the flow of if statements more straightforward. Marking is a non-issue since the bug was already documented in #131990. (cherry picked from commit 80c47f3) * Fix backport errors * Fix unit test
A follow-up to #131990. This PR ensures that only assigned allocations and not current allocations are used in the memory requirements calculation in
AssignmentPlan
.This change led to the simplification of the code in
ZoneAwareAssignmentPlanner
andTrainedModelRebalancer
.This PR also improves readability by adding comments, code documentation, renaming variables, and making the flow of if statements more straightforward.
Marking is a
non-issue
since the bug was already documented in #131990.