-
Notifications
You must be signed in to change notification settings - Fork 25.4k
[Test] Put shutdown marker on the last upgraded node only #132157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This is the same failure as observed in elastic#129644 for which the original fix elastic#129680 did not really work. It did not work because the the ordering of checks. The shutdown marker is removed after the cluster passes ready check so that new shards can be allocated. But the cluster cannot pass the ready check before the shards are allocated. Hence the circular dependency. In hindsight, there is no need to put shutdown record for all nodes. It is only needed on the node that upgrades the last to prevent snapshot from completion during the upgrade process. This PR does that which ensures there are always 2 nodes for hosting new shards. Resolves: elastic#132135 Resolves: elastic#132136 Resolves: elastic#132137
Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination) |
nicktindall
approved these changes
Jul 31, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@elasticmachine update branch |
💔 Backport failed
You can use sqren/backport to manually backport by running |
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
ywangd
added a commit
to ywangd/elasticsearch
that referenced
this pull request
Jul 31, 2025
…2157) This is the same failure as observed in elastic#129644 for which the original fix elastic#129680 did not really work. It did not work because the the ordering of checks. The shutdown marker is removed after the cluster passes ready check so that new shards can be allocated. But the cluster cannot pass the ready check before the shards are allocated. Hence the circular dependency. In hindsight, there is no need to put shutdown record for all nodes. It is only needed on the node that upgrades the last to prevent snapshot from completion during the upgrade process. This PR does that which ensures there are always 2 nodes for hosting new shards. Resolves: elastic#132135 Resolves: elastic#132136 Resolves: elastic#132137 (cherry picked from commit f39ccb5) # Conflicts: # muted-tests.yml
elasticsearchmachine
pushed a commit
that referenced
this pull request
Jul 31, 2025
…132233) This is the same failure as observed in #129644 for which the original fix #129680 did not really work. It did not work because the the ordering of checks. The shutdown marker is removed after the cluster passes ready check so that new shards can be allocated. But the cluster cannot pass the ready check before the shards are allocated. Hence the circular dependency. In hindsight, there is no need to put shutdown record for all nodes. It is only needed on the node that upgrades the last to prevent snapshot from completion during the upgrade process. This PR does that which ensures there are always 2 nodes for hosting new shards. Resolves: #132135 Resolves: #132136 Resolves: #132137 (cherry picked from commit f39ccb5) # Conflicts: # muted-tests.yml
afoucret
pushed a commit
to afoucret/elasticsearch
that referenced
this pull request
Jul 31, 2025
…2157) This is the same failure as observed in elastic#129644 for which the original fix elastic#129680 did not really work. It did not work because the the ordering of checks. The shutdown marker is removed after the cluster passes ready check so that new shards can be allocated. But the cluster cannot pass the ready check before the shards are allocated. Hence the circular dependency. In hindsight, there is no need to put shutdown record for all nodes. It is only needed on the node that upgrades the last to prevent snapshot from completion during the upgrade process. This PR does that which ensures there are always 2 nodes for hosting new shards. Resolves: elastic#132135 Resolves: elastic#132136 Resolves: elastic#132137
smalyshev
pushed a commit
to smalyshev/elasticsearch
that referenced
this pull request
Jul 31, 2025
…2157) This is the same failure as observed in elastic#129644 for which the original fix elastic#129680 did not really work. It did not work because the the ordering of checks. The shutdown marker is removed after the cluster passes ready check so that new shards can be allocated. But the cluster cannot pass the ready check before the shards are allocated. Hence the circular dependency. In hindsight, there is no need to put shutdown record for all nodes. It is only needed on the node that upgrades the last to prevent snapshot from completion during the upgrade process. This PR does that which ensures there are always 2 nodes for hosting new shards. Resolves: elastic#132135 Resolves: elastic#132136 Resolves: elastic#132137
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
auto-backport
Automatically create backport pull requests when merged
auto-merge-without-approval
Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!)
backport pending
:Distributed Coordination/Snapshot/Restore
Anything directly related to the `_snapshot/*` APIs
Team:Distributed Coordination
Meta label for Distributed Coordination team
>test
Issues or PRs that are addressing/adding tests
v9.1.1
v9.2.0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is the same failure as observed in #129644 for which the original fix #129680 did not really work. It did not work because the the ordering of checks. The shutdown marker is removed after the cluster passes ready check so that new shards can be allocated. But the cluster cannot pass the ready check before the shards are allocated. Hence the circular dependency. In hindsight, there is no need to put shutdown record for all nodes. It is only needed on the node that upgrades the last to prevent snapshot from completion during the upgrade process. This PR does that which ensures there are always 2 nodes for hosting new shards.
Resolves: #132135
Resolves: #132136
Resolves: #132137