-
Notifications
You must be signed in to change notification settings - Fork 8
Ensure DRBD sync before resizing volumes on thick provisoned linstor SR #105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 3.2.12-8.3
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,4 +1,4 @@ | ||
| coverage | ||
| coverage<7.11 | ||
| astroid | ||
| pylint | ||
| bitarray | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -841,31 +841,33 @@ def resize_volume(self, volume_uuid, new_size): | |
| self.ensure_volume_is_not_locked(volume_uuid) | ||
| new_size = self.round_up_volume_size(new_size) // 1024 | ||
|
|
||
| retry_count = 30 | ||
| while True: | ||
| result = self._linstor.volume_dfn_modify( | ||
| rsc_name=volume_name, | ||
| volume_nr=0, | ||
| size=new_size | ||
| # We can't resize anything until DRBD is up to date. | ||
| # We wait here for 5min max and raise an easy to understand error for the user. | ||
| # 5min is an arbitrary time, it's impossible to get a fit all situation value | ||
| # and it's currently impossible to know how much time we have to wait | ||
| # This is mostly an issue for thick provisioning, thin isn't affected. | ||
| start_time = time.monotonic() | ||
| try: | ||
| self._linstor.resource_dfn_wait_synced(volume_name, wait_interval=1.0, timeout=60*5) | ||
| except linstor.LinstorTimeoutError: | ||
| raise LinstorVolumeManagerError( | ||
| f"Volume `{volume_uuid}` from SR `{self._group_name}` is busy and can't be resized right now. " + | ||
| "Please retry later." | ||
| ) | ||
| util.SMlog(f"DRBD is up to date, syncing took {time.monotonic() - start_time}s") | ||
|
|
||
| self._mark_resource_cache_as_dirty() | ||
|
|
||
| error_str = self._get_error_str(result) | ||
| if not error_str: | ||
| break | ||
| result = self._linstor.volume_dfn_modify( | ||
| rsc_name=volume_name, | ||
| volume_nr=0, | ||
| size=new_size | ||
| ) | ||
|
Comment on lines
+859
to
+863
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we need to keep the initial call before waiting for the sync to complete. The idea would then be to display a message saying that we need to wait, which would create a sort of block/symmetry with your message suggestion "DRBD is up to date, syncing took...".
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Experience proves that we do need to wait, especially for early steps. That's why we had the previous retry mechanism. Either we keep the previous process which is a bit bruteforce or we just wait before but we can't keep both. I think we should discuss this :)
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we need to discuss this, I'm not sure this would be benificial |
||
|
|
||
| # After volume creation, DRBD volume can be unusable during many seconds. | ||
| # So we must retry the definition change if the device is not up to date. | ||
| # Often the case for thick provisioning. | ||
| if retry_count and error_str.find('non-UpToDate DRBD device') >= 0: | ||
| time.sleep(2) | ||
| retry_count -= 1 | ||
| continue | ||
| self._mark_resource_cache_as_dirty() | ||
|
|
||
| error_str = self._get_error_str(result) | ||
| if error_str: | ||
| raise LinstorVolumeManagerError( | ||
| 'Could not resize volume `{}` from SR `{}`: {}' | ||
| .format(volume_uuid, self._group_name, error_str) | ||
| f"Could not resize volume `{volume_uuid}` from SR `{self._group_name}`: {error_str}" | ||
| ) | ||
|
|
||
| def get_volume_name(self, volume_uuid): | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.