-
Notifications
You must be signed in to change notification settings - Fork 917
Run Service retirement subtasks with workflows #23660
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Run Service retirement subtasks with workflows #23660
Conversation
1cf83c4 to
86bf45c
Compare
356d879 to
a80dfbf
Compare
| fail!("#{self.class.model_being_retired} already in the process of being retired") | ||
| end | ||
|
|
||
| Notification.create!(:type => :vm_retiring, :subject => source) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NOTE moved these common states up to the base MiqRetireTask class so they will apply to Orchestration stacks as well.
It is odd, but automate's Orchestration Retirement also uses :vm_retiring notification type:
https://github.com/ManageIQ/manageiq-content/blob/master/content/automate/ManageIQ/Cloud/Orchestration/Retirement/StateMachines/Methods.class/__methods__/start_retirement.rb#L48
But :orchestration_stack_retired
https://github.com/ManageIQ/manageiq-content/blob/master/content/automate/ManageIQ/Cloud/Orchestration/Retirement/StateMachines/Methods.class/__methods__/finish_retirement.rb#L19
That is clearly a mistake, will fix separately
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dff20ee to
404adc3
Compare
| def remove_from_provider | ||
| case options[:removal_type] | ||
| when "remove_from_disk" | ||
| if vm.miq_provision || vm.is_tagged_with?("retire_full", :ns => "/managed/lifecycle") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Fryguy here's another question, do we want to keep the "did we provision it or is it tagged" or just have a "RemoveFromProvider": true option e.g. https://github.com/ManageIQ/manageiq-content/pull/775/files#diff-098855c3aa217fca0cc56c9749bc1b52ef8aa0440ae9f5a6d039f285d59839fcR14
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tagged based thing is interesting - I think we should keep it, but maybe it makes more sense as part of the workflow itself? We definitely have to make sure we document it well.
| return fail!("#{self.class.model_being_retired} already retired") | ||
| elsif source.retiring? | ||
| return fail!("#{self.class.model_being_retired} already in the process of being retired") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should be i18n if they are user facing.
| return fail!("#{self.class.model_being_retired} already retired") | |
| elsif source.retiring? | |
| return fail!("#{self.class.model_being_retired} already in the process of being retired") | |
| return fail!(_"%{model_being_retired} already retired" % self.class.model_being_retired) | |
| elsif source.retiring? | |
| return fail!(_"%{model_being_retired} already in the process of being retired" % self.class.model_being_retired) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since there is no user context here would that have to be N_() ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh interesting - I guess we can't do it then (even if we did N_() - there's nothing to do the actual translation and interpolation later)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this is setting the message attribute of the miq_request_task record. I don't know if the UI is translating the message here, it is shown in the Services / Requests list:
If N_() would add these for translation later then it sounds like it would be worth it but I don't know if the UI is actually translating these
| if vm.ext_management_system | ||
| _log.info("#{log_prefix} not yet removed from provider...") | ||
| vm.queue_refresh | ||
| requeue_phase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need some caps here so this doesn't run forever. Not sure if time-based, count-based or both. Perhaps we have sensible defaults here for both, and then expose options to retire_execute allowing the author to override.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a TimeoutSeconds ASL state property that we could use, an internal state retry counter would have to be method specific but we could do that too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok yeah - TimeoutSeconds makes sense, but that's a top-level thing (it's around the entire retire_execute). Similarly there's the whole Retries thing built-in to ASL.
Even so, I agree that we can document TimeoutSeconds as the way to prevent it from running forever (maybe even make it required?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seems to be some disagreement between the ASL spec and AWS Step Functions on the default value for TimeoutSeconds:
ASL:
If not provided, the default value of "TimeoutSeconds" is 60.
Step Functions:
The default value is 99999999.
60 is way too short for provision_execute or retire_execute, we should at a minimum add our own default value for those builtin states, maybe 3600? I think automate retries once a minute 100 times or something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree 60 is way too short - we can pick something reasonable, but high, as a default like 3600 globally, perhaps.
I also agree on builtins having their own separate default value - 3600 for this is a good choice (even if it matches the global value, I think having is explicitly defined as a separate default is better, in case we change the global later)
| end | ||
|
|
||
| def create_retired_notification! | ||
| notification_type = "#{self.class.model_being_retired.name.underscore}_retired" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible to get here trying to retire something that doesn't have a notification type? e.g. if you try to retire a generic service? maybe we have an "other_retired" or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are the subclasses of MiqRetireTask
vmdb(dev)> MiqRetireTask.subclasses.map(&:name)
=> ["VmRetireTask", "ServiceRetireTask", "OrchestrationStackRetireTask"]
Each of these has a retiring and retired (except orchestration_stack_retiring which is being fixed separately)
vmdb(dev)> NotificationType.where('name LIKE ? OR name LIKE ?', "%_retired", "%_retiring").pluck(:name)
=> ["service_retired", "service_retiring", "vm_retired", "vm_retiring", "orchestration_stack_retired"]
404adc3 to
3f9576c
Compare
3f9576c to
e6b790f
Compare
e6b790f to
3a40a31
Compare
Allow any service resource retirement task to be executed with embedded workflows
Retirement was only ever implemented in automate unlike
MiqProvision, https://github.com/ManageIQ/manageiq-content/tree/master/content/automate/ManageIQ/Infrastructure/VM/Retirement/StateMachines/Methods.class/__methods__This implements a simple
VmRetireTask::StateMachinethat can be called from workflowsTODO:
Related:
Depends on: