Skip to content

Conversation

@kimwnasptd
Copy link
Member

Closes #154

Note: This should be merged/reviewed only after we've merged the PRs for #154 (comment)

/cc @juliusvonkohout @thesuperzapper @andyatmiami

kimwnasptd and others added 8 commits October 28, 2025 08:33
Co-authored-by: Julius von Kohout <[email protected]>
Signed-off-by: Kimonas Sotirchos <[email protected]>
Co-authored-by: Julius von Kohout <[email protected]>
Signed-off-by: Kimonas Sotirchos <[email protected]>
Co-authored-by: Julius von Kohout <[email protected]>
Signed-off-by: Kimonas Sotirchos <[email protected]>
Co-authored-by: Julius von Kohout <[email protected]>
Signed-off-by: Kimonas Sotirchos <[email protected]>
Co-authored-by: Julius von Kohout <[email protected]>
Signed-off-by: Kimonas Sotirchos <[email protected]>
Co-authored-by: Julius von Kohout <[email protected]>
Signed-off-by: Kimonas Sotirchos <[email protected]>
Co-authored-by: Julius von Kohout <[email protected]>
Signed-off-by: Kimonas Sotirchos <[email protected]>
Co-authored-by: Julius von Kohout <[email protected]>
Signed-off-by: Kimonas Sotirchos <[email protected]>
@kimwnasptd
Copy link
Member Author

Thanks for the review @juliusvonkohout! Does the script functionality also looks good to you, or do you have any suggestions?

@juliusvonkohout
Copy link
Member

Thanks for the review @juliusvonkohout! Does the script functionality also looks good to you, or do you have any suggestions?

For me it is fine and optional so
/lgtm

@kimwnasptd
Copy link
Member Author

/ok-to-test

Copy link
Contributor

@andyatmiami andyatmiami left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really great work getting this spec'd out @kimwnasptd

I have some general questions/comments/observations - but nothing imho critically impacting ...

Will wait for your response/follow up on some of the outstanding questions before I engage in attempting to run this bad boy 💯

echo -e "\nWill remove namespaced resources with labels: $label"
for resource in $namespace_resources; do
echo "Removing all $resource objects..."
kubectl delete -n kubeflow -l $label $resource
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we want the --wait=true flag specified on this to ensure the resource(s) are really really gone before proceeding?

could also then specify --timeout=?s to something reasonable (whatever that would be!)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not totally sold myself on this - but I wonder if we'd also want to throw a --ignore-not-found=true flag on there as well..

in the event this script bailed for a wild variety of reasons outside our control... ignore-not-found would make subsequent runs less problematic

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, both are good suggestions as the assumptions of users would be that previous resources would have been completely removed.

I'll update accordingly

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding --ignore-not-found, I think we shouldn't need this one.

kubectl delete will not return a non-zero exit code if we try to delete objects based on labels. It would though if we would try to delete an object with its name. That's why in this line we have the ||
https://github.com/kubeflow/dashboard/pull/162/files#diff-afd68f06993c6a5670eb569a6764c40acaafa880d46db67999c93e5adadb1a14R48

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


for resource in $cluster_resources; do
echo "Removing all $resource objects..."
kubectl delete -l $label $resource
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we want the --wait=true flag specified on this to ensure the resource(s) are really really gone before proceeding?

could also then specify --timeout=?s to something reasonable (whatever that would be!)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not totally sold myself on this - but I wonder if we'd also want to throw a --ignore-not-found=true flag on there as well..

in the event this script bailed for a wild variety of reasons outside our control... ignore-not-found would make subsequent runs less problematic

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added be7d1cf to address this one

* The script will not remove any CR (Custom Resource) or CRD (Custom Resource Definition), to ensure no data loss
* Only Kubernetes resources relevant to the Deployments will be removed (Deployment, ServiceAccount, Service etc)
* The [`NetworkPolicy`](https://github.com/kubeflow/manifests/blob/v1.10-branch/common/networkpolicies/base/centraldashboard.yaml) from the `kubeflow/manifests` repo, of the CentralDashboard, will be removed
2. Install the manifests from this repository for the Dashboard, Profiles Controller and PodDefaults webhook
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Asking b/c I have no idea - but is it "safe" to assume the overlays this script is using when deploying to any potential client (?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say yes, for:

  1. PodDefaults have only the cert-manager overlay, which is currently the only supported way for creating the Webhook certificate
  2. Profiles use the kubeflow overlay, which is what we want to install in this case (platform)

The only one with a bit of a debate would be the dashboard, as the script uses the kserve overlay (had a bug, and was using istio) which in turn expects KServe to be deployed.

Adding a small note in the README. LMKWDYT

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

echo "Installing the updated Dashboard V2 components."
echo "-----------------------------------------------"

echo -e "\nApplying Dashboard component..."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nitpick - feel free to disagree.. but it seems to me we should deploy Dashboard last as its unusable until profile-controller is running...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM! Addressed in be7d1cf

Co-authored-by: Andy Stoneberg <[email protected]>
Signed-off-by: Kimonas Sotirchos <[email protected]>
@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from juliusvonkohout. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

kimwnasptd and others added 3 commits October 30, 2025 09:55
Co-authored-by: Andy Stoneberg <[email protected]>
Signed-off-by: Kimonas Sotirchos <[email protected]>
@kimwnasptd kimwnasptd force-pushed the feat-migration-script branch from 939ac4b to 5cf0e01 Compare October 30, 2025 09:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Script for automating the upgrade to the dashboard v2 components from kubeflow/kubeflow

3 participants