Skip to content

feat(ws): add ws counts to backend wsk model #368

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: notebooks-v2
Choose a base branch
from

Conversation

roee1313
Copy link

This PR solves issue #352

Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign kimwnasptd for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jiridanek
Copy link
Member

/ok-to-test

@google-oss-prow google-oss-prow bot added size/L and removed size/S labels May 27, 2025
@roee1313 roee1313 force-pushed the RHOAIENG-25098 branch 2 times, most recently from b47474c to 15e589e Compare May 28, 2025 11:55
Copy link
Contributor

@andyatmiami andyatmiami left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I managed to (briefly) talk with Mathew during the WG call today (Thursday, May 29th) and he stated he was comfortable embedding this "counts" information more naturally inline with the response schema (and not worrying about adhering to a 1:1 structural mapping w.r.t how the information is presented from k8s)

The below image represents my proposal on how this "embedding" could work. A couple notes:

  • The same schema should be applied to the "list" as well as the "get-by-id" endpoints
    • For sake of brevity - I am only showing the "get-by-id" payload below
  • I opted to "co-locate" the counts information closest to relevant data within the response. This is certainly most clean for the values arrays - but then leads to a root workspacesCount attribute at the root of the WorkspaceKind element as there isn't a more suitable place for it.
  • For sake of consistency/readability - I chose to re-use the same attribute name throughout (workspacesCount) - with the "pattern" being <child resource name (plural)>Count... in the event this paradigm crops up elsewhere in the API - we could reuse this design philosophy
  • Certainly open to other's opinions on if this is the most appropriate design. I had considered a counts: {....} object as well... but chose not to advocate for that option as its too redundant/nested (having to repeat context like imageConfig as well as the IDs in order for a client to then have to perform the mapping to relevant object to display inline as we do in our table view today)
image

FYI: @thesuperzapper @ederign @paulovmr

Copy link

@andyatmiami: changing LGTM is restricted to collaborators

In response to this:

I managed to (briefly) talk with Mathew during the WG call today (Thursday, May 29th) and he stated he was comfortable embedding this "counts" information more naturally inline with the response schema (and not worrying about adhering to a 1:1 structural mapping w.r.t how the information is presented from k8s)

The below image represents my proposal on how this "embedding" could work. A couple notes:

  • The same schema should be applied to the "list" as well as the "get-by-id" endpoints
    • For sake of brevity - I am only showing the "get-by-id" payload below
  • I opted to "co-locate" the counts information closest to relevant data within the response. This is certainly most clean for the values arrays - but then leads to a root workspaces_count attribute at the root of the WorkspaceKind element as there isn't a more suitable place for it.
  • For sake of consistency/readability - I chose to re-use the same attribute name throughout (workspaces_count) - with the "pattern" being <child resource name>_counts... in the event this paradigm crops up elsewhere in the API - we could reuse this design philosophy
  • Certainly open to other's opinions on if this is the most appropriate design. I had considered a counts: {....} object as well... but chose not to advocate for that option as its too redundant/nested (having to repeat context like imageConfig as well as the IDs in order for a client to then have to perform the mapping to relevant object to display inline as we do in our table view today)
image

FYI: @thesuperzapper @ederign @paulovmr

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@roee1313
Copy link
Author

roee1313 commented Jun 7, 2025

Hey @andyatmiami
I added the required fixes, and they are ready for to be checked,
Thanks!

@@ -153,3 +156,12 @@ func buildOptionRedirect(redirect *kubefloworgv1beta1.OptionRedirect) *OptionRed
Message: message,
}
}

func FindConfigWorkspacesById(configs []kubefloworgv1beta1.OptionMetric, targetId string) int32 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For optimal efficiency - I worry about this pattern of iterating through an array while we are already iterating over an array that gives us O(n^2)

We can improve this by iterating over the respective []kubefloworgv1beta1.OptionMetric once each (for imageConfig and podConfig) and building up a map object that we can then pass into the build...ConfigValues

This ensures while we are iterating over the PodConfig/ImageConfig .Values - we can get O(1) lookups for the respective counts.

Although being done in a slightly different context - you can see examples of "map initialization" in theNewWorkspaceKindModelFromWorkspaceKind function:

So, feels like this new use case also "fits"...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @andyatmiami
That is a good point actually :) thanks for that!
Just fixed that please review when available
Thanks!

@roee1313 roee1313 force-pushed the RHOAIENG-25098 branch 4 times, most recently from 0e43df9 to f8c3234 Compare June 10, 2025 13:06
@andyatmiami
Copy link
Contributor

/lgtm

Re-testing code after latest changes:

  • verified GET api/v1/workspacekinds includes workspacesCount in response for all elements
  • verified GET api/v1/workspacekinds/{name} includes workspacesCount in response

also confirmed Swagger UI properly includes these new attributes in schema:
image

@andyatmiami
Copy link
Contributor

@thesuperzapper - PR ready / looking for your review/approval now - thanks!

Copy link
Contributor

@andyatmiami andyatmiami left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@roee1313

I had a chance to speak with @thesuperzapper about this PR - and here is the (hopefully!) last batch of feedback we have to give in order to get this PR merged 🤞

In the last round of edits - we made changes to expose a workspacesCount attribute "inline" / co-located amongst the relevant JSON object.

We want to tweak this structure slightly.. instead of workspacesCount: <int> - we want the following structure:

clusterMetrics: {
    workspacesCount: <int>
}

This change will make more sense as I open up a couple other related issues based on recent discussions with Mathew - but the high level motivations are as such:

  • clusterMetrics becomes a "bucket" to hold other future metrics in a well-understood structure
  • We will eventually add the concept of namespaceMetrics as well - to better delineate metrics that span the cluster (relevant to Joel) and metrics specific to a given namespace (relevant to Bella)
    • One of the driving forces behind this change is the "dual nature" of use cases the GET /workspacekinds API find itself in:
      • For Bella, when listing WorkspaceKinds as part of the Create Workspace flow, she may not have permissions/ability to view ALL workspaces... so if we only had a SINGLE count exposed, this could be confusing as she may not see the count aligned with the number of workspaces accessible to her
      • For Joel, when listing WorkspaceKinds as part his workflow in the Workspace Kinds screen, he should have permissions/ability to view ALL workspaces... so presenting this clusterMetrics count is more natural / aligned with what he would find if clicking around in the UI

A few other implementation notes:

  • while not possible to observe today, we would want omitempty defined on the struct for clusterMetrics (however, as part of this PR, workspacesCount will always be defined
  • Please add a //TODO somewhere relevant in the code to highlight the need to extend your solution to apply to namespaceMetrics (in the future, after I write that issue up and get it prioritized).. the need to calculate namespaceMetrics would be triggered via a request to GET /workspacekinds having a ?namespaceFilter query parameter defined on it that would in some way be passed down into NewWorkspaceKindModelFromWorkspaceKind.. you don't need to implement these changes - just sharing for transparency/understanding)

@roee1313 roee1313 force-pushed the RHOAIENG-25098 branch 3 times, most recently from ceec598 to 88d0e77 Compare June 25, 2025 07:55
Labels []OptionLabel `json:"labels"`
Hidden bool `json:"hidden"`
Redirect *OptionRedirect `json:"redirect,omitempty"`
ClusterMetrics clusterMetrics `json:"clusterMetrics"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ClusterMetrics clusterMetrics `json:"clusterMetrics"`
ClusterMetrics clusterMetrics `json:"clusterMetrics,omitempty"`

Labels []OptionLabel `json:"labels"`
Hidden bool `json:"hidden"`
Redirect *OptionRedirect `json:"redirect,omitempty"`
ClusterMetrics clusterMetrics `json:"clusterMetrics"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ClusterMetrics clusterMetrics `json:"clusterMetrics"`
ClusterMetrics clusterMetrics `json:"clusterMetrics,omitempty"`

@@ -60,6 +62,10 @@ func NewWorkspaceKindModelFromWorkspaceKind(wsk *kubefloworgv1beta1.WorkspaceKin
Hidden: ptr.Deref(wsk.Spec.Spawner.Hidden, false),
Icon: iconRef,
Logo: logoRef,
// TODO: add namespaceMetrics
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// TODO: add namespaceMetrics
// TODO: in the future will need to support including exactly one of clusterMetrics or namespaceMetrics based on request context

Comment on lines +91 to +97
func buildOptionMetricsMap(metrics []kubefloworgv1beta1.OptionMetric) map[string]int32 {
resultMap := make(map[string]int32)
for _, metric := range metrics {
resultMap[metric.Id] = metric.Workspaces
}
return resultMap
}
Copy link
Contributor

@andyatmiami andyatmiami Jul 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am legitimately torn here on a subtle and potentially in-this-context-unimportant implementation decision 😇

for _, metric := range metrics will perform a copy on the elements defined in metrics

To avoid the copy - you can declare the for loop as: for i := range metrics

However, I am then unsure how the body of the loop should be constructed

avoid copy (readable): metric := &metrics[i]

  • as this is a pointer - if in the future someone modified this implementation for some reason to change this value - it would also change the actual element in the slice

avoid copy (less readable): resultMap[metrics[i].Id] = metrics[i].Workspaces

  • less likely for someone in future to "accidentally" re-assign a value - but also not as visually pleasing to read

explicit copy: metric := metrics[i]

  • intent more clear as compared to for _, metric := range metrics
  • as these structs are small - copy overhead not a concern

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given these objects we are handling are inherently small (and bounded to be reasonably sure they will always be small) - the code here as written is sufficient.

@google-oss-prow google-oss-prow bot added area/backend area - related to backend components area/v2 area - version - kubeflow notebooks v2 labels Jul 6, 2025
@thesuperzapper thesuperzapper changed the title feat(ws): Notebooks 2.0 // Backend // add count to workspacekind sche… feat(ws): add ws counts to backend wsk model Jul 10, 2025
@google-oss-prow google-oss-prow bot added the lgtm label Jul 11, 2025
@andyatmiami
Copy link
Contributor

/lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/backend area - related to backend components area/v2 area - version - kubeflow notebooks v2 lgtm ok-to-test size/L
Projects
Status: Needs Triage
Development

Successfully merging this pull request may close these issues.

3 participants