Skip to content

feat: Scaffold OptimizationJob CRD v1alpha1 types and clients#2624

Open
aniket2405 wants to merge 4 commits intokubeflow:masterfrom
aniket2405:feat/optimizationjob-crd
Open

feat: Scaffold OptimizationJob CRD v1alpha1 types and clients#2624
aniket2405 wants to merge 4 commits intokubeflow:masterfrom
aniket2405:feat/optimizationjob-crd

Conversation

@aniket2405
Copy link

What this PR does / why we need it:

This PR lays the foundational Go API types for the new OptimizationJob CRD, which represents the backend controller implementation of KEP-46 (Hyperparameter Optimization APIs).

Specific changes:

  1. Added OptimizationJob, OptimizationJobSpec, Objective, TrialConfig, and Algorithm structs in pkg/apis/optimizer/v1alpha1.
  2. Updated hack/update-codegen.sh to include the new optimizer API group.
  3. Generated zz_generated.deepcopy.go along with the Kubernetes client, lister, and informer code for the v1alpha1 package.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Updates #2605

Checklist:

  • Docs included if any changes are user facing

@github-actions
Copy link

🎉 Welcome to the Kubeflow Katib repo! 🎉

Thanks for opening your first PR! We're excited to have you onboard 🚀

Next steps:

Feel free to ask questions in the comments. Thanks again for contributing! 🙏

@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign andreyvelich for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@Krishna-kg732
Copy link

Hi @aniket2405,

Great foundational work on the OptimizationJob CRD! I’d love to collaborate and help with reviews, feedback, or any other support you might need as this progresses as I was working on this as well

Looking forward to working together !

also could you please sign your commits
-Thanks

}

// OptimizationJobSpec defines the desired state of OptimizationJob.
type OptimizationJobSpec struct {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the spec have initializer config at top level? We want have a common initializer for all jobs/trials

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, added an Initializer config.


// OptimizationJobStatus defines the observed state of OptimizationJob.
type OptimizationJobStatus struct {
// Add status fields here (e.g., Conditions, BestTrial, etc.) as the controller matures.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should also integrate with train job status tracking feature

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack, added it.

@akshaychitneni
Copy link

Thanks @aniket2405. Could you start a KEP for this effort to finalize the features and approach

@aniket2405
Copy link
Author

Hi @aniket2405,

Great foundational work on the OptimizationJob CRD! I’d love to collaborate and help with reviews, feedback, or any other support you might need as this progresses as I was working on this as well

Thanks @Krishna-kg732
Certainly, collaboration and help is always welcome!

@Krishna-kg732
Copy link

Thanks @aniket2405. Could you start a KEP for this effort to finalize the features and approach

Thanks @aniket2405 — would you be open to starting a Slack thread so we can discuss potential features and improvements together before drafting the KEP? I think it would help us align on scope and approach. Also, if you already have a Google Doc prepared, could you please share it here as well?

Thanks!

@andreyvelich
Copy link
Member

Thanks for this work @aniket2405!
As we discussed on the call yesterday, please start Google doc where we can identify list of features we want to support in OptimizationJob + API design: https://youtu.be/gdZIHK335Qg?t=3156

@aniket2405
Copy link
Author

Certainly.
I'll share the doc with some initial pointers tomorrow and we can have a slack discussion channel for the same as well.

Signed-off-by: aniket2405 <aniketshaha2001@gmail.com>
@aniket2405
Copy link
Author

Have a draft document ready, will fine-tune some points, and share this by tomorrow.

@aniket2405
Copy link
Author

Hi @andreyvelich @akshaychitneni
I've started a Google Doc to discuss and finalize the features for this effort:
http://bit.ly/4aGAWsu

Some more things need to be updated in this, but I feel it's a good point to start.

Please go through it and suggest improvements.

@aniket2405 aniket2405 force-pushed the feat/optimizationjob-crd branch from 41511af to 68b7f47 Compare February 28, 2026 19:52
@Krishna-kg732
Copy link

Hi @aniket2405, great work on the scaffolding! I've been going through the types and the design doc, and had a couple of observations:

1. SearchSpace typing

Currently SearchSpace is defined as map[string]string. But in the doc (Section 4.1) and the example YAML (Section 5) show a strongly-typed structure with continuous, categorical, and discrete parameter types — each with their own fields (min/max/distribution, choices, values). The SDK's Search.loguniform() and Search.choice() APIs also expect this structure.

Should we introduce typed structs here? Something like:

type ParameterSpec struct {
    Name        string                `json:"name"`
    Continuous  *ContinuousParameter  `json:"continuous,omitempty"`
    Categorical *CategoricalParameter `json:"categorical,omitempty"`
    Discrete    *DiscreteParameter    `json:"discrete,omitempty"`
}

type ContinuousParameter struct {
    Min          float64 `json:"min"`
    Max          float64 `json:"max"`
    Distribution string  `json:"distribution,omitempty"` // uniform, logUniform
}

type CategoricalParameter struct {
    Choices []string `json:"choices"`
}

type DiscreteParameter struct {
    Values []float64 `json:"values"`
}

And then in the spec: SearchSpace []ParameterSpec instead of map[string]string. This would align with Discussion 2 in the design doc, and also be consistent with how Katib's existing ParameterSpec + FeasibleSpace works (but cleaner).

2. Direction as string vs. enum

Objective.Direction is currently a bare string. Might be worth defining a type alias (similar to Katib's [ObjectiveType] so the allowed values (minimize/maximize) are explicit:

type ObjectiveDirection string

const (
    ObjectiveDirectionMinimize ObjectiveDirection = "minimize"
    ObjectiveDirectionMaximize ObjectiveDirection = "maximize"
)

Happy to help with any of these if you'd like to collaborate! Looking forward to working together on this.

cc: @akshaychitneni

@aniket2405
Copy link
Author

Yes, @Krishna-kg732
We definitely plan to make it

SearchSpace []ParameterSpec with a type ParameterSpec.

Kept it open for discussion (Discussion 2 in the KEP doc.) to get inputs from @akshaychitneni or @andreyvelich if there's a better way to go about doing it. Otherwise, the Katib's existing ParameterSpec implementation is the way to go!

Agreed with Objective.Direction to be an enum, rather than a bare string. I'd also like to add a kubebuilder validation marker on the enum so that the Kubernetes API server can reject any invalid YAML immediately before it even hits our controller.

@aniket2405
Copy link
Author

aniket2405 commented Mar 10, 2026

@akshaychitneni Did you get some time to go over the initial draft of KEP, or would you like to discuss this once we've come up with an in-depth CRD transition details. (I'm going over the existing Katib API and will be adding a few points over the next day or two). Thanks for all the help!

@aniket2405
Copy link
Author

Hi @akshaychitneni
Made a few changes in the doc: http://bit.ly/4aGAWsu
after your initial thoughts, and addressed your comments.
There are a few things that need to be finetuned (especially the push_metrics, now that the corresponding CR for progress API is merged).

Please go through it and point out any shortcomings. Since it's a pretty extensive doc, there might be some inconsistencies. I'm actively rectifying them.
Hopefully, we can close on the requirements soon.
Please let me know what you think.

cc: @andreyvelich

@llhimanshull
Copy link

Hi @aniket2405, great work on this scaffold! I'm also interested in Project 2 for GSoC. I'd love to collaborate — happy to help with the initializer config feedback @akshaychitneni mentioned, or with TrainJob status integration. Let me know how I can contribute.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants