Skip to content

v1.0 InferencePool API Review #1173

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: release-0.5
Choose a base branch
from

Conversation

capri-xiyue
Copy link
Contributor

@capri-xiyue capri-xiyue commented Jul 16, 2025

What type of PR is this?
/kind api-change

What this PR does / why we need it:
This PR is a diff of /apis from alpha (main branch) to v1.0 (release-1.0 branch).

Note: This PR is purely to facilitate review, it is not intended to merge.

Changes:

  1. group change from "inference.networking.x-k8s.io" to "inference.networking.k8s.io"
  2. Change the InferencePool.Selector from map[string]string to a struct for future flexibility in v1. See Upgrade the inferencePool selector to a struct from a map. #1330
  3. Simplify EndpointPickerConfig(remove the whole inline struct)
  4. TargetPortNumber int32 to become TargetPorts []Port see feat: TargetPortNumber int32 to become TargetPorts []Port #1354

/assign @robscott

@k8s-ci-robot k8s-ci-robot added kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jul 16, 2025
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jul 16, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @capri-xiyue. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Jul 16, 2025
@capri-xiyue
Copy link
Contributor Author

/hold
Don't merge as it is just for api review.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 16, 2025
Copy link

netlify bot commented Jul 16, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit 6c02159
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/687811ca8fb1b50008233e1f
😎 Deploy Preview https://deploy-preview-1173--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@capri-xiyue capri-xiyue force-pushed the capri-xiyue/capri-xiyue-v1-api-review branch from 5317094 to 4ffb5f6 Compare July 16, 2025 20:36
@k8s-ci-robot k8s-ci-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jul 16, 2025
@capri-xiyue capri-xiyue reopened this Jul 16, 2025
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jul 16, 2025
@capri-xiyue
Copy link
Contributor Author

/hold, this should not get merged

@capri-xiyue capri-xiyue force-pushed the capri-xiyue/capri-xiyue-v1-api-review branch from 77371fe to 4ffb5f6 Compare July 16, 2025 20:53
@k8s-ci-robot k8s-ci-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jul 16, 2025
@capri-xiyue capri-xiyue reopened this Jul 16, 2025
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jul 16, 2025
@robscott
Copy link
Member

Note: This PR is NOT intended to merge, it is entirely for the purpose of API review.

/cc @aojea @danwinship @thockin

@danehans danehans added this to the v1.0 (InferencePool GA) milestone Aug 14, 2025
// associated with the InferencePool, and the status of the InferencePool with respect to
// each parent.
//
// A maximum of 32 Gateways will be represented in this list. When the list contains
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand the comment about "kind: Status, name: default" -- where are those fields? They are not in PoolStatus.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also don't understand why you need a canary entry to signal "nothing to see" -- why is an empty list not sufficient? I find the imposition on the controllers to be very awkward.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was copied from Gateway API. Generally k8s users don't seem to treat the absence of status as a bad thing, so they may not realize their InferencePool is not being used/referenced.

This is particularly useful in Gateways where you made a typo on gatewayClass and nothing picks up/implements your Gateway. Having a baseline signal that a Gateway is waiting for a controller to pick it up is useful.

A similar pattern is useful for Routes if you make a typo referring to a Gateway - there's no controller empowered to warn you of this since each Gateway controller only populates status on Routes that are attached to their Gateway. If there's a reference to a nonexistent Gateway, there's nothing to warn you about that other than some kind of default status.

With that said, this doesn't seem quite as useful on InferencePool. If someone makes a typo when pointing a route to an InferencePool, they'll get a warning in the status of that Route. This status can be useful in indicating that the InferencePool has not actually been implemented yet, but an empty list may suffice here.

cc @danehans

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do like the explicitness of the current approach, but I don't have a strong opinion here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd make it as resolved for now seems the majority is ok with the current approach. Feel free to re-open it if further discussion is needed.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this very brittle and awkward. Doesn't this argue instead for something like an MAP which detects an empty list and inserts the canary?

Copy link
Contributor

@danehans danehans Aug 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xref kubernetes-sigs/gateway-api#3738 for adding a similar default status condition to HTTPRoute.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a quick update from Slack - we discussed this and decided to remove the defaulting for now. In the corresponding Gateway issue we've been looking into MAP as a potential solution as well. Although it's not clear that will work as we'd like, I think it makes sense to hold off on this until we have more time to look into a MAP-based solution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I resolve this with the agreement to remove the defaulting for now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#1427 is removing the defaulting.

@capri-xiyue
Copy link
Contributor Author

/reopen

@kfswain kfswain reopened this Aug 19, 2025
@k8s-ci-robot
Copy link
Contributor

@capri-xiyue: Reopened this PR.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am LGTM except for a few nits

// +listType=map
// +listMapKey=type
// +kubebuilder:validation:MaxItems=8
// +kubebuilder:default={{type: "Accepted", status: "Unknown", reason:"Pending", message:"Waiting for controller", lastTransitionTime: "1970-01-01T00:00:00Z"}}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this right? I don't understand it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will further update it once #1427 gets merged.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#1427 removes the status condition defaulting.

@JoelSpeed
Copy link

I lost the comment thread, but I believe I saw something along the lines of:

A decision has been made to stick to pointers for optional structs, but pointers only when required for scalars

If that's the case, you can set KALs config for optionalfields.omitzero.policy: Forbid and it will behave as you intend.

Omitzero support is only for structs in KAL right now, and with it set to forbid, it will force the omitempty route with a pointer for all optional structs

@danehans
Copy link
Contributor

FYI #1427 is a PR under review that implements several API changes based on feedback from API reviewers here and discussions among the community, maintainers, etc.

@danehans
Copy link
Contributor

danehans commented Aug 22, 2025

Given it is consistent, I am satisified. Implementations MUST use RV preconditions. If there is a v2, I would suggest to inline the Ref here and use those fields as the map key.

@thockin the challenge I see with inlining is that no field by itself can be used as the map key:

type ParentReference struct {
	// Group is the group of the referent API object. When unspecified, the referent is assumed
	// to be in the "gateway.networking.k8s.io" API group.
	//
	// +optional
	// +kubebuilder:default="gateway.networking.k8s.io"
	Group *Group `json:"group,omitempty"`

	// Kind is the kind of the referent API object. When unspecified, the referent is assumed
	// to be a "Gateway" kind.
	//
	// +optional
	// +kubebuilder:default=Gateway
	Kind *Kind `json:"kind,omitempty"`

	// Name is the name of the referent API object.
	//
	// +required
	Name ObjectName `json:"name,omitempty"`

	// Namespace is the namespace of the referenced object. When unspecified, the local
	// namespace is inferred.
	...
	// +optional
	Namespace *Namespace `json:"namespace,omitempty"`
}

All fields are needed to ensure the uniqueness of a parent.

@capri-xiyue
Copy link
Contributor Author

I lost the comment thread, but I believe I saw something along the lines of:

A decision has been made to stick to pointers for optional structs, but pointers only when required for scalars

If that's the case, you can set KALs config for optionalfields.omitzero.policy: Forbid and it will behave as you intend.

Omitzero support is only for structs in KAL right now, and with it set to forbid, it will force the omitempty route with a pointer for all optional structs

I tried it, with omitzero forbid, https://github.com/kubernetes-sigs/gateway-api-inference-extension/pull/1438/files#diff-8d7db842a7673f66c7db001de9fe07ce67b67e21d8cbce12394fde46eea1e5b7R36 I still need ignore kubeapi linter otherwise it will also show optionalfields: field Kind does not allow the zero value. The field does not need to be a pointer. (kubeapilinter) Kind *Kind `json:"kind,omitempty" for optional pointer struct where zero value is not allowed

@k8s-ci-robot
Copy link
Contributor

@capri-xiyue: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-gateway-api-inference-extension-test-unit-main 33c8e89 link true /test pull-gateway-api-inference-extension-test-unit-main

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@robscott
Copy link
Member

@thockin @aojea I believe we've addressed all your feedback, thanks! PTAL

@JoelSpeed
Copy link

I still need ignore kubeapi linter otherwise it will also show optionalfields: field Kind does not allow the zero value. The field does not need to be a pointer. (kubeapilinter) Kind *Kind `json:"kind,omitempty" for optional pointer struct where zero value is not allowed

Kind is not a struct, it is a string. You have a non-zero minimum length marker on kind so why do we need it to be a pointer? The string being empty implies to the go client that it wasn't present when admitted

// Status defines the observed state of the InferencePool.
//
// +optional
//nolint:kubeapilinter // status should not be a pointer.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not? It's the same as any other struct

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We followed the existing k8s convention. See https://github.com/kubernetes/api/blob/master/batch/v1/types.go#L84

//
// +optional
// +kubebuilder:default=Service
//nolint:kubeapilinter // ignore kubeapilinter here as we want to use pointer for optional struct.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a string, not a struct, this should not be ignored

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// unspecified (defaults to "Service").
//
// +optional
//nolint:kubeapilinter // ignore kubeapilinter here as we want to use pointer for optional struct.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a struct, it's an integer. I know we wanted to not allow 0, but this isn't the right way to except this. Add an exception to the golang ci lint config with the explicit error message.

Right now you're ignoring all possible KAL issues and that might mask other problems with the field

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zero means all ports by convention, we don't want to use zero to indicate not set, therefore we want to add a exception here. I will change the comment as it is not accurate

//
// +optional
// +kubebuilder:default="FailClose"
//nolint:kubeapilinter // ignore kubeapilinter here as we want to use pointer for optional struct.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a struct, it's a string. Enum validation prevents the empty string, this doesn't need to be a pointer.

Go clients can check if empty to assert whether the string was provided or not

Copy link
Contributor Author

@capri-xiyue capri-xiyue Aug 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we consider such type EndpointPickerFailureMode string as a built-in type string or a struct(In reality, it is neither a struct nor a built-in type) But maybe more close to a string when it comes to serialization/deserialization? cc @robscott @kfswain As we are going to cut the RC, if we decide to make a last-minute change, I can send a PR for the fix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#1444 is ready to merge if we want to change it to non-pointer for such named type.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EndpointPickerFailureMode

This is a string, it's definitely not a struct (that would be type Foo struct)

//
// +optional
// +kubebuilder:default=Gateway
//nolint:kubeapilinter // ignore kubeapilinter here as we want to use pointer for optional struct.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a string, not a struct. Empty string isn't valid, should still be a non-pointer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// documentation for details: https://gateway-api.sigs.k8s.io/api-types/referencegrant/
//
// +optional
//nolint:kubeapilinter // ignore kubeapilinter here as we want to use pointer for optional struct.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a string, not a struct, empty namespace is not a valid choice, does not need to be a pointer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JoelSpeed
Copy link

Further to my comment above, did we decide all optional fields should be pointers, or, only optional structs? Cc @robscott

I thought the latter, but the exceptions I'm seeing imply the former

Either decision can be configured in KAL so we shouldn't need any exceptions apart from the port number where a specific decision was made to ignore the guidance about the zero value

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.