-
Notifications
You must be signed in to change notification settings - Fork 145
v1.0 InferencePool API Review #1173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: release-0.5
Are you sure you want to change the base?
Changes from all commits
7cacbde
7ae0765
f4ccc68
0050467
80dee7b
9981a82
33c8e89
5033512
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,304 @@ | ||
/* | ||
Copyright 2025 The Kubernetes Authors. | ||
Licensed under the Apache License, Version 2.0 (the "License"); | ||
you may not use this file except in compliance with the License. | ||
You may obtain a copy of the License at | ||
http://www.apache.org/licenses/LICENSE-2.0 | ||
Unless required by applicable law or agreed to in writing, software | ||
distributed under the License is distributed on an "AS IS" BASIS, | ||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
See the License for the specific language governing permissions and | ||
limitations under the License. | ||
*/ | ||
|
||
package v1 | ||
|
||
import ( | ||
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" | ||
) | ||
|
||
// InferencePool is the Schema for the InferencePools API. | ||
// | ||
// +kubebuilder:object:root=true | ||
// TODO: change the annotation once it gets officially approved | ||
// +kubebuilder:metadata:annotations="api-approved.kubernetes.io=https://github.com/kubernetes-sigs/gateway-api-inference-extension/pull/1173" | ||
// +kubebuilder:resource:shortName=infpool | ||
// +kubebuilder:subresource:status | ||
// +kubebuilder:storageversion | ||
// +genclient | ||
type InferencePool struct { | ||
metav1.TypeMeta `json:",inline"` | ||
|
||
// +optional | ||
metav1.ObjectMeta `json:"metadata,omitempty"` | ||
|
||
// Spec defines the desired state of the InferencePool. | ||
// | ||
// +required | ||
Spec InferencePoolSpec `json:"spec,omitzero"` | ||
|
||
// Status defines the observed state of the InferencePool. | ||
// | ||
// +optional | ||
//nolint:kubeapilinter // status should not be a pointer. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why not? It's the same as any other struct There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We followed the existing k8s convention. See https://github.com/kubernetes/api/blob/master/batch/v1/types.go#L84 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @JoelSpeed I agree that Status should be a pointer type, but we could not find other K8s APIs that follow this convention. We prefer to follow established patterns. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In theory every optional struct-field should be a pointer. In practice, we do not do that (in part because we have had no linter). I am OK with this because it is "not worse than" everything else in the system. We should refine the rules (the doc which emerged from this discussion) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Joel, are you OK to resolve this one? |
||
Status InferencePoolStatus `json:"status,omitempty"` | ||
} | ||
|
||
// InferencePoolList contains a list of InferencePools. | ||
// | ||
// +kubebuilder:object:root=true | ||
type InferencePoolList struct { | ||
metav1.TypeMeta `json:",inline"` | ||
metav1.ListMeta `json:"metadata,omitempty"` | ||
Items []InferencePool `json:"items"` | ||
} | ||
|
||
// InferencePoolSpec defines the desired state of the InferencePool. | ||
type InferencePoolSpec struct { | ||
// Selector determines which Pods are members of this inference pool. | ||
// It matches Pods by their labels only within the same namespace; cross-namespace | ||
// selection is not supported. | ||
// | ||
// The structure of this LabelSelector is intentionally simple to be compatible | ||
// with Kubernetes Service selectors, as some implementations may translate | ||
// this configuration into a Service resource. | ||
// | ||
// +required | ||
Selector LabelSelector `json:"selector,omitzero"` | ||
|
||
// TargetPorts defines a list of ports that are exposed by this InferencePool. | ||
// Currently, the list may only include a single port definition. | ||
// | ||
// +kubebuilder:validation:MinItems=1 | ||
// +kubebuilder:validation:MaxItems=1 | ||
// +listType=atomic | ||
// +required | ||
TargetPorts []Port `json:"targetPorts,omitempty"` | ||
|
||
// EndpointPickerRef is a reference to the Endpoint Picker extension and its | ||
// associated configuration. | ||
// | ||
// +required | ||
EndpointPickerRef EndpointPickerRef `json:"endpointPickerRef,omitzero"` | ||
} | ||
|
||
// Port defines the network port that will be exposed by this InferencePool. | ||
type Port struct { | ||
// Number defines the port number to access the selected model server Pods. | ||
// The number must be in the range 1 to 65535. | ||
// | ||
// +required | ||
Number PortNumber `json:"number,omitempty"` | ||
capri-xiyue marked this conversation as resolved.
Show resolved
Hide resolved
|
||
} | ||
|
||
// EndpointPickerRef specifies a reference to an Endpoint Picker extension and its | ||
// associated configuration. | ||
type EndpointPickerRef struct { | ||
// Group is the group of the referent API object. When unspecified, the default value | ||
// is "", representing the Core API group. | ||
// | ||
// +optional | ||
// +kubebuilder:default="" | ||
Group *Group `json:"group,omitempty"` | ||
capri-xiyue marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
// Kind is the Kubernetes resource kind of the referent. | ||
// | ||
// Required if the referent is ambiguous, e.g. service with multiple ports. | ||
// | ||
// Defaults to "Service" when not specified. | ||
// | ||
// ExternalName services can refer to CNAME DNS records that may live | ||
// outside of the cluster and as such are difficult to reason about in | ||
// terms of conformance. They also may not be safe to forward to (see | ||
// CVE-2021-25740 for more information). Implementations MUST NOT | ||
// support ExternalName Services. | ||
capri-xiyue marked this conversation as resolved.
Show resolved
Hide resolved
|
||
// | ||
// +optional | ||
// +kubebuilder:default=Service | ||
Kind Kind `json:"kind,omitempty"` | ||
capri-xiyue marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
// Name is the name of the referent API object. | ||
// | ||
// +required | ||
Name ObjectName `json:"name,omitempty"` | ||
|
||
// PortNumber is the port number of the Endpoint Picker extension service. When unspecified, | ||
// implementations SHOULD infer a default value of 9002 when the kind field is "Service" or | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Rob, we discussed this, but I forget what we came to. Do we have good precedent for "should infer" as opposed to just setting a default? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To clarify, the default value of 9002 only applies when the kind filed is a service or unspecified (defaults to "Service"). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My point is that telling clients to infer a value is generally not what we do. We either give it an explicit default or we require the user to set it. You can bundle an MAP which looks at Group and Kind, and assigns 9002, but the validation for it would be "required". Or you can assign a static default of 9002 and leave it optional. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. cc @robscott for the context of inferring here. |
||
// unspecified (defaults to "Service"). | ||
// | ||
// +optional | ||
//nolint:kubeapilinter // ignore kubeapilinter here as we want to use pointer as zero means all ports in convention, we don't make to use 0 to indicate not set. | ||
PortNumber *PortNumber `json:"portNumber,omitempty"` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why "PortNumber" vs "Port" ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Will we need protocol here eventually? Did we discuss this one? Why not use the same Port struct? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @robscott Do you have any context about this? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good point. I think the ideal would be to mirror I'm a bit split on whether or not we should make the change from There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I’m fine with this change. portNumber was introduced only in v1 (different field exists in v1alpha2) and as long as the api has not been finalized, that’s ok to change. |
||
|
||
// FailureMode configures how the parent handles the case when the Endpoint Picker extension | ||
// is non-responsive. When unspecified, defaults to "FailClose". | ||
// | ||
// +optional | ||
// +kubebuilder:default="FailClose" | ||
FailureMode EndpointPickerFailureMode `json:"failureMode,omitempty"` | ||
} | ||
|
||
// EndpointPickerFailureMode defines the options for how the parent handles the case when the | ||
// Endpoint Picker extension is non-responsive. | ||
// | ||
// +kubebuilder:validation:Enum=FailOpen;FailClose | ||
type EndpointPickerFailureMode string | ||
|
||
const ( | ||
// EndpointPickerFailOpen specifies that the parent should forward the request to an endpoint | ||
// of its picking when the Endpoint Picker extension fails. | ||
EndpointPickerFailOpen EndpointPickerFailureMode = "FailOpen" | ||
// EndpointPickerFailClose specifies that the parent should drop the request when the Endpoint | ||
// Picker extension fails. | ||
EndpointPickerFailClose EndpointPickerFailureMode = "FailClose" | ||
) | ||
|
||
// InferencePoolStatus defines the observed state of the InferencePool. | ||
type InferencePoolStatus struct { | ||
// Parents is a list of parent resources, typically Gateways, that are associated with | ||
// the InferencePool, and the status of the InferencePool with respect to each parent. | ||
// | ||
// A controller that manages the InferencePool, must add an entry for each parent it manages | ||
// and remove the parent entry when the controller no longer considers the InferencePool to | ||
// be associated with that parent. | ||
// | ||
// A maximum of 32 parents will be represented in this list. When the list is empty, | ||
// it indicates that the InferencePool is not associated with any parents. | ||
// | ||
// +kubebuilder:validation:MaxItems=32 | ||
// +optional | ||
// +listType=atomic | ||
capri-xiyue marked this conversation as resolved.
Show resolved
Hide resolved
|
||
Parents []ParentStatus `json:"parents,omitempty"` | ||
} | ||
|
||
// ParentStatus defines the observed state of InferencePool from a Parent, i.e. Gateway. | ||
type ParentStatus struct { | ||
// Conditions is a list of status conditions that provide information about the observed | ||
// state of the InferencePool. This field is required to be set by the controller that | ||
// manages the InferencePool. | ||
// | ||
// Supported condition types are: | ||
// | ||
// * "Accepted" | ||
// * "ResolvedRefs" | ||
// | ||
// +optional | ||
// +listType=map | ||
// +listMapKey=type | ||
// +kubebuilder:validation:MaxItems=8 | ||
Conditions []metav1.Condition `json:"conditions,omitempty"` | ||
|
||
// ParentRef is used to identify the parent resource that this status | ||
// is associated with. It is used to match the InferencePool with the parent | ||
// resource, such as a Gateway. | ||
// | ||
// +required | ||
ParentRef ParentReference `json:"parentRef,omitzero"` | ||
} | ||
|
||
// InferencePoolConditionType is a type of status condition for the InferencePool. | ||
type InferencePoolConditionType string | ||
|
||
// InferencePoolReason is the reason for a type of InferencePool status condition. | ||
type InferencePoolReason string | ||
|
||
const ( | ||
// InferencePoolConditionAccepted is a type of condition that indicates whether | ||
// the InferencePool has been accepted or rejected by a Parent, and why. | ||
// | ||
// Possible reasons for this condition to be True are: | ||
// | ||
// * "SupportedByParent" | ||
// | ||
// Possible reasons for this condition to be False are: | ||
// | ||
// * "Accepted" | ||
// * "HTTPRouteNotAccepted" | ||
// | ||
// Possible reasons for this condition to be Unknown are: | ||
// | ||
// * "Pending" | ||
// | ||
// Controllers MAY raise this condition with other reasons, but should | ||
// prefer to use the reasons listed above to improve interoperability. | ||
InferencePoolConditionAccepted InferencePoolConditionType = "Accepted" | ||
|
||
// InferencePoolReasonAccepted is a reason used with the "Accepted" condition | ||
// when the InferencePool is accepted by a Parent because the Parent supports | ||
// InferencePool as a backend. | ||
InferencePoolReasonAccepted InferencePoolReason = "Accepted" | ||
|
||
// InferencePoolReasonNotSupportedByParent is a reason used with the "Accepted" | ||
// condition when the InferencePool has not been accepted by a Parent because | ||
// the Parent does not support InferencePool as a backend. | ||
InferencePoolReasonNotSupportedByParent InferencePoolReason = "NotSupportedByParent" | ||
|
||
// InferencePoolReasonHTTPRouteNotAccepted is an optional reason used with the | ||
// "Accepted" condition when the InferencePool is referenced by an HTTPRoute that | ||
// has been rejected by the Parent. The user should inspect the status of the | ||
// referring HTTPRoute for the specific reason. | ||
InferencePoolReasonHTTPRouteNotAccepted InferencePoolReason = "HTTPRouteNotAccepted" | ||
) | ||
|
||
const ( | ||
// InferencePoolConditionResolvedRefs is a type of condition that indicates whether | ||
// the controller was able to resolve all the object references for the InferencePool. | ||
// | ||
// Possible reasons for this condition to be True are: | ||
// | ||
// * "ResolvedRefs" | ||
// | ||
// Possible reasons for this condition to be False are: | ||
// | ||
// * "InvalidExtensionRef" | ||
// | ||
// Controllers MAY raise this condition with other reasons, but should | ||
// prefer to use the reasons listed above to improve interoperability. | ||
InferencePoolConditionResolvedRefs InferencePoolConditionType = "ResolvedRefs" | ||
|
||
// InferencePoolReasonResolvedRefs is a reason used with the "ResolvedRefs" | ||
// condition when the condition is true. | ||
InferencePoolReasonResolvedRefs InferencePoolReason = "ResolvedRefs" | ||
|
||
// InferencePoolReasonInvalidExtensionRef is a reason used with the "ResolvedRefs" | ||
// condition when the Extension is invalid in some way. This can include an | ||
// unsupported kind or API group, or a reference to a resource that cannot be found. | ||
InferencePoolReasonInvalidExtensionRef InferencePoolReason = "InvalidExtensionRef" | ||
) | ||
|
||
// ParentReference identifies an API object. It is used to associate the InferencePool with a | ||
// parent resource, such as a Gateway. | ||
type ParentReference struct { | ||
// Group is the group of the referent API object. When unspecified, the referent is assumed | ||
// to be in the "gateway.networking.k8s.io" API group. | ||
// | ||
// +optional | ||
// +kubebuilder:default="gateway.networking.k8s.io" | ||
Group *Group `json:"group,omitempty"` | ||
|
||
// Kind is the kind of the referent API object. When unspecified, the referent is assumed | ||
// to be a "Gateway" kind. | ||
// | ||
// +optional | ||
// +kubebuilder:default=Gateway | ||
Kind Kind `json:"kind,omitempty"` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We talked about Kind/Resource WRT endpoint picker, buit not here. Is this totally open-ended? Should it be resources instead of Kind? @robscott There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is only in status, so would not be open-ended. It requires the controller that is implementing the resource pointing at InferencePool (some kind of Route) to populate this. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If I am a controller and I am looking for entries in this list that "belong to me" - is group+kind sufficient? A group CAN have multiple resources with the same kind, but maybe this becomes a GIGO situation? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Practically speaking Group+Kind will be sufficient for 100% of cases today. To add some additional certainty, we added Similar to above, I'm a bit split on whether or not we should make this change at this late stage as it would be a breaking change if we added this as a required field. Will defer to @danehans, @kfswain, and @nirrozenbaum. |
||
|
||
// Name is the name of the referent API object. | ||
// | ||
// +required | ||
Name ObjectName `json:"name,omitempty"` | ||
|
||
// Namespace is the namespace of the referenced object. When unspecified, the local | ||
// namespace is inferred. | ||
// | ||
// Note that when a namespace different than the local namespace is specified, | ||
// a ReferenceGrant object is required in the referent namespace to allow that | ||
// namespace's owner to accept the reference. See the ReferenceGrant | ||
// documentation for details: https://gateway-api.sigs.k8s.io/api-types/referencegrant/ | ||
// | ||
// +optional | ||
Namespace Namespace `json:"namespace,omitempty"` | ||
} |
Uh oh!
There was an error while loading. Please reload this page.