Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -292,6 +292,14 @@ live-docs:
docker build -t gaie/mkdocs hack/mkdocs/image
docker run --rm -it -p 3000:3000 -v ${PWD}:/docs gaie/mkdocs

.PHONY: apix-ref-docs
apix-ref-docs:
crd-ref-docs \
--source-path=${PWD}/apix/v1alpha2 \
--config=crd-ref-docs.yaml \
--renderer=markdown \
--output-path=${PWD}/site-src/reference/x-spec.md

.PHONY: api-ref-docs
api-ref-docs:
crd-ref-docs \
Expand Down
4 changes: 2 additions & 2 deletions config/charts/inferencepool/templates/gke.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ metadata:
{{- include "gateway-api-inference-extension.labels" . | nindent 4 }}
spec:
targetRef:
group: "inference.networking.x-k8s.io"
group: "inference.networking.k8s.io"
kind: InferencePool
name: {{ .Release.Name }}
default:
Expand All @@ -28,7 +28,7 @@ metadata:
{{- include "gateway-api-inference-extension.labels" . | nindent 4 }}
spec:
targetRef:
group: "inference.networking.x-k8s.io"
group: "inference.networking.k8s.io"
kind: InferencePool
name: {{ .Release.Name }}
default:
Expand Down
2 changes: 1 addition & 1 deletion config/charts/inferencepool/templates/inferencepool.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{{ include "gateway-api-inference-extension.validations.inferencepool.common" $ }}
apiVersion: inference.networking.x-k8s.io/v1alpha2
apiVersion: inference.networking.k8s.io/v1
kind: InferencePool
metadata:
name: {{ .Release.Name }}
Expand Down
3 changes: 3 additions & 0 deletions config/charts/inferencepool/templates/rbac.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,9 @@ rules:
- apiGroups: ["inference.networking.x-k8s.io"]
resources: ["inferencemodels", "inferencepools"]
verbs: ["get", "watch", "list"]
- apiGroups: ["inference.networking.k8s.io"]
resources: ["inferencepools"]
verbs: ["get", "watch", "list"]
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "watch", "list"]
Expand Down
2 changes: 1 addition & 1 deletion config/manifests/gateway/gke/gcp-backend-policy.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ metadata:
name: inferencepool-backend-policy
spec:
targetRef:
group: "inference.networking.x-k8s.io"
group: "inference.networking.k8s.io"
kind: InferencePool
name: vllm-llama3-8b-instruct
default:
Expand Down
2 changes: 1 addition & 1 deletion config/manifests/inferencepool-resources.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
# - ./conformance/resources/manifests/manifests.yaml
# - ./site-src/guides/inferencepool-rollout.md
---
apiVersion: inference.networking.x-k8s.io/v1alpha2
apiVersion: inference.networking.k8s.io/v1
kind: InferencePool
metadata:
name: vllm-llama3-8b-instruct
Expand Down
2 changes: 1 addition & 1 deletion site-src/api-types/inferencepool.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ In summary, the InferencePoolSpec consists of 3 major parts:
Here is an example InferencePool configuration:

```
apiVersion: inference.networking.x-k8s.io/v1alpha2
apiVersion: inference.networking.k8s.io/v1
kind: InferencePool
metadata:
name: vllm-llama3-8b-instruct
Expand Down
4 changes: 2 additions & 2 deletions site-src/guides/implementers.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ spec:
name: inference-gateway
rules:
- backendRefs:
- group: inference.networking.x-k8s.io
- group: inference.networking.k8s.io
kind: InferencePool
name: base-model
matches:
Expand All @@ -42,7 +42,7 @@ The general idea of implementing a Gateway controller supporting the InferencePo
### Endpoint Tracking
Consider a simple inference pool like this:
```
apiVersion: inference.networking.x-k8s.io/v1alpha2
apiVersion: inference.networking.k8s.io/v1
kind: InferencePool
metadata:
name: vllm-llama3-8b-instruct
Expand Down
8 changes: 4 additions & 4 deletions site-src/guides/inferencepool-rollout.md
Original file line number Diff line number Diff line change
Expand Up @@ -204,7 +204,7 @@ data:
- id: food-review-1
source: Kawon/llama3.1-food-finetune_v14_r8
---
apiVersion: inference.networking.x-k8s.io/v1alpha2
apiVersion: inference.networking.k8s.io/v1
kind: InferencePool
metadata:
name: vllm-llama3-8b-instruct-new
Expand Down Expand Up @@ -400,11 +400,11 @@ spec:
name: inference-gateway
rules:
- backendRefs:
- group: inference.networking.x-k8s.io
- group: inference.networking.k8s.io
kind: InferencePool
name: vllm-llama3-8b-instruct
weight: 90
- group: inference.networking.x-k8s.io
- group: inference.networking.k8s.io
kind: InferencePool
name: vllm-llama3-8b-instruct-new
weight: 10
Expand Down Expand Up @@ -448,7 +448,7 @@ spec:
name: inference-gateway
rules:
- backendRefs:
- group: inference.networking.x-k8s.io
- group: inference.networking.k8s.io
kind: InferencePool
name: vllm-llama3-8b-instruct-new
weight: 100
Expand Down
205 changes: 62 additions & 143 deletions site-src/reference/spec.md

Large diffs are not rendered by default.

Loading