Skip to content

Commit 6889c31

Browse files
authored
recommended clarifications (#121)
* recommended clarifications Signed-off-by: Michael Kalantar <[email protected]> * spelling Signed-off-by: Michael Kalantar <[email protected]> * add kubernetes gateway api tutorials Signed-off-by: Michael Kalantar <[email protected]> * update roadmap Signed-off-by: Michael Kalantar <[email protected]> --------- Signed-off-by: Michael Kalantar <[email protected]>
1 parent 0ab253b commit 6889c31

File tree

19 files changed

+719
-61
lines changed

19 files changed

+719
-61
lines changed

docs/getting-started/first-abn.md

Lines changed: 34 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -26,10 +26,10 @@ This tutorial describes how to do A/B testing of a backend component using the [
2626

2727
A simple sample two-tier application using the Iter8 SDK is provided. Note that only the frontend component uses the Iter8 SDK. Deploy both the frontend and backend components of this application as described in each tab:
2828

29-
=== "frontend"
29+
=== "Frontend"
3030
Install the frontend component using an implementation in the language of your choice:
3131

32-
=== "node"
32+
=== "Node"
3333
```shell
3434
kubectl create deployment frontend --image=iter8/abn-sample-frontend-node:0.17.3
3535
kubectl expose deployment frontend --name=frontend --port=8090
@@ -43,7 +43,7 @@ A simple sample two-tier application using the Iter8 SDK is provided. Note that
4343

4444
The frontend component is implemented to call `Lookup()` before each call to the backend component. The frontend component uses the returned version number to route the request to the recommended version of the backend component.
4545

46-
=== "backend"
46+
=== "Backend"
4747
Release an initial version of the backend named `backend`:
4848

4949
```shell
@@ -66,9 +66,15 @@ In one shell, port-forward requests to the frontend component:
6666
```
6767
In another shell, run a script to generate load from multiple users:
6868
```shell
69-
curl -s https://raw.githubusercontent.com/iter8-tools/docs/v0.17.3/samples/abn-sample/generate_load.sh | sh -s --
69+
curl -s https://raw.githubusercontent.com/iter8-tools/docs/v0.18.3/samples/abn-sample/generate_load.sh | sh -s --
7070
```
7171

72+
The load generator and sample frontend application outputs the backend that handled each recommendation. With just one version is deployed, all requests are handled by `backend-0`. In the output you will see something like:
73+
74+
```
75+
Recommendation: {"Id":19,"Name":"sample","Source":"backend-74ff88c76d-nb87j"}
76+
```
77+
7278
## Deploy candidate
7379

7480
A candidate version of the *backend* component can be deployed simply by adding a second version to the list of versions:
@@ -91,6 +97,12 @@ EOF
9197
While the candidate version is deploying, `Lookup()` will return only the version index number `0`; that is, the first, or primary, version of the model.
9298
Once the candidate version is ready, `Lookup()` will return both `0` and `1`, the indices of both versions, so that requests can be distributed across both versions.
9399

100+
Once both backend versions are responding to requests, the output of the load generator will include recommendations from the candidate version. In this example, you should see something like:
101+
102+
```
103+
Recommendation: {"Id":19,"Name":"sample","Source":"backend-candidate-1-56cb7cd5cf-bkrjv"}
104+
```
105+
94106
## Compare versions using Grafana
95107

96108
Inspect the metrics using Grafana. If Grafana is deployed to your cluster, port-forward requests as follows:
@@ -132,6 +144,12 @@ EOF
132144

133145
Calls to `Lookup()` will now recommend that all traffic be sent to the new primary version `backend` (currently serving the promoted version of the code).
134146

147+
The output of the load generator will again show just `backend_0`:
148+
149+
```
150+
Recommendation: {"Id":19,"Name":"sample","Source":"backend-74ff88c76d-nb87j"}
151+
```
152+
135153
## Cleanup
136154

137155
Delete the sample application:
@@ -144,3 +162,15 @@ helm delete backend
144162
Uninstall the Iter8 controller:
145163

146164
--8<-- "docs/getting-started/uninstall.md"
165+
166+
If you installed Grafana, you can delete it as follows:
167+
168+
```shell
169+
kubectl delete svc/grafana, deploy/grafana
170+
```
171+
172+
***
173+
174+
Congratulations! :tada: You completed your first A/B test with Iter8.
175+
176+
***

docs/getting-started/first-performance.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,7 @@ The Iter8 dashboard will look like the following:
8080
![`http` Iter8 dashboard](../user-guide/tasks/images/httpdashboard.png)
8181

8282
## View logs
83-
Logs are useful for debugging.
83+
Logs are useful for debugging. To see the test logs:
8484

8585
```shell
8686
kubectl logs -l iter8.tools/test=httpbin-test
@@ -102,6 +102,12 @@ kubectl delete deploy/httpbin
102102

103103
--8<-- "docs/getting-started/uninstall.md"
104104

105+
If you installed Grafana, you can delete it as follows:
106+
107+
```shell
108+
kubectl delete svc/grafana, deploy/grafana
109+
```
110+
105111
***
106112

107113
Congratulations! :tada: You completed your first performance test with Iter8.

docs/getting-started/first-release.md

Lines changed: 29 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@ You can also send requests from a pod within the cluster:
6363

6464
1. Create a `sleep` pod in the cluster from which requests can be made:
6565
```shell
66-
curl -s https://raw.githubusercontent.com/iter8-tools/docs/v0.17.3/samples/kserve-serving/sleep.sh | sh -
66+
curl -s https://raw.githubusercontent.com/iter8-tools/docs/v0.18.4/samples/kserve-serving/sleep.sh | sh -
6767
```
6868

6969
2. Exec into the sleep pod:
@@ -76,7 +76,7 @@ kubectl exec --stdin --tty "$(kubectl get pod --sort-by={metadata.creationTimest
7676
curl httpbin.default -s -D - | grep -e '^HTTP' -e app-version
7777
```
7878

79-
The output includes the success of the request (the HTTP return code) and the version of the application that responded (the `app-version` response header). For example:
79+
The output includes the success of the request (the HTTP return code) and the version of the application that responded (in the `app-version` response header). In this example:
8080

8181
```
8282
HTTP/1.1 200 OK
@@ -123,7 +123,15 @@ When the second version is deployed and ready, the Iter8 controller automaticall
123123

124124
### Verify routing
125125

126-
You can verify the routing configuration by inspecting the `VirtualService` and/or by sending requests as described above. Requests will now be handled equally by both versions.
126+
You can verify the routing configuration by inspecting the `VirtualService` and/or by sending requests as described above. Requests will now be handled equally by both versions. Output will be something like:
127+
128+
```
129+
HTTP/1.1 200 OK
130+
app-version: httpbin-0
131+
...
132+
HTTP/1.1 200 OK
133+
app-version: httpbin-1
134+
```
127135

128136
## Modify weights (optional)
129137

@@ -177,7 +185,12 @@ Once the (reconfigured) primary version ready, the Iter8 controller will automat
177185

178186
### Verify routing
179187

180-
You can verify the routing configuration by inspecting the `VirtualService` and/or by sending requests as described above. They will all be handled by the primary version.
188+
You can verify the routing configuration by inspecting the `VirtualService` and/or by sending requests as described above. They will all be handled by the primary version. Output will be something like:
189+
190+
```
191+
HTTP/1.1 200 OK
192+
app-version: httpbin-0
193+
```
181194

182195
## Cleanup
183196

@@ -187,6 +200,18 @@ Delete the application and its routing configuration:
187200
helm delete httpbin
188201
```
189202

203+
If you used the `sleep` pod to generate load, remove it:
204+
205+
```shell
206+
kubectl delete deploy sleep
207+
```
208+
190209
Uninstall Iter8 controller:
191210

192211
--8<-- "docs/getting-started/uninstall.md"
212+
213+
***
214+
215+
Congratulations! :tada: You completed your first blue-green rollout with Iter8.
216+
217+
***

docs/roadmap.md

Lines changed: 5 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -9,11 +9,8 @@ hide:
99

1010
1. Stabilizing Iter8 APIs for CNCF sandboxing
1111
2. Autoscaling the metrics service
12-
3. Install infrastructure components such as Istio
13-
4. Install ML components such as KServe and KServe ModelMesh
14-
5. Extend routing templates to include application management
15-
6. Support multi-cluster installs
16-
7. Open Data Hub tier 1 project
17-
8. Metrics & evaluation for foundation model/LLM-based apps
18-
9. Hyperparameter tuning for foundation model/LLM-based inference pipelines
19-
10. Data/concept drift detection for ML models
12+
3. Support multi-cluster installs
13+
4. Open Data Hub tier 1 project
14+
5. Metrics & evaluation for foundation model/LLM-based apps
15+
6. Hyperparameter tuning for foundation model/LLM-based inference pipelines
16+
7. Data/concept drift detection for ML models

docs/tutorials/integrations/kserve-mm/abn.md

Lines changed: 32 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,12 @@ application:
6161
EOF
6262
```
6363

64+
Wait for the backend model to be ready:
65+
66+
```shell
67+
kubectl wait --for condition=ready isvc/backend-0 --timeout=600s
68+
```
69+
6470
## Generate load
6571

6672
In one shell, port-forward requests to the frontend component:
@@ -70,9 +76,15 @@ In one shell, port-forward requests to the frontend component:
7076

7177
In another shell, run a script to generate load from multiple users:
7278
```shell
73-
curl -s https://raw.githubusercontent.com/iter8-tools/docs/v0.17.3/samples/abn-sample/generate_load.sh | sh -s --
79+
curl -s https://raw.githubusercontent.com/iter8-tools/docs/v0.18.3/samples/abn-sample/generate_load.sh | sh -s --
7480
```
7581

82+
The load generator and sample frontend application outputs the backend that handled each recommendation. With just one version is deployed, all requests are handled by `backend-0`. In the output you will see something like:
83+
84+
```
85+
Recommendation: backend-0__isvc-3642375d03
86+
```
87+
7688
## Deploy candidate
7789

7890
A candidate version of the model can be deployed simply by adding a second version to the list of versions:
@@ -105,6 +117,12 @@ EOF
105117
Until the candidate version is ready, calls to `Lookup()` will return only the version index number `0`; that is, the first, or primary, version of the model.
106118
Once the candidate version is ready, `Lookup()` will return both `0` and `1`, the indices of both versions, so that requests can be distributed across both versions.
107119

120+
Once both backend versions are responding to requests, the output of the load generator will include recommendations from the candidate version. In this example, you should see something like:
121+
122+
```
123+
Recommendation: backend-1__isvc-3642375d03
124+
```
125+
108126
## Compare versions using Grafana
109127

110128
Inspect the metrics using Grafana. If Grafana is deployed to your cluster, port-forward requests as follows:
@@ -155,6 +173,12 @@ EOF
155173

156174
Calls to `Lookup()` will now recommend that all traffic be sent to the new primary version `backend-0` (currently serving the promoted version of the code).
157175

176+
The output of the load generator will again show just `backend_0`:
177+
178+
```
179+
Recommendation: backend-0__isvc-3642375d03
180+
```
181+
158182
## Cleanup
159183

160184
Delete the backend:
@@ -171,4 +195,10 @@ kubectl delete deploy/frontend svc/frontend
171195

172196
Uninstall Iter8 controller:
173197

174-
--8<-- "docs/getting-started/uninstall.md"
198+
--8<-- "docs/getting-started/uninstall.md"
199+
200+
If you installed Grafana, you can delete it as follows:
201+
202+
```shell
203+
kubectl delete svc/grafana, deploy/grafana
204+
```

docs/tutorials/integrations/kserve-mm/blue-green.md

Lines changed: 26 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,12 @@ application:
5050
EOF
5151
```
5252

53+
Wait for the backend model to be ready:
54+
55+
```shell
56+
kubectl wait --for condition=ready isvc/wisdom-0 --timeout=600s
57+
```
58+
5359
??? note "What happens?"
5460
- Because `environment` is set to `kserve-modelmesh-istio`, an `InferenceService` object is created.
5561
- The namespace `default` is inherited from the Helm release namespace since it is not specified in the version or in `application.metadata`.
@@ -90,33 +96,12 @@ cat grpc_input.json \
9096
| grep -e app-version
9197
```
9298

93-
The output includes the version of the application that responded (the `app-version` response header). For example:
99+
The output includes the version of the application that responded (in the `app-version` response header). In this example:
94100

95101
```
96102
app-version: wisdom-0
97103
```
98104

99-
??? note "To send requests from outside the cluster"
100-
To configure the release for traffic from outside the cluster, a suitable Istio `Gateway` is required. For example, this [sample gateway](https://raw.githubusercontent.com/kalantar/docs/release/samples/iter8-sample-gateway.yaml). When using the Iter8 `release` chart, set the `gateway` field to the name of your `Gateway`. Finally, to send traffic:
101-
102-
(a) In a separate terminal, port-forward the ingress gateway:
103-
```shell
104-
kubectl -n istio-system port-forward svc/istio-ingressgateway 8080:80
105-
```
106-
(b) Download the proto file and sample input:
107-
```shell
108-
curl -sO https://raw.githubusercontent.com/iter8-tools/docs/v0.17.3/samples/modelmesh-serving/kserve.proto
109-
curl -sO https://raw.githubusercontent.com/iter8-tools/docs/v0.17.3/samples/modelmesh-serving/grpc_input.json
110-
```
111-
\(c) Send requests using the `Host` header:
112-
```shell
113-
cat grpc_input.json | \
114-
grpcurl -vv -plaintext -proto kserve.proto -d @ \
115-
-authority wisdom.modelmesh-serving \
116-
localhost:8080 inference.GRPCInferenceService.ModelInfer \
117-
| grep -e app-version
118-
```
119-
120105
## Deploy candidate
121106

122107
A candidate version of the model can be deployed simply by adding a second version to the list of versions comprising the application:
@@ -151,7 +136,13 @@ When the candidate version is ready, the Iter8 controller will Iter8 will automa
151136

152137
### Verify Routing
153138

154-
You can verify the routing configuration by inspecting the `VirtualService` and/or by sending requests as described above. Requests will be handled equally by both versions.
139+
You can verify the routing configuration by inspecting the `VirtualService` and/or by sending requests as described above. Requests will be handled equally by both versions. Output will be something like:
140+
141+
```
142+
app-version: wisdom-0
143+
...
144+
app-version: wisdom-1
145+
```
155146

156147
## Modify weights (optional)
157148

@@ -186,7 +177,7 @@ Iter8 automatically reconfigures the routing to distribute traffic between the v
186177

187178
### Verify Routing
188179

189-
You can verify the routing configuration by inspecting the `VirtualService` and/or by sending requests as described above. 70 percent of requests will now be handled by the candidate version; the remaining 30 percent by the primary version.
180+
You can verify the routing configuration by inspecting the `VirtualService` and/or by sending requests as described above. 70 percent of requests will now be handled by the candidate version (`wisdom-1`); the remaining 30 percent by the primary version (`wisdom-0`).
190181

191182
## Promote candidate
192183

@@ -216,7 +207,11 @@ Once the (reconfigured) primary `InferenceService` ready, the Iter8 controller w
216207

217208
### Verify Routing
218209

219-
You can verify the routing configuration by inspecting the `VirtualService` and/or by sending requests as described above. They will all be handled by the primary version.
210+
You can verify the routing configuration by inspecting the `VirtualService` and/or by sending requests as described above. They will all be handled by the primary version. Output will be something like:
211+
212+
```
213+
app-version: wisdom-0
214+
```
220215

221216
## Cleanup
222217

@@ -226,6 +221,12 @@ Delete the models are their routing:
226221
helm delete wisdom
227222
```
228223

224+
If you used the `sleep` pod to generate load, remove it:
225+
226+
```shell
227+
kubectl delete deploy sleep
228+
```
229+
229230
Uninstall Iter8 controller:
230231

231232
--8<-- "docs/getting-started/uninstall.md"

0 commit comments

Comments
 (0)