You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/getting-started/first-abn.md
+34-4Lines changed: 34 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -26,10 +26,10 @@ This tutorial describes how to do A/B testing of a backend component using the [
26
26
27
27
A simple sample two-tier application using the Iter8 SDK is provided. Note that only the frontend component uses the Iter8 SDK. Deploy both the frontend and backend components of this application as described in each tab:
28
28
29
-
=== "frontend"
29
+
=== "Frontend"
30
30
Install the frontend component using an implementation in the language of your choice:
@@ -43,7 +43,7 @@ A simple sample two-tier application using the Iter8 SDK is provided. Note that
43
43
44
44
The frontend component is implemented to call `Lookup()` before each call to the backend component. The frontend component uses the returned version number to route the request to the recommended version of the backend component.
45
45
46
-
=== "backend"
46
+
=== "Backend"
47
47
Release an initial version of the backend named `backend`:
48
48
49
49
```shell
@@ -66,9 +66,15 @@ In one shell, port-forward requests to the frontend component:
66
66
```
67
67
In another shell, run a script to generate load from multiple users:
68
68
```shell
69
-
curl -s https://raw.githubusercontent.com/iter8-tools/docs/v0.17.3/samples/abn-sample/generate_load.sh | sh -s --
69
+
curl -s https://raw.githubusercontent.com/iter8-tools/docs/v0.18.3/samples/abn-sample/generate_load.sh | sh -s --
70
70
```
71
71
72
+
The load generator and sample frontend application outputs the backend that handled each recommendation. With just one version is deployed, all requests are handled by `backend-0`. In the output you will see something like:
A candidate version of the *backend* component can be deployed simply by adding a second version to the list of versions:
@@ -91,6 +97,12 @@ EOF
91
97
While the candidate version is deploying, `Lookup()` will return only the version index number `0`; that is, the first, or primary, version of the model.
92
98
Once the candidate version is ready, `Lookup()` will return both `0` and `1`, the indices of both versions, so that requests can be distributed across both versions.
93
99
100
+
Once both backend versions are responding to requests, the output of the load generator will include recommendations from the candidate version. In this example, you should see something like:
Inspect the metrics using Grafana. If Grafana is deployed to your cluster, port-forward requests as follows:
@@ -132,6 +144,12 @@ EOF
132
144
133
145
Calls to `Lookup()` will now recommend that all traffic be sent to the new primary version `backend` (currently serving the promoted version of the code).
134
146
147
+
The output of the load generator will again show just `backend_0`:
The output includes the success of the request (the HTTP return code) and the version of the application that responded (the `app-version` response header). For example:
79
+
The output includes the success of the request (the HTTP return code) and the version of the application that responded (in the `app-version` response header). In this example:
80
80
81
81
```
82
82
HTTP/1.1 200 OK
@@ -123,7 +123,15 @@ When the second version is deployed and ready, the Iter8 controller automaticall
123
123
124
124
### Verify routing
125
125
126
-
You can verify the routing configuration by inspecting the `VirtualService` and/or by sending requests as described above. Requests will now be handled equally by both versions.
126
+
You can verify the routing configuration by inspecting the `VirtualService` and/or by sending requests as described above. Requests will now be handled equally by both versions. Output will be something like:
127
+
128
+
```
129
+
HTTP/1.1 200 OK
130
+
app-version: httpbin-0
131
+
...
132
+
HTTP/1.1 200 OK
133
+
app-version: httpbin-1
134
+
```
127
135
128
136
## Modify weights (optional)
129
137
@@ -177,7 +185,12 @@ Once the (reconfigured) primary version ready, the Iter8 controller will automat
177
185
178
186
### Verify routing
179
187
180
-
You can verify the routing configuration by inspecting the `VirtualService` and/or by sending requests as described above. They will all be handled by the primary version.
188
+
You can verify the routing configuration by inspecting the `VirtualService` and/or by sending requests as described above. They will all be handled by the primary version. Output will be something like:
189
+
190
+
```
191
+
HTTP/1.1 200 OK
192
+
app-version: httpbin-0
193
+
```
181
194
182
195
## Cleanup
183
196
@@ -187,6 +200,18 @@ Delete the application and its routing configuration:
187
200
helm delete httpbin
188
201
```
189
202
203
+
If you used the `sleep` pod to generate load, remove it:
204
+
205
+
```shell
206
+
kubectl delete deploy sleep
207
+
```
208
+
190
209
Uninstall Iter8 controller:
191
210
192
211
--8<-- "docs/getting-started/uninstall.md"
212
+
213
+
***
214
+
215
+
Congratulations! :tada: You completed your first blue-green rollout with Iter8.
In one shell, port-forward requests to the frontend component:
@@ -70,9 +76,15 @@ In one shell, port-forward requests to the frontend component:
70
76
71
77
In another shell, run a script to generate load from multiple users:
72
78
```shell
73
-
curl -s https://raw.githubusercontent.com/iter8-tools/docs/v0.17.3/samples/abn-sample/generate_load.sh | sh -s --
79
+
curl -s https://raw.githubusercontent.com/iter8-tools/docs/v0.18.3/samples/abn-sample/generate_load.sh | sh -s --
74
80
```
75
81
82
+
The load generator and sample frontend application outputs the backend that handled each recommendation. With just one version is deployed, all requests are handled by `backend-0`. In the output you will see something like:
83
+
84
+
```
85
+
Recommendation: backend-0__isvc-3642375d03
86
+
```
87
+
76
88
## Deploy candidate
77
89
78
90
A candidate version of the model can be deployed simply by adding a second version to the list of versions:
@@ -105,6 +117,12 @@ EOF
105
117
Until the candidate version is ready, calls to `Lookup()` will return only the version index number `0`; that is, the first, or primary, version of the model.
106
118
Once the candidate version is ready, `Lookup()` will return both `0` and `1`, the indices of both versions, so that requests can be distributed across both versions.
107
119
120
+
Once both backend versions are responding to requests, the output of the load generator will include recommendations from the candidate version. In this example, you should see something like:
121
+
122
+
```
123
+
Recommendation: backend-1__isvc-3642375d03
124
+
```
125
+
108
126
## Compare versions using Grafana
109
127
110
128
Inspect the metrics using Grafana. If Grafana is deployed to your cluster, port-forward requests as follows:
@@ -155,6 +173,12 @@ EOF
155
173
156
174
Calls to `Lookup()` will now recommend that all traffic be sent to the new primary version `backend-0` (currently serving the promoted version of the code).
157
175
176
+
The output of the load generator will again show just `backend_0`:
- Because `environment` is set to `kserve-modelmesh-istio`, an `InferenceService` object is created.
55
61
- The namespace `default` is inherited from the Helm release namespace since it is not specified in the version or in `application.metadata`.
@@ -90,33 +96,12 @@ cat grpc_input.json \
90
96
| grep -e app-version
91
97
```
92
98
93
-
The output includes the version of the application that responded (the `app-version` response header). For example:
99
+
The output includes the version of the application that responded (in the `app-version` response header). In this example:
94
100
95
101
```
96
102
app-version: wisdom-0
97
103
```
98
104
99
-
??? note "To send requests from outside the cluster"
100
-
To configure the release for traffic from outside the cluster, a suitable Istio `Gateway` is required. For example, this [sample gateway](https://raw.githubusercontent.com/kalantar/docs/release/samples/iter8-sample-gateway.yaml). When using the Iter8 `release` chart, set the `gateway` field to the name of your `Gateway`. Finally, to send traffic:
101
-
102
-
(a) In a separate terminal, port-forward the ingress gateway:
A candidate version of the model can be deployed simply by adding a second version to the list of versions comprising the application:
@@ -151,7 +136,13 @@ When the candidate version is ready, the Iter8 controller will Iter8 will automa
151
136
152
137
### Verify Routing
153
138
154
-
You can verify the routing configuration by inspecting the `VirtualService` and/or by sending requests as described above. Requests will be handled equally by both versions.
139
+
You can verify the routing configuration by inspecting the `VirtualService` and/or by sending requests as described above. Requests will be handled equally by both versions. Output will be something like:
140
+
141
+
```
142
+
app-version: wisdom-0
143
+
...
144
+
app-version: wisdom-1
145
+
```
155
146
156
147
## Modify weights (optional)
157
148
@@ -186,7 +177,7 @@ Iter8 automatically reconfigures the routing to distribute traffic between the v
186
177
187
178
### Verify Routing
188
179
189
-
You can verify the routing configuration by inspecting the `VirtualService` and/or by sending requests as described above. 70 percent of requests will now be handled by the candidate version; the remaining 30 percent by the primary version.
180
+
You can verify the routing configuration by inspecting the `VirtualService` and/or by sending requests as described above. 70 percent of requests will now be handled by the candidate version (`wisdom-1`); the remaining 30 percent by the primary version (`wisdom-0`).
190
181
191
182
## Promote candidate
192
183
@@ -216,7 +207,11 @@ Once the (reconfigured) primary `InferenceService` ready, the Iter8 controller w
216
207
217
208
### Verify Routing
218
209
219
-
You can verify the routing configuration by inspecting the `VirtualService` and/or by sending requests as described above. They will all be handled by the primary version.
210
+
You can verify the routing configuration by inspecting the `VirtualService` and/or by sending requests as described above. They will all be handled by the primary version. Output will be something like:
211
+
212
+
```
213
+
app-version: wisdom-0
214
+
```
220
215
221
216
## Cleanup
222
217
@@ -226,6 +221,12 @@ Delete the models are their routing:
226
221
helm delete wisdom
227
222
```
228
223
224
+
If you used the `sleep` pod to generate load, remove it:
0 commit comments