You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This walkthrough demonstrates how to load test AppMesh on EKS. It can be used as a tool for further load testing in different mesh configuration. We use [Fortio](https://github.com/fortio/fortio) to generate the load. Currently, this walkthrough only focuses AppMesh on EKS.
2
+
This walkthrough demonstrates how to load test AppMesh on EKS. It can be used as a tool for further load testing in different mesh configuration.
3
+
We use [Fortio](https://github.com/fortio/fortio) to generate the load. This load test is for AppMesh on EKS, therefore, we
4
+
need [aws-app-mesh-controller-for-k8s](https://github.com/aws/aws-app-mesh-controller-for-k8s) to run it. Note that, the load test runs as a part of the controller integration test, hence,
5
+
we need the controller repo in this walkthorugh. Following are the key components of this load test:
6
+
7
+
8
+
* Configuration JSON: This specifies the details of the mesh, such as Virtual Nodes and their backends, a list of parameters for the load generator e.g., query per seconds (QPS), duration for each experiment to run and a list of metrics (and their corresponding logic) that need to be captured in the load test.
9
+
The details of the `config.json` can be found in [Step 3: Configuring the Load Test](#step-3:-configuring-the-load-test).
10
+
* Driver script: This bash script (`scripts/driver.sh`) sets up port-forwarding of the prometheus and starts the load test as part of the AppMesh K8s Controller integration tests.
11
+
* AppMesh K8s Controller: The K8s Controller for AppMesh [integration testing code](https://github.com/aws/aws-app-mesh-controller-for-k8s/tree/master/test/e2e/fishapp/load) is the
12
+
entry point of our load test. It handles creation of a meshified app with Virtual Nodes, Virtual Services, backends etc. It also cleans up resources and spins down the mesh after
13
+
finishing the test. The list of unique values under the adjacency list `backends_map` in `config.json` provides the number of Virtual Nodes that need to be created and the map
14
+
values provide the backend/edge connections of each node’s virtual service. These services corresponding to the backend connections will be configured as environment variables
15
+
at the time of creation of the deployment. The *Custom Service* looks for this environment variable when re-routing incoming HTTP requests.
16
+
* Custom service: The custom service `scripts/request_handler.py` script runs on each pod which receives incoming requests and makes calls to its “backend” services according to the `backends_map`
17
+
in the `config.json`. This is a simple HTTP server that runs on each pod which handles incoming requests and in turn routes them to its backend services. This backends info is
18
+
initialized as an environment variable at the time of pod creation. The custom service script is mounted onto the deployment using *ConfigMaps*
19
+
(see [createConfigMap](https://github.com/aws/aws-app-mesh-controller-for-k8s/blob/420a437f68e850a32f395f9ecd4917d62845d25a/test/e2e/fishapp/load/dynamic_stack_load_test.go) for
20
+
more details) to reduce development time (avoids creating Docker containers, pushing them to the registry, etc.). If the response from all its backends is SUCCESS/200 OK, then it
21
+
returns a 200. If any one of the responses is a failure, it returns a 500 HTTP error code. If it does not have any backends, it auto returns a 200 OK.
22
+
* Fortio: The [Fortio](https://github.com/fortio/fortio) load generator hits an endpoint in the mesh to simulate traffic by making HTTP requests to the given endpoint at the
23
+
requested QPS for the requested duration. The default endpoint in the mesh is defined in `URL_DEFAULT` under `scripts/constants.py`. Since fortio needs to access an endpoint within
24
+
the mesh, we install fortio inside the mesh with its own Virtual Node, K8 service and deployment. See the `fortio.yaml` file for more details. The K8s service is then port-forwarded
25
+
to the local machine so that REST API calls can be sent from local.
26
+
* AppMesh-Prometheus: Prometheus scrapes the required Envoy metrics during load test from each pod at specified interval. It has its own query language [*PromQL*]((https://prometheus.io/docs/prometheus/latest/querying/operators/)) which is helpful
27
+
for aggregating metrics at different granularities before exporting them.
28
+
* Load Driver: The load driver `scripts/load_driver.py` script reads the list of tests from the `config.json`, triggers the load, fetches the metrics from the Prometheus server using
29
+
its APIs and writes to persistent storage such as S3. This way, we have access to history data even if the Prometheus server spins down for some reason. The API endpoints support
30
+
PromQL queries so that aggregate metrics can be fetched directly instead of collecting raw metrics and writing separate code for aggregating them. The start and end timestamps of the
31
+
test will be noted for each test and the metrics will be queried using this time range.
32
+
* S3 storage for metrics: Experiments are uniquely identified by their `test_name` defined in `config.json`. Multiple runs of the same experiment are identified by their run
33
+
*timestamps* (in YYYYMMDDHHMMSS format). Hence, there will be a 1:1 mapping between the `test_name` and the set of config parameters in the JSON. Metrics are stored inside above
34
+
subfolders along with a metadata file specifying the parameter values used in the experiment. A list of metrics can be found in `metrics` defined under `config.json`.
1.[Walkthrough: App Mesh with EKS](../eks/). Make sure you have:
6
-
1. Cloned the [AWS AppMesh controller repo](https://github.com/aws/aws-app-mesh-controller-for-k8s). We will need this controller repo path (`CONTROLLER_PATH`) in [step 2](##step-2:-set-environment-variables).
7
-
2. Created an EKS cluster and setup kubeconfig.
8
-
3. Installed "appmesh-prometheus". You may follow this [App Mesh Prometheus](https://github.com/aws/eks-charts/tree/master/stable/appmesh-prometheus) chart for installation support.
9
-
4. This load test uses [Ginkgo](https://github.com/onsi/ginkgo/tree/v1.16.4). Make sure you have ginkgo installed by running `ginkgo version`. If it's not, you may need to install it:
10
-
1. Install [Go](https://go.dev/doc/install), if you haven't already.
2.`go install github.com/onsi/ginkgo/[email protected]` for GO version 1.17+
14
-
5. (Optional) You can follow this doc: [Getting started with AWS App Mesh and Kubernetes](https://docs.aws.amazon.com/app-mesh/latest/userguide/getting-started-kubernetes.html) to install appmesh-controller and EKS cluster using `eksctl`.
15
-
2. Clone this repository and navigate to the `walkthroughs/howto-k8s-appmesh-load-test` folder, all the commands henceforth are assumed to be run from the same directory as this `README`.
16
-
3. Make sure you have the latest version of [AWS CLI v2](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html) or [AWS CLI v1](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv1.html) installed.
17
-
4. This test requires Python 3 or later (tested with Python 3.9.6). So make sure you have [Python 3](https://www.python.org/downloads/) installed.
18
-
5. Load test results will be stored into S3 bucket. So, in `scripts/constants.py` give your `S3_BUCKET` a unique name.
19
-
6. In case you get `AccessDeniedException` (or any kind of accessing AWS resource denied exception) while creating any AppMesh resources (e.g., VirtualNode), don't forget to authenticate with your AWS account.
20
42
43
+
[//]: #(The following commands can be used to create an ec2 instance to run this load test. Make sure you already created the security-group, subnet, vpc and elasctic IP if you need it.)
44
+
[//]: #(Follow this https://docs.aws.amazon.com/cli/latest/userguide/cli-services-ec2-instances.html#launching-instances for more details.)
- Make sure you have the latest version of [AWS CLI v2](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html) or [AWS CLI v1](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv1.html) installed (at least version `1.18.82` or above).
51
+
- Make sure to have `kubectl`[installed](https://kubernetes.io/docs/tasks/tools/install-kubectl/), at least version `1.13` or above.
52
+
- Make sure to have `jq`[installed](https://stedolan.github.io/jq/download/).
53
+
- Make sure to have `helm`[installed](https://helm.sh/docs/intro/install/).
54
+
- Install [eksctl](https://eksctl.io/). Please make you have version `0.21.0` or above installed
55
+
```sh
56
+
curl --silent --location "https://github.com/weaveworks/eksctl/releases/latest/download/eksctl_$(uname -s)_amd64.tar.gz"| tar xz -C /tmp
57
+
58
+
sudo mv -v /tmp/eksctl /usr/local/bin
59
+
```
60
+
61
+
```sh
62
+
eksctl version
63
+
0.127.0
64
+
```
65
+
66
+
- Make sure you have [Python 3.9+](https://www.python.org/downloads/) installed. This walkthroguh is tested with Python 3.9.6.
67
+
```shell
68
+
python3 --version
69
+
Python 3.9.6
70
+
```
71
+
- Make sure [pip3](https://pip.pypa.io/en/stable/installation/) is installed.
72
+
```shell
73
+
pip3 --version
74
+
pip 21.2.4
75
+
```
76
+
- Make sure [Go](https://go.dev/doc/install) is installed. This walkthorugh is tested with go1.18.
77
+
- Make sure [Ginkgo](https://onsi.github.io/ginkgo/) v1.16.5 or later is installed.
If you need to update the `kubeconfig` file, you can follow this [guide](https://docs.aws.amazon.com/eks/latest/userguide/create-kubeconfig.html) and run the following:
118
+
```shell
119
+
aws eks update-kubeconfig --region $AWS_REGION --name $CLUSTER_NAME # in this example, $AWS_REGION us-west-2 and cluster-name appmeshtest
120
+
```
121
+
122
+
4. Run the following set of commands to install the App Mesh controller
Clone this repository and navigate to the `walkthroughs/howto-k8s-appmesh-load-test` folder. All the commands henceforth are assumed to be run from the same directory as this `README`.
134
+
1. Run the following command to install all python dependencies required for this test
135
+
```shell
136
+
pip3 install -r requirements.txt
137
+
```
138
+
2. Install "appmesh-prometheus". You may follow this [App Mesh Prometheus](https://github.com/aws/eks-charts/tree/master/stable/appmesh-prometheus) chart for installation support.
3. Load test results will be stored into S3 bucket. So, in `scripts/constants.py` give your `S3_BUCKET` a unique name.
21
144
22
145
## Step 2: Set Environment Variables
23
146
We need to set a few environment variables before starting the load tests.
@@ -29,6 +152,8 @@ export KUBECONFIG=<If eksctl is used to create the cluster, the KUBECONFIG will
29
152
export AWS_REGION=us-west-2
30
153
export VPC_ID=<VPC ID of the cluster, can be found using: aws eks describe-cluster --name $CLUSTER_NAME | grep 'vpcId'>
31
154
```
155
+
You can change these `env` variables in `vars.env` file and then apply it using: `source ./vars.env`.
156
+
32
157
33
158
34
159
@@ -44,10 +169,11 @@ a VirtualNode, Deployment, Service and VirtualService (with its VirtualNode as i
44
169
"2": ["4"]
45
170
},
46
171
```
47
-
where the node names are `"0"`, `"1"`, `"2"`, `"3"` and `"4"`.
172
+
where the virtual node names are `"0"`, `"1"`, `"2"`, `"3"` and `"4"`.
48
173
49
174
* `load_tests` -: Array of different test configurations that need to be run on the mesh.
50
-
*`url`: is the service endpoint that Fortio (load generator) should hit. The `url` format is: `http://service-<virtual-node-name>.tls-e2e.svc.cluster.local:9080/`.
175
+
* `test_name`: Name of the experiment. This name will be used to store the experimenter results into S3.
176
+
* `url`: is the service endpoint that Fortio (load generator) will hit. The `url` format is: `http://service-<virtual-node-name>.tls-e2e.svc.cluster.local:9080/`.
51
177
For example, based on the above `backends_map`, if we want to send the load traffic to the first virtual node `"0"`, then the `ulr` will look like:
* `qps`: Total Queries Per Seconds fortio sends to the endpoints.
@@ -57,32 +183,64 @@ a VirtualNode, Deployment, Service and VirtualService (with its VirtualNode as i
57
183
58
184
* `metrics` -: Map of metric_name to the corresponding metric [PromQL logic](https://prometheus.io/docs/prometheus/latest/querying/operators/).
59
185
186
+
### Description of other files
187
+
- `load_driver.py` -: Script which reads `config.json` and triggers load tests, reads metrics from PromQL and writes to S3. Called from within ginkgo.
188
+
- `fortio.yaml` -: Spec of the Fortio components which are created during runtime.
189
+
- `request_handler.py` and `request_handler_driver.sh` -: The custom service that runs in each of the pods to handle and route incoming requests according
190
+
to the mapping in `backends_map`.
191
+
- `configmap.yaml` -: ConfigMap spec to mount above request_handler* files into the cluster instead of creating Docker containers.
192
+
Don't forget to use the absolute path of `request_handler_driver.sh`.
193
+
- `cluster.yaml` -: This is optional and an example EKS cluster config file. This `cluster.yaml` can be used to create an EKS cluster by running `eksctl create cluster -f cluster.yaml`.
194
+
195
+
196
+
60
197
## Step 4: Running the Load Test
61
198
Run the driver script using the below command -:
62
-
> sh scripts/driver.sh
199
+
200
+
```sh
201
+
/bin/bash scripts/driver.sh
202
+
```
63
203
64
204
The driver script will perform the following -:
65
-
1.Install necessary Python3 libraries.
205
+
1. Checks necessary environment variables are set which is required to run this load test.
66
206
2. Port-forward the Prometheus service to local.
67
207
3. Run the Ginkgo test which is the entrypoint for our load test.
68
208
4. Kill the Prometheus port-forwarding after the load Test is done.
69
209
70
210
71
211
## Step 5: Analyze the Results
72
-
All the test results are saved into `S3_BUCKET` which was specified in `scripts/constants.py`.
73
-
Optionally, you can run the `scripts/analyze_load_test_data.py` to visualize the results.
74
-
The `analyze_load_test_data.py` will
212
+
All the test results are saved into `S3_BUCKET` which was specified in`scripts/constants.py`. Optionally, you can run the `scripts/analyze_load_test_data.py` to visualize the results.
213
+
The `analyze_load_test_data.py` will:
75
214
* First download all the load test results from the `S3_BUCKET` into `scripts\data` directory, then
76
-
* Plot a graph against the actual QPS (query per second) Fortio sends to the first VirtualNode vs the max memory consumed by the container of that VirtualNode.
77
-
78
-
## Description of other files
79
-
`load_driver.py` -: Script which reads `config.json` and triggers load tests, reads metrics from PromQL and writes to S3. Called from within ginkgo
80
-
81
-
`fortio.yaml` -: Spec of the Fortio components which are created during runtime
82
-
83
-
`request_handler.py` and `request_handler_driver.sh` -: The custom service that runs in each of the pods to handle and route incoming requests according
84
-
to the mapping in `backends_map`
85
-
86
-
`configmap.yaml` -: ConfigMap spec to mount above request_handler* files into the cluster instead of creating Docker containers. Don't forget to use the absolute path of `request_handler_driver.sh`
87
-
88
-
`cluster.yaml` -: A sample EKS cluster config. This `cluster.yaml` can be used to create an EKS cluster by running `eksctl create cluster -f cluster.yaml`
215
+
* Plot a graph against the actual QPS (query per seconds) Fortio sends to the first VirtualNode vs the max memory consumed by the container of that VirtualNode.
216
+
217
+
## Step 6: Clean-up
218
+
219
+
After the load test is finished, the mesh (including its dependent resources such as virtual nodes, services etc.) and the corresponding Kubernetes
220
+
namespace (currently this load test uses `tls-e2e` namespace) will be cleaned automatically. However, incase the test is stopped, perhaps because of manual intervention like pressing
221
+
ctrl + c, the automatic cleanup process may not be finished. In that case we have to manually clean up the mesh and the namespace.
222
+
- Delete the namespace:
223
+
```sh
224
+
kubectl delete ns tls-e2e
225
+
```
226
+
- The mesh created in our load test starts with `$CLUSTER_NAME` + 6 character long alphanumeric random string. So search for the exact mesh name by running:
227
+
```sh
228
+
kubectl get mesh --all-namespaces
229
+
```
230
+
Then delete the mesh
231
+
```shell
232
+
kubectl delete mesh $CLUSTER_NAME+6 character long alphanumeric random string
233
+
```
234
+
235
+
- Delete the controller and prometheus
236
+
237
+
```shell
238
+
helm delete appmesh-controller -n appmesh-system
239
+
helm delete appmesh-prometheus -n appmesh-system
240
+
kubectl delete ns appmesh-system
241
+
```
242
+
- Finally, get rid of the EKS cluster to free all compute, networking, and storage resources, using:
243
+
244
+
```sh
245
+
eksctl delete cluster --name $CLUSTER_NAME# In our case $CLUSTER_NAME is appmeshtest
0 commit comments