Skip to content

Commit dd19db9

Browse files
author
Mazhar Islam
committed
Updated readme. Clean codes and added python dependencies as requirments.txt
1 parent 6efb964 commit dd19db9

File tree

6 files changed

+247
-111
lines changed

6 files changed

+247
-111
lines changed
Lines changed: 194 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,146 @@
11
# AppMesh K8s Load Test
2-
This walkthrough demonstrates how to load test AppMesh on EKS. It can be used as a tool for further load testing in different mesh configuration. We use [Fortio](https://github.com/fortio/fortio) to generate the load. Currently, this walkthrough only focuses AppMesh on EKS.
2+
This walkthrough demonstrates how to load test AppMesh on EKS. It can be used as a tool for further load testing in different mesh configuration.
3+
We use [Fortio](https://github.com/fortio/fortio) to generate the load. This load test is for AppMesh on EKS, therefore, we
4+
need [aws-app-mesh-controller-for-k8s](https://github.com/aws/aws-app-mesh-controller-for-k8s) to run it. Note that, the load test runs as a part of the controller integration test, hence,
5+
we need the controller repo in this walkthorugh. Following are the key components of this load test:
6+
7+
8+
* Configuration JSON: This specifies the details of the mesh, such as Virtual Nodes and their backends, a list of parameters for the load generator e.g., query per seconds (QPS), duration for each experiment to run and a list of metrics (and their corresponding logic) that need to be captured in the load test.
9+
The details of the `config.json` can be found in [Step 3: Configuring the Load Test](#step-3:-configuring-the-load-test).
10+
* Driver script: This bash script (`scripts/driver.sh`) sets up port-forwarding of the prometheus and starts the load test as part of the AppMesh K8s Controller integration tests.
11+
* AppMesh K8s Controller: The K8s Controller for AppMesh [integration testing code](https://github.com/aws/aws-app-mesh-controller-for-k8s/tree/master/test/e2e/fishapp/load) is the
12+
entry point of our load test. It handles creation of a meshified app with Virtual Nodes, Virtual Services, backends etc. It also cleans up resources and spins down the mesh after
13+
finishing the test. The list of unique values under the adjacency list `backends_map` in `config.json` provides the number of Virtual Nodes that need to be created and the map
14+
values provide the backend/edge connections of each node’s virtual service. These services corresponding to the backend connections will be configured as environment variables
15+
at the time of creation of the deployment. The *Custom Service* looks for this environment variable when re-routing incoming HTTP requests.
16+
* Custom service: The custom service `scripts/request_handler.py` script runs on each pod which receives incoming requests and makes calls to its “backend” services according to the `backends_map`
17+
in the `config.json`. This is a simple HTTP server that runs on each pod which handles incoming requests and in turn routes them to its backend services. This backends info is
18+
initialized as an environment variable at the time of pod creation. The custom service script is mounted onto the deployment using *ConfigMaps*
19+
(see [createConfigMap](https://github.com/aws/aws-app-mesh-controller-for-k8s/blob/420a437f68e850a32f395f9ecd4917d62845d25a/test/e2e/fishapp/load/dynamic_stack_load_test.go) for
20+
more details) to reduce development time (avoids creating Docker containers, pushing them to the registry, etc.). If the response from all its backends is SUCCESS/200 OK, then it
21+
returns a 200. If any one of the responses is a failure, it returns a 500 HTTP error code. If it does not have any backends, it auto returns a 200 OK.
22+
* Fortio: The [Fortio](https://github.com/fortio/fortio) load generator hits an endpoint in the mesh to simulate traffic by making HTTP requests to the given endpoint at the
23+
requested QPS for the requested duration. The default endpoint in the mesh is defined in `URL_DEFAULT` under `scripts/constants.py`. Since fortio needs to access an endpoint within
24+
the mesh, we install fortio inside the mesh with its own Virtual Node, K8 service and deployment. See the `fortio.yaml` file for more details. The K8s service is then port-forwarded
25+
to the local machine so that REST API calls can be sent from local.
26+
* AppMesh-Prometheus: Prometheus scrapes the required Envoy metrics during load test from each pod at specified interval. It has its own query language [*PromQL*]((https://prometheus.io/docs/prometheus/latest/querying/operators/)) which is helpful
27+
for aggregating metrics at different granularities before exporting them.
28+
* Load Driver: The load driver `scripts/load_driver.py` script reads the list of tests from the `config.json`, triggers the load, fetches the metrics from the Prometheus server using
29+
its APIs and writes to persistent storage such as S3. This way, we have access to history data even if the Prometheus server spins down for some reason. The API endpoints support
30+
PromQL queries so that aggregate metrics can be fetched directly instead of collecting raw metrics and writing separate code for aggregating them. The start and end timestamps of the
31+
test will be noted for each test and the metrics will be queried using this time range.
32+
* S3 storage for metrics: Experiments are uniquely identified by their `test_name` defined in `config.json`. Multiple runs of the same experiment are identified by their run
33+
*timestamps* (in YYYYMMDDHHMMSS format). Hence, there will be a 1:1 mapping between the `test_name` and the set of config parameters in the JSON. Metrics are stored inside above
34+
subfolders along with a metadata file specifying the parameter values used in the experiment. A list of metrics can be found in `metrics` defined under `config.json`.
35+
36+
37+
Following is a flow diagram of the load test:
38+
39+
![Flow Diagram](./load_test_flow_dg.png "Flow Diagram")
340

441
## Step 1: Prerequisites
5-
1. [Walkthrough: App Mesh with EKS](../eks/). Make sure you have:
6-
1. Cloned the [AWS AppMesh controller repo](https://github.com/aws/aws-app-mesh-controller-for-k8s). We will need this controller repo path (`CONTROLLER_PATH`) in [step 2](##step-2:-set-environment-variables).
7-
2. Created an EKS cluster and setup kubeconfig.
8-
3. Installed "appmesh-prometheus". You may follow this [App Mesh Prometheus](https://github.com/aws/eks-charts/tree/master/stable/appmesh-prometheus) chart for installation support.
9-
4. This load test uses [Ginkgo](https://github.com/onsi/ginkgo/tree/v1.16.4). Make sure you have ginkgo installed by running `ginkgo version`. If it's not, you may need to install it:
10-
1. Install [Go](https://go.dev/doc/install), if you haven't already.
11-
2. Install Ginkgo v1.16.4 (currently, AppMesh controller uses [ginkgo v1.16.4](https://github.com/aws/aws-app-mesh-controller-for-k8s/blob/master/go.mod#L13))
12-
1. `go get -u github.com/onsi/ginkgo/[email protected]` or
13-
2. `go install github.com/onsi/ginkgo/[email protected]` for GO version 1.17+
14-
5. (Optional) You can follow this doc: [Getting started with AWS App Mesh and Kubernetes](https://docs.aws.amazon.com/app-mesh/latest/userguide/getting-started-kubernetes.html) to install appmesh-controller and EKS cluster using `eksctl`.
15-
2. Clone this repository and navigate to the `walkthroughs/howto-k8s-appmesh-load-test` folder, all the commands henceforth are assumed to be run from the same directory as this `README`.
16-
3. Make sure you have the latest version of [AWS CLI v2](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html) or [AWS CLI v1](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv1.html) installed.
17-
4. This test requires Python 3 or later (tested with Python 3.9.6). So make sure you have [Python 3](https://www.python.org/downloads/) installed.
18-
5. Load test results will be stored into S3 bucket. So, in `scripts/constants.py` give your `S3_BUCKET` a unique name.
19-
6. In case you get `AccessDeniedException` (or any kind of accessing AWS resource denied exception) while creating any AppMesh resources (e.g., VirtualNode), don't forget to authenticate with your AWS account.
2042

43+
[//]: # (The following commands can be used to create an ec2 instance to run this load test. Make sure you already created the security-group, subnet, vpc and elasctic IP if you need it.)
44+
[//]: # (Follow this https://docs.aws.amazon.com/cli/latest/userguide/cli-services-ec2-instances.html#launching-instances for more details.)
45+
[//]: # (```shell)
46+
[//]: # (aws ec2 run-instances --image-id ami-0534f435d9dd0ece4 --count 1 --instance-type t2.xlarge --key-name color-app-2 --security-group-ids sg-09581640015241144 --subnet-id subnet-056542d0b479a259a --associate-public-ip-address)
47+
[//]: # (```)
48+
### 1.1 Tools
49+
We need to install the following tools first:
50+
- Make sure you have the latest version of [AWS CLI v2](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html) or [AWS CLI v1](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv1.html) installed (at least version `1.18.82` or above).
51+
- Make sure to have `kubectl` [installed](https://kubernetes.io/docs/tasks/tools/install-kubectl/), at least version `1.13` or above.
52+
- Make sure to have `jq` [installed](https://stedolan.github.io/jq/download/).
53+
- Make sure to have `helm` [installed](https://helm.sh/docs/intro/install/).
54+
- Install [eksctl](https://eksctl.io/). Please make you have version `0.21.0` or above installed
55+
```sh
56+
curl --silent --location "https://github.com/weaveworks/eksctl/releases/latest/download/eksctl_$(uname -s)_amd64.tar.gz" | tar xz -C /tmp
57+
58+
sudo mv -v /tmp/eksctl /usr/local/bin
59+
```
60+
61+
```sh
62+
eksctl version
63+
0.127.0
64+
```
65+
66+
- Make sure you have [Python 3.9+](https://www.python.org/downloads/) installed. This walkthroguh is tested with Python 3.9.6.
67+
```shell
68+
python3 --version
69+
Python 3.9.6
70+
```
71+
- Make sure [pip3](https://pip.pypa.io/en/stable/installation/) is installed.
72+
```shell
73+
pip3 --version
74+
pip 21.2.4
75+
```
76+
- Make sure [Go](https://go.dev/doc/install) is installed. This walkthorugh is tested with go1.18.
77+
- Make sure [Ginkgo](https://onsi.github.io/ginkgo/) v1.16.5 or later is installed.
78+
```shell
79+
go install github.com/onsi/ginkgo/[email protected]
80+
```
81+
```shell
82+
ginkgo version
83+
Ginkgo Version 1.16.5
84+
```
85+
86+
87+
### 1.2 Installing AppMesh Controller for EKS
88+
Follow this [walkthrough: App Mesh with EKS](../eks/) for details about AppMesh Controller for EKS. Don't forget to authenticate with your
89+
AWS account in case you get `AccessDeniedException` or `GetCallerIdentity STS` error.
90+
91+
1. Make sure you cloned the [AWS AppMesh controller repo](https://github.com/aws/aws-app-mesh-controller-for-k8s). We will need this controller repo
92+
path (`CONTROLLER_PATH`) in [step 2](#step-2:-set-environment-variables).
93+
94+
```
95+
git clone https://github.com/aws/aws-app-mesh-controller-for-k8s.git
96+
```
97+
2. Create an EKS cluster with `eksctl`. Following is an example command to create a cluster with name `appmeshtest`:
98+
99+
```sh
100+
eksctl create cluster \
101+
--name appmeshtest \
102+
--nodes-min 2 \
103+
--nodes-max 3 \
104+
--nodes 2 \
105+
--auto-kubeconfig \
106+
--full-ecr-access \
107+
--appmesh-access
108+
# ...
109+
# [✔] EKS cluster "appmeshtest" in "us-west-2" region is ready
110+
```
111+
112+
3. Update the `KUBECONFIG` environment variable according to the output of the above `eksctl` command:
113+
114+
```sh
115+
export KUBECONFIG=~/.kube/eksctl/clusters/appmeshtest
116+
```
117+
If you need to update the `kubeconfig` file, you can follow this [guide](https://docs.aws.amazon.com/eks/latest/userguide/create-kubeconfig.html) and run the following:
118+
```shell
119+
aws eks update-kubeconfig --region $AWS_REGION --name $CLUSTER_NAME # in this example, $AWS_REGION us-west-2 and cluster-name appmeshtest
120+
```
121+
122+
4. Run the following set of commands to install the App Mesh controller
123+
124+
```sh
125+
helm repo add eks https://aws.github.io/eks-charts
126+
helm repo update
127+
kubectl create ns appmesh-system
128+
kubectl apply -k "https://github.com/aws/eks-charts/stable/appmesh-controller/crds?ref=master"
129+
helm upgrade -i appmesh-controller eks/appmesh-controller --namespace appmesh-system
130+
```
131+
132+
### 1.3 Load Test Setup
133+
Clone this repository and navigate to the `walkthroughs/howto-k8s-appmesh-load-test` folder. All the commands henceforth are assumed to be run from the same directory as this `README`.
134+
1. Run the following command to install all python dependencies required for this test
135+
```shell
136+
pip3 install -r requirements.txt
137+
```
138+
2. Install "appmesh-prometheus". You may follow this [App Mesh Prometheus](https://github.com/aws/eks-charts/tree/master/stable/appmesh-prometheus) chart for installation support.
139+
140+
```sh
141+
helm upgrade -i appmesh-prometheus eks/appmesh-prometheus --namespace appmesh-system
142+
```
143+
3. Load test results will be stored into S3 bucket. So, in `scripts/constants.py` give your `S3_BUCKET` a unique name.
21144
22145
## Step 2: Set Environment Variables
23146
We need to set a few environment variables before starting the load tests.
@@ -29,6 +152,8 @@ export KUBECONFIG=<If eksctl is used to create the cluster, the KUBECONFIG will
29152
export AWS_REGION=us-west-2
30153
export VPC_ID=<VPC ID of the cluster, can be found using: aws eks describe-cluster --name $CLUSTER_NAME | grep 'vpcId'>
31154
```
155+
You can change these `env` variables in `vars.env` file and then apply it using: `source ./vars.env`.
156+
32157
33158
34159
@@ -44,10 +169,11 @@ a VirtualNode, Deployment, Service and VirtualService (with its VirtualNode as i
44169
"2": ["4"]
45170
},
46171
```
47-
where the node names are `"0"`, `"1"`, `"2"`, `"3"` and `"4"`.
172+
where the virtual node names are `"0"`, `"1"`, `"2"`, `"3"` and `"4"`.
48173
49174
* `load_tests` -: Array of different test configurations that need to be run on the mesh.
50-
* `url`: is the service endpoint that Fortio (load generator) should hit. The `url` format is: `http://service-<virtual-node-name>.tls-e2e.svc.cluster.local:9080/`.
175+
* `test_name`: Name of the experiment. This name will be used to store the experimenter results into S3.
176+
* `url`: is the service endpoint that Fortio (load generator) will hit. The `url` format is: `http://service-<virtual-node-name>.tls-e2e.svc.cluster.local:9080/`.
51177
For example, based on the above `backends_map`, if we want to send the load traffic to the first virtual node `"0"`, then the `ulr` will look like:
52178
`http://service-0.tls-e2e.svc.cluster.local:9080/`.
53179
* `qps`: Total Queries Per Seconds fortio sends to the endpoints.
@@ -57,32 +183,64 @@ a VirtualNode, Deployment, Service and VirtualService (with its VirtualNode as i
57183
58184
* `metrics` -: Map of metric_name to the corresponding metric [PromQL logic](https://prometheus.io/docs/prometheus/latest/querying/operators/).
59185
186+
### Description of other files
187+
- `load_driver.py` -: Script which reads `config.json` and triggers load tests, reads metrics from PromQL and writes to S3. Called from within ginkgo.
188+
- `fortio.yaml` -: Spec of the Fortio components which are created during runtime.
189+
- `request_handler.py` and `request_handler_driver.sh` -: The custom service that runs in each of the pods to handle and route incoming requests according
190+
to the mapping in `backends_map`.
191+
- `configmap.yaml` -: ConfigMap spec to mount above request_handler* files into the cluster instead of creating Docker containers.
192+
Don't forget to use the absolute path of `request_handler_driver.sh`.
193+
- `cluster.yaml` -: This is optional and an example EKS cluster config file. This `cluster.yaml` can be used to create an EKS cluster by running `eksctl create cluster -f cluster.yaml`.
194+
195+
196+
60197
## Step 4: Running the Load Test
61198
Run the driver script using the below command -:
62-
> sh scripts/driver.sh
199+
200+
```sh
201+
/bin/bash scripts/driver.sh
202+
```
63203

64204
The driver script will perform the following -:
65-
1. Install necessary Python3 libraries.
205+
1. Checks necessary environment variables are set which is required to run this load test.
66206
2. Port-forward the Prometheus service to local.
67207
3. Run the Ginkgo test which is the entrypoint for our load test.
68208
4. Kill the Prometheus port-forwarding after the load Test is done.
69209

70210

71211
## Step 5: Analyze the Results
72-
All the test results are saved into `S3_BUCKET` which was specified in `scripts/constants.py`.
73-
Optionally, you can run the `scripts/analyze_load_test_data.py` to visualize the results.
74-
The `analyze_load_test_data.py` will
212+
All the test results are saved into `S3_BUCKET` which was specified in `scripts/constants.py`. Optionally, you can run the `scripts/analyze_load_test_data.py` to visualize the results.
213+
The `analyze_load_test_data.py` will:
75214
* First download all the load test results from the `S3_BUCKET` into `scripts\data` directory, then
76-
* Plot a graph against the actual QPS (query per second) Fortio sends to the first VirtualNode vs the max memory consumed by the container of that VirtualNode.
77-
78-
## Description of other files
79-
`load_driver.py` -: Script which reads `config.json` and triggers load tests, reads metrics from PromQL and writes to S3. Called from within ginkgo
80-
81-
`fortio.yaml` -: Spec of the Fortio components which are created during runtime
82-
83-
`request_handler.py` and `request_handler_driver.sh` -: The custom service that runs in each of the pods to handle and route incoming requests according
84-
to the mapping in `backends_map`
85-
86-
`configmap.yaml` -: ConfigMap spec to mount above request_handler* files into the cluster instead of creating Docker containers. Don't forget to use the absolute path of `request_handler_driver.sh`
87-
88-
`cluster.yaml` -: A sample EKS cluster config. This `cluster.yaml` can be used to create an EKS cluster by running `eksctl create cluster -f cluster.yaml`
215+
* Plot a graph against the actual QPS (query per seconds) Fortio sends to the first VirtualNode vs the max memory consumed by the container of that VirtualNode.
216+
217+
## Step 6: Clean-up
218+
219+
After the load test is finished, the mesh (including its dependent resources such as virtual nodes, services etc.) and the corresponding Kubernetes
220+
namespace (currently this load test uses `tls-e2e` namespace) will be cleaned automatically. However, in case the test is stopped, perhaps because of manual intervention like pressing
221+
ctrl + c, the automatic cleanup process may not be finished. In that case we have to manually clean up the mesh and the namespace.
222+
- Delete the namespace:
223+
```sh
224+
kubectl delete ns tls-e2e
225+
```
226+
- The mesh created in our load test starts with `$CLUSTER_NAME` + 6 character long alphanumeric random string. So search for the exact mesh name by running:
227+
```sh
228+
kubectl get mesh --all-namespaces
229+
```
230+
Then delete the mesh
231+
```shell
232+
kubectl delete mesh $CLUSTER_NAME+6 character long alphanumeric random string
233+
```
234+
235+
- Delete the controller and prometheus
236+
237+
```shell
238+
helm delete appmesh-controller -n appmesh-system
239+
helm delete appmesh-prometheus -n appmesh-system
240+
kubectl delete ns appmesh-system
241+
```
242+
- Finally, get rid of the EKS cluster to free all compute, networking, and storage resources, using:
243+
244+
```sh
245+
eksctl delete cluster --name $CLUSTER_NAME # In our case $CLUSTER_NAME is appmeshtest
246+
```
49.5 KB
Loading
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
altair==4.2.0
2+
boto3==1.26.14
3+
botocore==1.29.14
4+
matplotlib==3.5.3
5+
numpy==1.21.6
6+
pandas==1.3.5
7+
requests==2.28.1

0 commit comments

Comments
 (0)