-
Notifications
You must be signed in to change notification settings - Fork 0
Workflow updates #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Why these changes are being introduced: Some testing was done in a separate feature branch to work out the details of how we will handle automated deployment using GitHub Actions. It turns out that a simpler repository folder structure will make this a little easier. How this addresses that need: * Remove the `nodejs` subfolder from the lambda_function_url_checker and flatten the files/folders slightly * Update the test code to find all the dependencies correctly in the new folder structure * Update the package.json to reflect the new file/folder structure * Create a .runtime file for the developer to designate the expected runtime for the application in CloudWatch Synthetic Canary * Create a Makefile in the application folder for application-specific commands/targets * Update the main Makefile to simply call subfolder Makefiles instead * Clean up the CI workflow file slightly (addressing some typos) and set the trigger for PRs to main (and manual triggers in the UI) Side effects of this change: None. Related Jira Tickets: * https://mitlibraries.atlassian.net/browse/IN-1532
Why these changes are being introduced: This is the first pass and a deployment workflow in GitHub Actions to push the ZIP file to the S3 bucket and set SSM Parameter Store parameters so that the Terraform repository using this function can set the correct values. How this is being implemented: * Create a new dev-deploy.yml workflow * Create a new generate-matrix.sh bash script to generate the matrix of deployment jobx More details on the dev-deploy workflow: * Use the generate-matrix.sh script to generate the list of top level folders that have changes * For each folder list in the matrix, run a deployment job to zip the application, push the ZIP to the shared-files S3 bucket, and update two SSM Parameter Store parameters with the key to the ZIP and the expected runtime in CloudWatch Synthetic Canaries More details on the generate-matrix.sh script: * This was written with assistance from GitHub Copilot (but is mostly an update to the matrix method that was first used by DLS in the archived mitlib-terraform repository) * It was moved to a shell script to allow for easier local testing to ensure that it was actually doing what was expected without having to run it multiple times in GitHub as an Actions * It creates a list of top-level folders that have changes (from a git diff), accounting for the various ways the Action could be running (first push on new branch, triggered by a PR, triggered manually in the GitHub UI) * It exports the list to the GITHUB_OUTPUT environment so that the Action can use this information to build the matrix to run for as many folders have updated code Side effects: None. Related Jira tickets: * https://mitlibraries.atlassian.net/browse/IN-1531
Why these changes are being introduced: To make testing easier in the near future for the stage & prod deployment workflows, we create stub workflows for now so that they will exist in the `main` branch and allow for easier testing when we finalize the steps. How this addresses that need: * Create a stage-deploy.yml workflow with just a stub set of steps * Create a prod-deploy.yml workflow with just a stub set of steps These two workflows will generate a matrix and also attempt to log in to AWS with OIDC but will not create any ZIP files. Side effects of this change: None.
1af5088 to
3473950
Compare
* Update the main README with more context and details for this repo * Update the lambda_function_url_checker README
3473950 to
5200b22
Compare
|
@ghukill I figured you might enjoy a little break from USE to take a look at this repository for building ZIP files for Synthetic Canaries. Might also be worth looking at the first PR in this repo that Jeremy reviewed (for a little historical perspective). Hopefully the PR description is sufficient, but happy to have a detailed conversation about this here in the PR... |
ghukill
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very near a full approval, but left a couple of README suggestions and a question about making the lambda URL callable locally for testing and development.
But I must say, really like the overall architecture. At least for me, it was critical to understand that each canary may cover multiple projects, and that "all" this repository does is define them and then build zip files per Github actions. Nice work!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if you could perform chmod +x on the file and git commit that? I had to do that locally to get the script to run and I see it performed here in the actions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wondered about that. I couldn't remember if the "execute" flag was passed properly into GHA (it probably is, but I should probably test this more thoroughty).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did run chmod +x on the script, so I'll be curious to see if that is preserved through git push and git pull fo starters. I'm not quite ready to remove that line from the workflow yet...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After you provided an overview and context, I think this README reads really well. It's dense, but the information one needs is there.
If I had one suggestion -- something I think would have helped me make a critical leap of understanding -- it would be to add a "Development" section. In this section, I think you could touch on some things:
- creating a new "canary" is as simple as creating a root level folder (repeating, but that's okay)
- each canary may perform checking for one or many applications; it provides a method but is agnostic to what application it's used for
- when this repository builds, we update any modified canary zip file(s) in S3, thereby providing updating builds for the actual project's terraforms.
You likely say all this in the README already, but I just have this hunch that a "Development" section could be helpful. I'd hope to walk away from that section knowing:
- canaries are potentially one-to-many for applications
- I'm basically building a little canary lambda blueprint (zip file) that other terraform projects will utilize
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you -- this is exactly the kind of feedback I was hoping for: where are the gaps in the framing information so that it will be easier for someone to pick this up reasonably easily.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
README updated. Curious to know if the new version has sufficient context.
| ## Overview | ||
|
|
||
| This is a CloudWatch Synthetic Canary application that is Javasctip/Nodejs/Puppeteer-based. The following file structure is expected for an application like this: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might propose in the "Overview" section here, explaining what this canary does. I realize the first line basicaslly explains this,
A generic-ish CloudWatch Synthetic Canary to health-check Lambda Function URLs using HTTP POST.
.... but I still think it could be helpful. Might be an opportunity to have language like,
"This canary can be used to test any deployed Lambda that has a Function URL. This is acheived by making an HTTP post request to the lambda's Function URL, with the URL and payload configured by env vars in that lambda's terraform project. Broadly speaking, a 200 response is all that's required to demonstrate the lambda is not idle."
Again, maybe this is touched on elsewhere? But I think it could be helpful in the first couple lines you encounter with the README.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
README updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commenting here on the lambda_function_url_checker's main index.js file.
As mentioned in our discussion, I wonder if it might be worth making this thing testable locally? Less about the synthetics + metrics intergrations, and more about the Lambda Function URL requesting.
For example, the index.js file could pickup when WORKSPACE=local and then skip all the synthetics and metrics stuff.
I threw this at AI and got back an updated index.js that did just that, with what look like pretty minimal changes. I was able to confirm this for the Dev1 APT lambda:
Env vars in .env file:
NODE_ENV=test
WORKSPACE=local
CANARY_URL=<URL HERE>
CANARY_PAYLOAD={"action":"ping","challenge_secret":"<SECRET HERE>","verbose":"true"}Invoked:
WORKSPACE=local node lambda_function_url_checker/index.js
# [[email protected]] injecting env (3) from .env -- tip: 🔐 encrypt with Dotenvx: https://dotenvx.com
# Checking Lambda Function URL: <URL HERE> with payload {"action":"ping","challenge_secret":"<SECRET>","verbose":"true"}
# Received HTTP status code: 200
# Check completed in 154 msThis might even lend itself to a Makefile command then, which is entirely driven by the env vars present (just like in terraform / deployed context!).
Please consider this purely optional for the PR itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I gave this a shot with some simple conditionals. It's feels slightly klugdy, but it was the most obvious path forward to me (you should have seen the over-engineered attempts that the LLMs were proposing...). And, there is now a make local-exec to run it locally.
Why these changes are being introduced: This should address most of the issues raised in the PR conversation and code comments. How this addresses that need: * Update the root README * Update the lambda_function_url_checker README * Update the generate-matrix.sh with `chmod +x` to default it to executable * Update index.js to allow for a local run (with some conditional logic to bypass Synthetics when running locally) * Update the lambda_function_url_checker Makefile with a local-exec target that will run the app locally Side effects of this change: None.
ghukill
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Thanks for the additions.
I was also able to run the make local-exec for the function URL pinger canary.
Nice work!
| * Creating a new canary is as simple as creating a root level folder | ||
| * A single canary may perform checking for one or many applications; it provides a method but is agnostic to what application it's used to check | ||
| * Each canary in this repository (that is, each top-level folder) will have a few hard requirements: | ||
| * A `.runtime` file in the folder that designates the expected Synthetics Canary runtime (see [AWS: Synthetics runtime versions](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch_Synthetics_Canaries_Library.html)) | ||
| * A `Makefile` with (at a minimum) `install`, `update`, and `test` targets (as these are expected by the root [Makefile](./Makefile)) | ||
| * A `zipmanifest.txt` file that designates how the ZIP will be built for loading into a Synthetics Canary (different Synthetic Canary runtimes have different requirements for the structure of the ZIP file) | ||
|
|
||
| ## Deployment Overview | ||
|
|
||
| Deploying (or building) in this repository simply means copying a ZIP file of an application to a shared S3 bucket in AWS. That ZIP file is like a blueprint (or template) application that can be loaded into a Synthetic Canary that is built by Terraform to monitor an HTTP/HTTPS endpoint. It will be helpful to have both a `zip` and a `dev-publish` targets in the canary folder `Makefile` so that it will be easy for the developer to deploy the ZIP to Dev1 for testing in AWS. | ||
|
|
||
| The rest of the deployment process is handled by GitHub Actions, following the MITL typical GitHub-flow: | ||
|
|
||
| * Opening a PR to `main` will deploy the ZIP to our dev AWS account (and create/update SSM Parameters) | ||
| * Merging a PR to `main` will deploy the ZIP to our stage AWS account (and create/update SSM Parameters) | ||
| * Tagging a release on `main` will deploy the ZIP to our prod AWS account (and create/update SSM Parameters) | ||
|
|
||
| ## General Overview | ||
|
|
||
| The general idea of this repository is to de-couple the actual CloudWatch Synthetic Canary application code from the Terraform repositories that will deploy the Canary to monitor web services. Since the application code is decoupled from the Terraform repositories, it means that we can write fewer applications that are more generic (they are like blueprints or templates). We can customize the generic applications in each Terraform repository that builds the Canary infrastructure by using environment variables that can be set by Terraform. This requires that we export some critical information from the deployment process for this repository to SSM Parameter Store so that the dependent Terraform repositories can read this information to build the infrastructure for the Canary correctly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:chef's kiss:
Thanks for adding this! I feel like it's filling the gap I had on my first pass.
| if (process.env.WORKSPACE === "local") { | ||
| await _checkUrl(url, data); | ||
| } else { | ||
| await synthetics.executeStep("check_lambda_function_url", async () => { | ||
| await _checkUrl(url, data); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! This is feeling like all we need for a quick developer test locally.
Purpose and background context
The prior PR was just about getting a functional ZIP file into AWS for initial testing in CloudWatch Synthetic Canaries with lots of manual steps. This PR is all about automating the build process for our Dev environment (and prepping for the Stage and Prod deployment workflows).
There are NO changes to the Javascript application itself; this is all about deployment.
Examples
If you would like to see this in action, there are two deployed Canaries in Dev1, one that is for testing and one that is part of the APT infrastructure. In Dev1, navigate to CloudWatch and look for the Synthetic Canaries in the left navigation bar (CloudWatch Synthetic Canaries). The
js-check-lambda-function-urlwas built by the mitlib-tf-sandbox-examples repository as a PoC. Theapt-dev-function-url-checkwas built by the mitlib-tf-workloads-apt repository as the first real use of a Canary.If you navigate to CloudWatch Metrics (All metrics) you will see the CloudWatchSynthetics custom namespace and you can drill into that to see the metrics that are produced by the Canaries.
How can a reviewer manually see the effects of these changes?
maketargets, both at the root of the repository and in thelambda_function_url_checkerfolder. All of themakecommands should work with no errors.generate-matrix.shscript locally to see that it will generate the correct set of folders (just one) to pass to the 2nd job in the workflow.Includes new or updated dependencies?
NO
Changes expectations for external applications?
NO
What are the relevant tickets?