Workflow updates #2

cabutlermit · 2025-11-03T20:13:06Z

Purpose and background context

The prior PR was just about getting a functional ZIP file into AWS for initial testing in CloudWatch Synthetic Canaries with lots of manual steps. This PR is all about automating the build process for our Dev environment (and prepping for the Stage and Prod deployment workflows).

There are NO changes to the Javascript application itself; this is all about deployment.

Examples

If you would like to see this in action, there are two deployed Canaries in Dev1, one that is for testing and one that is part of the APT infrastructure. In Dev1, navigate to CloudWatch and look for the Synthetic Canaries in the left navigation bar (CloudWatch Synthetic Canaries). The js-check-lambda-function-url was built by the mitlib-tf-sandbox-examples repository as a PoC. The apt-dev-function-url-check was built by the mitlib-tf-workloads-apt repository as the first real use of a Canary.

If you navigate to CloudWatch Metrics (All metrics) you will see the CloudWatchSynthetics custom namespace and you can drill into that to see the metrics that are produced by the Canaries.

How can a reviewer manually see the effects of these changes?

Checkout the branch and work through the various make targets, both at the root of the repository and in the lambda_function_url_checker folder. All of the make commands should work with no errors.
Following the instructions in the main README, test the generate-matrix.sh script locally to see that it will generate the correct set of folders (just one) to pass to the 2nd job in the workflow.
Check the Actions tab on this repository and review the actions that have run for this branch.

See Reference: Synthetic Canary CloudWatch Alarms in Confluence for additional background information.

Includes new or updated dependencies?

NO

Changes expectations for external applications?

NO

What are the relevant tickets?

Why these changes are being introduced: Some testing was done in a separate feature branch to work out the details of how we will handle automated deployment using GitHub Actions. It turns out that a simpler repository folder structure will make this a little easier. How this addresses that need: * Remove the `nodejs` subfolder from the lambda_function_url_checker and flatten the files/folders slightly * Update the test code to find all the dependencies correctly in the new folder structure * Update the package.json to reflect the new file/folder structure * Create a .runtime file for the developer to designate the expected runtime for the application in CloudWatch Synthetic Canary * Create a Makefile in the application folder for application-specific commands/targets * Update the main Makefile to simply call subfolder Makefiles instead * Clean up the CI workflow file slightly (addressing some typos) and set the trigger for PRs to main (and manual triggers in the UI) Side effects of this change: None. Related Jira Tickets: * https://mitlibraries.atlassian.net/browse/IN-1532

Why these changes are being introduced: This is the first pass and a deployment workflow in GitHub Actions to push the ZIP file to the S3 bucket and set SSM Parameter Store parameters so that the Terraform repository using this function can set the correct values. How this is being implemented: * Create a new dev-deploy.yml workflow * Create a new generate-matrix.sh bash script to generate the matrix of deployment jobx More details on the dev-deploy workflow: * Use the generate-matrix.sh script to generate the list of top level folders that have changes * For each folder list in the matrix, run a deployment job to zip the application, push the ZIP to the shared-files S3 bucket, and update two SSM Parameter Store parameters with the key to the ZIP and the expected runtime in CloudWatch Synthetic Canaries More details on the generate-matrix.sh script: * This was written with assistance from GitHub Copilot (but is mostly an update to the matrix method that was first used by DLS in the archived mitlib-terraform repository) * It was moved to a shell script to allow for easier local testing to ensure that it was actually doing what was expected without having to run it multiple times in GitHub as an Actions * It creates a list of top-level folders that have changes (from a git diff), accounting for the various ways the Action could be running (first push on new branch, triggered by a PR, triggered manually in the GitHub UI) * It exports the list to the GITHUB_OUTPUT environment so that the Action can use this information to build the matrix to run for as many folders have updated code Side effects: None. Related Jira tickets: * https://mitlibraries.atlassian.net/browse/IN-1531

Why these changes are being introduced: To make testing easier in the near future for the stage & prod deployment workflows, we create stub workflows for now so that they will exist in the `main` branch and allow for easier testing when we finalize the steps. How this addresses that need: * Create a stage-deploy.yml workflow with just a stub set of steps * Create a prod-deploy.yml workflow with just a stub set of steps These two workflows will generate a matrix and also attempt to log in to AWS with OIDC but will not create any ZIP files. Side effects of this change: None.

* Update the main README with more context and details for this repo * Update the lambda_function_url_checker README

cabutlermit · 2025-11-04T17:18:19Z

@ghukill I figured you might enjoy a little break from USE to take a look at this repository for building ZIP files for Synthetic Canaries. Might also be worth looking at the first PR in this repo that Jeremy reviewed (for a little historical perspective).

Hopefully the PR description is sufficient, but happy to have a detailed conversation about this here in the PR...

ghukill

Very near a full approval, but left a couple of README suggestions and a question about making the lambda URL callable locally for testing and development.

But I must say, really like the overall architecture. At least for me, it was critical to understand that each canary may cover multiple projects, and that "all" this repository does is define them and then build zip files per Github actions. Nice work!

ghukill · 2025-11-06T15:16:40Z

.github/scripts/generate-matrix.sh

I wonder if you could perform chmod +x on the file and git commit that? I had to do that locally to get the script to run and I see it performed here in the actions.

I wondered about that. I couldn't remember if the "execute" flag was passed properly into GHA (it probably is, but I should probably test this more thoroughty).

I did run chmod +x on the script, so I'll be curious to see if that is preserved through git push and git pull fo starters. I'm not quite ready to remove that line from the workflow yet...

ghukill · 2025-11-07T14:12:59Z

README.md

After you provided an overview and context, I think this README reads really well. It's dense, but the information one needs is there.

If I had one suggestion -- something I think would have helped me make a critical leap of understanding -- it would be to add a "Development" section. In this section, I think you could touch on some things:

creating a new "canary" is as simple as creating a root level folder (repeating, but that's okay)

each canary may perform checking for one or many applications; it provides a method but is agnostic to what application it's used for

when this repository builds, we update any modified canary zip file(s) in S3, thereby providing updating builds for the actual project's terraforms.

You likely say all this in the README already, but I just have this hunch that a "Development" section could be helpful. I'd hope to walk away from that section knowing:

canaries are potentially one-to-many for applications

I'm basically building a little canary lambda blueprint (zip file) that other terraform projects will utilize

Thank you -- this is exactly the kind of feedback I was hoping for: where are the gaps in the framing information so that it will be easier for someone to pick this up reasonably easily.

README updated. Curious to know if the new version has sufficient context.

ghukill · 2025-11-07T14:17:47Z

lambda_function_url_checker/README.md

+## Overview
+
+This is a CloudWatch Synthetic Canary application that is Javasctip/Nodejs/Puppeteer-based. The following file structure is expected for an application like this:


I might propose in the "Overview" section here, explaining what this canary does. I realize the first line basicaslly explains this,

A generic-ish CloudWatch Synthetic Canary to health-check Lambda Function URLs using HTTP POST.

.... but I still think it could be helpful. Might be an opportunity to have language like,

"This canary can be used to test any deployed Lambda that has a Function URL. This is acheived by making an HTTP post request to the lambda's Function URL, with the URL and payload configured by env vars in that lambda's terraform project. Broadly speaking, a 200 response is all that's required to demonstrate the lambda is not idle."

Again, maybe this is touched on elsewhere? But I think it could be helpful in the first couple lines you encounter with the README.

README updated.

ghukill · 2025-11-07T14:30:58Z

lambda_function_url_checker/index.js

Commenting here on the lambda_function_url_checker's main index.js file.

As mentioned in our discussion, I wonder if it might be worth making this thing testable locally? Less about the synthetics + metrics intergrations, and more about the Lambda Function URL requesting.

For example, the index.js file could pickup when WORKSPACE=local and then skip all the synthetics and metrics stuff.

I threw this at AI and got back an updated index.js that did just that, with what look like pretty minimal changes. I was able to confirm this for the Dev1 APT lambda:

Env vars in .env file:

NODE_ENV=test WORKSPACE=local CANARY_URL=<URL HERE> CANARY_PAYLOAD={"action":"ping","challenge_secret":"<SECRET HERE>","verbose":"true"}

Invoked:

WORKSPACE=local node lambda_function_url_checker/index.js # [[email protected]] injecting env (3) from .env -- tip: 🔐 encrypt with Dotenvx: https://dotenvx.com # Checking Lambda Function URL: <URL HERE> with payload {"action":"ping","challenge_secret":"<SECRET>","verbose":"true"} # Received HTTP status code: 200 # Check completed in 154 ms

This might even lend itself to a Makefile command then, which is entirely driven by the env vars present (just like in terraform / deployed context!).

Please consider this purely optional for the PR itself.

I gave this a shot with some simple conditionals. It's feels slightly klugdy, but it was the most obvious path forward to me (you should have seen the over-engineered attempts that the LLMs were proposing...). And, there is now a make local-exec to run it locally.

Why these changes are being introduced: This should address most of the issues raised in the PR conversation and code comments. How this addresses that need: * Update the root README * Update the lambda_function_url_checker README * Update the generate-matrix.sh with `chmod +x` to default it to executable * Update index.js to allow for a local run (with some conditional logic to bypass Synthetics when running locally) * Update the lambda_function_url_checker Makefile with a local-exec target that will run the app locally Side effects of this change: None.

ghukill

Looks great! Thanks for the additions.

I was also able to run the make local-exec for the function URL pinger canary.

Nice work!

ghukill · 2025-11-10T14:24:05Z

README.md

+* Creating a new canary is as simple as creating a root level folder
+* A single canary may perform checking for one or many applications; it provides a method but is agnostic to what application it's used to check
+* Each canary in this repository (that is, each top-level folder) will have a few hard requirements:
+  * A `.runtime` file in the folder that designates the expected Synthetics Canary runtime (see [AWS: Synthetics runtime versions](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch_Synthetics_Canaries_Library.html))
+  * A `Makefile` with (at a minimum) `install`, `update`, and `test` targets (as these are expected by the root [Makefile](./Makefile))
+  * A `zipmanifest.txt` file that designates how the ZIP will be built for loading into a Synthetics Canary (different Synthetic Canary runtimes have different requirements for the structure of the ZIP file)
+
+## Deployment Overview
+
+Deploying (or building) in this repository simply means copying a ZIP file of an application to a shared S3 bucket in AWS. That ZIP file is like a blueprint (or template) application that can be loaded into a Synthetic Canary that is built by Terraform to monitor an HTTP/HTTPS endpoint. It will be helpful to have both a `zip` and a `dev-publish` targets in the canary folder `Makefile` so that it will be easy for the developer to deploy the ZIP to Dev1 for testing in AWS.
+
+The rest of the deployment process is handled by GitHub Actions, following the MITL typical GitHub-flow:
+
+* Opening a PR to `main` will deploy the ZIP to our dev AWS account (and create/update SSM Parameters)
+* Merging a PR to `main` will deploy the ZIP to our stage AWS account (and create/update SSM Parameters)
+* Tagging a release on `main` will deploy the ZIP to our prod AWS account (and create/update SSM Parameters)
+
+## General Overview
+
+The general idea of this repository is to de-couple the actual CloudWatch Synthetic Canary application code from the Terraform repositories that will deploy the Canary to monitor web services. Since the application code is decoupled from the Terraform repositories, it means that we can write fewer applications that are more generic (they are like blueprints or templates). We can customize the generic applications in each Terraform repository that builds the Canary infrastructure by using environment variables that can be set by Terraform. This requires that we export some critical information from the deployment process for this repository to SSM Parameter Store so that the dependent Terraform repositories can read this information to build the infrastructure for the Canary correctly.


:chef's kiss:

Thanks for adding this! I feel like it's filling the gap I had on my first pass.

ghukill · 2025-11-10T14:25:39Z

lambda_function_url_checker/index.js

+  if (process.env.WORKSPACE === "local") {
+    await _checkUrl(url, data);
+  } else {
+    await synthetics.executeStep("check_lambda_function_url", async () => {
+      await _checkUrl(url, data);


Looks good to me! This is feeling like all we need for a quick developer test locally.

cabutlermit added 3 commits November 3, 2025 16:14

cabutlermit force-pushed the workflow-updates branch from 1af5088 to 3473950 Compare November 3, 2025 21:15

Documentation Updates

5200b22

* Update the main README with more context and details for this repo * Update the lambda_function_url_checker README

cabutlermit force-pushed the workflow-updates branch from 3473950 to 5200b22 Compare November 4, 2025 17:06

cabutlermit requested a review from ghukill November 4, 2025 17:16

cabutlermit marked this pull request as ready for review November 4, 2025 17:16

ghukill self-assigned this Nov 6, 2025

ghukill reviewed Nov 7, 2025

View reviewed changes

cabutlermit requested a review from ghukill November 7, 2025 20:59

ghukill approved these changes Nov 10, 2025

View reviewed changes

cabutlermit merged commit 8fe0da8 into main Nov 10, 2025
3 checks passed

		## Overview

		This is a CloudWatch Synthetic Canary application that is Javasctip/Nodejs/Puppeteer-based. The following file structure is expected for an application like this:

Workflow updates #2

Workflow updates #2

Uh oh!

Conversation

cabutlermit commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose and background context

Examples

How can a reviewer manually see the effects of these changes?

Includes new or updated dependencies?

Changes expectations for external applications?

What are the relevant tickets?

Uh oh!

cabutlermit commented Nov 4, 2025

Uh oh!

ghukill left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ghukill left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

cabutlermit commented Nov 3, 2025 •

edited

Loading