Some of the unit tests seem to be non-deterministic. See for example, the smoke test runs here: https://github.com/lee-group-cmu/FlexCode/actions/workflows/smoke-test.yml
Nothing has changed, but the unit tests are not reliably passing and thus send false alerts.