Skip to content

Conversation

@schnecle
Copy link
Contributor

@schnecle schnecle commented Nov 24, 2025

Description

  • Adds e2e tests with gemini cli for Crashlytics
  • Adds functionality to interact with memories
  • Adds expectation negation. Chose to use "dont" to keep readability high (e.g. run.dont.expectText)
  • Converts agent-evals into a commonjs library so that we can pull in dependencies from core firebase-tools. This is demonstrated with the firebase_get_environment tool which now has an exported render function that can be used in mocks, preventing test drift.

Note: the bulk of the changes are in the template apps which are just small skeleton apps for the purpose of a smoke test for auto detection using the most common cases. The content of those apps is largely irrelevant but leads to quite a few file changes.

Scenarios Tested

  • See scripts/agent-evals/tests/crashlytics/connect.spec.ts

Sample Commands

cd scripts/agent-evals
npm run test:dev

@schnecle
Copy link
Contributor Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request makes significant improvements by adding e2e tests for Crashlytics, introducing memory interaction and assertion negation for tests, and refactoring agent-evals to a CommonJS library. The use of a shared renderTemplate for mocks is a great change to prevent test drift. My review has identified a few issues. There are some incorrect paths in the test configuration (.mocharc.yml and package.json) and a critical path issue in the GeminiCliRunner that will likely cause tests to fail. I've also noted some debug console.log statements that should be removed and some copy-paste errors in the new template README files. Addressing these points will improve the robustness and clarity of the new testing infrastructure.

@schnecle schnecle force-pushed the schnecle/add-agent-evals branch from 8321bf4 to e3e7221 Compare November 24, 2025 20:53
@schnecle schnecle requested review from joehan and samedson November 24, 2025 20:53
@schnecle schnecle marked this pull request as ready for review November 24, 2025 20:59

export const getEnvironmentWithIosApp = {
firebase_get_environment: toMockContent(
renderTemplate({ ...BASE_ENVIRONMENT_CONFIG, detectedAppIds: { [IOS_APP_ID]: IOS_BUNDLE_ID } }),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

templates are 🔥

Copy link
Contributor

@samedson samedson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have some questions in there, but excited to get this in!

Copy link
Contributor

@samedson samedson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just have that one nit on the dirs variable

@schnecle schnecle force-pushed the schnecle/add-agent-evals branch from 8872903 to c210cdc Compare November 25, 2025 20:32
Copy link
Contributor

@joehan joehan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with some nits and small q's

await run.type("/crashlytics:connect");
await run.expectToolCalls(["firebase_get_environment"]);

await run.expectText("prioritize");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly for my own understanding - why do we look for the word 'prioritize' here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The end of the crashlytics:connect prompt asks the agent to ask the user whether they would like to take either of the following actions --

  1. Prioritize the most impactful stability issues
  2. Diagnose and propose a fix for a crash

I'm just making sure that it asks that question.

@schnecle schnecle merged commit 4f5d0e5 into master Nov 25, 2025
48 checks passed
@schnecle schnecle deleted the schnecle/add-agent-evals branch November 25, 2025 21:31
@github-project-automation github-project-automation bot moved this from Approved [PR] to Done in [Cloud] Extensions + Functions Nov 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants