Skip to content

Add in in test retry for AI.#1388

Open
AustinBenoit wants to merge 3 commits intomainfrom
ailimitedretry
Open

Add in in test retry for AI.#1388
AustinBenoit wants to merge 3 commits intomainfrom
ailimitedretry

Conversation

@AustinBenoit
Copy link
Contributor

The github runners will retry but at whole sdk level. This means even if we encounter a retryable error in one test we must retry the whole product. This means we wasting our credits. In particular for our limited image generation ones. We can limit this by retrying in the test itself. The only con here is potenitaly retrying something that is doomed to fail. I have tried to put in some smarts to prevent this.

In the tests we can see a 429 resource exhausted that come in two flavours.

"code": 429,
"message": "Resource exhausted. Please try again later. Please refer to https://cloud.google.com/vertex-ai/generative-ai/docs/error-code-429 for more details.",

and

"code": 429,
"message": "You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. To monitor your current usage, head to: https://ai.dev/rate-limit. \n* Quota exceeded for metric: generativelanguage.googleapis.com/predict_requests_per_model_per_day_paid_tier_1, limit: 70, model: imagen-4.0-generate",

The exponential backoff should be helpful for "Resource exhausted. Please try again later. " but no the quota exceded one.

Description

Provide details of the change, and generalize the change in the PR title above.


Testing

Describe how you've tested these changes.

Tested this locally and saw some good recoveries that prevented the whole test run failing.


Type of Change

Place an x the applicable box:

  • Bug fix. Add the issue # below if applicable.
  • New feature. A non-breaking change which adds functionality.
  • Other, such as a build process or documentation change.

The github runners will retry but at whole sdk level. This means even if we encounter a retryable error in one test we must retry the whole product. This means we wasting our credits. In particular for our limited image generation ones. We can limit this by retrying in the test itself. The only con here is potenitaly retrying something that is doomed to fail. I have tried to put in some smarts to prevent this.

In the tests we can see a 429 resource exhausted that come in two flavours.

"code": 429,
    "message": "Resource exhausted. Please try again later. Please refer to https://cloud.google.com/vertex-ai/generative-ai/docs/error-code-429 for more details.",

and

"code": 429,
    "message": "You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. To monitor your current usage, head to: https://ai.dev/rate-limit. \n* Quota exceeded for metric: generativelanguage.googleapis.com/predict_requests_per_model_per_day_paid_tier_1, limit: 70, model: imagen-4.0-generate",

The exponential backoff should be helpful for  "Resource exhausted. Please try again later. " but no the quota exceded one.
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @AustinBenoit, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a robust in-test retry mechanism for AI SDK tests. The primary goal is to improve the stability and efficiency of automated tests by intelligently handling transient service errors, such as rate limiting. Instead of relying on costly, full-SDK-level retries, individual tests now employ an exponential backoff strategy with jitter, specifically designed to reattempt operations that are likely to succeed on a subsequent try, while avoiding retries for hard quota limits. This approach conserves testing resources and provides more accurate test results by isolating transient failures.

Highlights

  • Enhanced Test Reliability: Implemented in-test retry logic for AI SDK tests to handle transient errors like "429 Resource Exhausted" more gracefully, preventing entire test suite failures.
  • Optimized Resource Usage: By retrying individual tests rather than the whole SDK level, this change aims to reduce unnecessary resource consumption, especially for limited image generation credits.
  • Intelligent Retry Mechanism: Introduced logic to differentiate between retryable "429 Resource Exhausted" errors and non-retryable "429 Quota Exceeded" errors, applying exponential backoff only when appropriate.
  • Improved Error Context: Added the HTTP status code to exception data, providing more detailed information for error handling and retry decisions.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • firebaseai/src/Internal/HttpHelpers.cs
    • Added the HTTP status code to the Exception.Data dictionary when an HttpRequestException is thrown, allowing downstream code to inspect the status code.
  • firebaseai/testapp/Assets/Firebase/Sample/FirebaseAI/UIHandlerAutomated.cs
    • Defined a HashSet of RetryableCodes (429, 503, 504).
    • Introduced constants MaxRetries (5) and InitialRetryDelayMilliseconds (2000) for the retry mechanism.
    • Added IsQuotaExhausted helper to check if an exception message indicates a quota exhaustion.
    • Implemented ShouldRetry helper to determine if an exception warrants a retry, specifically excluding 429 Quota Exceeded errors.
    • Created RetryTestWithExponentialBackoff asynchronous method to wrap test actions, applying exponential backoff with random jitter for retryable errors.
    • Modified the test setup in Start() to wrap both multiBackendTests and singleTests with the new RetryTestWithExponentialBackoff logic.
Activity
  • The author tested the changes locally and observed successful recoveries, preventing entire test runs from failing.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@AustinBenoit AustinBenoit added the tests-requested: quick Trigger a quick set of integration tests. label Feb 4, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces retry logic for AI tests to handle transient errors from the backend service, which is a great improvement for the stability of the test suite. The implementation uses exponential backoff with jitter, which is a solid approach. The logic correctly distinguishes between different types of 429 errors to avoid retrying non-recoverable quota issues. My only suggestion is to make the detection of quota errors more robust by parsing the error response instead of relying on a simple string match in the exception message.

Comment on lines +62 to +66
private bool IsQuotaExhausted(Exception ex)
{
var msg = ex.Message;
return !string.IsNullOrEmpty(msg) && msg.Contains("exceeded your current quota");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Relying on the exception message string makes this check fragile. If the backend error message for quota exhaustion changes, this logic will break.

A more robust approach would be to parse the error content from the exception. The HttpRequestException message contains the JSON error response from the service. You could extract and parse this JSON to check the error message field directly, which would be less susceptible to changes in the overall exception message format.

For example:

private bool IsQuotaExhausted(Exception ex)
{
  var msg = ex.Message;
  const string errorContentPrefix = "Error Content: ";
  int errorContentIndex = msg.IndexOf(errorContentPrefix, StringComparison.Ordinal);
  if (errorContentIndex != -1)
  {
    string errorJson = msg.Substring(errorContentIndex + errorContentPrefix.Length);
    try
    {
      if (Google.MiniJSON.Json.Deserialize(errorJson) is Dictionary<string, object> errorData &&
          errorData.TryGetValue("message", out var errorMessage) &&
          errorMessage is string s)
      {
        return s.Contains("exceeded your current quota");
      }
    }
    catch
    {
      // Fallback to string search on the whole message if JSON parsing fails.
    }
  }
  return !string.IsNullOrEmpty(msg) && msg.Contains("exceeded your current quota");
}

Since this is in test code, the current implementation might be acceptable for now, but this suggestion would improve robustness.

@github-actions github-actions bot added tests: in-progress This PR's integration tests are in progress. and removed tests-requested: quick Trigger a quick set of integration tests. labels Feb 4, 2026
@github-actions
Copy link

github-actions bot commented Feb 4, 2026

❌  Integration test FAILED

Requested by @firebase-workflow-trigger[bot] on commit edee4c9
Last updated: Wed Feb 4 13:44 PST 2026
View integration test log & download artifacts

Failures Configs
D:\a\firebase-unity-sdk\firebase-unity-sdk\testapps\Unity2021.3.43f1-NET4.6\FirebaseAI\testapp [TEST] [ERROR] [2021] [1/2 Build OS(s): windows] [1/6 Platform(s): Playmode] [1/3 Test Device(s): github_runner]
firestore [TEST] [ERROR] [2021] [1/2 Build OS(s): macos] [1/6 Platform(s): 14] [1/3 Test Device(s): iOS]
messaging [TEST] [FAILURE] [2021] [1/2 Build OS(s): windows] [1/6 Platform(s): Android] [1/3 Test Device(s): android_target]

$"HTTP request failed with status code: {(int)response.StatusCode} ({response.ReasonPhrase}).\n" +
$"Error Content: {errorContent}"
);
ex.Data["StatusCode"] = (int)response.StatusCode;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of trying to pass it along via this, it might be better to use one of the other constructors that takes in a StatusCode: https://learn.microsoft.com/en-us/dotnet/api/system.net.http.httprequestexception.-ctor?view=net-10.0#system-net-http-httprequestexception-ctor(system-string-system-exception-system-nullable((system-net-httpstatuscode)))

Then below you can use the built in StatusCode property of the exception.

@github-actions github-actions bot added the tests: failed This PR's integration tests failed. label Feb 4, 2026
@firebase-workflow-trigger firebase-workflow-trigger bot removed the tests: in-progress This PR's integration tests are in progress. label Feb 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

tests: failed This PR's integration tests failed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants