Conversation
The github runners will retry but at whole sdk level. This means even if we encounter a retryable error in one test we must retry the whole product. This means we wasting our credits. In particular for our limited image generation ones. We can limit this by retrying in the test itself. The only con here is potenitaly retrying something that is doomed to fail. I have tried to put in some smarts to prevent this.
In the tests we can see a 429 resource exhausted that come in two flavours.
"code": 429,
"message": "Resource exhausted. Please try again later. Please refer to https://cloud.google.com/vertex-ai/generative-ai/docs/error-code-429 for more details.",
and
"code": 429,
"message": "You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. To monitor your current usage, head to: https://ai.dev/rate-limit. \n* Quota exceeded for metric: generativelanguage.googleapis.com/predict_requests_per_model_per_day_paid_tier_1, limit: 70, model: imagen-4.0-generate",
The exponential backoff should be helpful for "Resource exhausted. Please try again later. " but no the quota exceded one.
Summary of ChangesHello @AustinBenoit, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a robust in-test retry mechanism for AI SDK tests. The primary goal is to improve the stability and efficiency of automated tests by intelligently handling transient service errors, such as rate limiting. Instead of relying on costly, full-SDK-level retries, individual tests now employ an exponential backoff strategy with jitter, specifically designed to reattempt operations that are likely to succeed on a subsequent try, while avoiding retries for hard quota limits. This approach conserves testing resources and provides more accurate test results by isolating transient failures. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces retry logic for AI tests to handle transient errors from the backend service, which is a great improvement for the stability of the test suite. The implementation uses exponential backoff with jitter, which is a solid approach. The logic correctly distinguishes between different types of 429 errors to avoid retrying non-recoverable quota issues. My only suggestion is to make the detection of quota errors more robust by parsing the error response instead of relying on a simple string match in the exception message.
| private bool IsQuotaExhausted(Exception ex) | ||
| { | ||
| var msg = ex.Message; | ||
| return !string.IsNullOrEmpty(msg) && msg.Contains("exceeded your current quota"); | ||
| } |
There was a problem hiding this comment.
Relying on the exception message string makes this check fragile. If the backend error message for quota exhaustion changes, this logic will break.
A more robust approach would be to parse the error content from the exception. The HttpRequestException message contains the JSON error response from the service. You could extract and parse this JSON to check the error message field directly, which would be less susceptible to changes in the overall exception message format.
For example:
private bool IsQuotaExhausted(Exception ex)
{
var msg = ex.Message;
const string errorContentPrefix = "Error Content: ";
int errorContentIndex = msg.IndexOf(errorContentPrefix, StringComparison.Ordinal);
if (errorContentIndex != -1)
{
string errorJson = msg.Substring(errorContentIndex + errorContentPrefix.Length);
try
{
if (Google.MiniJSON.Json.Deserialize(errorJson) is Dictionary<string, object> errorData &&
errorData.TryGetValue("message", out var errorMessage) &&
errorMessage is string s)
{
return s.Contains("exceeded your current quota");
}
}
catch
{
// Fallback to string search on the whole message if JSON parsing fails.
}
}
return !string.IsNullOrEmpty(msg) && msg.Contains("exceeded your current quota");
}Since this is in test code, the current implementation might be acceptable for now, but this suggestion would improve robustness.
❌ Integration test FAILEDRequested by @firebase-workflow-trigger[bot] on commit edee4c9
|
| $"HTTP request failed with status code: {(int)response.StatusCode} ({response.ReasonPhrase}).\n" + | ||
| $"Error Content: {errorContent}" | ||
| ); | ||
| ex.Data["StatusCode"] = (int)response.StatusCode; |
There was a problem hiding this comment.
Instead of trying to pass it along via this, it might be better to use one of the other constructors that takes in a StatusCode: https://learn.microsoft.com/en-us/dotnet/api/system.net.http.httprequestexception.-ctor?view=net-10.0#system-net-http-httprequestexception-ctor(system-string-system-exception-system-nullable((system-net-httpstatuscode)))
Then below you can use the built in StatusCode property of the exception.
The github runners will retry but at whole sdk level. This means even if we encounter a retryable error in one test we must retry the whole product. This means we wasting our credits. In particular for our limited image generation ones. We can limit this by retrying in the test itself. The only con here is potenitaly retrying something that is doomed to fail. I have tried to put in some smarts to prevent this.
In the tests we can see a 429 resource exhausted that come in two flavours.
"code": 429,
"message": "Resource exhausted. Please try again later. Please refer to https://cloud.google.com/vertex-ai/generative-ai/docs/error-code-429 for more details.",
and
"code": 429,
"message": "You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. To monitor your current usage, head to: https://ai.dev/rate-limit. \n* Quota exceeded for metric: generativelanguage.googleapis.com/predict_requests_per_model_per_day_paid_tier_1, limit: 70, model: imagen-4.0-generate",
The exponential backoff should be helpful for "Resource exhausted. Please try again later. " but no the quota exceded one.
Description
Testing
Tested this locally and saw some good recoveries that prevented the whole test run failing.
Type of Change
Place an
xthe applicable box: