grpc: introduce ErrRetriesExhausted to wrap retry failures by arjan-bal · Pull Request #8894 · grpc/grpc-go

arjan-bal · 2026-02-10T09:17:19Z

This change wraps RPC errors when the maximum retry limit is reached. It introduces a custom error type that wraps the number of attempts made alongside the original error.

RELEASE NOTES:

grpc: Add ErrRetriesExhausted error type that wraps retry failures.

arjan-bal · 2026-02-10T09:38:24Z

/gemini review

codecov · 2026-02-10T09:38:47Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 80.29%. Comparing base (d7b3f93) to head (5cb347e).
⚠️ Report is 6 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #8894      +/-   ##
==========================================
- Coverage   83.13%   80.29%   -2.85%     
==========================================
  Files         414      416       +2     
  Lines       32821    33523     +702     
==========================================
- Hits        27286    26917     -369     
- Misses       4098     4677     +579     
- Partials     1437     1929     +492

Files with missing lines	Coverage Δ
stream.go	`61.77% <100.00%> (-20.11%)`	⬇️

... and 36 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

gemini-code-assist

Code Review

This pull request introduces ErrRetriesExhausted to provide more context when an RPC fails after exhausting all retry attempts. The idea is good, as it improves error diagnostics for clients.

However, the implementation of error wrapping in stream.go uses fmt.Errorf with two %w verbs, which is invalid in Go and will lead to a compile error. I've left a critical comment with details on this issue and suggestions for a fix. The core challenge is to create an error that both is identifiable as ErrRetriesExhausted and preserves the original gRPC status, which requires careful error wrapping.

The test changes are good, especially the switch to errors.Is and the new test case for ErrRetriesExhausted which helped identify the design issue.

gemini-code-assist · 2026-02-10T09:40:41Z

stream.go

+		return false, fmt.Errorf("stopped after %d attempts: %w: %w",
+			cs.numRetries+1,
+			ErrRetriesExhausted,
+			err,
+		)


Using two %w verbs in fmt.Errorf is invalid and will cause a compilation error with standard Go compilers. The Go documentation states: 'It is also invalid to use %w multiple times in a single Errorf call.'

Beyond the compilation error, there's a design challenge. The tests require that the resulting error e satisfies both errors.Is(e, ErrRetriesExhausted) and that status.Code(e) returns the original error's code. This is difficult to achieve when ErrRetriesExhausted is a simple error from errors.New because it cannot wrap the original error.

To satisfy both requirements, a solution would be to use a custom error type that wraps the original err and also implements an Is method to identify as ErrRetriesExhausted.

For example:

// unexported custom error type type exhaustedError struct { err error attempts int } func (e *exhaustedError) Error() string { return fmt.Sprintf("stopped after %d attempts: %v: %v", e.attempts, ErrRetriesExhausted, e.err) } func (e *exhaustedError) Unwrap() error { return e.err } func (e *exhaustedError) Is(target error) bool { return target == ErrRetriesExhausted }

Then you could construct the error as &exhaustedError{err: err, attempts: cs.numRetries + 1}.

Using two %w verbs in fmt.Errorf is invalid and will cause a compilation error with standard Go compilers. The Go documentation states: 'It is also invalid to use %w multiple times in a single Errorf call.'

While older versions of Go restricted the use of %w to a single instance, using multiple %w verbs in fmt.Errorf is valid since Go 1.20. Here is what the doc states:

If there is more than one %w verb, the returned error will implement an Unwrap method returning a []error containing all the %w operands in the order they appear in the arguments.

As a result, the error e satisfies errors.Is(e, ErrRetriesExhausted), while status.Code(e) correctly returns the original error's code. Both behaviors are covered by tests.

test/retry_test.go

status/status_ext_test.go

arjan-bal

Addressed the comments. Also added an experimental notice on the new error type and tests cases to verify the wrapped error isn't present when retries are disabled.

test/retry_test.go

status/status_ext_test.go

easwars · 2026-02-19T21:50:24Z

stream.go

 	RecvMsg(m any) error
 }

+// ErrRetriesExhausted is returned when an operation exceeds its configured


Nit: This is not any arbitrary operation. This only applies to RPCs, right. Can we make that more explicit here since this will be part of the API.

easwars · 2026-02-19T21:53:46Z

test/retry_test.go

+        "RetryableStatusCodes": [ "UNAVAILABLE" ]
+      }
+    }]}`),
+		grpc.WithDisableRetry()); err != nil {


I think this falls under go/go-style/decisions#indentation-confusion.

Can we have the dial options initialized in a separate slice and have them be passed here, so that the ss.Start(....) can be on a single line?

easwars · 2026-02-19T21:55:19Z

test/retry_test.go

+	}
+	_, err = stream.Recv()
+	if err == nil {
+		t.Fatalf("client: Recv() = <nil>, <nil>; want <nil>, error")


I know we have many error strings like this. But this should remain a relic of the past. This error message is not very readable. Something like "stream.Recv() succeeded when expected to fail" would be more readable. Here and elsewhere where this applies.

easwars · 2026-02-19T21:57:51Z

test/retry_test.go

+	}
+}
+
+func (s) TestRetryNotConfigured(t *testing.T) {


The test logic here and in the above test seems identical. The only difference is the dial options. Can we make it a table driven test instead?

easwars · 2026-02-19T22:04:02Z

status/status_ext_test.go

+		&testpb.Empty{},
+	}
+	s := status.New(codes.Canceled, "inner canceled")
+	sWithDetails, err := s.WithDetails(details...)


Nit: We could just inline the details proto here in the call to s.WithDetails and get rid of the slice and the unpacking here.

easwars · 2026-02-19T22:07:15Z

status/status_ext_test.go

+			if got := st.Code(); got != tc.wantCode {
+				t.Errorf("st.Code() = %v; want %v", got, tc.wantCode)
+			}
+			if got := st.Message(); got != tc.wantMessage {
+				t.Errorf("st.Message() = %q; want %q", got, tc.wantMessage)
+			}
+			if got := len(st.Details()); got != tc.wantDetails {
+				t.Errorf("len(st.Details()) = %v; want %v", got, tc.wantDetails)
+			}


Should we instead have a wantStatus in the test table and perform a full struct comparison here instead of comparing the individual fields? go/go-style/decisions#compare-full-structures

arjan-bal added this to the 1.80 Release milestone Feb 10, 2026

arjan-bal added the Type: Feature New features or improvements in behavior label Feb 10, 2026

Add error type for retries exhausted

d016ce4

arjan-bal force-pushed the wrap-retry-error branch from f34ddff to d016ce4 Compare February 10, 2026 09:35

gemini-code-assist bot reviewed Feb 10, 2026

View reviewed changes

arjan-bal requested a review from easwars February 10, 2026 10:00

arjan-bal assigned easwars Feb 10, 2026

easwars reviewed Feb 11, 2026

View reviewed changes

test/retry_test.go Outdated Show resolved Hide resolved

status/status_ext_test.go Show resolved Hide resolved

arjan-bal self-assigned this Feb 12, 2026

easwars removed their assignment Feb 13, 2026

arjan-bal commented Feb 17, 2026

View reviewed changes

test/retry_test.go Outdated Show resolved Hide resolved

status/status_ext_test.go Show resolved Hide resolved

More tests, experimental notice

5cb347e

arjan-bal force-pushed the wrap-retry-error branch from e1b6cba to 5cb347e Compare February 17, 2026 09:22

arjan-bal assigned easwars and unassigned arjan-bal Feb 17, 2026

easwars reviewed Feb 19, 2026

View reviewed changes

easwars assigned arjan-bal and unassigned easwars Feb 19, 2026

Conversation

arjan-bal commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arjan-bal commented Feb 10, 2026

Uh oh!

codecov bot commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

arjan-bal Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

arjan-bal left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

easwars Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

easwars Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

easwars Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

easwars Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

easwars Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

easwars Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

arjan-bal commented Feb 10, 2026 •

edited

Loading

codecov bot commented Feb 10, 2026 •

edited

Loading

arjan-bal Feb 10, 2026 •

edited

Loading