grpc: introduce ErrRetriesExhausted to wrap retry failures#8894
grpc: introduce ErrRetriesExhausted to wrap retry failures#8894arjan-bal wants to merge 2 commits intogrpc:masterfrom
Conversation
f34ddff to
d016ce4
Compare
|
/gemini review |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #8894 +/- ##
==========================================
- Coverage 83.13% 80.29% -2.85%
==========================================
Files 414 416 +2
Lines 32821 33523 +702
==========================================
- Hits 27286 26917 -369
- Misses 4098 4677 +579
- Partials 1437 1929 +492
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Code Review
This pull request introduces ErrRetriesExhausted to provide more context when an RPC fails after exhausting all retry attempts. The idea is good, as it improves error diagnostics for clients.
However, the implementation of error wrapping in stream.go uses fmt.Errorf with two %w verbs, which is invalid in Go and will lead to a compile error. I've left a critical comment with details on this issue and suggestions for a fix. The core challenge is to create an error that both is identifiable as ErrRetriesExhausted and preserves the original gRPC status, which requires careful error wrapping.
The test changes are good, especially the switch to errors.Is and the new test case for ErrRetriesExhausted which helped identify the design issue.
| return false, fmt.Errorf("stopped after %d attempts: %w: %w", | ||
| cs.numRetries+1, | ||
| ErrRetriesExhausted, | ||
| err, | ||
| ) |
There was a problem hiding this comment.
Using two %w verbs in fmt.Errorf is invalid and will cause a compilation error with standard Go compilers. The Go documentation states: 'It is also invalid to use %w multiple times in a single Errorf call.'
Beyond the compilation error, there's a design challenge. The tests require that the resulting error e satisfies both errors.Is(e, ErrRetriesExhausted) and that status.Code(e) returns the original error's code. This is difficult to achieve when ErrRetriesExhausted is a simple error from errors.New because it cannot wrap the original error.
To satisfy both requirements, a solution would be to use a custom error type that wraps the original err and also implements an Is method to identify as ErrRetriesExhausted.
For example:
// unexported custom error type
type exhaustedError struct {
err error
attempts int
}
func (e *exhaustedError) Error() string {
return fmt.Sprintf("stopped after %d attempts: %v: %v", e.attempts, ErrRetriesExhausted, e.err)
}
func (e *exhaustedError) Unwrap() error {
return e.err
}
func (e *exhaustedError) Is(target error) bool {
return target == ErrRetriesExhausted
}Then you could construct the error as &exhaustedError{err: err, attempts: cs.numRetries + 1}.
There was a problem hiding this comment.
Using two %w verbs in fmt.Errorf is invalid and will cause a compilation error with standard Go compilers. The Go documentation states: 'It is also invalid to use %w multiple times in a single Errorf call.'
While older versions of Go restricted the use of %w to a single instance, using multiple %w verbs in fmt.Errorf is valid since Go 1.20. Here is what the doc states:
If there is more than one %w verb, the returned error will implement an Unwrap method returning a []error containing all the %w operands in the order they appear in the arguments.
As a result, the error e satisfies errors.Is(e, ErrRetriesExhausted), while status.Code(e) correctly returns the original error's code. Both behaviors are covered by tests.
arjan-bal
left a comment
There was a problem hiding this comment.
Addressed the comments. Also added an experimental notice on the new error type and tests cases to verify the wrapped error isn't present when retries are disabled.
e1b6cba to
5cb347e
Compare
| RecvMsg(m any) error | ||
| } | ||
|
|
||
| // ErrRetriesExhausted is returned when an operation exceeds its configured |
There was a problem hiding this comment.
Nit: This is not any arbitrary operation. This only applies to RPCs, right. Can we make that more explicit here since this will be part of the API.
| "RetryableStatusCodes": [ "UNAVAILABLE" ] | ||
| } | ||
| }]}`), | ||
| grpc.WithDisableRetry()); err != nil { |
There was a problem hiding this comment.
I think this falls under go/go-style/decisions#indentation-confusion.
Can we have the dial options initialized in a separate slice and have them be passed here, so that the ss.Start(....) can be on a single line?
| } | ||
| _, err = stream.Recv() | ||
| if err == nil { | ||
| t.Fatalf("client: Recv() = <nil>, <nil>; want <nil>, error") |
There was a problem hiding this comment.
I know we have many error strings like this. But this should remain a relic of the past. This error message is not very readable. Something like "stream.Recv() succeeded when expected to fail" would be more readable. Here and elsewhere where this applies.
| } | ||
| } | ||
|
|
||
| func (s) TestRetryNotConfigured(t *testing.T) { |
There was a problem hiding this comment.
The test logic here and in the above test seems identical. The only difference is the dial options. Can we make it a table driven test instead?
| &testpb.Empty{}, | ||
| } | ||
| s := status.New(codes.Canceled, "inner canceled") | ||
| sWithDetails, err := s.WithDetails(details...) |
There was a problem hiding this comment.
Nit: We could just inline the details proto here in the call to s.WithDetails and get rid of the slice and the unpacking here.
| if got := st.Code(); got != tc.wantCode { | ||
| t.Errorf("st.Code() = %v; want %v", got, tc.wantCode) | ||
| } | ||
| if got := st.Message(); got != tc.wantMessage { | ||
| t.Errorf("st.Message() = %q; want %q", got, tc.wantMessage) | ||
| } | ||
| if got := len(st.Details()); got != tc.wantDetails { | ||
| t.Errorf("len(st.Details()) = %v; want %v", got, tc.wantDetails) | ||
| } |
There was a problem hiding this comment.
Should we instead have a wantStatus in the test table and perform a full struct comparison here instead of comparing the individual fields? go/go-style/decisions#compare-full-structures
Fixes: #7023
This change wraps RPC errors when the maximum retry limit is reached. It introduces a custom error type that wraps the number of attempts made alongside the original error.
RELEASE NOTES:
ErrRetriesExhaustederror type that wraps retry failures.