Skip to content

Conversation

gfmio
Copy link

@gfmio gfmio commented Aug 31, 2023

This PR increases the list of status codes considered retryable.

  • When Flink Statefun has high load, it can happen that the requests are sent too slowly and the server read timeout cuts off a request, in which case commonly a 400 error will be returned since the body was not parseable as a protobuf message.
  • When using a load balancer like Envoy, 503 and 504 errors can occur frequently when the upstream service is unavailable or e.g. getting redeployed.

@gfmio gfmio marked this pull request as ready for review August 31, 2023 11:11
@gfmio
Copy link
Author

gfmio commented Aug 31, 2023

Question for the maintainers: Do you think we should keep this a fixed list or shall we make this configurable as a comma separated list in either the flink config or the module.yaml?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant