Skip to content

Conversation

HankStat
Copy link
Contributor

@HankStat HankStat commented Oct 2, 2025

Which issue(s) does the PR fix:

Fixes #403

Problem

The TestHealthCheck test fails intermittently on CI due to a timing race condition.

The test's final assertion expects the stream's health check endpoint to return Stream terminated\n immediately after calling strm.StopUnordered(). However, StopUnordered() is an asynchronous operation, meaning the final status update may not complete instantly.

In CI, the test can assert the status while the stream is in an intermediate state (e.g., input not connected\noutput not connected\n), leading to a failure.

Solution

This commit fixes the flakiness by using require.Eventually to assert the final state of the health check.
Instead of asserting immediately, the test now continuously polls the health check endpoint every 100 milliseconds for up to 5 seconds. This gives the asynchronous shutdown process enough time to complete and for the stream to transition to its final Stream terminated\n state.

Verification

The test now correctly and consistently passes on my local machine after running it 100 times using a loop to simulate the flaky behavior.

for i in {1..100}; do go test -v ./internal/stream -run ^TestHealthCheck$; done 

I have also executed the following command before submitting the PR

make test
make lint
make fmt

Signed-off-by: Yong-Han Chen <[email protected]>
@HankStat HankStat marked this pull request as ready for review October 2, 2025 05:04
Copy link
Collaborator

@gregfurman gregfurman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for helping address this irritating flake!

@jem-davies jem-davies merged commit b573f89 into warpstreamlabs:main Oct 7, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Flaky Test: serverless/lambda::TestHealthCheck

3 participants