Skip to content

Flaky test: AzureEventHubsExtensionsTests.VerifyAzureEventHubsEmulatorResource times out under CI load #14823

@davidfowl

Description

@davidfowl

Test

Aspire.Hosting.Azure.Tests.AzureEventHubsExtensionsTests.VerifyAzureEventHubsEmulatorResource(referenceHub: True, hubName: null)

Error

Polly.Timeout.TimeoutRejectedException : The operation didn't complete within the allowed timeout of '00:00:20'.
---- System.Threading.Tasks.TaskCanceledException : A task was canceled.

Build

https://github.com/dotnet/aspire/actions/runs/22538430952/job/65289833249

Analysis

The test times out waiting for the EventHub emulator health check to pass. CI heartbeat logs show the machine was under extreme resource contention at the time:

  • CPU: ~96% sustained
  • VBCSCompiler: 170% CPU / 988 MB
  • bicep: ~100-120% CPU
  • 11-16 DCP processes running simultaneously (133-170% aggregate CPU)
  • Multiple node processes at 30-60% CPU each

The 20-second Polly timeout is not sufficient when the CI machine is this saturated. The emulator container was starting up (Microsoft.Cloud.EventHub.Emulator.Host visible at ~18-30% CPU) but the health check at /health did not return 200 within the timeout window.

This is a different root cause than the previously closed #6751 (which was an AMQP entity-not-found error). This failure is purely a resource contention timeout.

Suggested Fix

Consider increasing the Polly timeout for this test or marking it as a quarantined test if it continues to be flaky under CI load.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area-integrationsIssues pertaining to Aspire Integrations packagesazure-eventhubsIssues related to Azure Event Hubs integrationflaky-test

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions