Fix tornado server (request) duration metric calculation #3679

devmonkey22 · 2025-08-05T20:51:21Z

Description

Fix the tornado instrumenter to track a request's elapsed time in an async-safe way so concurrent requests calculate their own elapsed time for the HTTP_SERVER_DURATION metric properly. This changes the tornado instrumentation to track state (like request start time used to calculate request duration) on the per-request handler instance and not on the more global server_histogram object that is shared between requests.

Fixes #3486

Type of change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

How Has This Been Tested?

Added tornado metrics test_metrics_concurrent_requests test

Without the fix, the test run errors with:

Does This PR Require a Core Repo Change?

Yes. - Link to PR:
No.

Checklist:

See contributing.md for styleguide, changelog guidelines, and more.

Followed the style guidelines of this project
Changelogs have been updated
Unit tests have been added
Documentation has been updated

… metric calculation (open-telemetry#3486)

tammy-baylis-swi · 2025-08-07T16:48:13Z

Thank you for this fix. Please could you also add/update the tests?: https://github.com/open-telemetry/opentelemetry-python-contrib/blob/74536f1a92a357c9a25fccba057ce5766d5a8f27/instrumentation/opentelemetry-instrumentation-tornado/tests/test_metrics_instrumentation.py

devmonkey22 · 2025-08-07T17:33:46Z

Thank you for this fix. Please could you also add/update the tests?: https://github.com/open-telemetry/opentelemetry-python-contrib/blob/74536f1a92a357c9a25fccba057ce5766d5a8f27/instrumentation/opentelemetry-instrumentation-tornado/tests/test_metrics_instrumentation.py

@tammy-baylis-swi Can you think of a good way to test this race condition? I think we'd have to be able start two concurrent fetches, where one hits prepare, stays running in a (maybe 1 second sleep), the 2nd request hits (updates start_time), then both requests can complete. Then we could test that the duration for the 1st request is similar to client duration (ie: not super short like 2nd request should be). Do you know of an example already? Otherwise, I was just going to rely on the existing tests to make sure duration is still calculated and similar to client duration (just can't check race condition is fully fixed).

tammy-baylis-swi · 2025-08-07T21:54:07Z

@tammy-baylis-swi Can you think of a good way to test this race condition? I think we'd have to be able start two concurrent fetches, where one hits prepare, stays running in a (maybe 1 second sleep), the 2nd request hits (updates start_time), then both requests can complete. Then we could test that the duration for the 1st request is similar to client duration (ie: not super short like 2nd request should be). Do you know of an example already? Otherwise, I was just going to rely on the existing tests to make sure duration is still calculated and similar to client duration (just can't check race condition is fully fixed).

I'm not actually familiar with Tornado but generally the checks for values resulting from concurrent fetches with predictable times would be helpful, if it's possible. Hmm. Would it make sense to make_app with any new routes as needed, then also use tornado.httpclient.AsyncHTTPClient to do two concurrent fetches and check resulting metrics in memory?

devmonkey22 · 2025-08-08T16:59:29Z

@tammy-baylis-swi Can you think of a good way to test this race condition? I think we'd have to be able start two concurrent fetches, where one hits prepare, stays running in a (maybe 1 second sleep), the 2nd request hits (updates start_time), then both requests can complete. Then we could test that the duration for the 1st request is similar to client duration (ie: not super short like 2nd request should be). Do you know of an example already? Otherwise, I was just going to rely on the existing tests to make sure duration is still calculated and similar to client duration (just can't check race condition is fully fixed).

I'm not actually familiar with Tornado but generally the checks for values resulting from concurrent fetches with predictable times would be helpful, if it's possible. Hmm. Would it make sense to make_app with any new routes as needed, then also use tornado.httpclient.AsyncHTTPClient to do two concurrent fetches and check resulting metrics in memory?

Thanks @tammy-baylis-swi I think I found the test approach. Just need to iron out CI test failures that didn't happen locally.

devmonkey22 · 2025-08-08T17:37:42Z

Should be ready for review again. Tests are passing.

tammy-baylis-swi

Nice, thank you for the commented new tests and updated OP. This lgtm; the Maintainers will also have to have a look.

devmonkey22 added 2 commits August 5, 2025 16:46

Fix opentelemetry-instrumentation-tornado server (request) duration…

3ca3ae7

… metric calculation (open-telemetry#3486)

Update changelog

0d23c86

devmonkey22 requested a review from a team as a code owner August 5, 2025 20:51

github-actions bot assigned shalevr Aug 5, 2025

github-actions bot requested a review from shalevr August 5, 2025 20:51

This was referenced Aug 5, 2025

[WIP] Fix tornado server (request) duration metric calculation #3489

Closed

Fix to properly skip all tornado server telemetry when URL excluded #3680

Open

Linting

1761813

devmonkey22 changed the title ~~[WIP] Fix tornado server (request) duration metric calculation~~ Fix tornado server (request) duration metric calculation Aug 5, 2025

Add tornado metrics test_metrics_concurrent_requests test

a45d617

Fix tornado test ms to sec conversion

97b99a3

tammy-baylis-swi approved these changes Aug 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix tornado server (request) duration metric calculation #3679

Fix tornado server (request) duration metric calculation #3679

devmonkey22 commented Aug 5, 2025 •

edited

Loading

Uh oh!

tammy-baylis-swi commented Aug 7, 2025

Uh oh!

devmonkey22 commented Aug 7, 2025

Uh oh!

tammy-baylis-swi commented Aug 7, 2025

Uh oh!

devmonkey22 commented Aug 8, 2025

Uh oh!

devmonkey22 commented Aug 8, 2025

Uh oh!

tammy-baylis-swi left a comment

Uh oh!

Uh oh!

Fix tornado server (request) duration metric calculation #3679

Are you sure you want to change the base?

Fix tornado server (request) duration metric calculation #3679

Conversation

devmonkey22 commented Aug 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

How Has This Been Tested?

Does This PR Require a Core Repo Change?

Checklist:

Uh oh!

tammy-baylis-swi commented Aug 7, 2025

Uh oh!

devmonkey22 commented Aug 7, 2025

Uh oh!

tammy-baylis-swi commented Aug 7, 2025

Uh oh!

devmonkey22 commented Aug 8, 2025

Uh oh!

devmonkey22 commented Aug 8, 2025

Uh oh!

tammy-baylis-swi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

devmonkey22 commented Aug 5, 2025 •

edited

Loading