Skip to content

Rewrite FastAPI instrumentor middleware stack to be failsafe #3664

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

outergod
Copy link
Contributor

@outergod outergod commented Jul 30, 2025

Description

Change the way the FastAPI instrumentor deals with the FastAPI middleware stack so that exception handling code doesn't get executed twice, but still has a valid OTEL context available. At the same time, make sure instrumentor hooks failures cannot crash the service itself.

Fixes #3642 #3637

Type of change

  • Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested?

Using the MRE in the linked issue, and added unit tests.

Does This PR Require a Core Repo Change?

  • No.

Checklist:

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added
  • Documentation has been updated

@outergod outergod changed the title Fix/gh 3642 fastapi exceptions Rewrite FastAPI instrumentor middleware stack to be failsafe Jul 30, 2025
@outergod outergod requested a review from a team as a code owner July 30, 2025 11:00
try:
func(*args, **kwargs)
except Exception: # pylint: disable=W0718
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather record this exception on the span

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexmojaki good point! Addressed in new commit.

Comment on lines 320 to 325
if isinstance(
inner_server_error_middleware, ServerErrorMiddleware
): # usually true
outer_server_error_middleware = ServerErrorMiddleware(
app=otel_middleware,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me why this is being removed or how this relates to the added tests. If you keep this, do the tests fail?

I'm guessing this is meant to fix #3637 but I don't see how this PR does or how that's being tested.

Copy link
Contributor Author

@outergod outergod Aug 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not related to #3637 but you are correct, this should probably stay as-is. No effect on the tests either way. Addressed in commit.

Comment on lines 315 to 317
server_request_hook=failsafe(server_request_hook),
client_request_hook=failsafe(client_request_hook),
client_response_hook=failsafe(client_response_hook),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wrapping should be in OpenTelemetryMiddleware itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in commit.

def failsafe(func):
@wraps(func)
def wrapper(span: Span, *args, **kwargs):
if func is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if func is not None:
if func is not None:

this should be checked outside wrapper

# In order to make traces available at any stage of the request
# processing - including exception handling - we wrap ourselves as
# the new, outermost middleware. However in order to prevent
# exceptions from user-provided hooks of tearing through, we wrap
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wrapping is no longer here. I also don't think this comment is an improvement. It's now less clear than before that this approach ensures the correct http status code.

)
raise

app.add_middleware(ExceptionHandlerMiddleware)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't actually know where this middleware is going to go, it'd be clearer if this was inserted in build_middleware_stack.

def test_error_handler_context(self):
"""OTEL tracing contexts must be available during error handler execution"""

@self.app.exception_handler(Exception)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please test if the exception and http status code are recorded on the span in this case.


self.assertEqual(len(spans), 3)
span = spans[2]
self.assertEqual(span.status.status_code, StatusCode.ERROR)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please also check the http status code on the span.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FastAPI instrumentation: errors in hooks not handled properly
2 participants