Skip to content

Conversation

@nadavelkabets
Copy link
Contributor

@nadavelkabets nadavelkabets commented Jan 4, 2025

Addresses #1098.
Currently exceptions raised in user created tasks crash the executor. In my project I had to catch the exception and return it to allow communicating exceptions from the task to the main code.
This pull request mimics the behavior of asyncio, in which exceptions in tasks are ignored and logged upon garbage collection.
The logging part is already implemented, although this code is currently unreachable.
It was only necessary to add a line in the SingleThreadedExecutor to check if the task is related to an entity, since user created tasks are not bounded to any entity.

def __del__(self) -> None:
if self._exception is not None and not self._exception_fetched:
print(
'The following exception was never retrieved: ' + str(self._exception),
file=sys.stderr)

@fujitatomoya

@apockill
Copy link
Contributor

apockill commented Jan 5, 2025

yes thank you for this. I ran into this and didn't know enough about the Task system to know if this was expected behavior, so I ended up wrapping my function in an exception catcher, then re-raising it after getting the future.

Copy link
Collaborator

@fujitatomoya fujitatomoya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm but i would like to request other maintainers to review, test would be ideal.

CC: @sloretz @ahcorde

Copy link
Contributor

@sloretz sloretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR!

I took some time to understand #1098 better, and I misunderstood it before. I don't think ignoring the exception is the right fix. I think we need to somehow get the executor to get the task awaiting on the one that raised an exception to actually raise it.

@nadavelkabets
Copy link
Contributor Author

nadavelkabets commented Apr 16, 2025

I took some time to understand #1098 better, and I misunderstood it before. I don't think ignoring the exception is the right fix. I think we need to somehow get the executor to get the task awaiting on the one that raised an exception to actually raise it.

Hey @sloretz, sorry for taking the time...

We should consider the following cases when a task raises an exception:

Managed User-Created Tasks

For user-created tasks that are managed—either because another task is awaiting them or a done_callback is attached—the expected behavior (in line with asyncio) is for the exception to be raised in the awaiting task rather than crashing the executor. This is already implemented and to make this work we only need to remove the line in spin() that currently crashes the executor:

handler()
exception = handler.exception()
if exception is not None:
raise exception

Executor-Created Callback Tasks

For callback tasks created by the executor (in _make_handler), I believe we should maintain the current behavior of crashing the executor when an exception occurs. Since these tasks are internally managed by the executor and not exposed to the user for awaiting, it makes sense that an exception here should interrupt the program.
We can address this by implementing a mechanism similar to asyncio’s TaskGroup, where an executor task group manages these internal tasks and crashes the executor if an exception is raised.

Unmanaged User-Created Tasks

Both asyncio and rclpy insist that all tasks be managed in some form. The asyncio documentation for create_task() even advises:

Save a reference to the result of this function, to avoid a task disappearing mid-execution. The event loop only keeps weak references to tasks. A task that isn’t referenced elsewhere may get garbage collected at any time, even before it’s done.

In both rclpy and asyncio, if an unmanaged task is garbage collected while its exception is unfetched, the exception gets logged.
Although this logging can be unpredictable due to the timing of garbage collection in python, I think it’s acceptable as it reinforces that tasks should always be managed.

Moreover, it is very hard to understand at runtime if a task is managed. An awaiting task may be polled before the task that raises the exception does so in the same iteration, causing the exception to be caught only in the next iteration of the executor. Also, some users may prefer to deal with the exception in a done callback or by polling the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants