-
Notifications
You must be signed in to change notification settings - Fork 54
To fix the shutdown errors, we need to stop the tokio loop... #750
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: gh/zdevito/62/base
Are you sure you want to change the base?
Conversation
I am pretty sure our finalization errors are some combo of: 1. schedule some work that will hold or take GIL on a thread owned by tokio. This includes anything that holds a reference to a PyObject, because Drop on PyObject will take the GIL. 2. call Py_Finalize from the main thread 3. Python unloads 4. tokio thread tries to use GIL and crashes. We also start at least one std::thread which runs python that will encounter this problem and not be on the tokio loop. To avoid this, we need to shutdown the tokio loop (i.e. allow all tasks to reach an await, and then cancel everything). However, pyo3_async_runtime makes this nearly impossible because it requires a 'static lifetime reference to the loop, so it cannot be shutdown. This diff shows how to call shutdown on the tokio loop, and I observed it able to fix the `test_proc_mesh_size` finalization issues. However, I have to avoid initializing pyo3_async_runtime. One way to make this shippable would be to remove our usein of pyo3_async_runtime entirely. We primarily need it for future_into_py but we can just reimplement it directly ourself. Differential Revision: [D79533010](https://our.internmc.facebook.com/intern/diff/D79533010/) [ghstack-poisoned]
I am pretty sure our finalization errors are some combo of: 1. schedule some work that will hold or take GIL on a thread owned by tokio. This includes anything that holds a reference to a PyObject, because Drop on PyObject will take the GIL. 2. call Py_Finalize from the main thread 3. Python unloads 4. tokio thread tries to use GIL and crashes. We also start at least one std::thread which runs python that will encounter this problem and not be on the tokio loop. To avoid this, we need to shutdown the tokio loop (i.e. allow all tasks to reach an await, and then cancel everything). However, pyo3_async_runtime makes this nearly impossible because it requires a 'static lifetime reference to the loop, so it cannot be shutdown. This diff shows how to call shutdown on the tokio loop, and I observed it able to fix the `test_proc_mesh_size` finalization issues. However, I have to avoid initializing pyo3_async_runtime. One way to make this shippable would be to remove our usein of pyo3_async_runtime entirely. We primarily need it for future_into_py but we can just reimplement it directly ourself. Differential Revision: [D79533010](https://our.internmc.facebook.com/intern/diff/D79533010/) ghstack-source-id: 300628159 Pull Request resolved: #750
This pull request was exported from Phabricator. Differential Revision: D79533010 |
…..." I am pretty sure our finalization errors are some combo of: 1. schedule some work that will hold or take GIL on a thread owned by tokio. This includes anything that holds a reference to a PyObject, because Drop on PyObject will take the GIL. 2. call Py_Finalize from the main thread 3. Python unloads 4. tokio thread tries to use GIL and crashes. We also start at least one std::thread which runs python that will encounter this problem and not be on the tokio loop. To avoid this, we need to shutdown the tokio loop (i.e. allow all tasks to reach an await, and then cancel everything). However, pyo3_async_runtime makes this nearly impossible because it requires a 'static lifetime reference to the loop, so it cannot be shutdown. This diff shows how to call shutdown on the tokio loop, and I observed it able to fix the `test_proc_mesh_size` finalization issues. However, I have to avoid initializing pyo3_async_runtime. One way to make this shippable would be to remove our usein of pyo3_async_runtime entirely. We primarily need it for future_into_py but we can just reimplement it directly ourself. Differential Revision: [D79533010](https://our.internmc.facebook.com/intern/diff/D79533010/) [ghstack-poisoned]
This pull request was exported from Phabricator. Differential Revision: D79533010 |
…..." I am pretty sure our finalization errors are some combo of: 1. schedule some work that will hold or take GIL on a thread owned by tokio. This includes anything that holds a reference to a PyObject, because Drop on PyObject will take the GIL. 2. call Py_Finalize from the main thread 3. Python unloads 4. tokio thread tries to use GIL and crashes. We also start at least one std::thread which runs python that will encounter this problem and not be on the tokio loop. To avoid this, we need to shutdown the tokio loop (i.e. allow all tasks to reach an await, and then cancel everything). However, pyo3_async_runtime makes this nearly impossible because it requires a 'static lifetime reference to the loop, so it cannot be shutdown. This diff shows how to call shutdown on the tokio loop, and I observed it able to fix the `test_proc_mesh_size` finalization issues. However, I have to avoid initializing pyo3_async_runtime. One way to make this shippable would be to remove our usein of pyo3_async_runtime entirely. We primarily need it for future_into_py but we can just reimplement it directly ourself. Differential Revision: [D79533010](https://our.internmc.facebook.com/intern/diff/D79533010/) [ghstack-poisoned]
Pull Request resolved: #750 I am pretty sure our finalization errors are some combo of: 1. schedule some work that will hold or take GIL on a thread owned by tokio. This includes anything that holds a reference to a PyObject, because Drop on PyObject will take the GIL. 2. call Py_Finalize from the main thread 3. Python unloads 4. tokio thread tries to use GIL and crashes. We also start at least one std::thread which runs python that will encounter this problem and not be on the tokio loop. To avoid this, we need to shutdown the tokio loop (i.e. allow all tasks to reach an await, and then cancel everything). However,` pyo3_async_runtime::tokio` makes this nearly impossible because it requires a 'static lifetime reference to the loop, so it cannot be shutdown. This diff shows how to call shutdown on the tokio loop, and I observed it able to fix the `test_proc_mesh_size` finalization issues. However, I have to avoid initializing pyo3_async_runtime::tokio. This diff removes our use of pyo3_async_runtime::tokio. We instead replace with our own wrapper to get a python future which uses our get_tokio_runtime function. ghstack-source-id: 300750441 Differential Revision: [D79533010](https://our.internmc.facebook.com/intern/diff/D79533010/)
This pull request was exported from Phabricator. Differential Revision: D79533010 |
…..." I am pretty sure our finalization errors are some combo of: 1. schedule some work that will hold or take GIL on a thread owned by tokio. This includes anything that holds a reference to a PyObject, because Drop on PyObject will take the GIL. 2. call Py_Finalize from the main thread 3. Python unloads 4. tokio thread tries to use GIL and crashes. We also start at least one std::thread which runs python that will encounter this problem and not be on the tokio loop. To avoid this, we need to shutdown the tokio loop (i.e. allow all tasks to reach an await, and then cancel everything). However, pyo3_async_runtime makes this nearly impossible because it requires a 'static lifetime reference to the loop, so it cannot be shutdown. This diff shows how to call shutdown on the tokio loop, and I observed it able to fix the `test_proc_mesh_size` finalization issues. However, I have to avoid initializing pyo3_async_runtime. One way to make this shippable would be to remove our usein of pyo3_async_runtime entirely. We primarily need it for future_into_py but we can just reimplement it directly ourself. Differential Revision: [D79533010](https://our.internmc.facebook.com/intern/diff/D79533010/) [ghstack-poisoned]
This pull request was exported from Phabricator. Differential Revision: D79533010 |
…..." I am pretty sure our finalization errors are some combo of: 1. schedule some work that will hold or take GIL on a thread owned by tokio. This includes anything that holds a reference to a PyObject, because Drop on PyObject will take the GIL. 2. call Py_Finalize from the main thread 3. Python unloads 4. tokio thread tries to use GIL and crashes. We also start at least one std::thread which runs python that will encounter this problem and not be on the tokio loop. To avoid this, we need to shutdown the tokio loop (i.e. allow all tasks to reach an await, and then cancel everything). However, pyo3_async_runtime makes this nearly impossible because it requires a 'static lifetime reference to the loop, so it cannot be shutdown. This diff shows how to call shutdown on the tokio loop, and I observed it able to fix the `test_proc_mesh_size` finalization issues. However, I have to avoid initializing pyo3_async_runtime. One way to make this shippable would be to remove our usein of pyo3_async_runtime entirely. We primarily need it for future_into_py but we can just reimplement it directly ourself. Differential Revision: [D79533010](https://our.internmc.facebook.com/intern/diff/D79533010/) [ghstack-poisoned]
This pull request was exported from Phabricator. Differential Revision: D79533010 |
…..." I am pretty sure our finalization errors are some combo of: 1. schedule some work that will hold or take GIL on a thread owned by tokio. This includes anything that holds a reference to a PyObject, because Drop on PyObject will take the GIL. 2. call Py_Finalize from the main thread 3. Python unloads 4. tokio thread tries to use GIL and crashes. We also start at least one std::thread which runs python that will encounter this problem and not be on the tokio loop. To avoid this, we need to shutdown the tokio loop (i.e. allow all tasks to reach an await, and then cancel everything). However, pyo3_async_runtime makes this nearly impossible because it requires a 'static lifetime reference to the loop, so it cannot be shutdown. This diff shows how to call shutdown on the tokio loop, and I observed it able to fix the `test_proc_mesh_size` finalization issues. However, I have to avoid initializing pyo3_async_runtime. One way to make this shippable would be to remove our usein of pyo3_async_runtime entirely. We primarily need it for future_into_py but we can just reimplement it directly ourself. Differential Revision: [D79533010](https://our.internmc.facebook.com/intern/diff/D79533010/) [ghstack-poisoned]
This pull request was exported from Phabricator. Differential Revision: D79533010 |
Stack from ghstack (oldest at bottom):
I am pretty sure our finalization errors are some combo of:
We also start at least one std::thread which runs python that will encounter this problem and not be on the tokio loop.
To avoid this, we need to shutdown the tokio loop (i.e. allow all tasks to reach an await, and then cancel everything).
However, pyo3_async_runtime makes this nearly impossible because it requires a 'static lifetime reference to the loop, so it cannot be shutdown.
This diff shows how to call shutdown on the tokio loop, and I observed it able to fix the
test_proc_mesh_size
finalization issues. However, I have to avoid initializing pyo3_async_runtime.One way to make this shippable would be to remove our usein of pyo3_async_runtime entirely. We primarily need it for future_into_py but we can just reimplement it directly ourself.
Differential Revision: D79533010