Detach socket on create_connection cancellation to prevent fd double-close#740
Detach socket on create_connection cancellation to prevent fd double-close#740junjzhang wants to merge 5 commits intoMagicStack:masterfrom
Conversation
When create_connection(sock=sock) or create_unix_connection(sock=sock) is cancelled or raises, tr._close() closes the fd via libuv but the Python socket object still believes it owns that fd number. Its __del__ later closes whatever fd the OS recycled into that slot, corrupting unrelated transports. Call sock.detach() on the error path so the Python socket sets its internal fd to -1, matching the semantics of standard asyncio where the transport always takes full ownership of the socket. Fixes MagicStack#738
|
@1st1 Please have a look! thx |
|
@fantix could you review? this seems like a critical issue under high concurrency |
|
Hey @junjzhang @6matt, perhaps one option for you would be to switch to aiofastnet? Not only it is faster, its networking source code is following python 3.14 asyncio implementation, which doesn't have this bug. And if there is any other bug, I can fix it and release much faster. I waited for more than a year to get some of my uvloop PRs merged, and they haven't been released yet. Just saying |
|
I think this is a duplicate of #646 |
- Move test_create_connection_sock_cancel_detaches to _TestTCP so it runs on both uvloop and asyncio - Add test_create_connection_sock_cancel_fd_leak that reproduces the full data leak chain: cancel → fd reuse → stale close → writev to wrong socket (see MagicStack#645, aio-libs/aiohttp#10506) - Fix ConnectionAbortedError in detach test server handler Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Good call! Though, #646 fixed it in a way that breaks asyncio compatibility. I think this PR is a better approach. Let me also add a theoretical test to reproduce #645, as well as aio-libs/aiohttp#10506. |
6d9acea to
9cc1d79
Compare
Mirror the TCP cancel/detach and data-leak tests for the create_unix_connection(sock=) path, covering the fix in both create_connection and create_unix_connection. Also fix server handlers to close writers (Python 3.12+ wait_closed() blocks until all connections are closed) and fix flake8 blank line issue. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
9cc1d79 to
74c24f1
Compare
Summary
Fixes #738.
When
create_connection(sock=sock)orcreate_unix_connection(sock=sock)is cancelled or raises,tr._close()closes the fd via libuv but the Python socket object still believes it owns that fd. Its__del__later closes whatever fd the OS recycled into that slot, silently corrupting an unrelated transport.This adds
sock.detach()on the error path of bothcreate_connectionandcreate_unix_connection, matching the semantics of standard asyncio where the transport always takes full ownership of the socket.Changes
loop.pyx: Callsock.detach()aftertr._close()in theexcept BaseExceptionblock for both TCP and Unixsock=pathstest_tcp.py: Addtest_create_connection_sock_cancel_detachesverifying thatsock.fileno() == -1after cancellation