-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
What happened?
Description
When creating an HTTPProvider with an explicit session parameter, the provided session is only used by the thread/greenlet that created the provider. All other threads/greenlets create their own requests.Session instances, defeating the purpose of providing a shared session.
This is caused by the HTTPSessionManager.cache_and_return_session() method using threading.get_ident() as part of the cache key, even when a session is explicitly provided.
Environment
- Python version: 3.13
- web3.py version: 7.14.0
- OS: Linux
- Using: gevent
Steps to Reproduce
import threading
import requests
from web3 import Web3, HTTPProvider
from web3._utils.http_session_manager import HTTPSessionManager
# Create a shared session
shared_session = requests.Session()
# Create HTTPProvider with explicit session
provider = HTTPProvider("http://localhost:8545", session=shared_session)
w3 = Web3(provider)
# Simulate requests from different threads/greenlets
def make_request():
# This internally calls HTTPSessionManager.cache_and_return_session()
# WITHOUT passing the session parameter
manager = provider._request_session_manager
session = manager.cache_and_return_session("http://localhost:8545")
print(f"Thread {threading.get_ident()}: Session is shared? {session is shared_session}")
# Run from main thread
make_request() # True - uses the cached session
# Run from different threads
import concurrent.futures
with concurrent.futures.ThreadPoolExecutor(max_workers=3) as executor:
futures = [executor.submit(make_request) for _ in range(3)]
concurrent.futures.wait(futures)
# All print False - each thread creates a NEW session!With gevent (more impactful)
import threading
import gevent
from gevent import monkey
monkey.patch_all()
# Verify each greenlet gets unique thread ID
ids = set()
def get_id():
ids.add(threading.get_ident())
greenlets = [gevent.spawn(get_id) for _ in range(100)]
gevent.joinall(greenlets)
print(f"100 greenlets produced {len(ids)} unique thread IDs") # Prints: 100Root Cause
In web3/_utils/http_session_manager.py, the cache_and_return_session method:
def cache_and_return_session(
self,
endpoint_uri: URI,
session: requests.Session = None, # Session parameter exists but...
request_timeout: Optional[float] = None,
) -> requests.Session:
# Cache key includes thread ID
cache_key = generate_cache_key(f"{threading.get_ident()}:{endpoint_uri}")
cached_session = self.session_cache.get_cache_entry(cache_key)
if cached_session is not None:
return cached_session
if session is None:
session = requests.Session() # Creates NEW session if not in cache
# ...The problem is that when HTTPProvider._make_request() is called, it eventually calls get_response_from_post_request():
def get_response_from_post_request(
self, endpoint_uri: URI, *args: Any, **kwargs: Any
) -> requests.Response:
session = self.cache_and_return_session(
endpoint_uri, request_timeout=kwargs["timeout"]
) # ⚠️ NO session parameter passed!
return session.post(endpoint_uri, *args, **kwargs)Since no session parameter is passed here, and the cache key is thread-specific, each thread/greenlet creates its own session.
Expected Behavior
When a session is explicitly provided to HTTPProvider, it should be used for all requests regardless of which thread/greenlet makes them.
Actual Behavior
The provided session is only used by the thread/greenlet that created the HTTPProvider. All other threads/greenlets create new requests.Session instances.
Impact
This is a severe issue for applications using:
- gevent with Celery workers (common in production)
- Threading for concurrent requests
- asyncio with thread pools
In our case with Celery + gevent:
- Worker concurrency: 5000 greenlets
- Each greenlet gets a unique
threading.get_ident() - Results in up to 5000
requests.Sessioninstances per worker - Each session creates its own connection pool
- Leads to thousands of unclosed RPC connections
Workaround
Currently, the only workaround is to avoid using w3.eth methods for concurrent operations and instead make direct HTTP requests using the shared session, bypassing web3.py's provider layer.
Code that produced the error
import threading
import requests
from web3 import Web3, HTTPProvider
from web3._utils.http_session_manager import HTTPSessionManager
# Create a shared session
shared_session = requests.Session()
# Create HTTPProvider with explicit session
provider = HTTPProvider("http://localhost:8545", session=shared_session)
w3 = Web3(provider)
# Simulate requests from different threads/greenlets
def make_request():
# This internally calls HTTPSessionManager.cache_and_return_session()
# WITHOUT passing the session parameter
manager = provider._request_session_manager
session = manager.cache_and_return_session("http://localhost:8545")
print(f"Thread {threading.get_ident()}: Session is shared? {session is shared_session}")
# Run from main thread
make_request() # True - uses the cached session
# Run from different threads
import concurrent.futures
with concurrent.futures.ThreadPoolExecutor(max_workers=3) as executor:
futures = [executor.submit(make_request) for _ in range(3)]
concurrent.futures.wait(futures)
# All print False - each thread creates a NEW session!Full error output
Thread 139907393236800: Session is shared? True
Thread 139907353547200: Session is shared? False
Thread 139907353547200: Session is shared? False
Thread 139907353547200: Session is shared? FalseFill this section in if you know how this could or should be fixed
Suggested Fix
Option 1: Store the explicitly provided session and always use it:
class HTTPProvider(JSONBaseProvider):
def __init__(self, ..., session=None, ...):
super().__init__(**kwargs)
self._explicit_session = session # Store the provided session
self._request_session_manager = HTTPSessionManager(
explicit_session=session # Pass to manager
)
# ...Option 2: Modify HTTPSessionManager to check for an explicit session:
class HTTPSessionManager:
def __init__(self, ..., explicit_session=None):
self._explicit_session = explicit_session
# ...
def cache_and_return_session(self, endpoint_uri, session=None, ...):
# If explicit session was provided at init, always use it
if self._explicit_session is not None:
return self._explicit_session
# Otherwise, use thread-based caching (current behavior)
# ...web3 Version
7.14.0
Python Version
3.13
Operating System
linux
Output from pip freeze
aiohappyeyeballs==2.6.1
aiohttp==3.13.2
aiosignal==1.4.0
amqp==5.3.1
annotated-types==0.7.0
asgiref==3.11.0
asttokens==3.0.1
attrs==25.4.0
billiard==4.2.4
bitarray==3.8.0
boto3==1.42.4
botocore==1.42.9
cached-property==2.0.1
cachetools==6.2.2
celery==5.5.3
certifi==2025.11.12
cfgv==3.5.0
charset-normalizer==3.4.4
ckzg==2.1.5
click==8.3.1
click-didyoumean==0.3.1
click-plugins==1.1.1.2
click-repl==0.3.0
coverage==7.11.0
cron_descriptor==2.0.6
cytoolz==1.1.0
debugpy==1.8.18
decorator==5.2.1
distlib==0.4.0
Django==5.2.9
django-appconf==1.2.0
django-cache-memoize==0.2.1
django-celery-beat==2.8.1
django-cors-headers==4.9.0
django-debug-toolbar==6.1.0
django-debug-toolbar-force==0.2
django-environ==0.12.0
django-extensions==4.1
django-filter==25.2
django-imagekit==6.0.0
django-model-utils==5.0.0
django-nine==0.2.7
django-redis==6.0.0
django-s3-storage==0.15.0
django-stubs==5.2.7
django-stubs-ext==5.2.8
django-test-migrations==1.5.0
django-timezone-field==7.1
djangorestframework==3.16.1
djangorestframework-camel-case==1.4.2
docker==7.1.0
docutils==0.22.3
drf-spectacular==0.29.0
eth-account==0.13.7
eth-bloom==3.1.0
eth-hash==0.7.1
eth-keyfile==0.8.1
eth-keys==0.7.0
eth-rlp==2.2.0
eth-typing==5.2.1
eth-utils==5.3.1
eth_abi==5.2.0
executing==2.2.1
factory_boy==3.3.3
Faker==37.12.0
filelock==3.20.0
flower==2.0.1
frozenlist==1.8.0
gevent==25.9.1
greenlet==3.3.0
gunicorn==23.0.0
hexbytes==1.3.1
hiredis==3.3.0
humanize==4.14.0
identify==2.6.15
idna==3.11
inflection==0.5.1
iniconfig==2.3.0
ipdb==0.13.13
ipython==9.8.0
ipython_pygments_lexers==1.1.1
jedi==0.19.2
jmespath==1.0.1
jsonschema==4.25.1
jsonschema-specifications==2025.9.1
kombu==5.5.4
lru-dict==1.4.1
matplotlib-inline==0.2.1
multidict==6.7.0
mypy==1.18.2
mypy_extensions==1.1.0
nodeenv==1.9.1
packaging==25.0
parsimonious==0.10.0
parso==0.8.5
pathspec==0.12.1
pexpect==4.9.0
pika==1.3.2
pilkit==3.0
pillow==11.3.0
platformdirs==4.5.1
pluggy==1.6.0
pre_commit==4.5.0
prometheus_client==0.23.1
prompt_toolkit==3.0.52
propcache==0.4.1
psutil==7.1.3
psycopg==3.2.13
psycopg-binary==3.2.13
psycopg-pool==3.3.0
ptyprocess==0.7.0
pure_eval==0.2.3
py-ecc==8.0.0
py-evm==0.12.1b1
pycryptodome==3.23.0
pydantic==2.12.5
pydantic_core==2.41.5
Pygments==2.19.2
pytest==9.0.2
pytest-celery==1.2.1
pytest-django==4.11.1
pytest-env==1.2.0
pytest-rerunfailures==16.1
pytest-sugar==1.1.1
pytest_docker_tools==3.1.9
python-crontab==3.3.0
python-dateutil==2.9.0.post0
pytz==2025.2
pyunormalize==17.0.0
PyYAML==6.0.3
redis==7.1.0
referencing==0.37.0
regex==2025.11.3
requests==2.32.5
rlp==4.1.0
rpds-py==0.30.0
ruff==0.14.9
s3transfer==0.16.0
safe-eth-py==7.14.0
safe-pysha3==1.0.5
setuptools==80.9.0
six==1.17.0
sortedcontainers==2.4.0
sqlparse==0.5.4
stack-data==0.6.3
tenacity==9.1.2
termcolor==3.2.0
toolz==1.1.0
tornado==6.5.3
traitlets==5.14.3
trie==3.1.0
types-PyYAML==6.0.12.20250915
types-requests==2.32.4.20250913
typing-inspection==0.4.2
typing_extensions==4.15.0
tzdata==2025.3
uritemplate==4.2.0
urllib3==2.6.2
vine==5.1.0
virtualenv==20.35.4
wcwidth==0.2.14
web3==7.14.0
websockets==15.0.1
yarl==1.22.0
zope.event==6.1
zope.interface==8.1.1