fix: make all LiteLLM exception classes pickle-safe#24198
fix: make all LiteLLM exception classes pickle-safe#24198jasonmatthewsuhari wants to merge 3 commits intoBerriAI:mainfrom
Conversation
Standard Python exception pickling calls cls(*self.args) to reconstruct the object. LiteLLM exceptions require several positional args (message, llm_provider, model, …) that are not stored in self.args, so the default protocol raises TypeError. Additionally, response/request attributes hold httpx.Response/httpx.Request instances which are not picklable. Add _LiteLLMPickleMixin with __reduce__ and __getstate__ that: - Uses Exception.__new__(cls) + dict restoration to bypass __init__ - Serialises httpx.Response status codes separately and recreates minimal response/request objects on unpickle Apply the mixin to all 19 exception classes that directly inherit from openai exceptions or Exception, covering all 28 exception types via inheritance. Fixes BerriAI#24136
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis PR fixes a long-standing serialization bug where all LiteLLM exception classes failed to pickle/unpickle, breaking Celery task queues, Sentry error tracking, and multiprocessing. The fix introduces a Key changes:
Minor concerns:
Confidence Score: 4/5
|
| Filename | Overview |
|---|---|
| litellm/exceptions.py | Adds _LiteLLMPickleMixin (with __reduce__/__getstate__) and module-level _restore_litellm_exception to make all 19 base exception classes pickle-safe. The mixin is correctly placed as the first base class and subclasses inherit the fix automatically. Minor issues: pickle.loads(pickle.dumps(v)) in __getstate__ does a redundant round-trip deserialization just to test picklability, and the _pickled_* key prefix used for sentinel metadata can theoretically collide with genuine attribute names. |
| tests/test_litellm/test_exceptions_pickle.py | New test file covering pickle round-trips for all 28 exception types with parametrized tests for message, llm_provider, model, isinstance checks, raisability, and multiple round-trips; also includes targeted tests for request_data, cost fields, schema fields, guardrail names, and original_exception fallback. One unused variable request at line 272 is dead code. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[pickle.dumps exception] --> B["__reduce__ called"]
B --> C["returns _restore_litellm_exception, cls, state"]
C --> D["__getstate__ called"]
D --> E{"for each attr in __dict__"}
E --> F{"isinstance httpx.Response?"}
F -->|yes| G["store status_code only\nadd to http_attrs"]
F -->|no| H{"isinstance httpx.Request?"}
H -->|yes| I["add to http_attrs\nno metadata stored"]
H -->|no| J{"pickle round-trip\ntest succeeds?"}
J -->|yes| K["state[attr] = v"]
J -->|no| L["state[attr] = str(v)\nlossy fallback"]
E --> M{"more attrs?"}
M -->|yes| E
M -->|no| N["return state dict"]
N --> O["pickle.loads exception"]
O --> P["_restore_litellm_exception invoked"]
P --> Q["Exception.__new__(cls)"]
Q --> R["Exception.__init__(obj, message)"]
R --> S["obj.__dict__.update(clean_state)"]
S --> T{"for each attr in http_attrs"}
T --> U{"status key present?"}
U -->|yes| V["reconstruct httpx.Response\nstatus only, stub URL/method"]
U -->|no| W["reconstruct httpx.Request\nstub GET https://litellm.ai"]
V --> X{"more http_attrs?"}
W --> X
X -->|yes| T
X -->|no| Y["return restored exception"]
Last reviewed commit: "test: add pickle rou..."
| obj = Exception.__new__(cls) | ||
| # Restore Exception.args so str(obj) and repr(obj) work correctly. | ||
| Exception.__init__(obj, state.get("message", "")) | ||
|
|
||
| http_attrs = state.get("_pickled_http_attrs", []) | ||
| clean_state = {k: v for k, v in state.items() if not k.startswith("_pickled_")} | ||
| obj.__dict__.update(clean_state) | ||
|
|
||
| for attr in http_attrs: | ||
| status_key = f"_pickled_{attr}_status" | ||
| if status_key in state: | ||
| setattr( | ||
| obj, | ||
| attr, | ||
| httpx.Response( | ||
| status_code=state[status_key], | ||
| request=httpx.Request(method="GET", url="https://litellm.ai"), | ||
| ), | ||
| ) | ||
| else: | ||
| setattr(obj, attr, httpx.Request(method="GET", url="https://litellm.ai")) | ||
|
|
||
| return obj |
There was a problem hiding this comment.
original_exception in MidStreamFallbackError still breaks pickle
MidStreamFallbackError stores self.original_exception (see line 1035). During __getstate__, this attribute lands in clean_state because it isn't an httpx.Response / httpx.Request. Python will then try to pickle it inline as part of the state dict.
The problem: in practice original_exception is often a raw openai-SDK exception (e.g., openai.RateLimitError, openai.APIConnectionError), which does not get the _LiteLLMPickleMixin. Attempting to pickle a MidStreamFallbackError with such a value will still raise TypeError at pickle.dumps time.
The test in test_exceptions_pickle.py only constructs MidStreamFallbackError without original_exception, so this failure path is not exercised.
A safe fix is to skip non-picklable attributes gracefully in __getstate__:
def __getstate__(self):
state = {}
http_attrs = []
for k, v in self.__dict__.items():
if isinstance(v, httpx.Response):
http_attrs.append(k)
state[f"_pickled_{k}_status"] = v.status_code
elif isinstance(v, httpx.Request):
http_attrs.append(k)
else:
try:
pickle.dumps(v) # probe only
state[k] = v
except Exception:
pass # drop non-picklable attrs silently
if http_attrs:
state["_pickled_http_attrs"] = http_attrs
return state| for attr in http_attrs: | ||
| status_key = f"_pickled_{attr}_status" | ||
| if status_key in state: | ||
| setattr( | ||
| obj, | ||
| attr, | ||
| httpx.Response( | ||
| status_code=state[status_key], | ||
| request=httpx.Request(method="GET", url="https://litellm.ai"), | ||
| ), | ||
| ) | ||
| else: | ||
| setattr(obj, attr, httpx.Request(method="GET", url="https://litellm.ai")) |
There was a problem hiding this comment.
Lossy httpx reconstruction discards URL, method, and headers
Both httpx.Response and httpx.Request attributes are rebuilt with hardcoded stub values:
httpx.Request(method="GET", url="https://litellm.ai")The original method, URL, headers, and (for Response) body are all lost. Any downstream code that inspects exception.request.url (e.g., to log the failing endpoint) or exception.response.headers (e.g., to read Retry-After from a RateLimitError) will see incorrect stub data after a round-trip.
Consider preserving at least method and url from the request, and headers from the response:
# in __getstate__, for httpx.Response:
state[f"_pickled_{k}_status"] = v.status_code
state[f"_pickled_{k}_headers"] = dict(v.headers)
# for httpx.Request:
state[f"_pickled_{k}_method"] = v.method
state[f"_pickled_{k}_url"] = str(v.url)And reconstruct accordingly in _restore_litellm_exception.
| obj = Exception.__new__(cls) | ||
| # Restore Exception.args so str(obj) and repr(obj) work correctly. | ||
| Exception.__init__(obj, state.get("message", "")) |
There was a problem hiding this comment.
Exception.args diverges for Timeout and APIConnectionError after restore
Timeout.__init__ calls super().__init__(request=request) — the openai SDK's APITimeoutError.__init__ does not forward a message string to Exception.__init__, so the original exception has args = ().
After unpickling, _restore_litellm_exception calls:
Exception.__init__(obj, state.get("message", ""))which sets args = ("litellm.Timeout: timed out",).
Any code that inspects exception.args directly (e.g., some Sentry integrations fingerprint on exc.args) will see different values for the original vs the restored object. The same applies to APIConnectionError, which also passes only request= to its parent.
This isn't caught by any test because the parametrized tests don't assert on args. A simple guard would be to save args explicitly:
# in __getstate__
state["_args"] = list(self.args)
# in _restore_litellm_exception
Exception.__init__(obj, *state.get("_args", [state.get("message", "")]))| def test_pickle_isinstance_checks_still_work(original): | ||
| """After round-trip, isinstance checks against the original class must pass.""" | ||
| restored = _roundtrip(original) | ||
| assert isinstance(restored, type(original)) | ||
| assert isinstance(restored, Exception) | ||
|
|
||
|
|
||
| def test_pickle_rejected_request_preserves_request_data(): | ||
| original = exc.RejectedRequestError( | ||
| message="blocked", | ||
| model="gpt-4", | ||
| llm_provider="openai", | ||
| request_data={"messages": [{"role": "user", "content": "hello"}]}, | ||
| ) | ||
| restored = _roundtrip(original) |
There was a problem hiding this comment.
Missing test for
MidStreamFallbackError with a non-None original_exception
test_pickle_midstream_preserves_generated_content constructs MidStreamFallbackError without original_exception. The primary use-case for this class is to wrap another exception:
exc.MidStreamFallbackError(
...,
original_exception=exc.RateLimitError(
message="rate limited", llm_provider="openai", model="gpt-4"
),
)Since original_exception is now picklable (it's a LiteLLM exception), this should round-trip correctly — but it is worth verifying explicitly. Without a test covering this path, the most realistic usage scenario is unverified.
The __getstate__ mixin was storing attributes as-is into the pickle state dict. For openai SDK exceptions (e.g. openai.RateLimitError), pickle.dumps succeeds but pickle.loads fails because the openai __init__ requires response and body keyword arguments that the default pickle reconstruction does not supply. Fix: probe each non-httpx attribute with a full round-trip (pickle.loads(pickle.dumps(v))) before storing it. If the round-trip fails, store str(v) instead so the attribute is still meaningful after unpickling. Covers MidStreamFallbackError.original_exception and any other attribute that is dumps-able but not loads-able.
…picklable original_exception Covers the case where original_exception holds an openai SDK exception (openai.RateLimitError) that cannot be reconstructed by pickle.loads. Verifies that generated_content, model, and llm_provider are preserved and that original_exception is stored as its string representation.
|
Addressed the P1 raised by Greptile (two new commits on this branch): Root cause confirmed: Fix (commit 1113598): In Test (commit 64a75f1): Added On the two P2 items: the lossy httpx stubs are intentional (the original Also signed the CLA, oops forgot about that :-) |
| try: | ||
| pickle.loads(pickle.dumps(v)) | ||
| state[k] = v | ||
| except Exception: | ||
| state[k] = str(v) |
There was a problem hiding this comment.
Double round-trip for picklability check is wasteful
pickle.loads(pickle.dumps(v)) fully deserializes v just to verify it is picklable — but the deserialized result is immediately discarded. The deserialization step is unnecessary: a pickle.dumps call alone is sufficient to confirm that v can be serialized. The outer pickle.dumps of the exception state will then serialize v again, meaning every picklable attribute gets serialized three times in total (once to test, once discarded, once for real).
For exceptions carrying large payloads — e.g., RejectedRequestError.request_data with many chat messages — this triples the serialization cost on the hot path (Celery workers, Sentry, multiprocessing).
| try: | |
| pickle.loads(pickle.dumps(v)) | |
| state[k] = v | |
| except Exception: | |
| state[k] = str(v) | |
| try: | |
| pickle.dumps(v) | |
| state[k] = v | |
| except Exception: | |
| state[k] = str(v) |
| The original_exception should be preserved as its string representation.""" | ||
| import openai | ||
|
|
||
| request = _make_response(429).request |
| http_attrs = state.get("_pickled_http_attrs", []) | ||
| clean_state = {k: v for k, v in state.items() if not k.startswith("_pickled_")} | ||
| obj.__dict__.update(clean_state) |
There was a problem hiding this comment.
_pickled_* key prefix creates a fragile namespace collision risk
clean_state is built by filtering out every key whose name starts with "_pickled_". This means any exception that legitimately stores an attribute named like _pickled_response_status (or any _pickled_ prefix) would have that attribute silently dropped on restore. The filtering is correct for the sentinel keys this PR introduces, but the prefix is generic enough that a future attribute addition could accidentally collide.
Consider using a single dedicated key (e.g., "__litellm_pickle_meta__") to store all pickle-time metadata as a nested dict, rather than scattering sentinel keys across the flat state namespace. This eliminates the collision risk entirely:
# __getstate__
meta = {}
for attr in http_attrs:
if isinstance(self.__dict__[attr], httpx.Response):
meta[attr] = {"type": "response", "status_code": self.__dict__[attr].status_code}
else:
meta[attr] = {"type": "request"}
state["__litellm_pickle_meta__"] = meta
# _restore_litellm_exception
clean_state = {k: v for k, v in state.items() if k != "__litellm_pickle_meta__"}
Summary
Fixes #24136
LiteLLM exception classes fail to pickle/unpickle, breaking Celery task queues, Sentry error tracking, multiprocessing, and any other system that serialises exceptions.
Root causes:
cls(*self.args)to reconstruct the object. LiteLLM exceptions require several positional args (message,llm_provider,model, …) not stored inself.args, so reconstruction raisesTypeError.response/requestattributes holdhttpx.Response/httpx.Requestinstances, which are not picklable.Changes
litellm/exceptions.py_restore_litellm_exception(cls, state)— module-level callable that pickle uses to reconstruct exceptions. UsesException.__new__(cls)to bypass__init__, restores attributes via__dict__, and recreates minimalhttpx.Response/httpx.Requestobjects from their serialized status codes._LiteLLMPickleMixin— mixin with__reduce__and__getstate__. Applied as the first base class on 19 exception classes that directly inherit from openai exceptions orException. The remaining subclasses (ContextWindowExceededError,RejectedRequestError,MidStreamFallbackError, etc.) inherit the fix automatically.Test plan
tests/test_litellm/test_exceptions_pickle.py(149 tests, 139 pass, 10 skip where attribute is absent)message,llm_provider,model, and class-specific fields (request_data,current_cost,raw_response,generated_content,guardrail_name, etc.)isinstancechecks still work after unpicklingpytest tests/test_litellm/test_exceptions_pickle.py -v