Skip to content

fix: make all LiteLLM exception classes pickle-safe#24198

Open
jasonmatthewsuhari wants to merge 3 commits intoBerriAI:mainfrom
jasonmatthewsuhari:fix/exception-pickle-safety
Open

fix: make all LiteLLM exception classes pickle-safe#24198
jasonmatthewsuhari wants to merge 3 commits intoBerriAI:mainfrom
jasonmatthewsuhari:fix/exception-pickle-safety

Conversation

@jasonmatthewsuhari
Copy link

Summary

Fixes #24136

LiteLLM exception classes fail to pickle/unpickle, breaking Celery task queues, Sentry error tracking, multiprocessing, and any other system that serialises exceptions.

Root causes:

  1. Python's default exception pickle calls cls(*self.args) to reconstruct the object. LiteLLM exceptions require several positional args (message, llm_provider, model, …) not stored in self.args, so reconstruction raises TypeError.
  2. response / request attributes hold httpx.Response / httpx.Request instances, which are not picklable.

Changes

litellm/exceptions.py

  • Added _restore_litellm_exception(cls, state) — module-level callable that pickle uses to reconstruct exceptions. Uses Exception.__new__(cls) to bypass __init__, restores attributes via __dict__, and recreates minimal httpx.Response/httpx.Request objects from their serialized status codes.
  • Added _LiteLLMPickleMixin — mixin with __reduce__ and __getstate__. Applied as the first base class on 19 exception classes that directly inherit from openai exceptions or Exception. The remaining subclasses (ContextWindowExceededError, RejectedRequestError, MidStreamFallbackError, etc.) inherit the fix automatically.

Test plan

  • New test file tests/test_litellm/test_exceptions_pickle.py (149 tests, 139 pass, 10 skip where attribute is absent)
  • Round-trip survival for all 28 exception types
  • Attribute preservation: message, llm_provider, model, and class-specific fields (request_data, current_cost, raw_response, generated_content, guardrail_name, etc.)
  • isinstance checks still work after unpickling
  • Exceptions are still raise-able / catchable after round-trip
  • Multiple sequential round-trips
  • Run pytest tests/test_litellm/test_exceptions_pickle.py -v

Standard Python exception pickling calls cls(*self.args) to reconstruct
the object.  LiteLLM exceptions require several positional args (message,
llm_provider, model, …) that are not stored in self.args, so the default
protocol raises TypeError.  Additionally, response/request attributes
hold httpx.Response/httpx.Request instances which are not picklable.

Add _LiteLLMPickleMixin with __reduce__ and __getstate__ that:
- Uses Exception.__new__(cls) + dict restoration to bypass __init__
- Serialises httpx.Response status codes separately and recreates
  minimal response/request objects on unpickle

Apply the mixin to all 19 exception classes that directly inherit from
openai exceptions or Exception, covering all 28 exception types via
inheritance.

Fixes BerriAI#24136
@vercel
Copy link

vercel bot commented Mar 20, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Mar 21, 2026 7:55am

Request Review

@CLAassistant
Copy link

CLAassistant commented Mar 20, 2026

CLA assistant check
All committers have signed the CLA.

@codspeed-hq
Copy link
Contributor

codspeed-hq bot commented Mar 20, 2026

Merging this PR will not alter performance

✅ 16 untouched benchmarks


Comparing jasonmatthewsuhari:fix/exception-pickle-safety (64a75f1) with main (d8e4fc4)

Open in CodSpeed

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 20, 2026

Greptile Summary

This PR fixes a long-standing serialization bug where all LiteLLM exception classes failed to pickle/unpickle, breaking Celery task queues, Sentry error tracking, and multiprocessing. The fix introduces a _LiteLLMPickleMixin and a module-level _restore_litellm_exception function that bypass Python's default cls(*self.args) reconstruction path in favour of Exception.__new__ + __dict__ restoration.

Key changes:

  • _LiteLLMPickleMixin added as first base class to 19 exception classes; subclasses inherit the fix automatically
  • _restore_litellm_exception reconstructs httpx.Response/httpx.Request attributes from serialized status codes (other fields are lost — noted in existing review threads)
  • Non-picklable attributes (e.g. a third-party original_exception) are silently degraded to their str() representation — this is tested and documented
  • Comprehensive test file with 149 parametrized and targeted tests covering round-trips, attribute preservation, isinstance checks, raisability, and multiple sequential round-trips

Minor concerns:

  • __getstate__ calls pickle.loads(pickle.dumps(v)) to test picklability, but the deserialized value is immediately discarded; pickle.dumps(v) alone is sufficient and avoids a redundant deserialization on every attribute
  • Sentinel keys are prefixed with _pickled_ in the flat state dict, creating a theoretical namespace collision if a future exception attribute uses the same prefix
  • One unused variable (request) in test_pickle_midstream_with_non_picklable_original_exception

Confidence Score: 4/5

  • Safe to merge — the mixin is additive, does not change exception semantics for non-pickle code paths, and is well-tested.
  • The core fix is correct and addresses both root causes. Existing threads cover the lossy httpx reconstruction and args divergence issues. New comments are style/performance suggestions only. No regressions introduced to the public exception API.
  • No files require special attention beyond the inline comments on litellm/exceptions.py.

Important Files Changed

Filename Overview
litellm/exceptions.py Adds _LiteLLMPickleMixin (with __reduce__/__getstate__) and module-level _restore_litellm_exception to make all 19 base exception classes pickle-safe. The mixin is correctly placed as the first base class and subclasses inherit the fix automatically. Minor issues: pickle.loads(pickle.dumps(v)) in __getstate__ does a redundant round-trip deserialization just to test picklability, and the _pickled_* key prefix used for sentinel metadata can theoretically collide with genuine attribute names.
tests/test_litellm/test_exceptions_pickle.py New test file covering pickle round-trips for all 28 exception types with parametrized tests for message, llm_provider, model, isinstance checks, raisability, and multiple round-trips; also includes targeted tests for request_data, cost fields, schema fields, guardrail names, and original_exception fallback. One unused variable request at line 272 is dead code.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[pickle.dumps exception] --> B["__reduce__ called"]
    B --> C["returns _restore_litellm_exception, cls, state"]
    C --> D["__getstate__ called"]
    D --> E{"for each attr in __dict__"}
    E --> F{"isinstance httpx.Response?"}
    F -->|yes| G["store status_code only\nadd to http_attrs"]
    F -->|no| H{"isinstance httpx.Request?"}
    H -->|yes| I["add to http_attrs\nno metadata stored"]
    H -->|no| J{"pickle round-trip\ntest succeeds?"}
    J -->|yes| K["state[attr] = v"]
    J -->|no| L["state[attr] = str(v)\nlossy fallback"]
    E --> M{"more attrs?"}
    M -->|yes| E
    M -->|no| N["return state dict"]

    N --> O["pickle.loads exception"]
    O --> P["_restore_litellm_exception invoked"]
    P --> Q["Exception.__new__(cls)"]
    Q --> R["Exception.__init__(obj, message)"]
    R --> S["obj.__dict__.update(clean_state)"]
    S --> T{"for each attr in http_attrs"}
    T --> U{"status key present?"}
    U -->|yes| V["reconstruct httpx.Response\nstatus only, stub URL/method"]
    U -->|no| W["reconstruct httpx.Request\nstub GET https://litellm.ai"]
    V --> X{"more http_attrs?"}
    W --> X
    X -->|yes| T
    X -->|no| Y["return restored exception"]
Loading

Last reviewed commit: "test: add pickle rou..."

Comment on lines +37 to +59
obj = Exception.__new__(cls)
# Restore Exception.args so str(obj) and repr(obj) work correctly.
Exception.__init__(obj, state.get("message", ""))

http_attrs = state.get("_pickled_http_attrs", [])
clean_state = {k: v for k, v in state.items() if not k.startswith("_pickled_")}
obj.__dict__.update(clean_state)

for attr in http_attrs:
status_key = f"_pickled_{attr}_status"
if status_key in state:
setattr(
obj,
attr,
httpx.Response(
status_code=state[status_key],
request=httpx.Request(method="GET", url="https://litellm.ai"),
),
)
else:
setattr(obj, attr, httpx.Request(method="GET", url="https://litellm.ai"))

return obj
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 original_exception in MidStreamFallbackError still breaks pickle

MidStreamFallbackError stores self.original_exception (see line 1035). During __getstate__, this attribute lands in clean_state because it isn't an httpx.Response / httpx.Request. Python will then try to pickle it inline as part of the state dict.

The problem: in practice original_exception is often a raw openai-SDK exception (e.g., openai.RateLimitError, openai.APIConnectionError), which does not get the _LiteLLMPickleMixin. Attempting to pickle a MidStreamFallbackError with such a value will still raise TypeError at pickle.dumps time.

The test in test_exceptions_pickle.py only constructs MidStreamFallbackError without original_exception, so this failure path is not exercised.

A safe fix is to skip non-picklable attributes gracefully in __getstate__:

def __getstate__(self):
    state = {}
    http_attrs = []
    for k, v in self.__dict__.items():
        if isinstance(v, httpx.Response):
            http_attrs.append(k)
            state[f"_pickled_{k}_status"] = v.status_code
        elif isinstance(v, httpx.Request):
            http_attrs.append(k)
        else:
            try:
                pickle.dumps(v)   # probe only
                state[k] = v
            except Exception:
                pass  # drop non-picklable attrs silently
    if http_attrs:
        state["_pickled_http_attrs"] = http_attrs
    return state

Comment on lines +45 to +57
for attr in http_attrs:
status_key = f"_pickled_{attr}_status"
if status_key in state:
setattr(
obj,
attr,
httpx.Response(
status_code=state[status_key],
request=httpx.Request(method="GET", url="https://litellm.ai"),
),
)
else:
setattr(obj, attr, httpx.Request(method="GET", url="https://litellm.ai"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Lossy httpx reconstruction discards URL, method, and headers

Both httpx.Response and httpx.Request attributes are rebuilt with hardcoded stub values:

httpx.Request(method="GET", url="https://litellm.ai")

The original method, URL, headers, and (for Response) body are all lost. Any downstream code that inspects exception.request.url (e.g., to log the failing endpoint) or exception.response.headers (e.g., to read Retry-After from a RateLimitError) will see incorrect stub data after a round-trip.

Consider preserving at least method and url from the request, and headers from the response:

# in __getstate__, for httpx.Response:
state[f"_pickled_{k}_status"]  = v.status_code
state[f"_pickled_{k}_headers"] = dict(v.headers)
# for httpx.Request:
state[f"_pickled_{k}_method"] = v.method
state[f"_pickled_{k}_url"]    = str(v.url)

And reconstruct accordingly in _restore_litellm_exception.

Comment on lines +37 to +39
obj = Exception.__new__(cls)
# Restore Exception.args so str(obj) and repr(obj) work correctly.
Exception.__init__(obj, state.get("message", ""))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Exception.args diverges for Timeout and APIConnectionError after restore

Timeout.__init__ calls super().__init__(request=request) — the openai SDK's APITimeoutError.__init__ does not forward a message string to Exception.__init__, so the original exception has args = ().

After unpickling, _restore_litellm_exception calls:

Exception.__init__(obj, state.get("message", ""))

which sets args = ("litellm.Timeout: timed out",).

Any code that inspects exception.args directly (e.g., some Sentry integrations fingerprint on exc.args) will see different values for the original vs the restored object. The same applies to APIConnectionError, which also passes only request= to its parent.

This isn't caught by any test because the parametrized tests don't assert on args. A simple guard would be to save args explicitly:

# in __getstate__
state["_args"] = list(self.args)

# in _restore_litellm_exception
Exception.__init__(obj, *state.get("_args", [state.get("message", "")]))

Comment on lines +198 to +212
def test_pickle_isinstance_checks_still_work(original):
"""After round-trip, isinstance checks against the original class must pass."""
restored = _roundtrip(original)
assert isinstance(restored, type(original))
assert isinstance(restored, Exception)


def test_pickle_rejected_request_preserves_request_data():
original = exc.RejectedRequestError(
message="blocked",
model="gpt-4",
llm_provider="openai",
request_data={"messages": [{"role": "user", "content": "hello"}]},
)
restored = _roundtrip(original)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Missing test for MidStreamFallbackError with a non-None original_exception

test_pickle_midstream_preserves_generated_content constructs MidStreamFallbackError without original_exception. The primary use-case for this class is to wrap another exception:

exc.MidStreamFallbackError(
    ...,
    original_exception=exc.RateLimitError(
        message="rate limited", llm_provider="openai", model="gpt-4"
    ),
)

Since original_exception is now picklable (it's a LiteLLM exception), this should round-trip correctly — but it is worth verifying explicitly. Without a test covering this path, the most realistic usage scenario is unverified.

The __getstate__ mixin was storing attributes as-is into the pickle
state dict. For openai SDK exceptions (e.g. openai.RateLimitError),
pickle.dumps succeeds but pickle.loads fails because the openai
__init__ requires response and body keyword arguments that the
default pickle reconstruction does not supply.

Fix: probe each non-httpx attribute with a full round-trip
(pickle.loads(pickle.dumps(v))) before storing it. If the round-trip
fails, store str(v) instead so the attribute is still meaningful
after unpickling.

Covers MidStreamFallbackError.original_exception and any other
attribute that is dumps-able but not loads-able.
…picklable original_exception

Covers the case where original_exception holds an openai SDK exception
(openai.RateLimitError) that cannot be reconstructed by pickle.loads.
Verifies that generated_content, model, and llm_provider are preserved
and that original_exception is stored as its string representation.
@jasonmatthewsuhari
Copy link
Author

jasonmatthewsuhari commented Mar 21, 2026

Addressed the P1 raised by Greptile (two new commits on this branch):

Root cause confirmed: pickle.dumps on an openai.RateLimitError succeeds, but pickle.loads fails with TypeError: APIStatusError.__init__() missing 2 required keyword-only arguments: 'response' and 'body'. So the original try: pickle.dumps(v) guard was insufficient.

Fix (commit 1113598): In _LiteLLMPickleMixin.__getstate__, each non-httpx attribute is now probed with a full round-trip (pickle.loads(pickle.dumps(v))). If that round-trip raises, the attribute is stored as str(v) instead. This covers MidStreamFallbackError.original_exception and any other attribute that is serialisable but not reconstructable.

Test (commit 64a75f1): Added test_pickle_midstream_with_non_picklable_original_exception which constructs a MidStreamFallbackError with original_exception=openai.RateLimitError(...) and asserts the round-trip succeeds, with original_exception preserved as its string representation.

On the two P2 items: the lossy httpx stubs are intentional (the original __init__ methods already used hardcoded stubs, so nothing real is lost). The Timeout.args divergence is real but pre-existing behaviour from the openai SDK itself and is not a regression from this PR.

Also signed the CLA, oops forgot about that :-)

Comment on lines +91 to +95
try:
pickle.loads(pickle.dumps(v))
state[k] = v
except Exception:
state[k] = str(v)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Double round-trip for picklability check is wasteful

pickle.loads(pickle.dumps(v)) fully deserializes v just to verify it is picklable — but the deserialized result is immediately discarded. The deserialization step is unnecessary: a pickle.dumps call alone is sufficient to confirm that v can be serialized. The outer pickle.dumps of the exception state will then serialize v again, meaning every picklable attribute gets serialized three times in total (once to test, once discarded, once for real).

For exceptions carrying large payloads — e.g., RejectedRequestError.request_data with many chat messages — this triples the serialization cost on the hot path (Celery workers, Sentry, multiprocessing).

Suggested change
try:
pickle.loads(pickle.dumps(v))
state[k] = v
except Exception:
state[k] = str(v)
try:
pickle.dumps(v)
state[k] = v
except Exception:
state[k] = str(v)

The original_exception should be preserved as its string representation."""
import openai

request = _make_response(429).request
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Unused variable request

request is assigned but never referenced anywhere in the test body. It can be removed without affecting the test.

Suggested change
request = _make_response(429).request

Comment on lines +42 to +44
http_attrs = state.get("_pickled_http_attrs", [])
clean_state = {k: v for k, v in state.items() if not k.startswith("_pickled_")}
obj.__dict__.update(clean_state)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 _pickled_* key prefix creates a fragile namespace collision risk

clean_state is built by filtering out every key whose name starts with "_pickled_". This means any exception that legitimately stores an attribute named like _pickled_response_status (or any _pickled_ prefix) would have that attribute silently dropped on restore. The filtering is correct for the sentinel keys this PR introduces, but the prefix is generic enough that a future attribute addition could accidentally collide.

Consider using a single dedicated key (e.g., "__litellm_pickle_meta__") to store all pickle-time metadata as a nested dict, rather than scattering sentinel keys across the flat state namespace. This eliminates the collision risk entirely:

# __getstate__
meta = {}
for attr in http_attrs:
    if isinstance(self.__dict__[attr], httpx.Response):
        meta[attr] = {"type": "response", "status_code": self.__dict__[attr].status_code}
    else:
        meta[attr] = {"type": "request"}
state["__litellm_pickle_meta__"] = meta

# _restore_litellm_exception
clean_state = {k: v for k, v in state.items() if k != "__litellm_pickle_meta__"}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Exception classes are not pickle-safe

2 participants