-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Description
Description:
When attempting to use litellm.completion
, the Python SDK throws an httpx.InvalidURL
error. This error appears to be related to incorrect URL construction before the request is even sent to the litellm
proxy Docker container. This happens irrespective of the model being called.
The litellm
proxy itself is running within a Docker environment, and other services are able to successfully utilize the proxy without issues. This suggests the core litellm
proxy functionality is operational. The issue appears specific to the Python SDK's handling of the base URL when using litellm.completion
.
Using the OpenAI SDK and pointing it to the same litellm
proxy instance works as expected, indicating the proxy's general connectivity and configuration are correct. The problem seems isolated to how the litellm
Python SDK constructs the URL when calling litellm.completion
.
Steps to Reproduce:
- Ensure a
litellm
proxy Docker container is running (version 1.75.8, OAS 3.1). - Set the
api_base
andapi_key
for thelitellm
Python SDK (version 1.75.8) to point to thelitellm
proxy.
Code Example:
import litellm
api_base = "http://litellm:4000" # Replace with your litellm proxy address
# Same problem with a trailing slash: "http://litellm:4000/"
# HTTPStatusError: Client error '404 Not Found' for url 'http://litellm:4000/:generateContent'
api_key = "xyz" # Replace with your actual api key (if needed by litellm)
litellm.api_base = api_base
litellm.api_key = api_key
messages = [{"content": "Hello, how are you?", "role": "user"}]
try:
response = litellm.completion(
model="gpt-3.5-turbo", # Or any other model
messages=messages
)
print(response)
except Exception as e:
print(f"Error: {e}")
Expected Behavior:
The litellm.completion
call should successfully route the request to the litellm
proxy, which in turn should forward it to the appropriate backend API (assuming correct litellm
proxy configuration). The function should return a completion response.
Actual Behavior:
The code raises an httpx.InvalidURL
exception. The error message indicates that the port in the constructed URL is invalid, likely due to an incorrect concatenation of the base URL and the API endpoint.
Error Message:
(Full debug message below)
ValueError: invalid literal for int() with base 10: '4000:generateContent'
During handling of the above exception, another exception occurred:
httpx.InvalidURL: Invalid port: '4000:generateContent'
Working OpenAI Example:
The following code works correctly, demonstrating that the litellm
proxy is generally accessible:
import openai
api_base = "http://litellm:4000" # Replace with your litellm proxy address
api_key = "xyz" # Replace with your actual api key (if needed by litellm)
client = openai.OpenAI(
api_key=api_key,
base_url=api_base # LiteLLM Proxy is OpenAI compatible, Read More: https://docs.litellm.ai/docs/proxy/user_keys
)
response = client.chat.completions.create(
model="gemini/gemini-2.0-flash", # model to send to the proxy
messages=[
{
"role": "user",
"content": "this is a test request, write a short poem"
}
]
)
print(response)
Environment:
litellm
Python SDK version: 1.75.8litellm
proxy Docker image version: 1.75.8 (OAS 3.1)- Python version: [Specify your Python version here]
Additional Information:
- Using the OpenAI SDK with the same
api_base
andapi_key
(pointing to thelitellm
proxy) works correctly. This confirms the proxy is reachable and configured for basic requests. - The error appears to occur before the request is sent to the
litellm
proxy Docker container, suggesting a problem with URL construction within the Python SDK'slitellm.completion
function.
Full debug message
ValueError Traceback (most recent call last)
File ~/.local/lib/python3.12/site-packages/httpx/_urlparse.py:409, in normalize_port(port, scheme)
408 try:
--> 409 port_as_int = int(port)
410 except ValueError:
ValueError: invalid literal for int() with base 10: '4000:generateContent'
During handling of the above exception, another exception occurred:
InvalidURL Traceback (most recent call last)
File /usr/local/lib/python3.12/dist-packages/litellm/main.py:2685, in completion(model, messages, timeout, temperature, top_p, n, stream, stream_options, stop, max_completion_tokens, max_tokens, modalities, prediction, audio, presence_penalty, frequency_penalty, logit_bias, user, reasoning_effort, response_format, seed, tools, tool_choice, logprobs, top_logprobs, parallel_tool_calls, web_search_options, deployment_id, extra_headers, functions, function_call, base_url, api_version, api_key, model_list, thinking, **kwargs)
2684 new_params = deepcopy(optional_params)
-> 2685 response = vertex_chat_completion.completion( # type: ignore
2686 model=model,
2687 messages=messages,
2688 model_response=model_response,
2689 print_verbose=print_verbose,
2690 optional_params=new_params,
2691 litellm_params=litellm_params, # type: ignore
2692 logger_fn=logger_fn,
2693 encoding=encoding,
2694 vertex_location=vertex_ai_location,
2695 vertex_project=vertex_ai_project,
2696 vertex_credentials=vertex_credentials,
2697 gemini_api_key=gemini_api_key,
2698 logging_obj=logging,
2699 acompletion=acompletion,
2700 timeout=timeout,
2701 custom_llm_provider=custom_llm_provider,
2702 client=client,
2703 api_base=api_base,
2704 extra_headers=extra_headers,
2705 )
2707 elif custom_llm_provider == "vertex_ai":
File /usr/local/lib/python3.12/dist-packages/litellm/llms/vertex_ai/gemini/vertex_and_google_ai_studio_gemini.py:1890, in VertexLLM.completion(self, model, messages, model_response, print_verbose, custom_llm_provider, encoding, logging_obj, optional_params, acompletion, timeout, vertex_project, vertex_location, vertex_credentials, gemini_api_key, litellm_params, logger_fn, extra_headers, client, api_base)
1889 try:
-> 1890 response = client.post(url=url, headers=headers, json=data) # type: ignore
1891 response.raise_for_status()
File /usr/local/lib/python3.12/dist-packages/litellm/llms/custom_httpx/http_handler.py:782, in HTTPHandler.post(self, url, data, json, params, headers, stream, timeout, files, content, logging_obj)
781 except Exception as e:
--> 782 raise e
File /usr/local/lib/python3.12/dist-packages/litellm/llms/custom_httpx/http_handler.py:758, in HTTPHandler.post(self, url, data, json, params, headers, stream, timeout, files, content, logging_obj)
757 else:
--> 758 req = self.client.build_request(
759 "POST", url, data=data, json=json, params=params, headers=headers, files=files, content=content # type: ignore
760 )
761 response = self.client.send(req, stream=stream)
File ~/.local/lib/python3.12/site-packages/httpx/_client.py:366, in BaseClient.build_request(self, method, url, content, data, files, json, params, headers, cookies, timeout, extensions)
355 """
356 Build and return a request instance.
357
(...) 364 [0]: /advanced/clients/#request-instances
365 """
--> 366 url = self._merge_url(url)
367 headers = self._merge_headers(headers)
File ~/.local/lib/python3.12/site-packages/httpx/_client.py:396, in BaseClient._merge_url(self, url)
392 """
393 Merge a URL argument together with any 'base_url' on the client,
394 to create the URL used for the outgoing request.
395 """
--> 396 merge_url = URL(url)
397 if merge_url.is_relative_url:
398 # To merge URLs we always append to the base URL. To get this
399 # behaviour correct we always ensure the base URL ends in a '/'
(...) 407 # >>> client.build_request("GET", "/path").url
408 # URL('https://www.example.com/subpath/path')
File ~/.local/lib/python3.12/site-packages/httpx/_urls.py:117, in URL.init(self, url, **kwargs)
116 if isinstance(url, str):
--> 117 self._uri_reference = urlparse(url, **kwargs)
118 elif isinstance(url, URL):
File ~/.local/lib/python3.12/site-packages/httpx/_urlparse.py:321, in urlparse(url, **kwargs)
320 parsed_host: str = encode_host(host)
--> 321 parsed_port: int | None = normalize_port(port, scheme)
323 has_scheme = parsed_scheme != ""
File ~/.local/lib/python3.12/site-packages/httpx/_urlparse.py:411, in normalize_port(port, scheme)
410 except ValueError:
--> 411 raise InvalidURL(f"Invalid port: {port!r}")
413 # See https://url.spec.whatwg.org/#url-miscellaneous
InvalidURL: Invalid port: '4000:generateContent'
During handling of the above exception, another exception occurred:
APIConnectionError Traceback (most recent call last)
Cell In[20], line 22
20 messages = [{ "content": "Hello, how are you?","role": "user"}]
21 # response = litellm.completion(messages, model="gemini/gemini-2.0-flash")
---> 22 response = litellm.completion(
23 model = "gemini/gemini-2.0-flash",
24 messages = messages
25 )
File /usr/local/lib/python3.12/dist-packages/litellm/utils.py:1332, in client..wrapper(*args, **kwargs)
1328 if logging_obj:
1329 logging_obj.failure_handler(
1330 e, traceback_exception, start_time, end_time
1331 ) # DO NOT MAKE THREADED - router retry fallback relies on this!
-> 1332 raise e
File /usr/local/lib/python3.12/dist-packages/litellm/utils.py:1207, in client..wrapper(*args, **kwargs)
1205 print_verbose(f"Error while checking max token limit: {str(e)}")
1206 # MODEL CALL
-> 1207 result = original_function(*args, **kwargs)
1208 end_time = datetime.datetime.now()
1209 if _is_streaming_request(
1210 kwargs=kwargs,
1211 call_type=call_type,
1212 ):
File /usr/local/lib/python3.12/dist-packages/litellm/main.py:3525, in completion(model, messages, timeout, temperature, top_p, n, stream, stream_options, stop, max_completion_tokens, max_tokens, modalities, prediction, audio, presence_penalty, frequency_penalty, logit_bias, user, reasoning_effort, response_format, seed, tools, tool_choice, logprobs, top_logprobs, parallel_tool_calls, web_search_options, deployment_id, extra_headers, functions, function_call, base_url, api_version, api_key, model_list, thinking, **kwargs)
3522 return response
3523 except Exception as e:
3524 ## Map to OpenAI Exception
-> 3525 raise exception_type(
3526 model=model,
3527 custom_llm_provider=custom_llm_provider,
3528 original_exception=e,
3529 completion_kwargs=args,
3530 extra_kwargs=kwargs,
3531 )
File /usr/local/lib/python3.12/dist-packages/litellm/litellm_core_utils/exception_mapping_utils.py:2301, in exception_type(model, original_exception, custom_llm_provider, completion_kwargs, extra_kwargs)
2299 if exception_mapping_worked:
2300 setattr(e, "litellm_response_headers", litellm_response_headers)
-> 2301 raise e
2302 else:
2303 for error_type in litellm.LITELLM_EXCEPTION_TYPES:
File /usr/local/lib/python3.12/dist-packages/litellm/litellm_core_utils/exception_mapping_utils.py:2277, in exception_type(model, original_exception, custom_llm_provider, completion_kwargs, extra_kwargs)
2270 raise APIConnectionError(
2271 message="{} - {}".format(exception_provider, error_str),
2272 llm_provider=custom_llm_provider,
2273 model=model,
2274 request=original_exception.request,
2275 )
2276 else:
-> 2277 raise APIConnectionError(
2278 message="{}\n{}".format(
2279 str(original_exception), traceback.format_exc()
2280 ),
2281 llm_provider=custom_llm_provider,
2282 model=model,
2283 request=httpx.Request(
2284 method="POST", url="https://api.openai.com/v1/"
2285 ), # stub the request
2286 )
2287 except Exception as e:
2288 # LOGGING
2289 exception_logging(
2290 logger_fn=None,
2291 additional_args={
(...) 2295 exception=e,
2296 )
APIConnectionError: litellm.APIConnectionError: Invalid port: '4000:generateContent'
Traceback (most recent call last):
File "/config/.local/lib/python3.12/site-packages/httpx/_urlparse.py", line 409, in normalize_port
port_as_int = int(port)
^^^^^^^^^
ValueError: invalid literal for int() with base 10: '4000:generateContent'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.12/dist-packages/litellm/main.py", line 2685, in completion
response = vertex_chat_completion.completion( # type: ignore
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/litellm/llms/vertex_ai/gemini/vertex_and_google_ai_studio_gemini.py", line 1890, in completion
response = client.post(url=url, headers=headers, json=data) # type: ignore
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/litellm/llms/custom_httpx/http_handler.py", line 782, in post
raise e
File "/usr/local/lib/python3.12/dist-packages/litellm/llms/custom_httpx/http_handler.py", line 758, in post
req = self.client.build_request(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/config/.local/lib/python3.12/site-packages/httpx/_client.py", line 366, in build_request
url = self._merge_url(url)
^^^^^^^^^^^^^^^^^^^^
File "/config/.local/lib/python3.12/site-packages/httpx/_client.py", line 396, in _merge_url
merge_url = URL(url)
^^^^^^^^
File "/config/.local/lib/python3.12/site-packages/httpx/_urls.py", line 117, in init
self._uri_reference = urlparse(url, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^
File "/config/.local/lib/python3.12/site-packages/httpx/_urlparse.py", line 321, in urlparse
parsed_port: int | None = normalize_port(port, scheme)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/config/.local/lib/python3.12/site-packages/httpx/_urlparse.py", line 411, in normalize_port
raise InvalidURL(f"Invalid port: {port!r}")
httpx.InvalidURL: Invalid port: '4000:generateContent'