[Bug] Invalid URL construction when using litellm.completion under the Python SDK

**Description:**

When attempting to use `litellm.completion`, the Python SDK throws an `httpx.InvalidURL` error. This error appears to be related to incorrect URL construction *before* the request is even sent to the `litellm` proxy Docker container. This happens irrespective of the model being called.

The `litellm` proxy itself is running within a Docker environment, and other services are able to successfully utilize the proxy without issues. This suggests the core `litellm` proxy functionality is operational. The issue appears specific to the Python SDK's handling of the base URL when using `litellm.completion`.

Using the OpenAI SDK and pointing it to the same `litellm` proxy instance works as expected, indicating the proxy's general connectivity and configuration are correct. The problem seems isolated to how the `litellm` Python SDK constructs the URL when calling `litellm.completion`.

**Steps to Reproduce:**

1.  Ensure a `litellm` proxy Docker container is running (version 1.75.8, OAS 3.1).
2.  Set the `api_base` and `api_key` for the `litellm` Python SDK (version 1.75.8) to point to the `litellm` proxy.

**Code Example:**

```python
import litellm

api_base = "http://litellm:4000"  # Replace with your litellm proxy address
# Same problem with a trailing slash: "http://litellm:4000/"
# HTTPStatusError: Client error '404 Not Found' for url 'http://litellm:4000/:generateContent'
api_key = "xyz" # Replace with your actual api key (if needed by litellm)

litellm.api_base = api_base
litellm.api_key = api_key

messages = [{"content": "Hello, how are you?", "role": "user"}]

try:
    response = litellm.completion(
        model="gpt-3.5-turbo",  # Or any other model
        messages=messages
    )
    print(response)

except Exception as e:
    print(f"Error: {e}")
```

**Expected Behavior:**

The `litellm.completion` call should successfully route the request to the `litellm` proxy, which in turn should forward it to the appropriate backend API (assuming correct `litellm` proxy configuration). The function should return a completion response.

**Actual Behavior:**

The code raises an `httpx.InvalidURL` exception. The error message indicates that the port in the constructed URL is invalid, likely due to an incorrect concatenation of the base URL and the API endpoint.

**Error Message:**
(Full debug message below)

```
ValueError: invalid literal for int() with base 10: '4000:generateContent'

During handling of the above exception, another exception occurred:

httpx.InvalidURL: Invalid port: '4000:generateContent'
```

**Working OpenAI Example:**

The following code works correctly, demonstrating that the `litellm` proxy is generally accessible:

```python
import openai

api_base = "http://litellm:4000"  # Replace with your litellm proxy address
api_key = "xyz" # Replace with your actual api key (if needed by litellm)

client = openai.OpenAI(
    api_key=api_key,
    base_url=api_base  # LiteLLM Proxy is OpenAI compatible, Read More: https://docs.litellm.ai/docs/proxy/user_keys
)
response = client.chat.completions.create(
    model="gemini/gemini-2.0-flash",  # model to send to the proxy
    messages=[
        {
            "role": "user",
            "content": "this is a test request, write a short poem"
        }
    ]
)
print(response)
```

**Environment:**

*   `litellm` Python SDK version: 1.75.8
*   `litellm` proxy Docker image version: 1.75.8 (OAS 3.1)
*   Python version: [Specify your Python version here]

**Additional Information:**

*   Using the OpenAI SDK with the same `api_base` and `api_key` (pointing to the `litellm` proxy) works correctly. This confirms the proxy is reachable and configured for basic requests.
*   The error appears to occur before the request is sent to the `litellm` proxy Docker container, suggesting a problem with URL construction within the Python SDK's `litellm.completion` function.


<details><summary>Full debug message</summary>
<p>

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
File ~/.local/lib/python3.12/site-packages/httpx/_urlparse.py:409, in normalize_port(port, scheme)
    408 try:
--> [409](https://vscode-remote+code-002eoci-002elocal.vscode-resource.vscode-cdn.net/config/workspace/Innovare/AstralGen/~/.local/lib/python3.12/site-packages/httpx/_urlparse.py:409)     port_as_int = int(port)
    410 except ValueError:

ValueError: invalid literal for int() with base 10: '4000:generateContent'

During handling of the above exception, another exception occurred:

InvalidURL                                Traceback (most recent call last)
File /usr/local/lib/python3.12/dist-packages/litellm/main.py:2685, in completion(model, messages, timeout, temperature, top_p, n, stream, stream_options, stop, max_completion_tokens, max_tokens, modalities, prediction, audio, presence_penalty, frequency_penalty, logit_bias, user, reasoning_effort, response_format, seed, tools, tool_choice, logprobs, top_logprobs, parallel_tool_calls, web_search_options, deployment_id, extra_headers, functions, function_call, base_url, api_version, api_key, model_list, thinking, **kwargs)
   2684     new_params = deepcopy(optional_params)
-> [2685](https://vscode-remote+code-002eoci-002elocal.vscode-resource.vscode-cdn.net/usr/local/lib/python3.12/dist-packages/litellm/main.py:2685)     response = vertex_chat_completion.completion(  # type: ignore
   2686         model=model,
   2687         messages=messages,
   2688         model_response=model_response,
   2689         print_verbose=print_verbose,
   2690         optional_params=new_params,
   2691         litellm_params=litellm_params,  # type: ignore
   2692         logger_fn=logger_fn,
   2693         encoding=encoding,
   2694         vertex_location=vertex_ai_location,
   2695         vertex_project=vertex_ai_project,
   2696         vertex_credentials=vertex_credentials,
   2697         gemini_api_key=gemini_api_key,
   2698         logging_obj=logging,
   2699         acompletion=acompletion,
   2700         timeout=timeout,
   2701         custom_llm_provider=custom_llm_provider,
   2702         client=client,
   2703         api_base=api_base,
   2704         extra_headers=extra_headers,
   2705     )
   2707 elif custom_llm_provider == "vertex_ai":

File /usr/local/lib/python3.12/dist-packages/litellm/llms/vertex_ai/gemini/vertex_and_google_ai_studio_gemini.py:1890, in VertexLLM.completion(self, model, messages, model_response, print_verbose, custom_llm_provider, encoding, logging_obj, optional_params, acompletion, timeout, vertex_project, vertex_location, vertex_credentials, gemini_api_key, litellm_params, logger_fn, extra_headers, client, api_base)
   1889 try:
-> [1890](https://vscode-remote+code-002eoci-002elocal.vscode-resource.vscode-cdn.net/usr/local/lib/python3.12/dist-packages/litellm/llms/vertex_ai/gemini/vertex_and_google_ai_studio_gemini.py:1890)     response = client.post(url=url, headers=headers, json=data)  # type: ignore
   1891     response.raise_for_status()

File /usr/local/lib/python3.12/dist-packages/litellm/llms/custom_httpx/http_handler.py:782, in HTTPHandler.post(self, url, data, json, params, headers, stream, timeout, files, content, logging_obj)
    781 except Exception as e:
--> [782](https://vscode-remote+code-002eoci-002elocal.vscode-resource.vscode-cdn.net/usr/local/lib/python3.12/dist-packages/litellm/llms/custom_httpx/http_handler.py:782)     raise e

File /usr/local/lib/python3.12/dist-packages/litellm/llms/custom_httpx/http_handler.py:758, in HTTPHandler.post(self, url, data, json, params, headers, stream, timeout, files, content, logging_obj)
    757 else:
--> [758](https://vscode-remote+code-002eoci-002elocal.vscode-resource.vscode-cdn.net/usr/local/lib/python3.12/dist-packages/litellm/llms/custom_httpx/http_handler.py:758)     req = self.client.build_request(
    759         "POST", url, data=data, json=json, params=params, headers=headers, files=files, content=content  # type: ignore
    760     )
    761 response = self.client.send(req, stream=stream)

File ~/.local/lib/python3.12/site-packages/httpx/_client.py:366, in BaseClient.build_request(self, method, url, content, data, files, json, params, headers, cookies, timeout, extensions)
    355 """
    356 Build and return a request instance.
    357 
   (...)    364 [0]: /advanced/clients/#request-instances
    365 """
--> [366](https://vscode-remote+code-002eoci-002elocal.vscode-resource.vscode-cdn.net/config/workspace/Innovare/AstralGen/~/.local/lib/python3.12/site-packages/httpx/_client.py:366) url = self._merge_url(url)
    367 headers = self._merge_headers(headers)

File ~/.local/lib/python3.12/site-packages/httpx/_client.py:396, in BaseClient._merge_url(self, url)
    392 """
    393 Merge a URL argument together with any 'base_url' on the client,
    394 to create the URL used for the outgoing request.
    395 """
--> [396](https://vscode-remote+code-002eoci-002elocal.vscode-resource.vscode-cdn.net/config/workspace/Innovare/AstralGen/~/.local/lib/python3.12/site-packages/httpx/_client.py:396) merge_url = URL(url)
    397 if merge_url.is_relative_url:
    398     # To merge URLs we always append to the base URL. To get this
    399     # behaviour correct we always ensure the base URL ends in a '/'
   (...)    407     # >>> client.build_request("GET", "/path").url
    408     # URL('https://www.example.com/subpath/path')

File ~/.local/lib/python3.12/site-packages/httpx/_urls.py:117, in URL.__init__(self, url, **kwargs)
    116 if isinstance(url, str):
--> [117](https://vscode-remote+code-002eoci-002elocal.vscode-resource.vscode-cdn.net/config/workspace/Innovare/AstralGen/~/.local/lib/python3.12/site-packages/httpx/_urls.py:117)     self._uri_reference = urlparse(url, **kwargs)
    118 elif isinstance(url, URL):

File ~/.local/lib/python3.12/site-packages/httpx/_urlparse.py:321, in urlparse(url, **kwargs)
    320 parsed_host: str = encode_host(host)
--> [321](https://vscode-remote+code-002eoci-002elocal.vscode-resource.vscode-cdn.net/config/workspace/Innovare/AstralGen/~/.local/lib/python3.12/site-packages/httpx/_urlparse.py:321) parsed_port: int | None = normalize_port(port, scheme)
    323 has_scheme = parsed_scheme != ""

File ~/.local/lib/python3.12/site-packages/httpx/_urlparse.py:411, in normalize_port(port, scheme)
    410 except ValueError:
--> [411](https://vscode-remote+code-002eoci-002elocal.vscode-resource.vscode-cdn.net/config/workspace/Innovare/AstralGen/~/.local/lib/python3.12/site-packages/httpx/_urlparse.py:411)     raise InvalidURL(f"Invalid port: {port!r}")
    413 # See https://url.spec.whatwg.org/#url-miscellaneous

InvalidURL: Invalid port: '4000:generateContent'

During handling of the above exception, another exception occurred:

APIConnectionError                        Traceback (most recent call last)
Cell In[20], [line 22](vscode-notebook-cell:?execution_count=20&line=22)
     20 messages = [{ "content": "Hello, how are you?","role": "user"}]
     21 # response = litellm.completion(messages, model="gemini/gemini-2.0-flash")
---> [22](vscode-notebook-cell:?execution_count=20&line=22) response = litellm.completion(
     23     model = "gemini/gemini-2.0-flash",
     24     messages = messages
     25 )

File /usr/local/lib/python3.12/dist-packages/litellm/utils.py:1332, in client.<locals>.wrapper(*args, **kwargs)
   1328 if logging_obj:
   1329     logging_obj.failure_handler(
   1330         e, traceback_exception, start_time, end_time
   1331     )  # DO NOT MAKE THREADED - router retry fallback relies on this!
-> [1332](https://vscode-remote+code-002eoci-002elocal.vscode-resource.vscode-cdn.net/usr/local/lib/python3.12/dist-packages/litellm/utils.py:1332) raise e

File /usr/local/lib/python3.12/dist-packages/litellm/utils.py:1207, in client.<locals>.wrapper(*args, **kwargs)
   1205         print_verbose(f"Error while checking max token limit: {str(e)}")
   1206 # MODEL CALL
-> [1207](https://vscode-remote+code-002eoci-002elocal.vscode-resource.vscode-cdn.net/usr/local/lib/python3.12/dist-packages/litellm/utils.py:1207) result = original_function(*args, **kwargs)
   1208 end_time = datetime.datetime.now()
   1209 if _is_streaming_request(
   1210     kwargs=kwargs,
   1211     call_type=call_type,
   1212 ):

File /usr/local/lib/python3.12/dist-packages/litellm/main.py:3525, in completion(model, messages, timeout, temperature, top_p, n, stream, stream_options, stop, max_completion_tokens, max_tokens, modalities, prediction, audio, presence_penalty, frequency_penalty, logit_bias, user, reasoning_effort, response_format, seed, tools, tool_choice, logprobs, top_logprobs, parallel_tool_calls, web_search_options, deployment_id, extra_headers, functions, function_call, base_url, api_version, api_key, model_list, thinking, **kwargs)
   3522     return response
   3523 except Exception as e:
   3524     ## Map to OpenAI Exception
-> [3525](https://vscode-remote+code-002eoci-002elocal.vscode-resource.vscode-cdn.net/usr/local/lib/python3.12/dist-packages/litellm/main.py:3525)     raise exception_type(
   3526         model=model,
   3527         custom_llm_provider=custom_llm_provider,
   3528         original_exception=e,
   3529         completion_kwargs=args,
   3530         extra_kwargs=kwargs,
   3531     )

File /usr/local/lib/python3.12/dist-packages/litellm/litellm_core_utils/exception_mapping_utils.py:2301, in exception_type(model, original_exception, custom_llm_provider, completion_kwargs, extra_kwargs)
   2299 if exception_mapping_worked:
   2300     setattr(e, "litellm_response_headers", litellm_response_headers)
-> [2301](https://vscode-remote+code-002eoci-002elocal.vscode-resource.vscode-cdn.net/usr/local/lib/python3.12/dist-packages/litellm/litellm_core_utils/exception_mapping_utils.py:2301)     raise e
   2302 else:
   2303     for error_type in litellm.LITELLM_EXCEPTION_TYPES:

File /usr/local/lib/python3.12/dist-packages/litellm/litellm_core_utils/exception_mapping_utils.py:2277, in exception_type(model, original_exception, custom_llm_provider, completion_kwargs, extra_kwargs)
   2270             raise APIConnectionError(
   2271                 message="{} - {}".format(exception_provider, error_str),
   2272                 llm_provider=custom_llm_provider,
   2273                 model=model,
   2274                 request=original_exception.request,
   2275             )
   2276         else:
-> [2277](https://vscode-remote+code-002eoci-002elocal.vscode-resource.vscode-cdn.net/usr/local/lib/python3.12/dist-packages/litellm/litellm_core_utils/exception_mapping_utils.py:2277)             raise APIConnectionError(
   2278                 message="{}\n{}".format(
   2279                     str(original_exception), traceback.format_exc()
   2280                 ),
   2281                 llm_provider=custom_llm_provider,
   2282                 model=model,
   2283                 request=httpx.Request(
   2284                     method="POST", url="https://api.openai.com/v1/"
   2285                 ),  # stub the request
   2286             )
   2287 except Exception as e:
   2288     # LOGGING
   2289     exception_logging(
   2290         logger_fn=None,
   2291         additional_args={
   (...)   2295         exception=e,
   2296     )

APIConnectionError: litellm.APIConnectionError: Invalid port: '4000:generateContent'
Traceback (most recent call last):
  File "/config/.local/lib/python3.12/site-packages/httpx/_urlparse.py", line 409, in normalize_port
    port_as_int = int(port)
                  ^^^^^^^^^
ValueError: invalid literal for int() with base 10: '4000:generateContent'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.12/dist-packages/litellm/main.py", line 2685, in completion
    response = vertex_chat_completion.completion(  # type: ignore
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/litellm/llms/vertex_ai/gemini/vertex_and_google_ai_studio_gemini.py", line 1890, in completion
    response = client.post(url=url, headers=headers, json=data)  # type: ignore
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/litellm/llms/custom_httpx/http_handler.py", line 782, in post
    raise e
  File "/usr/local/lib/python3.12/dist-packages/litellm/llms/custom_httpx/http_handler.py", line 758, in post
    req = self.client.build_request(
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/config/.local/lib/python3.12/site-packages/httpx/_client.py", line 366, in build_request
    url = self._merge_url(url)
          ^^^^^^^^^^^^^^^^^^^^
  File "/config/.local/lib/python3.12/site-packages/httpx/_client.py", line 396, in _merge_url
    merge_url = URL(url)
                ^^^^^^^^
  File "/config/.local/lib/python3.12/site-packages/httpx/_urls.py", line 117, in __init__
    self._uri_reference = urlparse(url, **kwargs)
                          ^^^^^^^^^^^^^^^^^^^^^^^
  File "/config/.local/lib/python3.12/site-packages/httpx/_urlparse.py", line 321, in urlparse
    parsed_port: int | None = normalize_port(port, scheme)
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/config/.local/lib/python3.12/site-packages/httpx/_urlparse.py", line 411, in normalize_port
    raise InvalidURL(f"Invalid port: {port!r}")
httpx.InvalidURL: Invalid port: '4000:generateContent'

</p>
</details> 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug] Invalid URL construction when using litellm.completion under the Python SDK #13693

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug] Invalid URL construction when using litellm.completion under the Python SDK #13693

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions