Apex AI Proxy is a free, personal AI Gateway that runs on Cloudflare Workers. It aggregates multiple AI service providers behind a unified OpenAI-compatible API, allowing you to overcome rate limits and take advantage of free quotas from different providers.
Why you'll care:
- π Completely Free: Runs entirely on Cloudflare Workers' free plan
- π Load Balancing: Distributes requests across multiple providers to overcome rate limits
- π° Maximize Free Quotas: Take advantage of free tiers from different AI providers
- π Multiple API Keys: Register multiple keys for the same service provider
- π€ OpenAI Client Compatible: Works with any library that speaks OpenAI's API format
2025-04 Update
Apex AI Proxy now supports the new OpenAI /v1/responses
-style API, which is the latest standard for OpenAI-compatible services. This update is crucial for:
- Ecosystem Compatibility: Seamless integration with the latest OpenAI tools (e.g., Codex) and clients that require the
/v1/responses
API. - Future-Proofing: Ensures your proxy remains compatible with evolving OpenAI standards.
- /v1/responses API Support: You can now use the new response-based endpoints, unlocking compatibility with next-gen OpenAI clients and tools.
- Response ID-based Endpoints: Some endpoints now operate based on
response_id
. To support this, a newkv_namespaces
configuration is required for caching and managing response data. - Configuration Change: Please add the
kv_namespaces
field in your configuration (see below) to enable proper response caching and retrieval.
module.exports = {
// ...existing config...
kv_namespaces: [
{ binding: 'RESPONSE_KV', id: 'your-kv-namespace-id' }
],
};
Note: Without this configuration, some
/v1/responses
endpoints will not function correctly.
- Unlocks new OpenAI ecosystem tools (like Codex)
- Aligns with the latest API standards
- Enables advanced features that require response ID tracking
For more details, see the updated usage and configuration sections below.
- π Multi-Provider Support: Aggregate Azure, DeepSeek, Aliyun, and more behind one API
- π Smart Request Distribution: Automatically routes requests to available providers
- π Multiple API Key Management: Register multiple keys for the same provider to further increase limits
- π Protocol Translation: Handles different provider authentication methods and API formats
- π‘οΈ Robust Error Handling: Gracefully handles provider errors and failover
- Clone the repository:
git clone https://github.com/loadchange/apex-ai-proxy.git
cd apex-ai-proxy
- Install dependencies:
pnpm install
- Configure your providers (in
wrangler-config.js
):
// First, define your providers with their base URLs and API keys
const providerConfig = {
aliyuncs: {
base_url: 'https://dashscope.aliyuncs.com/compatible-mode/v1',
api_keys: ['your-aliyun-key'],
},
deepinfra: {
base_url: 'https://api.deepinfra.com/v1/openai',
api_keys: ['your-deepinfra-key'],
},
azure: {
base_url: 'https://:name.azure.com/openai/deployments/:model',
api_keys: ['your-azure-key'],
},
// Add more providers as needed
};
// Then, configure your models and assign providers to them
const modelProviderConfig = {
'gpt-4o-mini': {
providers: [
{
provider: 'azure',
model: 'gpt-4o-mini',
},
// Add more providers for the same model
],
},
'DeepSeek-R1': {
providers: [
{
provider: 'aliyuncs',
model: 'deepseek-r1',
},
{
provider: 'deepinfra',
model: 'deepseek-ai/DeepSeek-R1',
},
// You can still override provider settings for specific models if needed
{
provider: 'azure',
base_url: 'https://your-custom-endpoint.azure.com/openai/deployments/DeepSeek-R1',
api_key: 'your-custom-azure-key',
model: 'DeepSeek-R1',
},
],
},
};
- Deploy to Cloudflare Workers:
pnpm run deploy
- Rate Limit Issues: By distributing requests across multiple providers, you can overcome rate limits imposed by individual services
- Cost Optimization: Take advantage of free tiers from different providers
- API Consistency: Use a single, consistent API format (OpenAI-compatible) regardless of the underlying provider
- Simplified Integration: No need to modify your existing code that uses OpenAI clients
# Works with ANY OpenAI client!
from openai import OpenAI
client = OpenAI(
base_url="https://your-proxy.workers.dev/v1",
api_key="your-configured-api-key"
)
# Use any model you've configured in your proxy
response = client.chat.completions.create(
model="DeepSeek-R1", # This will be routed to one of your configured providers
messages=[{"role": "user", "content": "Why is this proxy awesome?"}]
)
You can configure multiple API keys for the same provider to further increase your rate limits:
{
provider: 'aliyuncs',
base_url: 'https://dashscope.aliyuncs.com/compatible-mode/v1',
api_keys: [
'your-first-aliyun-key',
'your-second-aliyun-key',
'your-third-aliyun-key'
],
model: 'deepseek-r1',
}
Found a bug or want to add support for more providers? PRs are welcome!
Apex AI Proxy provides comprehensive support for the Anthropic API format, enabling seamless integration with Claude Code and other Anthropic-compatible tools. This allows you to use any OpenAI-compatible provider (like DeepSeek, Azure OpenAI, etc.) with Anthropic ecosystem tools.
- Install Claude Code:
npm install -g @anthropic-ai/claude-code
- Configure Environment Variables:
export ANTHROPIC_BASE_URL=https://your-proxy.workers.dev
export ANTHROPIC_AUTH_TOKEN=your-configured-api-key
export API_TIMEOUT_MS=600000
export ANTHROPIC_MODEL=your-model-name
export ANTHROPIC_SMALL_FAST_MODEL=your-fast-model-name
- Enter Your Project Directory and Execute Claude Code:
cd my-project
claude
- Install Anthropic SDK:
pip install anthropic
- Configure and Use:
import anthropic
client = anthropic.Anthropic(
base_url="https://your-proxy.workers.dev",
api_key="your-configured-api-key"
)
message = client.messages.create(
model="your-model-name",
max_tokens=1000,
system="You are a helpful assistant.",
messages=[
{
"role": "user",
"content": [
{
"type": "text",
"text": "Hi, how are you?"
}
]
}
]
)
print(message.content)
Our implementation provides comprehensive support for the Anthropic Messages API with automatic conversion to/from OpenAI format for backend providers.
Field | Support Status | Notes |
---|---|---|
anthropic-beta |
Fully Supported | Automatically set for tool use (tools-2024-04-04 ) |
anthropic-version |
Fully Supported | Set to 2023-06-01 |
x-api-key |
Fully Supported | Primary authentication method |
Field | Support Status | Notes |
---|---|---|
model |
Fully Supported | Routes to configured provider models |
max_tokens |
Fully Supported | Required field, passed through to providers |
messages |
Fully Supported | Full message format conversion |
system |
Fully Supported | Supports both string and array formats |
temperature |
Fully Supported | Range [0.0 ~ 2.0] |
top_p |
Fully Supported | Passed through to compatible providers |
top_k |
Ignored | Not supported by most OpenAI-compatible providers |
stop_sequences |
Fully Supported | Converted to OpenAI stop parameter |
stream |
Fully Supported | Full streaming support with proper event formatting |
metadata |
Ignored | Not applicable for proxy use case |
Field | Sub-Field | Support Status | Notes |
---|---|---|---|
tools |
name |
Fully Supported | Function name mapping |
description |
Fully Supported | Function description | |
input_schema |
Fully Supported | JSON schema with automatic cleaning | |
tool_choice |
auto |
Fully Supported | Default tool selection |
any |
Supported | Mapped to auto for OpenAI compatibility |
|
tool |
Fully Supported | Specific tool selection | |
none |
Fully Supported | No tool usage |
Type | Sub-Field | Support Status | Notes |
---|---|---|---|
text |
text |
Fully Supported | Plain text content |
image |
Not Supported | Requires provider-specific implementation | |
tool_use |
id |
Fully Supported | Tool call identification |
name |
Fully Supported | Function name | |
input |
Fully Supported | Function arguments | |
tool_result |
tool_use_id |
Fully Supported | Tool response linking |
content |
Fully Supported | Tool execution results | |
is_error |
Supported | Error state indication |
Field | Support Status | Notes |
---|---|---|
id |
Fully Supported | Generated message ID |
type |
Fully Supported | Always message |
role |
Fully Supported | Always assistant |
content |
Fully Supported | Array of content blocks |
stop_reason |
Fully Supported | end_turn , max_tokens , tool_use |
usage |
Fully Supported | Token usage statistics |
Event Type | Support Status | Notes |
---|---|---|
message_start |
Fully Supported | Message initialization |
content_block_start |
Fully Supported | Content block initialization |
content_block_delta |
Fully Supported | Incremental content updates |
content_block_stop |
Fully Supported | Content block completion |
message_stop |
Fully Supported | Message completion |
- Automatic Protocol Conversion: Seamlessly converts between Anthropic and OpenAI formats
- Tool Call Buffering: Handles incremental tool call data in streaming responses
- Error Handling: Comprehensive error mapping and user-friendly error messages
- Provider Fallback: Automatic failover to alternative providers when available
- Schema Cleaning: Removes unsupported JSON schema fields for provider compatibility
- Image Content: Not supported (requires provider-specific multimodal implementations)
- Document Processing: Not supported in current version
- Advanced Content Types: Some specialized content types are not implemented
- Cache Control: Caching directives are ignored (handled at proxy level)
// Configure a model for Anthropic API usage
const modelProviderConfig = {
'claude-3-sonnet': {
providers: [
{
provider: 'deepseek',
model: 'deepseek-chat',
base_url: 'https://api.deepseek.com/v1',
api_key: 'your-deepseek-key',
},
{
provider: 'azure',
model: 'gpt-4o',
base_url: 'https://your-azure.openai.azure.com',
api_key: 'your-azure-key',
},
],
},
};
This configuration allows you to use claude-3-sonnet
as the model name in Anthropic API calls, while the proxy routes requests to your configured OpenAI-compatible providers.