diff --git a/models/cerebras.mdx b/models/cerebras.mdx index 6409e9f8..dd9d57d3 100644 --- a/models/cerebras.mdx +++ b/models/cerebras.mdx @@ -3,7 +3,7 @@ title: Cerebras description: Learn how to use Cerebras models in Agno. --- -[Cerebras Inference](https://inference-docs.cerebras.ai/introduction) provides high-speed, low-latency AI model inference powered by Cerebras Wafer-Scale Engines and CS-3 systems. Agno integrates directly with the Cerebras Python SDK, allowing you to use state-of-the-art Llama models with a simple interface. +[Cerebras Inference](https://inference-docs.cerebras.ai/introduction) provides high-speed, low-latency AI model inference powered by Cerebras Wafer-Scale Engines and CS-3 systems. Agno integrates directly with the Cerebras Python SDK, allowing you to use state-of-the-art models with a simple interface. ## Prerequisites @@ -29,7 +29,7 @@ from agno.agent import Agent from agno.models.cerebras import Cerebras agent = Agent( - model=Cerebras(id="llama-4-scout-17b-16e-instruct"), + model=Cerebras(id="llama-3.3-70b"), markdown=True, ) @@ -39,32 +39,64 @@ agent.print_response("write a two sentence horror story") ## Supported Models -Cerebras currently supports the following models (see [docs](https://inference-docs.cerebras.ai/introduction) for the latest list): +Cerebras currently supports the following production models (see [docs](https://inference-docs.cerebras.ai/models) for the latest list): -| Model Name | Model ID | Parameters | Knowledge | -| ------------------------------- | ------------------------------ | ----------- | ------------- | -| Llama 4 Scout | llama-4-scout-17b-16e-instruct | 109 billion | August 2024 | -| Llama 3.1 8B | llama3.1-8b | 8 billion | March 2023 | -| Llama 3.3 70B | llama-3.3-70b | 70 billion | December 2023 | -| DeepSeek R1 Distill Llama 70B* | deepseek-r1-distill-llama-70b | 70 billion | December 2023 | - -\* DeepSeek R1 Distill Llama 70B is available in private preview. +| Model Name | Model ID | Parameters | Speed (tokens/s) | +| :--------------- | :-------------- | :---------- | :--------------- | +| Llama 3.1 8B | `llama3.1-8b` | 8 billion | ~2200 | +| Llama 3.3 70B | `llama-3.3-70b` | 70 billion | ~2100 | +| OpenAI GPT OSS | `gpt-oss-120b` | 120 billion | ~3000 | +| Qwen 3 32B | `qwen-3-32b` | 32 billion | ~2600 | ## Configuration Options The `Cerebras` class accepts the following parameters: -| Parameter | Type | Description | Default | -| ------------- | -------------- | --------------------------------------------------- | ------------------------------ | -| `id` | str | Model identifier (e.g., "llama-4-scout-17b-16e-instruct") | **Required** | -| `name` | str | Display name for the model | "Cerebras" | -| `provider` | str | Provider name | "Cerebras" | -| `api_key` | Optional[str] | API key (falls back to `CEREBRAS_API_KEY` env var) | None | -| `max_tokens` | Optional[int] | Maximum tokens in the response | None | -| `temperature` | float | Sampling temperature | 0.7 | -| `top_p` | float | Top-p sampling value | 1.0 | -| `request_params` | Optional[Dict[str, Any]] | Additional request parameters | None | +| Parameter | Type | Description | Default | +| :--------------- | :----------------------- | :-------------------------------------------------- | :----------- | +| `id` | str | Model identifier (e.g., "llama-3.3-70b") | **Required** | +| `name` | str | Display name for the model | "Cerebras" | +| `provider` | str | Provider name | "Cerebras" | +| `api_key` | Optional[str] | API key (falls back to `CEREBRAS_API_KEY` env var) | None | +| `temperature` | float | Sampling temperature | 0.7 | +| `top_p` | float | Top-p sampling value | 1.0 | +| `request_params` | Optional[Dict[str, Any]] | Additional request parameters | None | + +### Example with Custom Parameters + +```python +from agno.agent import Agent +from agno.models.cerebras import Cerebras + +agent = Agent( + model=Cerebras( + id="llama-3.3-70b", + temperature=0.7 + ), + markdown=True +) + +agent.print_response("Explain quantum computing in simple terms") +``` + +### Example with API Key + +If you need to pass the API key directly instead of using an environment variable: + +```python +from agno.agent import Agent +from agno.models.cerebras import Cerebras +agent = Agent( + model=Cerebras( + id="llama-3.3-70b", + api_key="your-api-key-here" + ), + markdown=True +) + +agent.print_response("write a two sentence horror story") +``` ## Resources @@ -72,4 +104,4 @@ The `Cerebras` class accepts the following parameters: - [Cerebras API Reference](https://inference-docs.cerebras.ai/api-reference/chat-completions) ### SDK Examples -- View more examples [here](../examples/models/cerebras). \ No newline at end of file +- View more examples [here](../examples/models/cerebras). diff --git a/models/cerebras_openai.mdx b/models/cerebras_openai.mdx index 70ebc0c2..eb10d367 100644 --- a/models/cerebras_openai.mdx +++ b/models/cerebras_openai.mdx @@ -3,30 +3,33 @@ title: Cerebras OpenAI description: Learn how to use Cerebras OpenAI with Agno. --- -## OpenAI-Compatible Integration +[Cerebras Inference](https://inference-docs.cerebras.ai/introduction) provides high-speed, low-latency AI model inference powered by Cerebras Wafer-Scale Engines and CS-3 systems. Agno integrates with Cerebras through an OpenAI-compatible interface, making it easy to use with tools and libraries that expect the OpenAI API. -Cerebras can also be used via an OpenAI-compatible interface, making it easy to integrate with tools and libraries that expect the OpenAI API. +## Prerequisites -### Using the OpenAI-Compatible Class +To use Cerebras with the OpenAI-compatible interface, you need to: -The `CerebrasOpenAI` class provides an OpenAI-style interface for Cerebras models: +1. **Install the required packages:** + ```shell + pip install openai + ``` -First, install openai: +2. **Set your API key:** + The OpenAI SDK expects your API key to be available as an environment variable: + ```shell + export CEREBRAS_API_KEY=your_api_key_here + ``` -```shell -pip install openai -``` +## Basic Usage +The `CerebrasOpenAI` class provides an OpenAI-style interface for Cerebras models: ```python from agno.agent import Agent from agno.models.cerebras import CerebrasOpenAI agent = Agent( - model=CerebrasOpenAI( - id="llama-4-scout-17b-16e-instruct", # Model ID to use - # base_url="https://api.cerebras.ai", # Optional: default endpoint for Cerebras - ), + model=CerebrasOpenAI(id="llama-3.3-70b"), markdown=True, ) @@ -34,17 +37,69 @@ agent = Agent( agent.print_response("write a two sentence horror story") ``` -### Configuration Options +## Supported Models + +Cerebras currently supports the following production models (see [docs](https://inference-docs.cerebras.ai/models) for the latest list): + +| Model Name | Model ID | Parameters | Speed (tokens/s) | +| :--------------- | :-------------- | :---------- | :--------------- | +| Llama 3.1 8B | `llama3.1-8b` | 8 billion | ~2200 | +| Llama 3.3 70B | `llama-3.3-70b` | 70 billion | ~2100 | +| OpenAI GPT OSS | `gpt-oss-120b` | 120 billion | ~3000 | +| Qwen 3 32B | `qwen-3-32b` | 32 billion | ~2600 | + +## Configuration Options The `CerebrasOpenAI` class accepts the following parameters: -| Parameter | Type | Description | Default | -| ----------- | ------------ | -------------------------------------------------------------- | ------------------------------ | -| `id` | str | Model identifier (e.g., "llama-4-scout-17b-16e-instruct") | **Required** | -| `name` | str | Display name for the model | "Cerebras" | -| `provider` | str | Provider name | "Cerebras" | -| `api_key` | str | API key (falls back to CEREBRAS_API_KEY environment variable) | None | -| `base_url` | str | URL of the Cerebras OpenAI-compatible endpoint | "https://api.cerebras.ai" | +| Parameter | Type | Description | Default | +| :--------- | :----------- | :------------------------------------------------------------ | :------------------------ | +| `id` | str | Model identifier (e.g., "llama-3.3-70b") | **Required** | +| `name` | str | Display name for the model | "Cerebras" | +| `provider` | str | Provider name | "Cerebras" | +| `api_key` | str | API key (falls back to `CEREBRAS_API_KEY` env var) | None | +| `base_url` | str | URL of the Cerebras OpenAI-compatible endpoint | "https://api.cerebras.ai" | + +### Example with Custom Parameters + +```python +from agno.agent import Agent +from agno.models.cerebras import CerebrasOpenAI + +agent = Agent( + model=CerebrasOpenAI( + id="llama-3.3-70b", + base_url="https://api.cerebras.ai/v1" + ), + markdown=True +) + +agent.print_response("Explain quantum computing in simple terms") +``` + +### Example with API Key + +If you need to pass the API key directly instead of using an environment variable: + +```python +from agno.agent import Agent +from agno.models.cerebras import CerebrasOpenAI + +agent = Agent( + model=CerebrasOpenAI( + id="llama-3.3-70b", + api_key="your-api-key-here" + ), + markdown=True +) + +agent.print_response("write a two sentence horror story") +``` + +## Resources + +- [Cerebras Inference Documentation](https://inference-docs.cerebras.ai/introduction) +- [Cerebras API Reference](https://inference-docs.cerebras.ai/api-reference/chat-completions) -### Examples -- View more examples [here](../examples/models/cerebras_openai). \ No newline at end of file +### SDK Examples +- View more examples [here](../examples/models/cerebras_openai).