agno-agi · sebastiand-cerebras · Nov 11, 2025
diff --git a/models/cerebras.mdx b/models/cerebras.mdx
@@ -3,7 +3,7 @@ title: Cerebras
 description: Learn how to use Cerebras models in Agno.
 ---
 
-[Cerebras Inference](https://inference-docs.cerebras.ai/introduction) provides high-speed, low-latency AI model inference powered by Cerebras Wafer-Scale Engines and CS-3 systems. Agno integrates directly with the Cerebras Python SDK, allowing you to use state-of-the-art Llama models with a simple interface.
+[Cerebras Inference](https://inference-docs.cerebras.ai/introduction) provides high-speed, low-latency AI model inference powered by Cerebras Wafer-Scale Engines and CS-3 systems. Agno integrates directly with the Cerebras Python SDK, allowing you to use state-of-the-art models with a simple interface.
 
 ## Prerequisites
 
@@ -29,7 +29,7 @@ from agno.agent import Agent
 from agno.models.cerebras import Cerebras
 
 agent = Agent(
-    model=Cerebras(id="llama-4-scout-17b-16e-instruct"),
+    model=Cerebras(id="llama-3.3-70b"),
     markdown=True,
 )
 
@@ -39,37 +39,69 @@ agent.print_response("write a two sentence horror story")
 
 ## Supported Models
 
-Cerebras currently supports the following models (see [docs](https://inference-docs.cerebras.ai/introduction) for the latest list):
+Cerebras currently supports the following production models (see [docs](https://inference-docs.cerebras.ai/models) for the latest list):
 
-| Model Name                      | Model ID                       | Parameters  | Knowledge     |
-| ------------------------------- | ------------------------------ | ----------- | ------------- |
-| Llama 4 Scout                   | llama-4-scout-17b-16e-instruct | 109 billion | August 2024   |
-| Llama 3.1 8B                    | llama3.1-8b                    | 8 billion   | March 2023    |
-| Llama 3.3 70B                   | llama-3.3-70b                  | 70 billion  | December 2023 |
-| DeepSeek R1 Distill Llama 70B*  | deepseek-r1-distill-llama-70b  | 70 billion  | December 2023 |
-
-\* DeepSeek R1 Distill Llama 70B is available in private preview.
+| Model Name       | Model ID        | Parameters  | Speed (tokens/s) |
+| :--------------- | :-------------- | :---------- | :--------------- |
+| Llama 3.1 8B     | `llama3.1-8b`   | 8 billion   | ~2200            |
+| Llama 3.3 70B    | `llama-3.3-70b` | 70 billion  | ~2100            |
+| OpenAI GPT OSS   | `gpt-oss-120b`  | 120 billion | ~3000            |
+| Qwen 3 32B       | `qwen-3-32b`    | 32 billion  | ~2600            |
 
 ## Configuration Options
 
 The `Cerebras` class accepts the following parameters:
 
-| Parameter     | Type           | Description                                         | Default                        |
-| ------------- | -------------- | --------------------------------------------------- | ------------------------------ |
-| `id`          | str            | Model identifier (e.g., "llama-4-scout-17b-16e-instruct") | **Required**                   |
-| `name`        | str            | Display name for the model                          | "Cerebras"                     |
-| `provider`    | str            | Provider name                                       | "Cerebras"                     |
-| `api_key`     | Optional[str]  | API key (falls back to `CEREBRAS_API_KEY` env var)  | None                           |
-| `max_tokens`  | Optional[int]  | Maximum tokens in the response                      | None                           |
-| `temperature` | float          | Sampling temperature                                | 0.7                            |
-| `top_p`       | float          | Top-p sampling value                                | 1.0                            |
-| `request_params` | Optional[Dict[str, Any]] | Additional request parameters           | None                           |
+| Parameter        | Type                     | Description                                         | Default      |
+| :--------------- | :----------------------- | :-------------------------------------------------- | :----------- |
+| `id`             | str                      | Model identifier (e.g., "llama-3.3-70b")            | **Required** |
+| `name`           | str                      | Display name for the model                          | "Cerebras"   |
+| `provider`       | str                      | Provider name                                       | "Cerebras"   |
+| `api_key`        | Optional[str]            | API key (falls back to `CEREBRAS_API_KEY` env var)  | None         |
+| `temperature`    | float                    | Sampling temperature                                | 0.7          |
+| `top_p`          | float                    | Top-p sampling value                                | 1.0          |
+| `request_params` | Optional[Dict[str, Any]] | Additional request parameters                       | None         |
+
+### Example with Custom Parameters
+
+```python
+from agno.agent import Agent
+from agno.models.cerebras import Cerebras
+
+agent = Agent(
+    model=Cerebras(
+        id="llama-3.3-70b",
+        temperature=0.7
+    ),
+    markdown=True
+)
+
+agent.print_response("Explain quantum computing in simple terms")
+```
+
+### Example with API Key
+
+If you need to pass the API key directly instead of using an environment variable:
+
+```python
+from agno.agent import Agent
+from agno.models.cerebras import Cerebras
 
+agent = Agent(
+    model=Cerebras(
+        id="llama-3.3-70b",
+        api_key="your-api-key-here"
+    ),
+    markdown=True
+)
+
+agent.print_response("write a two sentence horror story")
+```
 
 ## Resources
 
 - [Cerebras Inference Documentation](https://inference-docs.cerebras.ai/introduction)
 - [Cerebras API Reference](https://inference-docs.cerebras.ai/api-reference/chat-completions)
 
 ### SDK Examples
-- View more examples [here](../examples/models/cerebras).
+- View more examples [here](../examples/models/cerebras).
diff --git a/models/cerebras_openai.mdx b/models/cerebras_openai.mdx
@@ -3,48 +3,103 @@ title: Cerebras OpenAI
 description: Learn how to use Cerebras OpenAI with Agno.
 ---
 
-## OpenAI-Compatible Integration
+[Cerebras Inference](https://inference-docs.cerebras.ai/introduction) provides high-speed, low-latency AI model inference powered by Cerebras Wafer-Scale Engines and CS-3 systems. Agno integrates with Cerebras through an OpenAI-compatible interface, making it easy to use with tools and libraries that expect the OpenAI API.
 
-Cerebras can also be used via an OpenAI-compatible interface, making it easy to integrate with tools and libraries that expect the OpenAI API.
+## Prerequisites
 
-### Using the OpenAI-Compatible Class
+To use Cerebras with the OpenAI-compatible interface, you need to:
 
-The `CerebrasOpenAI` class provides an OpenAI-style interface for Cerebras models:
+1. **Install the required packages:**
+   ```shell
+   pip install openai
+   ```
 
-First, install openai:
+2. **Set your API key:**
+   The OpenAI SDK expects your API key to be available as an environment variable:
+   ```shell
+   export CEREBRAS_API_KEY=your_api_key_here
+   ```
 
-```shell
-pip install openai
-```
+## Basic Usage
 
+The `CerebrasOpenAI` class provides an OpenAI-style interface for Cerebras models:
 
 ```python
 from agno.agent import Agent
 from agno.models.cerebras import CerebrasOpenAI
 
 agent = Agent(
-    model=CerebrasOpenAI(
-        id="llama-4-scout-17b-16e-instruct",  # Model ID to use
-        # base_url="https://api.cerebras.ai", # Optional: default endpoint for Cerebras
-    ),
+    model=CerebrasOpenAI(id="llama-3.3-70b"),
     markdown=True,
 )
 
 # Print the response in the terminal
 agent.print_response("write a two sentence horror story")
 ```
 
-### Configuration Options
+## Supported Models
+
+Cerebras currently supports the following production models (see [docs](https://inference-docs.cerebras.ai/models) for the latest list):
+
+| Model Name       | Model ID        | Parameters  | Speed (tokens/s) |
+| :--------------- | :-------------- | :---------- | :--------------- |
+| Llama 3.1 8B     | `llama3.1-8b`   | 8 billion   | ~2200            |
+| Llama 3.3 70B    | `llama-3.3-70b` | 70 billion  | ~2100            |
+| OpenAI GPT OSS   | `gpt-oss-120b`  | 120 billion | ~3000            |
+| Qwen 3 32B       | `qwen-3-32b`    | 32 billion  | ~2600            |
+
+## Configuration Options
 
 The `CerebrasOpenAI` class accepts the following parameters:
 
-| Parameter   | Type         | Description                                                    | Default                        |
-| ----------- | ------------ | -------------------------------------------------------------- | ------------------------------ |
-| `id`        | str          | Model identifier (e.g., "llama-4-scout-17b-16e-instruct")      | **Required**                   |
-| `name`      | str          | Display name for the model                                     | "Cerebras"                     |
-| `provider`  | str          | Provider name                                                  | "Cerebras"                     |
-| `api_key`   | str          | API key (falls back to CEREBRAS_API_KEY environment variable)  | None                           |
-| `base_url`  | str          | URL of the Cerebras OpenAI-compatible endpoint                 | "https://api.cerebras.ai"      |
+| Parameter  | Type         | Description                                                   | Default                   |
+| :--------- | :----------- | :------------------------------------------------------------ | :------------------------ |
+| `id`       | str          | Model identifier (e.g., "llama-3.3-70b")                      | **Required**              |
+| `name`     | str          | Display name for the model                                    | "Cerebras"                |
+| `provider` | str          | Provider name                                                 | "Cerebras"                |
+| `api_key`  | str          | API key (falls back to `CEREBRAS_API_KEY` env var)            | None                      |
+| `base_url` | str          | URL of the Cerebras OpenAI-compatible endpoint                | "https://api.cerebras.ai" |
+
+### Example with Custom Parameters
+
+```python
+from agno.agent import Agent
+from agno.models.cerebras import CerebrasOpenAI
+
+agent = Agent(
+    model=CerebrasOpenAI(
+        id="llama-3.3-70b",
+        base_url="https://api.cerebras.ai/v1"
+    ),
+    markdown=True
+)
+
+agent.print_response("Explain quantum computing in simple terms")
+```
+
+### Example with API Key
+
+If you need to pass the API key directly instead of using an environment variable:
+
+```python
+from agno.agent import Agent
+from agno.models.cerebras import CerebrasOpenAI
+
+agent = Agent(
+    model=CerebrasOpenAI(
+        id="llama-3.3-70b",
+        api_key="your-api-key-here"
+    ),
+    markdown=True
+)
+
+agent.print_response("write a two sentence horror story")
+```
+
+## Resources
+
+- [Cerebras Inference Documentation](https://inference-docs.cerebras.ai/introduction)
+- [Cerebras API Reference](https://inference-docs.cerebras.ai/api-reference/chat-completions)
 
-### Examples
-- View more examples [here](../examples/models/cerebras_openai). 
+### SDK Examples
+- View more examples [here](../examples/models/cerebras_openai).