Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
76 changes: 54 additions & 22 deletions models/cerebras.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ title: Cerebras
description: Learn how to use Cerebras models in Agno.
---

[Cerebras Inference](https://inference-docs.cerebras.ai/introduction) provides high-speed, low-latency AI model inference powered by Cerebras Wafer-Scale Engines and CS-3 systems. Agno integrates directly with the Cerebras Python SDK, allowing you to use state-of-the-art Llama models with a simple interface.
[Cerebras Inference](https://inference-docs.cerebras.ai/introduction) provides high-speed, low-latency AI model inference powered by Cerebras Wafer-Scale Engines and CS-3 systems. Agno integrates directly with the Cerebras Python SDK, allowing you to use state-of-the-art models with a simple interface.

## Prerequisites

Expand All @@ -29,7 +29,7 @@ from agno.agent import Agent
from agno.models.cerebras import Cerebras

agent = Agent(
model=Cerebras(id="llama-4-scout-17b-16e-instruct"),
model=Cerebras(id="llama-3.3-70b"),
markdown=True,
)

Expand All @@ -39,37 +39,69 @@ agent.print_response("write a two sentence horror story")

## Supported Models

Cerebras currently supports the following models (see [docs](https://inference-docs.cerebras.ai/introduction) for the latest list):
Cerebras currently supports the following production models (see [docs](https://inference-docs.cerebras.ai/models) for the latest list):

| Model Name | Model ID | Parameters | Knowledge |
| ------------------------------- | ------------------------------ | ----------- | ------------- |
| Llama 4 Scout | llama-4-scout-17b-16e-instruct | 109 billion | August 2024 |
| Llama 3.1 8B | llama3.1-8b | 8 billion | March 2023 |
| Llama 3.3 70B | llama-3.3-70b | 70 billion | December 2023 |
| DeepSeek R1 Distill Llama 70B* | deepseek-r1-distill-llama-70b | 70 billion | December 2023 |

\* DeepSeek R1 Distill Llama 70B is available in private preview.
| Model Name | Model ID | Parameters | Speed (tokens/s) |
| :--------------- | :-------------- | :---------- | :--------------- |
| Llama 3.1 8B | `llama3.1-8b` | 8 billion | ~2200 |
| Llama 3.3 70B | `llama-3.3-70b` | 70 billion | ~2100 |
| OpenAI GPT OSS | `gpt-oss-120b` | 120 billion | ~3000 |
| Qwen 3 32B | `qwen-3-32b` | 32 billion | ~2600 |

## Configuration Options

The `Cerebras` class accepts the following parameters:

| Parameter | Type | Description | Default |
| ------------- | -------------- | --------------------------------------------------- | ------------------------------ |
| `id` | str | Model identifier (e.g., "llama-4-scout-17b-16e-instruct") | **Required** |
| `name` | str | Display name for the model | "Cerebras" |
| `provider` | str | Provider name | "Cerebras" |
| `api_key` | Optional[str] | API key (falls back to `CEREBRAS_API_KEY` env var) | None |
| `max_tokens` | Optional[int] | Maximum tokens in the response | None |
| `temperature` | float | Sampling temperature | 0.7 |
| `top_p` | float | Top-p sampling value | 1.0 |
| `request_params` | Optional[Dict[str, Any]] | Additional request parameters | None |
| Parameter | Type | Description | Default |
| :--------------- | :----------------------- | :-------------------------------------------------- | :----------- |
| `id` | str | Model identifier (e.g., "llama-3.3-70b") | **Required** |
| `name` | str | Display name for the model | "Cerebras" |
| `provider` | str | Provider name | "Cerebras" |
| `api_key` | Optional[str] | API key (falls back to `CEREBRAS_API_KEY` env var) | None |
| `temperature` | float | Sampling temperature | 0.7 |
| `top_p` | float | Top-p sampling value | 1.0 |
| `request_params` | Optional[Dict[str, Any]] | Additional request parameters | None |

### Example with Custom Parameters

```python
from agno.agent import Agent
from agno.models.cerebras import Cerebras

agent = Agent(
model=Cerebras(
id="llama-3.3-70b",
temperature=0.7
),
markdown=True
)

agent.print_response("Explain quantum computing in simple terms")
```

### Example with API Key

If you need to pass the API key directly instead of using an environment variable:

```python
from agno.agent import Agent
from agno.models.cerebras import Cerebras

agent = Agent(
model=Cerebras(
id="llama-3.3-70b",
api_key="your-api-key-here"
),
markdown=True
)

agent.print_response("write a two sentence horror story")
```

## Resources

- [Cerebras Inference Documentation](https://inference-docs.cerebras.ai/introduction)
- [Cerebras API Reference](https://inference-docs.cerebras.ai/api-reference/chat-completions)

### SDK Examples
- View more examples [here](../examples/models/cerebras).
- View more examples [here](../examples/models/cerebras).
99 changes: 77 additions & 22 deletions models/cerebras_openai.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,48 +3,103 @@ title: Cerebras OpenAI
description: Learn how to use Cerebras OpenAI with Agno.
---

## OpenAI-Compatible Integration
[Cerebras Inference](https://inference-docs.cerebras.ai/introduction) provides high-speed, low-latency AI model inference powered by Cerebras Wafer-Scale Engines and CS-3 systems. Agno integrates with Cerebras through an OpenAI-compatible interface, making it easy to use with tools and libraries that expect the OpenAI API.

Cerebras can also be used via an OpenAI-compatible interface, making it easy to integrate with tools and libraries that expect the OpenAI API.
## Prerequisites

### Using the OpenAI-Compatible Class
To use Cerebras with the OpenAI-compatible interface, you need to:

The `CerebrasOpenAI` class provides an OpenAI-style interface for Cerebras models:
1. **Install the required packages:**
```shell
pip install openai
```

First, install openai:
2. **Set your API key:**
The OpenAI SDK expects your API key to be available as an environment variable:
```shell
export CEREBRAS_API_KEY=your_api_key_here
```

```shell
pip install openai
```
## Basic Usage

The `CerebrasOpenAI` class provides an OpenAI-style interface for Cerebras models:

```python
from agno.agent import Agent
from agno.models.cerebras import CerebrasOpenAI

agent = Agent(
model=CerebrasOpenAI(
id="llama-4-scout-17b-16e-instruct", # Model ID to use
# base_url="https://api.cerebras.ai", # Optional: default endpoint for Cerebras
),
model=CerebrasOpenAI(id="llama-3.3-70b"),
markdown=True,
)

# Print the response in the terminal
agent.print_response("write a two sentence horror story")
```

### Configuration Options
## Supported Models

Cerebras currently supports the following production models (see [docs](https://inference-docs.cerebras.ai/models) for the latest list):

| Model Name | Model ID | Parameters | Speed (tokens/s) |
| :--------------- | :-------------- | :---------- | :--------------- |
| Llama 3.1 8B | `llama3.1-8b` | 8 billion | ~2200 |
| Llama 3.3 70B | `llama-3.3-70b` | 70 billion | ~2100 |
| OpenAI GPT OSS | `gpt-oss-120b` | 120 billion | ~3000 |
| Qwen 3 32B | `qwen-3-32b` | 32 billion | ~2600 |

## Configuration Options

The `CerebrasOpenAI` class accepts the following parameters:

| Parameter | Type | Description | Default |
| ----------- | ------------ | -------------------------------------------------------------- | ------------------------------ |
| `id` | str | Model identifier (e.g., "llama-4-scout-17b-16e-instruct") | **Required** |
| `name` | str | Display name for the model | "Cerebras" |
| `provider` | str | Provider name | "Cerebras" |
| `api_key` | str | API key (falls back to CEREBRAS_API_KEY environment variable) | None |
| `base_url` | str | URL of the Cerebras OpenAI-compatible endpoint | "https://api.cerebras.ai" |
| Parameter | Type | Description | Default |
| :--------- | :----------- | :------------------------------------------------------------ | :------------------------ |
| `id` | str | Model identifier (e.g., "llama-3.3-70b") | **Required** |
| `name` | str | Display name for the model | "Cerebras" |
| `provider` | str | Provider name | "Cerebras" |
| `api_key` | str | API key (falls back to `CEREBRAS_API_KEY` env var) | None |
| `base_url` | str | URL of the Cerebras OpenAI-compatible endpoint | "https://api.cerebras.ai" |

### Example with Custom Parameters

```python
from agno.agent import Agent
from agno.models.cerebras import CerebrasOpenAI

agent = Agent(
model=CerebrasOpenAI(
id="llama-3.3-70b",
base_url="https://api.cerebras.ai/v1"
),
markdown=True
)

agent.print_response("Explain quantum computing in simple terms")
```

### Example with API Key

If you need to pass the API key directly instead of using an environment variable:

```python
from agno.agent import Agent
from agno.models.cerebras import CerebrasOpenAI

agent = Agent(
model=CerebrasOpenAI(
id="llama-3.3-70b",
api_key="your-api-key-here"
),
markdown=True
)

agent.print_response("write a two sentence horror story")
```

## Resources

- [Cerebras Inference Documentation](https://inference-docs.cerebras.ai/introduction)
- [Cerebras API Reference](https://inference-docs.cerebras.ai/api-reference/chat-completions)

### Examples
- View more examples [here](../examples/models/cerebras_openai).
### SDK Examples
- View more examples [here](../examples/models/cerebras_openai).