Skip to content

[AI Endpoints] - Billing Guide - Update prices, columns name and add new models #8223

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 31, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: AI Endpoints - Billing and lifecycle
excerpt: Learn how we bill AI Endpoints
updated: 2025-04-28
updated: 2025-07-31
---

> [!primary]
Expand Down Expand Up @@ -31,36 +31,36 @@ By following this model lifecycle process, OVHcloud ensures that customers are w

## Billing principles

Here is the model billing overview for AI Endpoints.

| Category | Model | Price ($) | Price (€) | Unit Price |
| -------------- | --------------- | ------ | ------ | ---------- |
| Large Language Model (LLM) | Llama 3.3 70B Instruct | 0.74 | 0.67 | per 1M tokens |
| Large Language Model (LLM) | Llama 3.1 70B Instruct | 0.74 | 0.67 | per 1M tokens |
| Large Language Model (LLM) | Mixtral 8x7B Instruct v0.1 | 0.70 | 0.63 | per 1M tokens |
| Large Language Model (LLM) | Mistral-Nemo-Instruct-2407 | 0.14 | 0.13 | per 1M tokens |
| Large Language Model (LLM) | Llama 3.1 8B Instruct | 0.10 | 0.10 | per 1M tokens |
| Large Language Model (LLM) | Mistral 7B Instruct v0.3 | 0.11 | 0.10 | per 1M tokens |
| Reasoning LLM | DeepSeek R1 Distill Llama 70B | 0.74 | 0.67 | per 1M tokens |
| Reasoning LLM | Qwen3 32B | 0.09 | 0.08 | per 1M tokens |
| Code LLM | Qwen2.5 Coder 32B Instruct | 0.96 | 0.87 | per 1M tokens |
| Code LLM | Mamba Codestral 7B v0.1 | 0.21 | 0.19 | per 1M tokens |
| Visual LLM | Mistral Small 3.2 24B Instruct 2506 | 0.10 | 0.09 | per 1M tokens |
| Visual LLM | Qwen2.5 VL 72B Instruct | 1.01 | 0.91 | per 1M tokens |
| Visual LLM | Llava Next Mistral 7B | 0.32 | 0.29 | per 1M tokens |
| Embeddings | BGE Multilingual Gemma2 | 0.01 | 0.01 | per 1M tokens |
| Embeddings | BGE-M3 | 0.01 | 0.01 | per 1M tokens |
| Embeddings | BGE Base EN v1.5 | 0.01 | 0.005 | per 1M tokens |
| Natural Language Processing (NLP) | Roberta Base Go Emotions | Free | Free | per 1M characters |
| Natural Language Processing (NLP) | Bert Base Multilingual uncased sentiment | Free | Free | per 1M characters |
| Natural Language Processing (NLP) | Bert Base NER | Free | Free | per 1M characters |
| Natural Language Processing (NLP) | Bart Large CNN | Free | Free | per 1M characters |
| Image generation| Stable Diffusion XL | Free | Free | per image |
| Speech to Text | RIVA Automatic Speech Recognition | Free | Free | per hour |
| Text to Speech | RIVA Text-to-Speech | Free | Free | per hour |
| Translation | T5-Large | Free | Free | per 1M characters |
| Computer vision | YOLOv11 Object Detection | Free | Free | per image |
| Computer vision | YOLOv11 Image Segmentation | Free | Free | per image |
Here is the model billing overview for AI Endpoints models:

| Category | Model | Input Price (\$) | Output Price (\$) | Input Price (€) | Output Price (€) | Unit Price |
| -------------------------- | -------------------------- | ---------------- | ----------------- | --------------- | ---------------- | --------------------------------- |
| Large Language Model (LLM) | Llama 3.3 70B Instruct | 0.74 | 0.74 | 0.67 | 0.67 | Per 1 million tokens |
| Large Language Model (LLM) | Llama 3.1 70B Instruct | 0.74 | 0.74 | 0.67 | 0.67 | Per 1 million tokens |
| Large Language Model (LLM) | Mixtral 8x7B Instruct v0.1 | 0.70 | 0.70 | 0.63 | 0.63 | Per 1 million tokens |
| Large Language Model (LLM) | Mistral Nemo Instruct 2407 | 0.14 | 0.14 | 0.13 | 0.13 | Per 1 million tokens |
| Large Language Model (LLM) | Llama 3.1 8B Instruct | 0.11 | 0.11 | 0.10 | 0.10 | Per 1 million tokens |
| Large Language Model (LLM) | Mistral 7B Instruct v0.3 | 0.11 | 0.11 | 0.10 | 0.10 | Per 1 million tokens |
| Reasoning LLM | Qwen 3 32B | 0.09 | 0.25 | 0.08 | 0.23 | Per 1 million tokens |
| Reasoning LLM | DeepSeek R1 Distill Llama 70B | 0.74 | 0.74 | 0.67 | 0.67 | Per 1 million tokens |
| Code LLM | Qwen 2.5 Coder 32B Instruct | 0.96 | 0.96 | 0.87 | 0.87 | Per 1 million tokens |
| Code LLM | Mamba Codestral 7B v0.1 | 0.21 | 0.21 | 0.19 | 0.19 | Per 1 million tokens |
| Visual LLM | Mistral Small 3.2 24B Instruct 2506 | 0.10 | 0.31 | 0.09 | 0.28 | Per 1 million tokens |
| Visual LLM | Qwen 2.5 VL 72B Instruct | 1.01 | 1.01 | 0.91 | 0.91 | Per 1 million tokens |
| Visual LLM | Llava Next Mistral 7B | 0.32 | 0.32 | 0.29 | 0.29 | Per 1 million tokens |
| Embeddings | BGE Multilingual Gemma2 | 0.01 | 0.01 | Free | Free | Per 1 million tokens |
| Embeddings | BGE M3 | 0.01 | 0.01 | Free | Free | Per 1 million tokens |
| Embeddings | BGE Base EN v1.5 | 0.01 | 0.01 | Free | Free | Per 1 million tokens |
| Natural Language Processing (NLP) | Roberta Base Go Emotions | Free | Free | Free | Free | Per 1 million characters |
| Natural Language Processing (NLP) | Bert Base Multilingual uncased sentiment | Free | Free | Free | Free | Per 1 million characters |
| Natural Language Processing (NLP) | Bert Base NER | Free | Free | Free | Free | Per 1 million characters |
| Natural Language Processing (NLP) | Bart Large CNN | Free | Free | Free | Free | Per 1 million characters |
| Image generation | Stable Diffusion XL | Free | Free | Free | Free | Per image |
| Audio Analysis | RIVA Automatic Speech Recognition | Free | Free | Free | Free | Per hour |
| Audio Analysis | RIVA Text-to-Speech | Free | Free | Free | Free | Per hour |
| Translation | T5-Large | Free | Free | Free | Free | Per 1 million characters |
| Computer vision | YOLOv11 Object Detection | Free | Free | Free | Free | Per image |
| Computer vision | YOLOv11 Image Segmentation | Free | Free | Free | Free | Per image |

## Feedback

Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: AI Endpoints - Billing and lifecycle
excerpt: Learn how we bill AI Endpoints
updated: 2025-04-28
updated: 2025-07-31
---

> [!primary]
Expand Down Expand Up @@ -31,36 +31,36 @@ By following this model lifecycle process, OVHcloud ensures that customers are w

## Billing principles

Here is the model billing overview for AI Endpoints.

| Category | Model | Price ($) | Price (€) | Unit Price |
| -------------- | --------------- | ------ | ------ | ---------- |
| Large Language Model (LLM) | Llama 3.3 70B Instruct | 0.74 | 0.67 | per 1M tokens |
| Large Language Model (LLM) | Llama 3.1 70B Instruct | 0.74 | 0.67 | per 1M tokens |
| Large Language Model (LLM) | Mixtral 8x7B Instruct v0.1 | 0.70 | 0.63 | per 1M tokens |
| Large Language Model (LLM) | Mistral-Nemo-Instruct-2407 | 0.14 | 0.13 | per 1M tokens |
| Large Language Model (LLM) | Llama 3.1 8B Instruct | 0.10 | 0.10 | per 1M tokens |
| Large Language Model (LLM) | Mistral 7B Instruct v0.3 | 0.11 | 0.10 | per 1M tokens |
| Reasoning LLM | DeepSeek R1 Distill Llama 70B | 0.74 | 0.67 | per 1M tokens |
| Reasoning LLM | Qwen3 32B | 0.09 | 0.08 | per 1M tokens |
| Code LLM | Qwen2.5 Coder 32B Instruct | 0.96 | 0.87 | per 1M tokens |
| Code LLM | Mamba Codestral 7B v0.1 | 0.21 | 0.19 | per 1M tokens |
| Visual LLM | Mistral Small 3.2 24B Instruct 2506 | 0.10 | 0.09 | per 1M tokens |
| Visual LLM | Qwen2.5 VL 72B Instruct | 1.01 | 0.91 | per 1M tokens |
| Visual LLM | Llava Next Mistral 7B | 0.32 | 0.29 | per 1M tokens |
| Embeddings | BGE Multilingual Gemma2 | 0.01 | 0.01 | per 1M tokens |
| Embeddings | BGE-M3 | 0.01 | 0.01 | per 1M tokens |
| Embeddings | BGE Base EN v1.5 | 0.01 | 0.005 | per 1M tokens |
| Natural Language Processing (NLP) | Roberta Base Go Emotions | Free | Free | per 1M characters |
| Natural Language Processing (NLP) | Bert Base Multilingual uncased sentiment | Free | Free | per 1M characters |
| Natural Language Processing (NLP) | Bert Base NER | Free | Free | per 1M characters |
| Natural Language Processing (NLP) | Bart Large CNN | Free | Free | per 1M characters |
| Image generation| Stable Diffusion XL | Free | Free | per image |
| Speech to Text | RIVA Automatic Speech Recognition | Free | Free | per hour |
| Text to Speech | RIVA Text-to-Speech | Free | Free | per hour |
| Translation | T5-Large | Free | Free | per 1M characters |
| Computer vision | YOLOv11 Object Detection | Free | Free | per image |
| Computer vision | YOLOv11 Image Segmentation | Free | Free | per image |
Here is the model billing overview for AI Endpoints models:

| Category | Model | Input Price (\$) | Output Price (\$) | Input Price (€) | Output Price (€) | Unit Price |
| -------------------------- | -------------------------- | ---------------- | ----------------- | --------------- | ---------------- | --------------------------------- |
| Large Language Model (LLM) | Llama 3.3 70B Instruct | 0.74 | 0.74 | 0.67 | 0.67 | Per 1 million tokens |
| Large Language Model (LLM) | Llama 3.1 70B Instruct | 0.74 | 0.74 | 0.67 | 0.67 | Per 1 million tokens |
| Large Language Model (LLM) | Mixtral 8x7B Instruct v0.1 | 0.70 | 0.70 | 0.63 | 0.63 | Per 1 million tokens |
| Large Language Model (LLM) | Mistral Nemo Instruct 2407 | 0.14 | 0.14 | 0.13 | 0.13 | Per 1 million tokens |
| Large Language Model (LLM) | Llama 3.1 8B Instruct | 0.11 | 0.11 | 0.10 | 0.10 | Per 1 million tokens |
| Large Language Model (LLM) | Mistral 7B Instruct v0.3 | 0.11 | 0.11 | 0.10 | 0.10 | Per 1 million tokens |
| Reasoning LLM | Qwen 3 32B | 0.09 | 0.25 | 0.08 | 0.23 | Per 1 million tokens |
| Reasoning LLM | DeepSeek R1 Distill Llama 70B | 0.74 | 0.74 | 0.67 | 0.67 | Per 1 million tokens |
| Code LLM | Qwen 2.5 Coder 32B Instruct | 0.96 | 0.96 | 0.87 | 0.87 | Per 1 million tokens |
| Code LLM | Mamba Codestral 7B v0.1 | 0.21 | 0.21 | 0.19 | 0.19 | Per 1 million tokens |
| Visual LLM | Mistral Small 3.2 24B Instruct 2506 | 0.10 | 0.31 | 0.09 | 0.28 | Per 1 million tokens |
| Visual LLM | Qwen 2.5 VL 72B Instruct | 1.01 | 1.01 | 0.91 | 0.91 | Per 1 million tokens |
| Visual LLM | Llava Next Mistral 7B | 0.32 | 0.32 | 0.29 | 0.29 | Per 1 million tokens |
| Embeddings | BGE Multilingual Gemma2 | 0.01 | 0.01 | Free | Free | Per 1 million tokens |
| Embeddings | BGE M3 | 0.01 | 0.01 | Free | Free | Per 1 million tokens |
| Embeddings | BGE Base EN v1.5 | 0.01 | 0.01 | Free | Free | Per 1 million tokens |
| Natural Language Processing (NLP) | Roberta Base Go Emotions | Free | Free | Free | Free | Per 1 million characters |
| Natural Language Processing (NLP) | Bert Base Multilingual uncased sentiment | Free | Free | Free | Free | Per 1 million characters |
| Natural Language Processing (NLP) | Bert Base NER | Free | Free | Free | Free | Per 1 million characters |
| Natural Language Processing (NLP) | Bart Large CNN | Free | Free | Free | Free | Per 1 million characters |
| Image generation | Stable Diffusion XL | Free | Free | Free | Free | Per image |
| Audio Analysis | RIVA Automatic Speech Recognition | Free | Free | Free | Free | Per hour |
| Audio Analysis | RIVA Text-to-Speech | Free | Free | Free | Free | Per hour |
| Translation | T5-Large | Free | Free | Free | Free | Per 1 million characters |
| Computer vision | YOLOv11 Object Detection | Free | Free | Free | Free | Per image |
| Computer vision | YOLOv11 Image Segmentation | Free | Free | Free | Free | Per image |

## Feedback

Expand Down
Loading