Skip to content

Commit 0aed252

Browse files
fpagnyRoRoJ
andauthored
feat(genapi): add gpt-oss, mistral-small 3.2 and qwen3 coder (#5415)
* feat(genapi): add gpt-oss, mistral-small 3.2 and qwen3 coder * feat(genapi): update model catalog * fix(genapi): update qwen3 parallel tool call feature * Update pages/managed-inference/reference-content/model-catalog.mdx --------- Co-authored-by: Rowena Jones <[email protected]>
1 parent 6f6e7cf commit 0aed252

File tree

2 files changed

+60
-6
lines changed

2 files changed

+60
-6
lines changed

pages/generative-apis/reference-content/supported-models.mdx

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -14,22 +14,23 @@ Our API supports the most popular models for [Chat](/generative-apis/how-to/quer
1414
| Provider | Model string | Context window (Tokens) | Maximum output (Tokens)| License | Model card |
1515
|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|
1616
| Google (Preview) | `gemma-3-27b-it` | 40k | 8192 | [Gemma](https://ai.google.dev/gemma/terms) | [HF](https://huggingface.co/google/gemma-3-27b-it) |
17-
| Mistral | `mistral-small-3.1-24b-instruct-2503` | 128k | 8192 | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503) |
17+
| Mistral | `mistral-small-3.2-24b-instruct-2506` | 128k | 8192 | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/mistralai/Mistral-Small-3.2-24B-Instruct-2506) |
1818

1919
## Chat models
2020

2121
| Provider | Model string | Context window (Tokens) | Maximum output (Tokens)| License | Model card |
2222
|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|
23+
| OpenAI | `gpt-oss-120b` | 128k | 8192 | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/openai/gpt-oss-120b) |
2324
| Mistral | `devstral-small-2505` | 128k | 8192 | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/mistralai/Devstral-Small-2505) |
2425
| Meta | `llama-3.3-70b-instruct` | 100k | 4096 | [Llama 3.3 Community](https://www.llama.com/llama3_3/license/) | [HF](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct) |
2526
| Meta | `llama-3.1-8b-instruct` | 128k | 16384 | [Llama 3.1 Community](https://llama.meta.com/llama3_1/license/) | [HF](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) |
2627
| Mistral | `mistral-nemo-instruct-2407` | 128k | 8192 | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407) |
2728
| Qwen | `qwen3-235b-a22b-instruct-2507` | 40k | 4096 | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/Qwen/Qwen3-235B-A22B-Instruct-2507) |
28-
| Qwen | `qwen2.5-coder-32b-instruct` | 32k | 8192 | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct) |
29+
| Qwen | `qwen3-coder-30b-a3b-instruct` | 128k | 8192 | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/Qwen/Qwen3-Coder-30B-A3B-Instruct) |
2930
| DeepSeek | `deepseek-r1-distill-llama-70b` | 32k | 4096 | [MIT](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/mit.md) | [HF](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-70B) |
3031

3132
<Message type="tip">
32-
If you are unsure which chat model to use, we currently recommend Mistral Small 3.1 24B Instruct (`mistral-small-3.1-24b-instruct-2503`) to get started.
33+
If you are unsure which chat model to use, we currently recommend Mistral Small 3.2 24B Instruct (`mistral-small-3.2-24b-instruct-2506`) to get started.
3334
</Message>
3435

3536
## Vision models
@@ -61,7 +62,8 @@ Deprecated models should not be queried anymore. We recommend to use newer model
6162

6263
| Provider | Model string | End of Life (EOL) date
6364
|-----------------|-----------------|-----------------|
64-
| Meta | `llama-3.1-70b-instruct` | 25th May, 2025 |
65+
| Mistral | `mistral-small-3.1-24b-instruct-2503` | 14th November, 2025 |
66+
| Qwen | `qwen2.5-coder-32b-instruct` | 14th November, 2025 |
6567

6668
<Message type="note">
6769
Llama 3.1 70B is now deprecated. The new Llama 3.3 70B is available with similar or better performance in most use cases.
@@ -74,4 +76,5 @@ These models are not accessible anymore from Generative APIs. They can still how
7476

7577
| Provider | Model string | EOL date
7678
|-----------------|-----------------|-----------------|
79+
| Meta | `llama-3.1-70b-instruct` | 25th May, 2025 |
7780
| SBERT | `sentence-t5-xxl` | 26 February, 2025 |

pages/managed-inference/reference-content/model-catalog.mdx

Lines changed: 53 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ A quick overview of available models in Scaleway's catalog and their core attrib
1616

1717
| Model name | Provider | Maximum Context length (tokens) | Modalities | Compatible Instances (Max Context in tokens\*) | License |
1818
|------------|----------|--------------|------------|-----------|---------|
19+
| [`gpt-oss-120b`](#gpt-oss-120b) | OpenAI | 128k | Text | H100 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
1920
| [`qwen3-235b-a22b-instruct-2507`](#qwen3-235b-a22b-instruct-2507) | Qwen | 40k | Text | H100-2 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
2021
| [`gemma-3-27b-it`](#gemma-3-27b-it) | Google | 40k | Text, Vision | H100, H100-2 | [Gemma](https://ai.google.dev/gemma/terms) |
2122
| [`llama-3.3-70b-instruct`](#llama-33-70b-instruct) | Meta | 128k | Text | H100 (15k), H100-2 | [Llama 3.3 Community](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct) |
@@ -26,6 +27,7 @@ A quick overview of available models in Scaleway's catalog and their core attrib
2627
| [`deepseek-r1-distill-70b`](#deepseek-r1-distill-llama-70b) | Deepseek | 128k | Text | H100 (13k), H100-2 | [MIT](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-70B/blob/main/LICENSE) and [Llama 3.3 Community](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct/blob/main/LICENSE) |
2728
| [`deepseek-r1-distill-8b`](#deepseek-r1-distill-llama-8b) | Deepseek | 128k | Text | L4 (90k), L40S, H100, H100-2 | [MIT](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-70B/blob/main/LICENSE) and [Llama 3.1 Community](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct/blob/main/LICENSE) |
2829
| [`mistral-7b-instruct-v0.3`](#mistral-7b-instruct-v03) | Mistral | 32k | Text | L4, L40S, H100, H100-2 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
30+
| [`mistral-small-3.2-24b-instruct-2506`](#mistral-small-32-24b-instruct-2506) | Mistral | 128k | Text, Vision | H100, H100-2 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
2931
| [`mistral-small-3.1-24b-instruct-2503`](#mistral-small-31-24b-instruct-2503) | Mistral | 128k | Text, Vision | H100, H100-2 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
3032
| [`mistral-small-24b-instruct-2501`](#mistral-small-24b-instruct-2501) | Mistral | 32k | Text | L40S (20k), H100, H100-2 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
3133
| [`mistral-nemo-instruct-2407`](#mistral-nemo-instruct-2407) | Mistral | 128k | Text | L40S, H100, H100-2 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
@@ -36,6 +38,7 @@ A quick overview of available models in Scaleway's catalog and their core attrib
3638
| [`moshika-0.1-8b`](#moshika-01-8b) | Kyutai | 4k | Audio to Audio| L4, H100 | [CC-BY-4.0](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/cc-by-4.0.md) |
3739
| [`pixtral-12b-2409`](#pixtral-12b-2409) | Mistral | 128k | Text, Vision | L40S (50k), H100, H100-2 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
3840
| [`molmo-72b-0924`](#molmo-72b-0924) | Allen AI | 50k | Text, Vision | H100-2 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) and [Twonyi Qianwen license](https://huggingface.co/Qwen/Qwen2-72B/blob/main/LICENSE)|
41+
| [`qwen3-coder-30b-a3b-instruct`](#qwen3-coder-30b-a3b-instruct) | Qwen | 128k | Code | L40S, H100, H100-2 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
3942
| [`qwen2.5-coder-32b-instruct`](#qwen25-coder-32b-instruct) | Qwen | 32k | Code | H100, H100-2 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
4043
| [`bge-multilingual-gemma2`](#bge-multilingual-gemma2) | BAAI | 4k | Embeddings | L4, L40S, H100, H100-2 | [Gemma](https://ai.google.dev/gemma/terms) |
4144
| [`sentence-t5-xxl`](#sentence-t5-xxl) | Sentence transformers | 512 | Embeddings | L4 | [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
@@ -45,6 +48,7 @@ A quick overview of available models in Scaleway's catalog and their core attrib
4548
## Models feature summary
4649
| Model name | Structured output supported | Function calling | Supported languages |
4750
| --- | --- | --- | --- |
51+
| `gpt-oss-120b` | Yes | Yes | English |
4852
| `qwen3-235b-a22b-instruct-2507` | Yes | Yes | English, French, German, Chinese, Japanese, Korean and 113 additional languages and dialects |
4953
| `gemma-3-27b-it` | Yes | Partial | English, Chinese, Japanese, Korean and 31 additional languages |
5054
| `llama-3.3-70b-instruct` | Yes | Yes | English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai |
@@ -55,6 +59,7 @@ A quick overview of available models in Scaleway's catalog and their core attrib
5559
| `deepseek-r1-distill-llama-70B` | Yes | Yes | English, Chinese |
5660
| `deepseek-r1-distill-llama-8B` | Yes | Yes | English, Chinese |
5761
| `mistral-7b-instruct-v0.3` | Yes | Yes | English |
62+
| `mistral-small-3.2-24b-instruct-2506` | Yes | Yes | English, French, German, Greek, Hindi, Indonesian, Italian, Japanese, Korean, Malay, Nepali, Polish, Portuguese, Romanian, Russian, Serbian, Spanish, Swedish, Turkish, Ukrainian, Vietnamese, Arabic, Bengali, Chinese, Farsi |
5863
| `mistral-small-3.1-24b-instruct-2503` | Yes | Yes | English, French, German, Greek, Hindi, Indonesian, Italian, Japanese, Korean, Malay, Nepali, Polish, Portuguese, Romanian, Russian, Serbian, Spanish, Swedish, Turkish, Ukrainian, Vietnamese, Arabic, Bengali, Chinese, Farsi |
5964
| `mistral-small-24b-instruct-2501` | Yes | Yes | English, French, German, Dutch, Spanish, Italian, Polish, Portuguese, Chinese, Japanese, Korean |
6065
| `mistral-nemo-instruct-2407` | Yes | Yes | English, French, German, Spanish, Italian, Portuguese, Russian, Chinese, Japanese |
@@ -65,6 +70,7 @@ A quick overview of available models in Scaleway's catalog and their core attrib
6570
| `moshika-0.1-8b` | No | No | English |
6671
| `pixtral-12b-2409` | Yes | Yes | English |
6772
| `molmo-72b-0924` | Yes | No | English |
73+
| `qwen3-coder-30b-a3b-instruct` | Yes | Yes | English, French, German, Chinese, Japanese, Korean and 113 additional languages and dialects |
6874
| `qwen2.5-coder-32b-instruct` | Yes | Yes | English, French, Spanish, Portuguese, German, Italian, Russian, Chinese, Japanese, Korean, Vietnamese, Thai, Arabic and 16 additional languages. |
6975
| `bge-multilingual-gemma2` | No | No | English, French, Chinese, Japanese, Korean |
7076
| `sentence-t5-xxl` | No | No | English |
@@ -98,6 +104,22 @@ google/gemma-3-27b-it:bf16
98104

99105
- Pan & Scan is not yet supported for Gemma 3 images. This means that high resolution images are currently resized to 896x896 resolution that may generate artifacts and lead to a lower accuracy.
100106

107+
### Mistral-small-3.2-24b-instruct-2506
108+
Mistral-small-3.2-24b-instruct-2506 is an improved version of Mistral-small-3.1 which performs better on tool calling.
109+
This model was optimized to have a dense knowledge and faster tokens throughput compared to its size.
110+
111+
| Attribute | Value |
112+
|-----------|-------|
113+
| Supports parallel tool calling | Yes |
114+
| Supported images formats | PNG, JPEG, WEBP, and non-animated GIFs |
115+
| Maximum image resolution (pixels) | 1540x1540 |
116+
| Token dimension (pixels)| 28x28 |
117+
118+
#### Model names
119+
```
120+
mistral/mistral-small-3.2-24b-instruct-2506:fp8
121+
```
122+
101123
### Mistral-small-3.1-24b-instruct-2503
102124
Mistral-small-3.1-24b-instruct-2503 is a model developed by Mistral to perform text processing and image analysis on many languages.
103125
This model was optimized to have a dense knowledge and faster tokens throughput compared to its size.
@@ -112,6 +134,7 @@ This model was optimized to have a dense knowledge and faster tokens throughput
112134
#### Model names
113135
```
114136
mistral/mistral-small-3.1-24b-instruct-2503:bf16
137+
mistral/mistral-small-3.1-24b-instruct-2503:fp8
115138
```
116139

117140
- Bitmap (or raster) image formats, meaning storing images as grids of individual pixels, are supported. Vector image formats (SVG, PSD) are not supported, neither PDFs nor videos.
@@ -147,16 +170,31 @@ allenai/molmo-72b-0924:fp8
147170

148171
## Text models
149172

150-
### Qwen3-235b-a22b-instruct-2507
173+
### Gpt-oss-120b
151174
Released July 23, 2025, Qwen 3 235B A22B is an open-weight model, competitive in multiple benchmarks (such as [LM Arena for text use cases](https://lmarena.ai/leaderboard)) compared to Gemini 2.5 Pro and GPT4.5.
152175

153176
| Attribute | Value |
154177
|-----------|-------|
155178
| Supports parallel tool calling | Yes |
156179

180+
181+
157182
#### Model name
158183
```
159-
qwen/qwen3-235b-a22b-instruct-2507:awq
184+
openai/gpt-oss-120b:fp4
185+
```
186+
187+
### Gpt-oss-120b
188+
Released August 5, 2025, GPT OSS 120B is an open-weight model providing significant throughput performance and reasoning capabilities.
189+
Currently, this model should be used through Responses API, as Chat Completion does not yet support tool calling for this model.
190+
191+
| Attribute | Value |
192+
|-----------|-------|
193+
| Supports parallel tool calling | Yes |
194+
195+
#### Model name
196+
```
197+
openai/gpt-oss-120b:fp4
160198
```
161199

162200
### Llama-3.3-70b-instruct
@@ -333,6 +371,19 @@ kyutai/moshika-0.1-8b:fp8
333371

334372
## Code models
335373

374+
### Qwen3-coder-30b-a3b-instruct
375+
Qwen3-coder is an improved version of Qwen2.5 with better accuracy and throughput.
376+
Thanks to its a3b architecture, only a subset of its weights are activated for a given generation, leading to much faster input and output token processing, ideal for code completion.
377+
378+
| Attribute | Value |
379+
|-----------|-------|
380+
| Supports parallel tool calling | Yes |
381+
382+
#### Model name
383+
```
384+
qwen/qwen3-coder-30b-a3b-instruct:fp8
385+
```
386+
336387
### Qwen2.5-coder-32b-instruct
337388
Qwen2.5-coder is your intelligent programming assistant familiar with more than 40 programming languages.
338389
With Qwen2.5-coder deployed at Scaleway, your company can benefit from code generation, AI-assisted code repair, and code reasoning.

0 commit comments

Comments
 (0)