Skip to content

Commit 2ae0aa3

Browse files
committed
fix(provider/openai): file search tool optional query param
1 parent 970163a commit 2ae0aa3

File tree

365 files changed

+4886
-850
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

365 files changed

+4886
-850
lines changed

.changeset/gorgeous-pets-tie.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
---
2+
'@ai-sdk/anthropic': patch
3+
---
4+
5+
fix(provider/anthorpic): add cacheControl to AnthropicProviderOptions

.changeset/rich-ghosts-dream.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
---
2+
'@ai-sdk/openai': patch
3+
---
4+
5+
Fix openai file_search tool to take optional query param
Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
---
2+
title: Google Gemini Image Generation
3+
description: Generate and edit images with Google Gemini 2.5 Flash Image using the AI SDK.
4+
tags: ['image-generation', 'google', 'gemini']
5+
---
6+
7+
# Generate and Edit Images with Google Gemini 2.5 Flash
8+
9+
This guide will show you how to generate and edit images with the AI SDK and Google's latest multimodal language model Gemini 2.5 Flash Image.
10+
11+
## Generating Images
12+
13+
As Gemini 2.5 Flash Image is a language model with multimodal capabilities, you can use the `generateText` or `streamText` functions (not `generateImage`) to create images. The model determines which modality to respond in based on your prompt and configuration. Here's how to create your first image:
14+
15+
```ts
16+
import { google } from '@ai-sdk/google';
17+
import { generateText } from 'ai';
18+
import fs from 'node:fs';
19+
import 'dotenv/config';
20+
21+
async function generateImage() {
22+
const result = await generateText({
23+
model: google('gemini-2.5-flash-image-preview'),
24+
prompt:
25+
'Create a picture of a nano banana dish in a fancy restaurant with a Gemini theme',
26+
});
27+
28+
// Save generated images
29+
for (const file of result.files) {
30+
if (file.mediaType.startsWith('image/')) {
31+
const timestamp = Date.now();
32+
const fileName = `generated-${timestamp}.png`;
33+
34+
fs.mkdirSync('output', { recursive: true });
35+
await fs.promises.writeFile(`output/${fileName}`, file.uint8Array);
36+
37+
console.log(`Generated and saved image: output/${fileName}`);
38+
}
39+
}
40+
}
41+
42+
generateImage().catch(console.error);
43+
```
44+
45+
Here are some key points to remember:
46+
47+
- Generated images are returned in the `result.files` array
48+
- Images are returned as `Uint8Array` data
49+
- The model leverages Gemini's world knowledge, so detailed prompts yield better results
50+
51+
## Editing Images
52+
53+
Gemini 2.5 Flash Image excels at editing existing images with natural language instructions. You can add elements, modify styles, or transform images while maintaining their core characteristics:
54+
55+
```ts
56+
import { google } from '@ai-sdk/google';
57+
import { generateText } from 'ai';
58+
import fs from 'node:fs';
59+
import 'dotenv/config';
60+
61+
async function editImage() {
62+
const editResult = await generateText({
63+
model: google('gemini-2.5-flash-image-preview'),
64+
prompt: [
65+
{
66+
role: 'user',
67+
content: [
68+
{
69+
type: 'text',
70+
text: 'Add a small wizard hat to this cat. Keep everything else the same.',
71+
},
72+
{
73+
type: 'image',
74+
// image: DataContent (string | Uint8Array | ArrayBuffer | Buffer) or URL
75+
image: new URL(
76+
'https://raw.githubusercontent.com/vercel/ai/refs/heads/main/examples/ai-core/data/comic-cat.png',
77+
),
78+
mediaType: 'image/jpeg',
79+
},
80+
],
81+
},
82+
],
83+
});
84+
85+
// Save the edited image
86+
const timestamp = Date.now();
87+
fs.mkdirSync('output', { recursive: true });
88+
89+
for (const file of editResult.files) {
90+
if (file.mediaType.startsWith('image/')) {
91+
await fs.promises.writeFile(
92+
`output/edited-${timestamp}.png`,
93+
file.uint8Array,
94+
);
95+
console.log(`Saved edited image: output/edited-${timestamp}.png`);
96+
}
97+
}
98+
}
99+
100+
editImage().catch(console.error);
101+
```
102+
103+
## What's Next?
104+
105+
You've learned how to generate new images from text prompts and edit existing images using natural language instructions with Google's Gemini 2.5 Flash Image model.
106+
107+
For more advanced techniques, integration patterns, and practical examples, check out our [Cookbook](/cookbook) where you'll find comprehensive guides for building sophisticated AI-powered applications.

content/docs/03-ai-sdk-core/10-generating-structured-data.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -214,7 +214,7 @@ You can access the reasoning used by the language model to generate the object v
214214
```ts
215215
import { openai, OpenAIResponsesProviderOptions } from '@ai-sdk/openai';
216216
import { generateObject } from 'ai';
217-
import { z } from 'zod/v4';
217+
import { z } from 'zod';
218218

219219
const result = await generateObject({
220220
model: openai('gpt-5'),

content/docs/03-ai-sdk-core/35-image-generation.mdx

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -231,18 +231,15 @@ try {
231231

232232
## Generating Images with Language Models
233233

234-
Some language models such as Google `gemini-2.0-flash-exp` support multi-modal outputs including images.
234+
Some language models such as Google `gemini-2.5-flash-image-preview` support multi-modal outputs including images.
235235
With such models, you can access the generated images using the `files` property of the response.
236236

237237
```ts
238238
import { google } from '@ai-sdk/google';
239239
import { generateText } from 'ai';
240240

241241
const result = await generateText({
242-
model: google('gemini-2.0-flash-exp'),
243-
providerOptions: {
244-
google: { responseModalities: ['TEXT', 'IMAGE'] },
245-
},
242+
model: google('gemini-2.5-flash-image-preview'),
246243
prompt: 'Generate an image of a comic cat',
247244
});
248245

content/docs/04-ai-sdk-ui/02-chatbot.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -960,7 +960,7 @@ messages.map(message => (
960960

961961
## Image Generation
962962

963-
Some models such as Google `gemini-2.0-flash-exp` support image generation.
963+
Some models such as Google `gemini-2.5-flash-image-preview` support image generation.
964964
When images are generated, they are exposed as files to the client.
965965
On the client side, you can access file parts of the message object
966966
and render them as images.

content/docs/04-ai-sdk-ui/08-object-generation.mdx

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,10 @@ description: Learn how to use the useObject hook.
55

66
# Object Generation
77

8-
<Note>`useObject` is an experimental feature and only available in React.</Note>
8+
<Note>
9+
`useObject` is an experimental feature and only available in React, Svelte,
10+
and Vue.
11+
</Note>
912

1013
The [`useObject`](/docs/reference/ai-sdk-ui/use-object) hook allows you to create interfaces that represent a structured JSON object that is being streamed.
1114

content/docs/07-reference/01-ai-sdk-core/01-generate-text.mdx

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -847,12 +847,12 @@ To see `generateText` in action, check out [these examples](#examples).
847847
},
848848
{
849849
name: 'reasoning',
850-
type: 'Array<ReasoningPart>',
850+
type: 'Array<ReasoningOutput>',
851851
description:
852852
'The full reasoning that the model has generated in the last step.',
853853
properties: [
854854
{
855-
type: 'ReasoningPart',
855+
type: 'ReasoningOutput',
856856
parameters: [
857857
{
858858
name: 'type',
@@ -864,6 +864,12 @@ To see `generateText` in action, check out [these examples](#examples).
864864
type: 'string',
865865
description: 'The reasoning text.',
866866
},
867+
{
868+
name: 'providerMetadata',
869+
type: 'SharedV2ProviderMetadata',
870+
isOptional: true,
871+
description: 'Additional provider metadata for the source.',
872+
},
867873
],
868874
},
869875
],

content/docs/07-reference/01-ai-sdk-core/02-stream-text.mdx

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1540,23 +1540,29 @@ To see `streamText` in action, check out [these examples](#examples).
15401540
},
15411541
{
15421542
name: 'reasoning',
1543-
type: 'Promise<Array<ReasoningPart>>',
1543+
type: 'Promise<Array<ReasoningOutput>>',
15441544
description:
15451545
'The full reasoning that the model has generated in the last step. Automatically consumes the stream.',
15461546
properties: [
15471547
{
1548-
type: 'ReasoningPart',
1548+
type: 'ReasoningOutput',
15491549
parameters: [
15501550
{
15511551
name: 'type',
15521552
type: "'reasoning'",
1553-
description: 'The type of the reasoning part.',
1553+
description: 'The type of the message part.',
15541554
},
15551555
{
15561556
name: 'text',
15571557
type: 'string',
15581558
description: 'The reasoning text.',
15591559
},
1560+
{
1561+
name: 'providerMetadata',
1562+
type: 'SharedV2ProviderMetadata',
1563+
isOptional: true,
1564+
description: 'Additional provider metadata for the source.',
1565+
},
15601566
],
15611567
},
15621568
],

content/docs/07-reference/02-ai-sdk-ui/03-use-object.mdx

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,8 @@ description: API reference for the useObject hook.
66
# `experimental_useObject()`
77

88
<Note>
9-
`useObject` is an experimental feature and only available in React and Svelte.
9+
`useObject` is an experimental feature and only available in React, Svelte,
10+
and Vue.
1011
</Note>
1112

1213
Allows you to consume text streams that represent a JSON object and parse them into a complete object based on a schema.

0 commit comments

Comments
 (0)