Skip to content

add detail to image_url for openai chat completions #2221

@redbmk

Description

@redbmk

Describe the bug
When sending messages containing image content to OpenAI's chat completions endpoint, the detail is ignored. OpenAI expects it to be inside the image_url object (see here), but llamaindex expects it to be next to the image_url:

export type MessageContentImageDetail = {
type: "image_url";
image_url: { url: string };
detail?: "high" | "low" | "auto";
};

and there's no translation happening when using llm.chat:

// Keep other types as is (text, image_url, etc.)
return item;

To Reproduce
Code to reproduce the behavior:

import { OpenAI, OpenAIResponses } from "@llamaindex/openai";
import type { ChatMessage } from "llamaindex";

const followingTypes: ChatMessage = {
  role: "user",
  content: [
    {
      type: "image_url",
      detail: "high",
      image_url: { url: "data:image/jpeg;base64,aGVsbG8=" },
    },
  ],
};

const workaroundForChatCompletionsOnly: ChatMessage = {
  role: "user",
  content: [
    {
      type: "image_url",
      image_url: {
        url: "data:image/jpeg;base64,aGVsbG8=",
        // @ts-expect-error
        detail: "high",
      },
    },
  ],
};

const workaroundForBoth: ChatMessage = {
  role: "user",
  content: [
    {
      type: "image_url",
      detail: "high",
      image_url: {
        url: "data:image/jpeg;base64,aGVsbG8=",
        // @ts-expect-error
        detail: "high",
      },
    },
  ],
};

const messages = [
  followingTypes,
  workaroundForChatCompletionsOnly,
  workaroundForBoth,
];

const chat = OpenAI.toOpenAIMessage(messages);
const responses = new OpenAIResponses().toOpenAIResponseMessages(messages);

console.dir({ chat, responses }, { depth: null });

The workaroundForBoth ends up putting detail in two places for chat completions. I still need to confirm if this throws a 4xx error from OpenAI or if it's just ignored.

{
  "type": "image_url",
  "detail": "high",
  "image_url": {
    "url": "data:image/jpeg;base64,aGVsbG8=",
    "detail": "high"
  }
} 

Expected behavior
We should be able to provide messages that adhere to llamaindex's ChatMessage type and still have the detail param make it into the correct spot for both Chat Completions and Responses APIs.

i.e. OpenAI.toOpenAIMessage([followingTypes]) should return:

[
  {
    "role": "user",
    "content": [
      {
        "type": "image_url",
        "image_url": {
          "url": "data:image/jpeg;base64,aGVsbG8=",
          "detail": "high"
        }
      }
    ]
  }
]

And new OpenAIResponses().toOpenAIResponseMessages([followingTypes]) should return:

[
  {
    "role": "user",
    "content": [
      {
        "type": "input_image",
        "image_url": "data:image/jpeg;base64,aGVsbG8=",
        "detail": "high"
      }
    ]
  }
]

Screenshots
N/A

Desktop (please complete the following information):

  • OS: macOS
  • JS Runtime / Framework / Bundler (select all applicable)
  • Node.js
  • Deno
  • Bun
  • Next.js
  • ESBuild
  • Rollup
  • Webpack
  • Turbopack
  • Vite
  • Waku
  • Edge Runtime
  • AWS Lambda
  • Cloudflare Worker
  • Others (please elaborate on this)
  • Version: 1.2.22

Additional context
N/A

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinggood first issueGood for newcomershelp wantedExtra attention is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions