Skip to content

Accept a Hash of options on image_generation=#64

Merged
raghubetina merged 2 commits into
mainfrom
image-generation-options
Apr 22, 2026
Merged

Accept a Hash of options on image_generation=#64
raghubetina merged 2 commits into
mainfrom
image-generation-options

Conversation

@raghubetina
Copy link
Copy Markdown
Contributor

@raghubetina raghubetina commented Apr 21, 2026

Summary

  • chat.image_generation= now accepts a Hash of tool options in addition to the existing true/false — e.g. chat.image_generation = { size: "1536x1024", quality: "low", model: "gpt-image-2" }. The Hash is merged into the OpenAI Responses API image generation tool spec, with type: "image_generation" always enforced.
  • chat.image_generation = true is unchanged; students who don't care about options keep the one-liner.
  • nil normalizes to false; anything that isn't true/false/nil/Hash raises ArgumentError.

Why now

OpenAI announced ChatGPT Images 2.0 on 2026-04-21, adding gpt-image-2 to the API with 4K resolutions, flexible aspect ratios, thinking mode, and an action parameter. Today the gem has no way to expose any of these — image_generation is boolean-only, so the only path is to drop down to the raw openai-ruby client.

The openai-ruby SDK already types every option on the image_generation tool (see lib/openai/models/responses/tool.rb in the SDK). Model selection works via "gpt-image-2" as a String, since model is a Union with String as a variant. Unknown keys pass through the SDK's BaseModel coerce/dump unchanged.

Stacked on #65

This PR is based on #65 (bump openai from ~> 0.43 to ~> 0.59) so the typed action field and the up-to-date image_generation tool surface land first. Review order: merge #65, then this PR's base auto-retargets to main.

Still deliberately out of scope

  • Default model bump (gpt-5.2gpt-5.4). OpenAI's current image-generation docs use gpt-5.4. Worth considering, separate PR.
  • Gem version bump. Left CHANGELOG entry under [Unreleased]; whoever cuts the release can decide the number (minor bump looks right).

Test plan

  • Unit specs added for the setter (true/false/nil/Hash/invalid) and the tools method's output in each case
  • Safety case: a Hash with a stray type: key does not override the tool type
  • bundle exec rspec spec/unit/chat_spec.rb — 65 examples, 0 failures (also after rebase onto Bump openai dep from ~> 0.43 to ~> 0.59 #65)
  • bundle exec standardrb lib/ai/chat.rb spec/unit/chat_spec.rb — clean on touched lines (two pre-existing warnings on untouched code left alone)
  • Manual smoke test against real API with { size: "1536x1024", quality: "low" } — leaving to the reviewer since it burns credits

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 372d8c4dd1

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread lib/ai/chat.rb
if image_generation
tools_list << {type: "image_generation"}
options = image_generation.is_a?(Hash) ? image_generation : {}
tools_list << options.merge(type: "image_generation")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Strip string type before merging image tool options

When image_generation is set from parsed JSON/YAML (string keys), options.merge(type: "image_generation") does not overwrite an existing "type" key, so the resulting tool hash can carry both "type" and :type. That breaks the intended invariant that the tool type is always forced, and can produce ambiguous payloads with duplicate type fields when serialized. Normalize keys (or delete both key variants) before merging so user input cannot preserve a conflicting tool type.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in a83f9ed. The setter now calls value.transform_keys(&:to_sym) before storing, so downstream code can assume symbol keys and the :type override in tools works regardless of how the caller built the hash (literal, JSON.parse, YAML, etc.). Added two unit cases: the setter normalizes string keys, and a string-keyed "type" is correctly overridden in the tool spec.

Tracks the upstream SDK forward by sixteen minor versions
(0.43 → 0.59, released 2026-04-14). No code changes in lib/ are
required: every method we call (OpenAI::Client.new, responses.create,
conversations.items.list) keeps the same signature, and the response
object shape we consume (id / output / output_text / model / usage /
image_generation_call items) is unchanged.

Notable things this brings along:

- Typed `action` field on the image_generation tool (auto/generate/
  edit), added after 0.43. Unknown keys were passing through via
  BaseModel anyway, but the typed field makes the SDK-facing surface
  closer to the docs.
- gpt-5.4 mini/nano model slugs in ChatModel (added in 0.56).
- Short-lived tokens, phase field on Conversations messages, and the
  WEB_SEARCH_CALL_RESULTS Includable (0.57–0.58) — not used here yet,
  but available for future work.

The SDK still has no typed `gpt-image-2` constant (the announcement
landed on 2026-04-21, a week after 0.59 shipped), but the model field
is a String|Symbol union, so `"gpt-image-2"` passes through.

72 unit examples pass; StandardRB is clean on the diff (the two
pre-existing warnings on lib/ai/chat.rb:49 and :178 are unrelated).
@raghubetina raghubetina force-pushed the image-generation-options branch from 372d8c4 to 21f6c5e Compare April 21, 2026 22:51
@raghubetina raghubetina changed the base branch from main to bump-openai-0.59 April 21, 2026 22:51
raghubetina added a commit that referenced this pull request Apr 21, 2026
When a caller set image_generation to a hash with string keys (e.g.
from JSON.parse), `options.merge(type: "image_generation")` did not
overwrite an existing `"type"` string key — it only overrides symbol
keys. The resulting tool hash could carry both `"type"` and `:type`,
breaking the "tool type is always forced" invariant and producing a
payload with duplicate type fields.

Fix this at the setter by running transform_keys(&:to_sym) on the
incoming Hash. Downstream code can assume symbol keys, and the stored
hash is consistent regardless of how it was built (literal, JSON,
YAML, etc.). Adds two unit cases: the setter normalizes string keys,
and a string-keyed `"type"` is correctly overridden in the tool spec.

Reported by Codex review on PR #64.
raghubetina added a commit that referenced this pull request Apr 22, 2026
Addresses the Codex adversarial review on #64.

`extract_and_save_images` was writing every decoded image to `001.png`,
which silently corrupted the file boundary once the new Hash options
let callers pass `output_format: "jpeg"` or `"webp"` — bytes in one
format with the filename of another would mis-route through any
downstream code that trusts the extension (image viewers, MIME
probes, Rails uploads, etc.).

Switch to sniffing the decoded bytes with Marcel and mapping the
recognized image MIME type to an extension (`.png`, `.jpg`, `.webp`).
Unrecognized bytes fall back to `.png`, matching the prior default.
Marcel is already a runtime dependency; no new gem work required.

Also trim `partial_images` from the README's supported-keys list and
reword the CHANGELOG to stop claiming partial-image streaming.
`create_response` does a blocking `client.responses.create(...)` with
no `stream: true`, so partial images would never be delivered even if
the API produced them. Streaming support is a future feature; until
then, documenting the key is worse than omitting it.

- lib/ai/chat.rb: new `image_extension_for(bytes)` helper and
  `IMAGE_EXTENSIONS_BY_MIME_TYPE` table; use the helper when choosing
  the filename inside `extract_and_save_images`.
- spec/unit/chat_spec.rb: four new cases covering PNG / JPEG / WebP
  magic-byte payloads and the fallback for unrecognized bytes.
- README.md: drop `partial_images` from the supported-keys list; add
  a one-liner explaining the auto-selected extension and a note
  about why `partial_images` isn't exposed.
- CHANGELOG.md: drop the partial-image-streaming claim; add a
  "Fixed" entry for the extension bug.
`chat.image_generation = true` still works unchanged — it forwards
`{type: "image_generation"}` and lets OpenAI pick the defaults. It now
also accepts a Hash so students can configure any option the OpenAI
Responses API image generation tool exposes:

    chat.image_generation = {
      size: "1536x1024",
      quality: "low",
      model: "gpt-image-2"
    }

The Hash is merged into the tool payload in `tools`, with
`type: "image_generation"` always forced so a stray `type:` in the
user Hash cannot hijack the tool slot. Incoming keys are normalized
to Symbols via `transform_keys(&:to_sym)` so that JSON-parsed or
YAML-parsed config (string keys) behaves identically to a Ruby
literal — without this, `"type" => "..."` and `:type` could coexist
in the final payload.

Why now: ChatGPT Images 2.0 (`gpt-image-2`, announced 2026-04-21)
brought 4K resolutions, flexible aspect ratios, and a new `action`
parameter. The gem's previous boolean `image_generation` flag could
not reach any of this; students' only escape was dropping to the raw
openai-ruby client. The openai-ruby SDK already types every tool
option and accepts `"gpt-image-2"` through the String variant on
`ImageGeneration#model`, so the plumbing is all there.

Validation:
- `true` stays truthy; `false`/`nil` both collapse to off; anything
  else (String, Integer, Array, …) raises ArgumentError.
- `inspectable_attributes` lists `@image_generation` whenever it's
  non-false, so the inspect output reveals which Hash a student set.

Saved-image extension fix:
- `extract_and_save_images` previously wrote every decoded image to
  `001.png`, which becomes silent file-boundary corruption once the
  Hash lets callers pass `output_format: "jpeg"` or `"webp"`. The
  decoded bytes are now sniffed with Marcel (already a runtime
  dependency) and saved as `.png`, `.jpg`, or `.webp` based on what
  Marcel sees. Unknown bytes fall back to `.png`, matching prior
  behavior. The fallback MIME table is a private constant, and the
  sniffing logic is factored into a `image_extension_for(bytes)`
  helper so it can be unit-tested directly with magic-byte payloads.

Docs:
- README gains a "Configuring the Tool" subsection listing the keys
  that actually do something in a blocking call, plus a note about
  `partial_images` being intentionally omitted (it only does anything
  under `stream: true`, which this gem does not yet implement).
- CHANGELOG [Unreleased] gets Added/Changed/Fixed entries covering
  the Hash API, the setter validation, and the extension fix.
- `examples/10_image_generation.rb` gains a fourth example that
  configures `size:` and `quality:`.

Spec coverage:
- `#image_generation=` — accepts true/false/nil/Hash/empty-Hash,
  normalizes string keys to symbols, rejects other types.
- `#tools` — omits the tool when off, emits a bare tool when `true`,
  merges Hash options into the tool spec, and enforces
  `type: "image_generation"` for both symbol- and string-keyed input.
- `#image_extension_for` — returns `png` / `jpg` / `webp` for the
  matching magic bytes and falls back to `png` for unknown bytes.
- `#inspectable_attributes` — surfaces `@image_generation` when it's
  a Hash of options, not just when it's `true`.
@raghubetina raghubetina force-pushed the image-generation-options branch from ed22f1e to 0cf93c6 Compare April 22, 2026 21:45
@raghubetina raghubetina changed the base branch from bump-openai-0.59 to main April 22, 2026 21:56
@raghubetina raghubetina merged commit 6bd56b3 into main Apr 22, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant