Accept a Hash of options on image_generation=#64
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 372d8c4dd1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if image_generation | ||
| tools_list << {type: "image_generation"} | ||
| options = image_generation.is_a?(Hash) ? image_generation : {} | ||
| tools_list << options.merge(type: "image_generation") |
There was a problem hiding this comment.
Strip string
type before merging image tool options
When image_generation is set from parsed JSON/YAML (string keys), options.merge(type: "image_generation") does not overwrite an existing "type" key, so the resulting tool hash can carry both "type" and :type. That breaks the intended invariant that the tool type is always forced, and can produce ambiguous payloads with duplicate type fields when serialized. Normalize keys (or delete both key variants) before merging so user input cannot preserve a conflicting tool type.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in a83f9ed. The setter now calls value.transform_keys(&:to_sym) before storing, so downstream code can assume symbol keys and the :type override in tools works regardless of how the caller built the hash (literal, JSON.parse, YAML, etc.). Added two unit cases: the setter normalizes string keys, and a string-keyed "type" is correctly overridden in the tool spec.
Tracks the upstream SDK forward by sixteen minor versions (0.43 → 0.59, released 2026-04-14). No code changes in lib/ are required: every method we call (OpenAI::Client.new, responses.create, conversations.items.list) keeps the same signature, and the response object shape we consume (id / output / output_text / model / usage / image_generation_call items) is unchanged. Notable things this brings along: - Typed `action` field on the image_generation tool (auto/generate/ edit), added after 0.43. Unknown keys were passing through via BaseModel anyway, but the typed field makes the SDK-facing surface closer to the docs. - gpt-5.4 mini/nano model slugs in ChatModel (added in 0.56). - Short-lived tokens, phase field on Conversations messages, and the WEB_SEARCH_CALL_RESULTS Includable (0.57–0.58) — not used here yet, but available for future work. The SDK still has no typed `gpt-image-2` constant (the announcement landed on 2026-04-21, a week after 0.59 shipped), but the model field is a String|Symbol union, so `"gpt-image-2"` passes through. 72 unit examples pass; StandardRB is clean on the diff (the two pre-existing warnings on lib/ai/chat.rb:49 and :178 are unrelated).
372d8c4 to
21f6c5e
Compare
When a caller set image_generation to a hash with string keys (e.g. from JSON.parse), `options.merge(type: "image_generation")` did not overwrite an existing `"type"` string key — it only overrides symbol keys. The resulting tool hash could carry both `"type"` and `:type`, breaking the "tool type is always forced" invariant and producing a payload with duplicate type fields. Fix this at the setter by running transform_keys(&:to_sym) on the incoming Hash. Downstream code can assume symbol keys, and the stored hash is consistent regardless of how it was built (literal, JSON, YAML, etc.). Adds two unit cases: the setter normalizes string keys, and a string-keyed `"type"` is correctly overridden in the tool spec. Reported by Codex review on PR #64.
Addresses the Codex adversarial review on #64. `extract_and_save_images` was writing every decoded image to `001.png`, which silently corrupted the file boundary once the new Hash options let callers pass `output_format: "jpeg"` or `"webp"` — bytes in one format with the filename of another would mis-route through any downstream code that trusts the extension (image viewers, MIME probes, Rails uploads, etc.). Switch to sniffing the decoded bytes with Marcel and mapping the recognized image MIME type to an extension (`.png`, `.jpg`, `.webp`). Unrecognized bytes fall back to `.png`, matching the prior default. Marcel is already a runtime dependency; no new gem work required. Also trim `partial_images` from the README's supported-keys list and reword the CHANGELOG to stop claiming partial-image streaming. `create_response` does a blocking `client.responses.create(...)` with no `stream: true`, so partial images would never be delivered even if the API produced them. Streaming support is a future feature; until then, documenting the key is worse than omitting it. - lib/ai/chat.rb: new `image_extension_for(bytes)` helper and `IMAGE_EXTENSIONS_BY_MIME_TYPE` table; use the helper when choosing the filename inside `extract_and_save_images`. - spec/unit/chat_spec.rb: four new cases covering PNG / JPEG / WebP magic-byte payloads and the fallback for unrecognized bytes. - README.md: drop `partial_images` from the supported-keys list; add a one-liner explaining the auto-selected extension and a note about why `partial_images` isn't exposed. - CHANGELOG.md: drop the partial-image-streaming claim; add a "Fixed" entry for the extension bug.
`chat.image_generation = true` still works unchanged — it forwards
`{type: "image_generation"}` and lets OpenAI pick the defaults. It now
also accepts a Hash so students can configure any option the OpenAI
Responses API image generation tool exposes:
chat.image_generation = {
size: "1536x1024",
quality: "low",
model: "gpt-image-2"
}
The Hash is merged into the tool payload in `tools`, with
`type: "image_generation"` always forced so a stray `type:` in the
user Hash cannot hijack the tool slot. Incoming keys are normalized
to Symbols via `transform_keys(&:to_sym)` so that JSON-parsed or
YAML-parsed config (string keys) behaves identically to a Ruby
literal — without this, `"type" => "..."` and `:type` could coexist
in the final payload.
Why now: ChatGPT Images 2.0 (`gpt-image-2`, announced 2026-04-21)
brought 4K resolutions, flexible aspect ratios, and a new `action`
parameter. The gem's previous boolean `image_generation` flag could
not reach any of this; students' only escape was dropping to the raw
openai-ruby client. The openai-ruby SDK already types every tool
option and accepts `"gpt-image-2"` through the String variant on
`ImageGeneration#model`, so the plumbing is all there.
Validation:
- `true` stays truthy; `false`/`nil` both collapse to off; anything
else (String, Integer, Array, …) raises ArgumentError.
- `inspectable_attributes` lists `@image_generation` whenever it's
non-false, so the inspect output reveals which Hash a student set.
Saved-image extension fix:
- `extract_and_save_images` previously wrote every decoded image to
`001.png`, which becomes silent file-boundary corruption once the
Hash lets callers pass `output_format: "jpeg"` or `"webp"`. The
decoded bytes are now sniffed with Marcel (already a runtime
dependency) and saved as `.png`, `.jpg`, or `.webp` based on what
Marcel sees. Unknown bytes fall back to `.png`, matching prior
behavior. The fallback MIME table is a private constant, and the
sniffing logic is factored into a `image_extension_for(bytes)`
helper so it can be unit-tested directly with magic-byte payloads.
Docs:
- README gains a "Configuring the Tool" subsection listing the keys
that actually do something in a blocking call, plus a note about
`partial_images` being intentionally omitted (it only does anything
under `stream: true`, which this gem does not yet implement).
- CHANGELOG [Unreleased] gets Added/Changed/Fixed entries covering
the Hash API, the setter validation, and the extension fix.
- `examples/10_image_generation.rb` gains a fourth example that
configures `size:` and `quality:`.
Spec coverage:
- `#image_generation=` — accepts true/false/nil/Hash/empty-Hash,
normalizes string keys to symbols, rejects other types.
- `#tools` — omits the tool when off, emits a bare tool when `true`,
merges Hash options into the tool spec, and enforces
`type: "image_generation"` for both symbol- and string-keyed input.
- `#image_extension_for` — returns `png` / `jpg` / `webp` for the
matching magic bytes and falls back to `png` for unknown bytes.
- `#inspectable_attributes` — surfaces `@image_generation` when it's
a Hash of options, not just when it's `true`.
ed22f1e to
0cf93c6
Compare
Summary
chat.image_generation=now accepts aHashof tool options in addition to the existingtrue/false— e.g.chat.image_generation = { size: "1536x1024", quality: "low", model: "gpt-image-2" }. The Hash is merged into the OpenAI Responses API image generation tool spec, withtype: "image_generation"always enforced.chat.image_generation = trueis unchanged; students who don't care about options keep the one-liner.nilnormalizes tofalse; anything that isn'ttrue/false/nil/HashraisesArgumentError.Why now
OpenAI announced ChatGPT Images 2.0 on 2026-04-21, adding
gpt-image-2to the API with 4K resolutions, flexible aspect ratios, thinking mode, and anactionparameter. Today the gem has no way to expose any of these —image_generationis boolean-only, so the only path is to drop down to the rawopenai-rubyclient.The
openai-rubySDK already types every option on theimage_generationtool (seelib/openai/models/responses/tool.rbin the SDK). Model selection works via"gpt-image-2"as aString, sincemodelis a Union withStringas a variant. Unknown keys pass through the SDK'sBaseModelcoerce/dump unchanged.Stacked on #65
This PR is based on #65 (bump openai from
~> 0.43to~> 0.59) so the typedactionfield and the up-to-dateimage_generationtool surface land first. Review order: merge #65, then this PR's base auto-retargets tomain.Still deliberately out of scope
gpt-5.2→gpt-5.4). OpenAI's current image-generation docs usegpt-5.4. Worth considering, separate PR.[Unreleased]; whoever cuts the release can decide the number (minor bump looks right).Test plan
true/false/nil/Hash/invalid) and thetoolsmethod's output in each casetype:key does not override the tool typebundle exec rspec spec/unit/chat_spec.rb— 65 examples, 0 failures (also after rebase onto Bump openai dep from ~> 0.43 to ~> 0.59 #65)bundle exec standardrb lib/ai/chat.rb spec/unit/chat_spec.rb— clean on touched lines (two pre-existing warnings on untouched code left alone){ size: "1536x1024", quality: "low" }— leaving to the reviewer since it burns credits