-
-
Notifications
You must be signed in to change notification settings - Fork 230
Structured Output & JSON mode response support #131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -261,6 +261,54 @@ end | |
chat.ask "What is metaprogramming in Ruby?" | ||
``` | ||
|
||
## Receiving Structured Responses | ||
You can ensure the responses follow a schema you define like this: | ||
```ruby | ||
chat = RubyLLM.chat | ||
|
||
chat.with_response_format(:integer).ask("What is 2 + 2?").to_i | ||
# => 4 | ||
|
||
chat.with_response_format(:string).ask("Say 'Hello World' and nothing else.").content | ||
# => "Hello World" | ||
|
||
chat.with_response_format(:array, items: { type: :string }) | ||
chat.ask('What are the 2 largest countries? Only respond with country names.').content | ||
# => ["Russia", "Canada"] | ||
|
||
chat.with_response_format(:object, properties: { age: { type: :integer } }) | ||
chat.ask('Provide sample customer age between 10 and 100.').content | ||
# => { "age" => 42 } | ||
|
||
chat.with_response_format( | ||
:object, | ||
properties: { hobbies: { type: :array, items: { type: :string, enum: %w[Soccer Golf Hockey] } } } | ||
) | ||
chat.ask('Provide at least 1 hobby.').content | ||
# => { "hobbies" => ["Soccer"] } | ||
``` | ||
|
||
You can also provide the JSON schema you want directly to the method like this: | ||
```ruby | ||
chat.with_response_format(type: :object, properties: { age: { type: :integer } }) | ||
# => { "age" => 31 } | ||
``` | ||
|
||
In this example the code is automatically switching to OpenAI's json_mode since no object properties are requested: | ||
```ruby | ||
chat.with_response_format(:json) # Don't care about structure, just give me JSON | ||
|
||
chat.ask('Provide a sample customer data object with name and email keys.').content | ||
# => { "name" => "Tobias", "email" => "[email protected]" } | ||
|
||
chat.ask('Provide a sample customer data object with name and email keys.').content | ||
# => { "first_name" => "Michael", "email_address" => "[email protected]" } | ||
``` | ||
|
||
{: .note } | ||
**Only OpenAI supported for now:** Only OpenAI models support this feature for now. We will add support for other models shortly. | ||
|
||
|
||
## Next Steps | ||
|
||
This guide covered the core `Chat` interface. Now you might want to explore: | ||
|
@@ -269,4 +317,4 @@ This guide covered the core `Chat` interface. Now you might want to explore: | |
* [Using Tools]({% link guides/tools.md %}): Enable the AI to call your Ruby code. | ||
* [Streaming Responses]({% link guides/streaming.md %}): Get real-time feedback from the AI. | ||
* [Rails Integration]({% link guides/rails.md %}): Persist your chat conversations easily. | ||
* [Error Handling]({% link guides/error-handling.md %}): Build robust applications that handle API issues. | ||
* [Error Handling]({% link guides/error-handling.md %}): Build robust applications that handle API issues. |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -31,6 +31,56 @@ def initialize(model: nil, provider: nil, assume_model_exists: false, context: n | |
} | ||
end | ||
|
||
## | ||
# This method lets you ensure the responses follow a schema you define like this: | ||
# | ||
# chat.with_response_format(:integer).ask("What is 2 + 2?").to_i | ||
# # => 4 | ||
# chat.with_response_format(:string).ask("Say 'Hello World' and nothing else.").content | ||
# # => "Hello World" | ||
# chat.with_response_format(:array, items: { type: :string }) | ||
# chat.ask('What are the 2 largest countries? Only respond with country names.').content | ||
# # => ["Russia", "Canada"] | ||
# chat.with_response_format(:object, properties: { age: { type: :integer } }) | ||
# chat.ask('Provide sample customer age between 10 and 100.').content | ||
# # => { "age" => 42 } | ||
# chat.with_response_format( | ||
# :object, | ||
# properties: { hobbies: { type: :array, items: { type: :string, enum: %w[Soccer Golf Hockey] } } } | ||
# ) | ||
# chat.ask('Provide at least 1 hobby.').content | ||
# # => { "hobbies" => ["Soccer"] } | ||
# | ||
# You can also provide the JSON schema you want directly to the method like this: | ||
# chat.with_response_format(type: :object, properties: { age: { type: :integer } }) | ||
# # => { "age" => 31 } | ||
# | ||
# In this example the code is automatically switching to OpenAI's json_mode since no object | ||
# properties are requested: | ||
# chat.with_response_format(:json) # Don't care about structure, just give me JSON | ||
# chat.ask('Provide a sample customer data object with name and email keys.').content | ||
# # => { "name" => "Tobias", "email" => "[email protected]" } | ||
# chat.ask('Provide a sample customer data object with name and email keys.').content | ||
# # => { "first_name" => "Michael", "email_address" => "[email protected]" } | ||
# | ||
# @param type [Symbol] (optional) This can be anything supported by the API JSON schema types (integer, object, etc) | ||
# @param schema [Hash] The schema for the response format. It can be a JSON schema or a simple hash. | ||
# @return [Chat] (self) | ||
def with_response_format(type = nil, **schema) | ||
schema_hash = if type.is_a?(Symbol) || type.is_a?(String) | ||
{ type: type == :json ? :object : type } | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. isn't this :json_object? [0] [0] https://platform.openai.com/docs/guides/structured-outputs?api-mode=chat#json-mode |
||
elsif type.is_a?(Hash) | ||
type | ||
else | ||
{} | ||
end.merge(schema) | ||
|
||
@response_schema = Schema.new(schema_hash) | ||
|
||
self | ||
end | ||
alias with_structured_response with_response_format | ||
|
||
def ask(message = nil, with: {}, &) | ||
add_message role: :user, content: Content.new(message, with) | ||
complete(&) | ||
|
@@ -86,17 +136,23 @@ def each(&) | |
|
||
def complete(&) # rubocop:disable Metrics/MethodLength | ||
@on[:new_message]&.call | ||
response = @provider.complete( | ||
messages, | ||
tools: @tools, | ||
temperature: @temperature, | ||
model: @model.id, | ||
connection: @connection, | ||
& | ||
) | ||
response = @provider.with_response_schema(@response_schema) do | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @crmne should we reset the @response_schema after the completion so it doesn't apply to subsequent messages in the chat? Learning from your comment on my other PR about temperature I realize now that your There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. no, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, I actually updated the code to reset the response schema after completion after your comment on my other PR about temperature already! I've been battle-testing this code at my company Osello (which is also a sponsor now of this project) and I realized it does make more sense to reset the response_schema after completion because in practice subsequent chat messages are most likely not meant to follow the same format: chat.with_response_format(type: :string, enum: %w["Toronto", "Ottawa"])
.ask("What's the capital of Canada?")
.content
# => "Ottawa"
chat.ask("How long has it been the capital?")
# => "Ottawa has been the capital of Canada since 1857."
chat.with_response_format(type: :integer).ask("How many years is that?")
# => 168 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Given the the design of the # Applies to all messages
chat = RubyLLM.chat.with_response_format(type: :string)
chat.ask("What's the capital of Canada?")
# => "Ottawa"
# Applies to current message
chat.as(type: :integer).ask("How many years is that?")
# => 168
# Resets back
chat.ask("How long has it been the capital?")
# => "Ottawa has been the capital of Canada since 1857." There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @sirwolfgang good idea! Although There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Another idea is that I could make it so that chat.with_response_format(type: :string, enum: %w["Toronto", "Ottawa"]) do
chat.ask("What's the capital of Canada?").content
end
# => "Ottawa"
chat.ask("How long has it been the capital?")
# => "Ottawa has been the capital of Canada since 1857."
chat.with_response_format(type: :integer)
chat.ask("How many years ago is that?")
# => 168
chat.ask("How many years ago will that be next year?")
# => 169 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jayelkaake I think linguistically/ergonomically it would be better to split and not use the Totally open to ideas other than Could extend it to Otherwise, if we could delay execution like that of AR, I can see the argument for making it a post setting; something like: agent.ask("...?").in(type: :integer)
agent.ask("...?").as(type: :integer)
agent.ask("...?").structured_as(type: :integer)
agent.ask("...?").formated_as(type: :integer) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I like the idea, just doesn't read well with English "ask someone to do something as...".... and also it should be clear you're not modifying the query like it is with AR, it's modifying the response. I like the postfix format better. Maybe something like Most APIs don't let you mutate the response format in real-time, so LLMs are kind of introducing the need for a new pattern maybe? I've been scratching my head about these things a lot over the last couple weeks! 😅 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, I think refactoring to support a more dynamic method chaining should be a different PR; But we could setup the expect syntax here, and build towards that; Since it should functionally work in either order. response/d feels a little weird to me. I could see this interface also make sense for loading personas; like timekeeper = RubyLLM.chat.with_tool(TIME)
groot = RubyLLM.chat.with_instructions(GROOT)
timekeeper.ask("What time is it?").respond_as(groot) => "I am grooooooot" |
||
@provider.complete( | ||
messages, | ||
tools: @tools, | ||
temperature: @temperature, | ||
model: @model.id, | ||
connection: @connection, | ||
& | ||
) | ||
end | ||
|
||
@on[:end_message]&.call(response) | ||
|
||
add_message response | ||
|
||
@response_schema = nil # Reset the response schema after completion of this chat thread | ||
|
||
if response.tool_call? | ||
handle_tool_calls(response, &) | ||
else | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -31,6 +31,29 @@ def list_models(connection:) | |
parse_list_models_response response, slug, capabilities | ||
end | ||
|
||
## | ||
# @return [::RubyLLM::Schema, NilClass] | ||
def response_schema | ||
Thread.current['RubyLLM::Provider::Methods.response_schema'] | ||
end | ||
|
||
## | ||
# @param response_schema [::RubyLLM::Schema] | ||
def with_response_schema(response_schema) | ||
prev_response_schema = Thread.current['RubyLLM::Provider::Methods.response_schema'] | ||
|
||
result = nil | ||
begin | ||
Thread.current['RubyLLM::Provider::Methods.response_schema'] = response_schema | ||
|
||
result = yield | ||
ensure | ||
Thread.current['RubyLLM::Provider::Methods.response_schema'] = prev_response_schema | ||
end | ||
|
||
result | ||
end | ||
|
||
Comment on lines
+34
to
+56
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what's all this about threads? we have contexts now There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This was a thread-safe way to set response schema. I'll be able to refactor it to use context instead (assuming the context system thread-safe, haven't looked yet). |
||
def embed(text, model:, connection:, dimensions:) | ||
payload = render_embedding_payload(text, model:, dimensions:) | ||
response = connection.post(embedding_url(model:), payload) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jayelkaake @crmne Thanks for all the work put in this! I've been trying the API and found it to be a bit surprising and adds complexity on top of OpenAI nuances.
Comparing the calls
chat.with_response_format(:object, properties: { age: { type: :integer } })
chat.with_response_format(type: :object, properties: { age: { type: :integer } })
The inclusion or not of the
type:
changes a lot the way we invoke OpenAI. One relies on json_mode and the other on json_schema but it's not very clear from the API.In addition, the support of structured output or json_mode depends also on the model used. Old models will not support json_schema so maybe that's something we want to factor in? I.e. use one or the other based on the model.
OpenAI doesn't recommend JSON mode except for older models; and I do believe its API a product of its time. The fact that we have to append more instructions to the original prompt and ask for json is a sign of that (why do it in english btw?), so I don't think RubyLLM should default to it relegating the response format
json_schema
.