diff --git a/.gitignore b/.gitignore index b2ed8ad2..976fafc7 100644 --- a/.gitignore +++ b/.gitignore @@ -47,8 +47,8 @@ build-iPhoneSimulator/ # for a library or gem, you might want to ignore these files since the code is # intended to run in multiple environments; otherwise, check them in: Gemfile.lock -# .ruby-version -# .ruby-gemset +.ruby-version +.ruby-gemset # unless supporting rvm < 1.11.0 or doing something fancy, ignore this: .rvmrc @@ -57,3 +57,4 @@ Gemfile.lock # .rubocop-https?--* repomix-output.* +/.idea/ diff --git a/Gemfile b/Gemfile index 95d97963..0ae07d06 100644 --- a/Gemfile +++ b/Gemfile @@ -18,6 +18,7 @@ group :development do gem 'nokogiri' gem 'overcommit', '>= 0.66' gem 'pry', '>= 0.14' + gem 'pry-byebug', '>= 3.11' gem 'rake', '>= 13.0' gem 'rdoc' gem 'reline' diff --git a/README.md b/README.md index 974a95b0..a808083d 100644 --- a/README.md +++ b/README.md @@ -60,6 +60,9 @@ chat.ask "Tell me a story about a Ruby programmer" do |chunk| print chunk.content end +# Get structured responses easily (OpenAI only for now) +chat.with_response_format(:integer).ask("What is 2 + 2?").to_i # => 4 + # Generate images RubyLLM.paint "a sunset over mountains in watercolor style" diff --git a/docs/guides/chat.md b/docs/guides/chat.md index b818d2bd..bfda175d 100644 --- a/docs/guides/chat.md +++ b/docs/guides/chat.md @@ -261,6 +261,54 @@ end chat.ask "What is metaprogramming in Ruby?" ``` +## Receiving Structured Responses +You can ensure the responses follow a schema you define like this: +```ruby +chat = RubyLLM.chat + +chat.with_response_format(:integer).ask("What is 2 + 2?").to_i +# => 4 + +chat.with_response_format(:string).ask("Say 'Hello World' and nothing else.").content +# => "Hello World" + +chat.with_response_format(:array, items: { type: :string }) +chat.ask('What are the 2 largest countries? Only respond with country names.').content +# => ["Russia", "Canada"] + +chat.with_response_format(:object, properties: { age: { type: :integer } }) +chat.ask('Provide sample customer age between 10 and 100.').content +# => { "age" => 42 } + +chat.with_response_format( + :object, + properties: { hobbies: { type: :array, items: { type: :string, enum: %w[Soccer Golf Hockey] } } } +) +chat.ask('Provide at least 1 hobby.').content +# => { "hobbies" => ["Soccer"] } +``` + +You can also provide the JSON schema you want directly to the method like this: +```ruby +chat.with_response_format(type: :object, properties: { age: { type: :integer } }) +# => { "age" => 31 } +``` + +In this example the code is automatically switching to OpenAI's json_mode since no object properties are requested: +```ruby +chat.with_response_format(:json) # Don't care about structure, just give me JSON + +chat.ask('Provide a sample customer data object with name and email keys.').content +# => { "name" => "Tobias", "email" => "tobias@example.com" } + +chat.ask('Provide a sample customer data object with name and email keys.').content +# => { "first_name" => "Michael", "email_address" => "michael@example.com" } +``` + +{: .note } +**Only OpenAI supported for now:** Only OpenAI models support this feature for now. We will add support for other models shortly. + + ## Next Steps This guide covered the core `Chat` interface. Now you might want to explore: @@ -269,4 +317,4 @@ This guide covered the core `Chat` interface. Now you might want to explore: * [Using Tools]({% link guides/tools.md %}): Enable the AI to call your Ruby code. * [Streaming Responses]({% link guides/streaming.md %}): Get real-time feedback from the AI. * [Rails Integration]({% link guides/rails.md %}): Persist your chat conversations easily. -* [Error Handling]({% link guides/error-handling.md %}): Build robust applications that handle API issues. \ No newline at end of file +* [Error Handling]({% link guides/error-handling.md %}): Build robust applications that handle API issues. diff --git a/lib/ruby_llm/active_record/acts_as.rb b/lib/ruby_llm/active_record/acts_as.rb index 3520cf60..9698dc79 100644 --- a/lib/ruby_llm/active_record/acts_as.rb +++ b/lib/ruby_llm/active_record/acts_as.rb @@ -93,6 +93,12 @@ def with_instructions(instructions, replace: false) self end + # @see LlmChat#with_response_format + def with_response_format(...) + to_llm.with_response_format(...) + self + end + def with_tool(...) to_llm.with_tool(...) self @@ -158,14 +164,19 @@ def persist_message_completion(message) # rubocop:disable Metrics/AbcSize,Metric end transaction do - @message.update!( - role: message.role, - content: message.content, - model_id: message.model_id, - tool_call_id: tool_call_id, - input_tokens: message.input_tokens, - output_tokens: message.output_tokens - ) + # These are required fields: + @message.role = message.role + @message.content = message.content + + # These are optional fields: + @message.try('model_id=', message.model_id) + @message.try('tool_call_id=', tool_call_id) + @message.try('input_tokens=', message.input_tokens) + @message.try('output_tokens=', message.output_tokens) + @message.try('content_schema=', message.content_schema) + + @message.save! + persist_tool_calls(message.tool_calls) if message.tool_calls.present? end end diff --git a/lib/ruby_llm/chat.rb b/lib/ruby_llm/chat.rb index 6462b656..cc227fc6 100644 --- a/lib/ruby_llm/chat.rb +++ b/lib/ruby_llm/chat.rb @@ -31,6 +31,56 @@ def initialize(model: nil, provider: nil, assume_model_exists: false, context: n } end + ## + # This method lets you ensure the responses follow a schema you define like this: + # + # chat.with_response_format(:integer).ask("What is 2 + 2?").to_i + # # => 4 + # chat.with_response_format(:string).ask("Say 'Hello World' and nothing else.").content + # # => "Hello World" + # chat.with_response_format(:array, items: { type: :string }) + # chat.ask('What are the 2 largest countries? Only respond with country names.').content + # # => ["Russia", "Canada"] + # chat.with_response_format(:object, properties: { age: { type: :integer } }) + # chat.ask('Provide sample customer age between 10 and 100.').content + # # => { "age" => 42 } + # chat.with_response_format( + # :object, + # properties: { hobbies: { type: :array, items: { type: :string, enum: %w[Soccer Golf Hockey] } } } + # ) + # chat.ask('Provide at least 1 hobby.').content + # # => { "hobbies" => ["Soccer"] } + # + # You can also provide the JSON schema you want directly to the method like this: + # chat.with_response_format(type: :object, properties: { age: { type: :integer } }) + # # => { "age" => 31 } + # + # In this example the code is automatically switching to OpenAI's json_mode since no object + # properties are requested: + # chat.with_response_format(:json) # Don't care about structure, just give me JSON + # chat.ask('Provide a sample customer data object with name and email keys.').content + # # => { "name" => "Tobias", "email" => "tobias@example.com" } + # chat.ask('Provide a sample customer data object with name and email keys.').content + # # => { "first_name" => "Michael", "email_address" => "michael@example.com" } + # + # @param type [Symbol] (optional) This can be anything supported by the API JSON schema types (integer, object, etc) + # @param schema [Hash] The schema for the response format. It can be a JSON schema or a simple hash. + # @return [Chat] (self) + def with_response_format(type = nil, **schema) + schema_hash = if type.is_a?(Symbol) || type.is_a?(String) + { type: type == :json ? :object : type } + elsif type.is_a?(Hash) + type + else + {} + end.merge(schema) + + @response_schema = Schema.new(schema_hash) + + self + end + alias with_structured_response with_response_format + def ask(message = nil, with: {}, &) add_message role: :user, content: Content.new(message, with) complete(&) @@ -86,17 +136,23 @@ def each(&) def complete(&) # rubocop:disable Metrics/MethodLength @on[:new_message]&.call - response = @provider.complete( - messages, - tools: @tools, - temperature: @temperature, - model: @model.id, - connection: @connection, - & - ) + response = @provider.with_response_schema(@response_schema) do + @provider.complete( + messages, + tools: @tools, + temperature: @temperature, + model: @model.id, + connection: @connection, + & + ) + end + @on[:end_message]&.call(response) add_message response + + @response_schema = nil # Reset the response schema after completion of this chat thread + if response.tool_call? handle_tool_calls(response, &) else diff --git a/lib/ruby_llm/message.rb b/lib/ruby_llm/message.rb index cf6ea7f3..7e8891d2 100644 --- a/lib/ruby_llm/message.rb +++ b/lib/ruby_llm/message.rb @@ -7,7 +7,9 @@ module RubyLLM class Message ROLES = %i[system user assistant tool].freeze - attr_reader :role, :content, :tool_calls, :tool_call_id, :input_tokens, :output_tokens, :model_id + attr_reader :role, :tool_calls, :tool_call_id, :input_tokens, :output_tokens, :model_id, :content_schema + + delegate :to_i, :to_a, :to_s, to: :content def initialize(options = {}) @role = options[:role].to_sym @@ -17,10 +19,22 @@ def initialize(options = {}) @output_tokens = options[:output_tokens] @model_id = options[:model_id] @tool_call_id = options[:tool_call_id] + @content_schema = options[:content_schema] ensure_valid_role end + def content + return @content unless @content_schema.present? + return @content if @content.nil? + + if @content_schema[:type].to_s == :object.to_s && @content_schema[:properties].to_h.keys.none? + json_response + else + structured_content + end + end + def tool_call? !tool_calls.nil? && !tool_calls.empty? end @@ -47,6 +61,18 @@ def to_h private + def json_response + return nil if @content.nil? + + JSON.parse(@content) + end + + def structured_content + return nil if @content.nil? + + json_response['result'] + end + def normalize_content(content) case content when Content then content.format diff --git a/lib/ruby_llm/provider.rb b/lib/ruby_llm/provider.rb index 8bd6f235..8bd6c3b4 100644 --- a/lib/ruby_llm/provider.rb +++ b/lib/ruby_llm/provider.rb @@ -31,6 +31,29 @@ def list_models(connection:) parse_list_models_response response, slug, capabilities end + ## + # @return [::RubyLLM::Schema, NilClass] + def response_schema + Thread.current['RubyLLM::Provider::Methods.response_schema'] + end + + ## + # @param response_schema [::RubyLLM::Schema] + def with_response_schema(response_schema) + prev_response_schema = Thread.current['RubyLLM::Provider::Methods.response_schema'] + + result = nil + begin + Thread.current['RubyLLM::Provider::Methods.response_schema'] = response_schema + + result = yield + ensure + Thread.current['RubyLLM::Provider::Methods.response_schema'] = prev_response_schema + end + + result + end + def embed(text, model:, connection:, dimensions:) payload = render_embedding_payload(text, model:, dimensions:) response = connection.post(embedding_url(model:), payload) diff --git a/lib/ruby_llm/providers/openai/chat.rb b/lib/ruby_llm/providers/openai/chat.rb index 598020c8..ebae7df3 100644 --- a/lib/ruby_llm/providers/openai/chat.rb +++ b/lib/ruby_llm/providers/openai/chat.rb @@ -22,11 +22,14 @@ def render_payload(messages, tools:, temperature:, model:, stream: false) # rubo payload[:tools] = tools.map { |_, tool| tool_for(tool) } payload[:tool_choice] = 'auto' end + + add_response_schema_to_payload(payload) if response_schema.present? + payload[:stream_options] = { include_usage: true } if stream end end - def parse_completion_response(response) # rubocop:disable Metrics/MethodLength + def parse_completion_response(response) # rubocop:disable Metrics/MethodLength, Metrics/AbcSize -- ABC is high because of the JSON parsing which is better in 1 method data = response.body return if data.empty? @@ -37,6 +40,7 @@ def parse_completion_response(response) # rubocop:disable Metrics/MethodLength Message.new( role: :assistant, + content_schema: response_schema, content: message_data['content'], tool_calls: parse_tool_calls(message_data['tool_calls']), input_tokens: data['usage']['prompt_tokens'], @@ -64,6 +68,54 @@ def format_role(role) role.to_s end end + + private + + ## + # @param [Hash] payload + def add_response_schema_to_payload(payload) + payload[:response_format] = gen_response_format_request + + return unless payload[:response_format][:type] == :json_object + + # NOTE: this is required by the Open AI API when requesting arbitrary JSON. + payload[:messages].unshift({ role: :developer, content: <<~GUIDANCE + You must format your output as a valid JSON object. + Format your entire response as valid JSON. + Do not include explanations, markdown formatting, or any text outside the JSON. + GUIDANCE + }) + end + + ## + # @return [Hash] + def gen_response_format_request + if response_schema[:type].to_s == :object.to_s && response_schema[:properties].to_h.keys.none? + { type: :json_object } # Assume we just want json_mode + else + gen_json_schema_format_request + end + end + + def gen_json_schema_format_request # rubocop:disable Metrics/MethodLength -- because it's mostly the standard hash + result_schema = response_schema.dup # so we don't modify the original in the thread + result_schema.add_to_each_object_type!(:additionalProperties, false) + result_schema.add_to_each_object_type!(:required, ->(schema) { schema[:properties].to_h.keys }) + + { + type: :json_schema, + json_schema: { + name: :response, + schema: { + type: :object, + properties: { result: result_schema.to_h }, + additionalProperties: false, + required: [:result] + }, + strict: true + } + } + end end end end diff --git a/lib/ruby_llm/schema.rb b/lib/ruby_llm/schema.rb new file mode 100644 index 00000000..10dc8b15 --- /dev/null +++ b/lib/ruby_llm/schema.rb @@ -0,0 +1,76 @@ +# frozen_string_literal: true + +module RubyLLM + ## + # Schema class for defining the structure of data objects. + # Wraps the #Hash class + # @see #Hash + class Schema + delegate_missing_to :@schema + + def initialize(schema = {}) + @schema = deep_transform_keys_in_object(schema.to_h.dup, &:to_sym) + end + + def [](key) + @schema[key.to_sym] + end + + def []=(key, new_value) + @schema[key.to_sym] = deep_transform_keys_in_object(new_value, &:to_sym) + end + + # Adds the new_value into the new_key key for every sub-schema that is of type: :object + # @param new_key [Symbol] The key to add to each object type. + # @param new_value [Boolean, String] The value to assign to the new key. + def add_to_each_object_type!(new_key, new_value) + add_to_each_object_type(new_key, new_value, @schema) + end + + # @return [Boolean] + def present? + @schema.present? && @schema[:type].present? + end + + private + + def add_to_each_object_type(new_key, new_value, schema) + return schema unless schema.is_a?(Hash) + + if schema[:type].to_s == :object.to_s + add_to_object_type(new_key, new_value, schema) + elsif schema[:type].to_s == :array.to_s && schema[:items] + schema[:items] = add_to_each_object_type(new_key, new_value, schema[:items]) + end + + schema + end + + def add_to_object_type(new_key, new_value, schema) + if schema[new_key.to_sym].nil? + schema[new_key.to_sym] = new_value.is_a?(Proc) ? new_value.call(schema) : new_value + end + + schema[:properties]&.transform_values! { |value| add_to_each_object_type(new_key, new_value, value) } + end + + ## + # Recursively transforms keys in a hash or array to symbols. + # Borrowed from ActiveSupport's Hash#deep_transform_keys + # @param object [Object] The object to transform. + # @param block [Proc] The block to apply to each key. + # @return [Object] The transformed object. + def deep_transform_keys_in_object(object, &block) + case object + when Hash + object.each_with_object({}) do |(key, value), result| + result[yield(key)] = deep_transform_keys_in_object(value, &block) + end + when Array + object.map { |e| deep_transform_keys_in_object(e, &block) } + else + object + end + end + end +end diff --git a/lib/ruby_llm/version.rb b/lib/ruby_llm/version.rb index e80bfb73..73b80c7e 100644 --- a/lib/ruby_llm/version.rb +++ b/lib/ruby_llm/version.rb @@ -1,5 +1,5 @@ # frozen_string_literal: true module RubyLLM - VERSION = '1.2.0' + VERSION = '1.3.0' end diff --git a/spec/fixtures/vcr_cassettes/chat_with_response_format_with_openai_gpt-4_1-nano_more_complex_schemas_returns_arrays_within_object.yml b/spec/fixtures/vcr_cassettes/chat_with_response_format_with_openai_gpt-4_1-nano_more_complex_schemas_returns_arrays_within_object.yml new file mode 100644 index 00000000..51da9dc2 --- /dev/null +++ b/spec/fixtures/vcr_cassettes/chat_with_response_format_with_openai_gpt-4_1-nano_more_complex_schemas_returns_arrays_within_object.yml @@ -0,0 +1,111 @@ +--- +http_interactions: +- request: + method: post + uri: https://api.openai.com/v1/chat/completions + body: + encoding: UTF-8 + string: '{"model":"gpt-4.1-nano","messages":[{"role":"user","content":"Provide + at least 1 hobby."}],"temperature":0.7,"stream":false,"response_format":{"type":"json_schema","json_schema":{"name":"response","schema":{"type":"object","properties":{"result":{"type":"object","properties":{"hobbies":{"type":"array","items":{"type":"string","enum":["Soccer","Golf","Hockey"]}}},"additionalProperties":false,"required":["hobbies"]}},"additionalProperties":false,"required":["result"]},"strict":true}}}' + headers: + User-Agent: + - Faraday v2.13.0 + Authorization: + - Bearer + Content-Type: + - application/json + Accept-Encoding: + - gzip;q=1.0,deflate;q=0.6,identity;q=0.3 + Accept: + - "*/*" + response: + status: + code: 200 + message: OK + headers: + Date: + - Mon, 21 Apr 2025 19:58:26 GMT + Content-Type: + - application/json + Transfer-Encoding: + - chunked + Connection: + - keep-alive + Access-Control-Expose-Headers: + - X-Request-ID + Openai-Organization: + - "" + Openai-Processing-Ms: + - '156' + Openai-Version: + - '2020-10-01' + X-Ratelimit-Limit-Requests: + - '30000' + X-Ratelimit-Limit-Tokens: + - '150000000' + X-Ratelimit-Remaining-Requests: + - '29999' + X-Ratelimit-Remaining-Tokens: + - '149999990' + X-Ratelimit-Reset-Requests: + - 2ms + X-Ratelimit-Reset-Tokens: + - 0s + X-Request-Id: + - "" + Strict-Transport-Security: + - max-age=31536000; includeSubDomains; preload + Cf-Cache-Status: + - DYNAMIC + Set-Cookie: + - "" + - "" + X-Content-Type-Options: + - nosniff + Server: + - cloudflare + Cf-Ray: + - "" + Alt-Svc: + - h3=":443"; ma=86400 + body: + encoding: ASCII-8BIT + string: | + { + "id": "chatcmpl-BOrZaCm30iMatE1i7SfWWWcA6izPO", + "object": "chat.completion", + "created": 1745265506, + "model": "gpt-4.1-nano-2025-04-14", + "choices": [ + { + "index": 0, + "message": { + "role": "assistant", + "content": "{\"result\":{\"hobbies\":[\"Soccer\"]}}", + "refusal": null, + "annotations": [] + }, + "logprobs": null, + "finish_reason": "stop" + } + ], + "usage": { + "prompt_tokens": 69, + "completion_tokens": 11, + "total_tokens": 80, + "prompt_tokens_details": { + "cached_tokens": 0, + "audio_tokens": 0 + }, + "completion_tokens_details": { + "reasoning_tokens": 0, + "audio_tokens": 0, + "accepted_prediction_tokens": 0, + "rejected_prediction_tokens": 0 + } + }, + "service_tier": "default", + "system_fingerprint": "fp_c1fb89028d" + } + recorded_at: Mon, 21 Apr 2025 19:58:26 GMT +recorded_with: VCR 6.3.1 diff --git a/spec/fixtures/vcr_cassettes/chat_with_response_format_with_openai_gpt-4_1-nano_more_complex_schemas_returns_integers_within_object.yml b/spec/fixtures/vcr_cassettes/chat_with_response_format_with_openai_gpt-4_1-nano_more_complex_schemas_returns_integers_within_object.yml new file mode 100644 index 00000000..0c15c97e --- /dev/null +++ b/spec/fixtures/vcr_cassettes/chat_with_response_format_with_openai_gpt-4_1-nano_more_complex_schemas_returns_integers_within_object.yml @@ -0,0 +1,111 @@ +--- +http_interactions: +- request: + method: post + uri: https://api.openai.com/v1/chat/completions + body: + encoding: UTF-8 + string: '{"model":"gpt-4.1-nano","messages":[{"role":"user","content":"Provide + sample customer age between 10 and 100."}],"temperature":0.7,"stream":false,"response_format":{"type":"json_schema","json_schema":{"name":"response","schema":{"type":"object","properties":{"result":{"type":"object","properties":{"age":{"type":"integer"}},"additionalProperties":false,"required":["age"]}},"additionalProperties":false,"required":["result"]},"strict":true}}}' + headers: + User-Agent: + - Faraday v2.13.0 + Authorization: + - Bearer + Content-Type: + - application/json + Accept-Encoding: + - gzip;q=1.0,deflate;q=0.6,identity;q=0.3 + Accept: + - "*/*" + response: + status: + code: 200 + message: OK + headers: + Date: + - Mon, 21 Apr 2025 19:58:26 GMT + Content-Type: + - application/json + Transfer-Encoding: + - chunked + Connection: + - keep-alive + Access-Control-Expose-Headers: + - X-Request-ID + Openai-Organization: + - "" + Openai-Processing-Ms: + - '109' + Openai-Version: + - '2020-10-01' + X-Ratelimit-Limit-Requests: + - '30000' + X-Ratelimit-Limit-Tokens: + - '150000000' + X-Ratelimit-Remaining-Requests: + - '29999' + X-Ratelimit-Remaining-Tokens: + - '149999985' + X-Ratelimit-Reset-Requests: + - 2ms + X-Ratelimit-Reset-Tokens: + - 0s + X-Request-Id: + - "" + Strict-Transport-Security: + - max-age=31536000; includeSubDomains; preload + Cf-Cache-Status: + - DYNAMIC + Set-Cookie: + - "" + - "" + X-Content-Type-Options: + - nosniff + Server: + - cloudflare + Cf-Ray: + - "" + Alt-Svc: + - h3=":443"; ma=86400 + body: + encoding: ASCII-8BIT + string: | + { + "id": "chatcmpl-BOrZaoIiVy8391UrqbPhLCvnzggQg", + "object": "chat.completion", + "created": 1745265506, + "model": "gpt-4.1-nano-2025-04-14", + "choices": [ + { + "index": 0, + "message": { + "role": "assistant", + "content": "{\"result\":{\"age\":45}}", + "refusal": null, + "annotations": [] + }, + "logprobs": null, + "finish_reason": "stop" + } + ], + "usage": { + "prompt_tokens": 55, + "completion_tokens": 8, + "total_tokens": 63, + "prompt_tokens_details": { + "cached_tokens": 0, + "audio_tokens": 0 + }, + "completion_tokens_details": { + "reasoning_tokens": 0, + "audio_tokens": 0, + "accepted_prediction_tokens": 0, + "rejected_prediction_tokens": 0 + } + }, + "service_tier": "default", + "system_fingerprint": "fp_8fd43718b3" + } + recorded_at: Mon, 21 Apr 2025 19:58:26 GMT +recorded_with: VCR 6.3.1 diff --git a/spec/fixtures/vcr_cassettes/chat_with_response_format_with_openai_gpt-4_1-nano_more_complex_schemas_returns_strings_within_object.yml b/spec/fixtures/vcr_cassettes/chat_with_response_format_with_openai_gpt-4_1-nano_more_complex_schemas_returns_strings_within_object.yml new file mode 100644 index 00000000..3283ca07 --- /dev/null +++ b/spec/fixtures/vcr_cassettes/chat_with_response_format_with_openai_gpt-4_1-nano_more_complex_schemas_returns_strings_within_object.yml @@ -0,0 +1,111 @@ +--- +http_interactions: +- request: + method: post + uri: https://api.openai.com/v1/chat/completions + body: + encoding: UTF-8 + string: '{"model":"gpt-4.1-nano","messages":[{"role":"user","content":"Provide + a sample customer name."}],"temperature":0.7,"stream":false,"response_format":{"type":"json_schema","json_schema":{"name":"response","schema":{"type":"object","properties":{"result":{"type":"object","properties":{"name":{"type":"string"}},"additionalProperties":false,"required":["name"]}},"additionalProperties":false,"required":["result"]},"strict":true}}}' + headers: + User-Agent: + - Faraday v2.13.0 + Authorization: + - Bearer + Content-Type: + - application/json + Accept-Encoding: + - gzip;q=1.0,deflate;q=0.6,identity;q=0.3 + Accept: + - "*/*" + response: + status: + code: 200 + message: OK + headers: + Date: + - Mon, 21 Apr 2025 19:58:26 GMT + Content-Type: + - application/json + Transfer-Encoding: + - chunked + Connection: + - keep-alive + Access-Control-Expose-Headers: + - X-Request-ID + Openai-Organization: + - "" + Openai-Processing-Ms: + - '2739' + Openai-Version: + - '2020-10-01' + X-Ratelimit-Limit-Requests: + - '30000' + X-Ratelimit-Limit-Tokens: + - '150000000' + X-Ratelimit-Remaining-Requests: + - '29999' + X-Ratelimit-Remaining-Tokens: + - '149999990' + X-Ratelimit-Reset-Requests: + - 2ms + X-Ratelimit-Reset-Tokens: + - 0s + X-Request-Id: + - "" + Strict-Transport-Security: + - max-age=31536000; includeSubDomains; preload + Cf-Cache-Status: + - DYNAMIC + Set-Cookie: + - "" + - "" + X-Content-Type-Options: + - nosniff + Server: + - cloudflare + Cf-Ray: + - "" + Alt-Svc: + - h3=":443"; ma=86400 + body: + encoding: ASCII-8BIT + string: | + { + "id": "chatcmpl-BOrZXRlg9PgcZkl0OVkxbhf5dikqq", + "object": "chat.completion", + "created": 1745265503, + "model": "gpt-4.1-nano-2025-04-14", + "choices": [ + { + "index": 0, + "message": { + "role": "assistant", + "content": "{\"result\":{\"name\":\"Jane Doe\"}}", + "refusal": null, + "annotations": [] + }, + "logprobs": null, + "finish_reason": "stop" + } + ], + "usage": { + "prompt_tokens": 50, + "completion_tokens": 9, + "total_tokens": 59, + "prompt_tokens_details": { + "cached_tokens": 0, + "audio_tokens": 0 + }, + "completion_tokens_details": { + "reasoning_tokens": 0, + "audio_tokens": 0, + "accepted_prediction_tokens": 0, + "rejected_prediction_tokens": 0 + } + }, + "service_tier": "default", + "system_fingerprint": "fp_eede8f0d45" + } + recorded_at: Mon, 21 Apr 2025 19:58:26 GMT +recorded_with: VCR 6.3.1 diff --git a/spec/fixtures/vcr_cassettes/chat_with_response_format_with_openai_gpt-4_1-nano_simple_type_param_returns_array_response.yml b/spec/fixtures/vcr_cassettes/chat_with_response_format_with_openai_gpt-4_1-nano_simple_type_param_returns_array_response.yml new file mode 100644 index 00000000..4b972df5 --- /dev/null +++ b/spec/fixtures/vcr_cassettes/chat_with_response_format_with_openai_gpt-4_1-nano_simple_type_param_returns_array_response.yml @@ -0,0 +1,111 @@ +--- +http_interactions: +- request: + method: post + uri: https://api.openai.com/v1/chat/completions + body: + encoding: UTF-8 + string: '{"model":"gpt-4.1-nano","messages":[{"role":"user","content":"What + are the 2 largest countries? Only respond with country names."}],"temperature":0.7,"stream":false,"response_format":{"type":"json_schema","json_schema":{"name":"response","schema":{"type":"object","properties":{"result":{"type":"array","items":{"type":"string"}}},"additionalProperties":false,"required":["result"]},"strict":true}}}' + headers: + User-Agent: + - Faraday v2.13.0 + Authorization: + - Bearer + Content-Type: + - application/json + Accept-Encoding: + - gzip;q=1.0,deflate;q=0.6,identity;q=0.3 + Accept: + - "*/*" + response: + status: + code: 200 + message: OK + headers: + Date: + - Mon, 21 Apr 2025 19:58:23 GMT + Content-Type: + - application/json + Transfer-Encoding: + - chunked + Connection: + - keep-alive + Access-Control-Expose-Headers: + - X-Request-ID + Openai-Organization: + - "" + Openai-Processing-Ms: + - '117' + Openai-Version: + - '2020-10-01' + X-Ratelimit-Limit-Requests: + - '30000' + X-Ratelimit-Limit-Tokens: + - '150000000' + X-Ratelimit-Remaining-Requests: + - '29999' + X-Ratelimit-Remaining-Tokens: + - '149999981' + X-Ratelimit-Reset-Requests: + - 2ms + X-Ratelimit-Reset-Tokens: + - 0s + X-Request-Id: + - "" + Strict-Transport-Security: + - max-age=31536000; includeSubDomains; preload + Cf-Cache-Status: + - DYNAMIC + Set-Cookie: + - "" + - "" + X-Content-Type-Options: + - nosniff + Server: + - cloudflare + Cf-Ray: + - "" + Alt-Svc: + - h3=":443"; ma=86400 + body: + encoding: ASCII-8BIT + string: | + { + "id": "chatcmpl-BOrZXWH6RuDbrFE36GckviGfyCCOo", + "object": "chat.completion", + "created": 1745265503, + "model": "gpt-4.1-nano-2025-04-14", + "choices": [ + { + "index": 0, + "message": { + "role": "assistant", + "content": "{\"result\":[\"Russia\",\"Canada\"]}", + "refusal": null, + "annotations": [] + }, + "logprobs": null, + "finish_reason": "stop" + } + ], + "usage": { + "prompt_tokens": 55, + "completion_tokens": 9, + "total_tokens": 64, + "prompt_tokens_details": { + "cached_tokens": 0, + "audio_tokens": 0 + }, + "completion_tokens_details": { + "reasoning_tokens": 0, + "audio_tokens": 0, + "accepted_prediction_tokens": 0, + "rejected_prediction_tokens": 0 + } + }, + "service_tier": "default", + "system_fingerprint": "fp_eede8f0d45" + } + recorded_at: Mon, 21 Apr 2025 19:58:23 GMT +recorded_with: VCR 6.3.1 diff --git a/spec/fixtures/vcr_cassettes/chat_with_response_format_with_openai_gpt-4_1-nano_simple_type_param_returns_object_response.yml b/spec/fixtures/vcr_cassettes/chat_with_response_format_with_openai_gpt-4_1-nano_simple_type_param_returns_object_response.yml new file mode 100644 index 00000000..0a6754b7 --- /dev/null +++ b/spec/fixtures/vcr_cassettes/chat_with_response_format_with_openai_gpt-4_1-nano_simple_type_param_returns_object_response.yml @@ -0,0 +1,114 @@ +--- +http_interactions: +- request: + method: post + uri: https://api.openai.com/v1/chat/completions + body: + encoding: UTF-8 + string: '{"model":"gpt-4.1-nano","messages":[{"role":"developer","content":"You + must format your output as a valid JSON object.\nFormat your entire response + as valid JSON.\nDo not include explanations, markdown formatting, or any text + outside the JSON.\n"},{"role":"user","content":"Provide a sample customer + data object with name and email keys."}],"temperature":0.7,"stream":false,"response_format":{"type":"json_object"}}' + headers: + User-Agent: + - Faraday v2.13.0 + Authorization: + - Bearer + Content-Type: + - application/json + Accept-Encoding: + - gzip;q=1.0,deflate;q=0.6,identity;q=0.3 + Accept: + - "*/*" + response: + status: + code: 200 + message: OK + headers: + Date: + - Mon, 21 Apr 2025 20:27:06 GMT + Content-Type: + - application/json + Transfer-Encoding: + - chunked + Connection: + - keep-alive + Access-Control-Expose-Headers: + - X-Request-ID + Openai-Organization: + - "" + Openai-Processing-Ms: + - '182' + Openai-Version: + - '2020-10-01' + X-Ratelimit-Limit-Requests: + - '30000' + X-Ratelimit-Limit-Tokens: + - '150000000' + X-Ratelimit-Remaining-Requests: + - '29999' + X-Ratelimit-Remaining-Tokens: + - '149999937' + X-Ratelimit-Reset-Requests: + - 2ms + X-Ratelimit-Reset-Tokens: + - 0s + X-Request-Id: + - "" + Strict-Transport-Security: + - max-age=31536000; includeSubDomains; preload + Cf-Cache-Status: + - DYNAMIC + Set-Cookie: + - "" + - "" + X-Content-Type-Options: + - nosniff + Server: + - cloudflare + Cf-Ray: + - "" + Alt-Svc: + - h3=":443"; ma=86400 + body: + encoding: ASCII-8BIT + string: | + { + "id": "chatcmpl-BOs1JQeHVaDPQIKSSDxJWlT2neRiP", + "object": "chat.completion", + "created": 1745267225, + "model": "gpt-4.1-nano-2025-04-14", + "choices": [ + { + "index": 0, + "message": { + "role": "assistant", + "content": "{\n \"name\": \"John Doe\",\n \"email\": \"john.doe@example.com\"\n}", + "refusal": null, + "annotations": [] + }, + "logprobs": null, + "finish_reason": "stop" + } + ], + "usage": { + "prompt_tokens": 57, + "completion_tokens": 22, + "total_tokens": 79, + "prompt_tokens_details": { + "cached_tokens": 0, + "audio_tokens": 0 + }, + "completion_tokens_details": { + "reasoning_tokens": 0, + "audio_tokens": 0, + "accepted_prediction_tokens": 0, + "rejected_prediction_tokens": 0 + } + }, + "service_tier": "default", + "system_fingerprint": "fp_eede8f0d45" + } + recorded_at: Mon, 21 Apr 2025 20:27:06 GMT +recorded_with: VCR 6.3.1 diff --git a/spec/fixtures/vcr_cassettes/chat_with_response_format_with_openai_gpt-4_1-nano_simple_type_param_returns_simple_integer_responses.yml b/spec/fixtures/vcr_cassettes/chat_with_response_format_with_openai_gpt-4_1-nano_simple_type_param_returns_simple_integer_responses.yml new file mode 100644 index 00000000..c76577bb --- /dev/null +++ b/spec/fixtures/vcr_cassettes/chat_with_response_format_with_openai_gpt-4_1-nano_simple_type_param_returns_simple_integer_responses.yml @@ -0,0 +1,111 @@ +--- +http_interactions: +- request: + method: post + uri: https://api.openai.com/v1/chat/completions + body: + encoding: UTF-8 + string: '{"model":"gpt-4.1-nano","messages":[{"role":"user","content":"What''s + 2 + 2?"}],"temperature":0.7,"stream":false,"response_format":{"type":"json_schema","json_schema":{"name":"response","schema":{"type":"object","properties":{"result":{"type":"integer"}},"additionalProperties":false,"required":["result"]},"strict":true}}}' + headers: + User-Agent: + - Faraday v2.13.0 + Authorization: + - Bearer + Content-Type: + - application/json + Accept-Encoding: + - gzip;q=1.0,deflate;q=0.6,identity;q=0.3 + Accept: + - "*/*" + response: + status: + code: 200 + message: OK + headers: + Date: + - Mon, 21 Apr 2025 19:58:23 GMT + Content-Type: + - application/json + Transfer-Encoding: + - chunked + Connection: + - keep-alive + Access-Control-Expose-Headers: + - X-Request-ID + Openai-Organization: + - "" + Openai-Processing-Ms: + - '110' + Openai-Version: + - '2020-10-01' + X-Ratelimit-Limit-Requests: + - '30000' + X-Ratelimit-Limit-Tokens: + - '150000000' + X-Ratelimit-Remaining-Requests: + - '29999' + X-Ratelimit-Remaining-Tokens: + - '149999993' + X-Ratelimit-Reset-Requests: + - 2ms + X-Ratelimit-Reset-Tokens: + - 0s + X-Request-Id: + - "" + Strict-Transport-Security: + - max-age=31536000; includeSubDomains; preload + Cf-Cache-Status: + - DYNAMIC + Set-Cookie: + - "" + - "" + X-Content-Type-Options: + - nosniff + Server: + - cloudflare + Cf-Ray: + - "" + Alt-Svc: + - h3=":443"; ma=86400 + body: + encoding: ASCII-8BIT + string: | + { + "id": "chatcmpl-BOrZXQv6MnfAEDRPiZFGSFM36kI1x", + "object": "chat.completion", + "created": 1745265503, + "model": "gpt-4.1-nano-2025-04-14", + "choices": [ + { + "index": 0, + "message": { + "role": "assistant", + "content": "{\"result\":4}", + "refusal": null, + "annotations": [] + }, + "logprobs": null, + "finish_reason": "stop" + } + ], + "usage": { + "prompt_tokens": 42, + "completion_tokens": 6, + "total_tokens": 48, + "prompt_tokens_details": { + "cached_tokens": 0, + "audio_tokens": 0 + }, + "completion_tokens_details": { + "reasoning_tokens": 0, + "audio_tokens": 0, + "accepted_prediction_tokens": 0, + "rejected_prediction_tokens": 0 + } + }, + "service_tier": "default", + "system_fingerprint": "fp_8fd43718b3" + } + recorded_at: Mon, 21 Apr 2025 19:58:23 GMT +recorded_with: VCR 6.3.1 diff --git a/spec/fixtures/vcr_cassettes/chat_with_response_format_with_openai_gpt-4_1-nano_simple_type_param_returns_simple_string_responses.yml b/spec/fixtures/vcr_cassettes/chat_with_response_format_with_openai_gpt-4_1-nano_simple_type_param_returns_simple_string_responses.yml new file mode 100644 index 00000000..d6f99ae2 --- /dev/null +++ b/spec/fixtures/vcr_cassettes/chat_with_response_format_with_openai_gpt-4_1-nano_simple_type_param_returns_simple_string_responses.yml @@ -0,0 +1,111 @@ +--- +http_interactions: +- request: + method: post + uri: https://api.openai.com/v1/chat/completions + body: + encoding: UTF-8 + string: '{"model":"gpt-4.1-nano","messages":[{"role":"user","content":"Say ''Hello + World'' and nothing else."}],"temperature":0.7,"stream":false,"response_format":{"type":"json_schema","json_schema":{"name":"response","schema":{"type":"object","properties":{"result":{"type":"string"}},"additionalProperties":false,"required":["result"]},"strict":true}}}' + headers: + User-Agent: + - Faraday v2.13.0 + Authorization: + - Bearer + Content-Type: + - application/json + Accept-Encoding: + - gzip;q=1.0,deflate;q=0.6,identity;q=0.3 + Accept: + - "*/*" + response: + status: + code: 200 + message: OK + headers: + Date: + - Mon, 21 Apr 2025 20:32:33 GMT + Content-Type: + - application/json + Transfer-Encoding: + - chunked + Connection: + - keep-alive + Access-Control-Expose-Headers: + - X-Request-ID + Openai-Organization: + - "" + Openai-Processing-Ms: + - '90' + Openai-Version: + - '2020-10-01' + X-Ratelimit-Limit-Requests: + - '30000' + X-Ratelimit-Limit-Tokens: + - '150000000' + X-Ratelimit-Remaining-Requests: + - '29999' + X-Ratelimit-Remaining-Tokens: + - '149999989' + X-Ratelimit-Reset-Requests: + - 2ms + X-Ratelimit-Reset-Tokens: + - 0s + X-Request-Id: + - "" + Strict-Transport-Security: + - max-age=31536000; includeSubDomains; preload + Cf-Cache-Status: + - DYNAMIC + Set-Cookie: + - "" + - "" + X-Content-Type-Options: + - nosniff + Server: + - cloudflare + Cf-Ray: + - "" + Alt-Svc: + - h3=":443"; ma=86400 + body: + encoding: ASCII-8BIT + string: | + { + "id": "chatcmpl-BOs6bQYZuN9HKN9YjzQg0mKJK7Xwb", + "object": "chat.completion", + "created": 1745267553, + "model": "gpt-4.1-nano-2025-04-14", + "choices": [ + { + "index": 0, + "message": { + "role": "assistant", + "content": "{\"result\":\"Hello World\"}", + "refusal": null, + "annotations": [] + }, + "logprobs": null, + "finish_reason": "stop" + } + ], + "usage": { + "prompt_tokens": 44, + "completion_tokens": 7, + "total_tokens": 51, + "prompt_tokens_details": { + "cached_tokens": 0, + "audio_tokens": 0 + }, + "completion_tokens_details": { + "reasoning_tokens": 0, + "audio_tokens": 0, + "accepted_prediction_tokens": 0, + "rejected_prediction_tokens": 0 + } + }, + "service_tier": "default", + "system_fingerprint": "fp_eede8f0d45" + } + recorded_at: Mon, 21 Apr 2025 20:32:33 GMT +recorded_with: VCR 6.3.1 diff --git a/spec/ruby_llm/chat/with_response_format_spec.rb b/spec/ruby_llm/chat/with_response_format_spec.rb new file mode 100644 index 00000000..06948f9d --- /dev/null +++ b/spec/ruby_llm/chat/with_response_format_spec.rb @@ -0,0 +1,69 @@ +# frozen_string_literal: true + +require 'spec_helper' + +RSpec.describe RubyLLM::Chat, '#with_response_format' do + include_context 'with configured RubyLLM' + + # TODO: Add support for other API types + # chat_models = %w[claude-3-5-haiku-20241022 + # anthropic.claude-3-5-haiku-20241022-v1:0 + # gemini-2.0-flash + # deepseek-chat + # gpt-4.1-nano].freeze + + chat_models = %w[gpt-4.1-nano].freeze + chat_models.each do |model| + provider = RubyLLM::Models.provider_for(model).slug + + context "with #{provider}/#{model}" do + let(:chat) { RubyLLM.chat(model: model) } + + describe 'simple type param' do + it 'returns simple integer responses' do + expect(chat.with_response_format(:integer).ask("What's 2 + 2?").content).to eq(4) + end + + it 'returns simple string responses' do + response = chat.with_response_format(:string).ask("Say 'Hello World' and nothing else.").content + expect(response).to eq('Hello World') + end + + it 'returns array response' do + chat.with_response_format(:array, items: { type: :string }) + response = chat.ask('What are the 2 largest countries? Only respond with country names.').content + expect(response).to be_a(Array) + end + + it 'returns object response' do + chat.with_response_format(:json) + result = chat.ask('Provide a sample customer data object with name and email keys.') + expect(result.content).to be_a(Hash) + end + end + + describe 'more complex schemas' do + it 'returns strings within object' do + chat.with_response_format(:object, properties: { name: { type: :string } }) + response = chat.ask('Provide a sample customer name.').content + expect(response['name']).to be_a(String) + end + + it 'returns integers within object' do + chat.with_response_format(:object, properties: { age: { type: :integer } }) + response = chat.ask('Provide sample customer age between 10 and 100.').content + expect(response['age']).to be_a(Integer) + end + + it 'returns arrays within object' do # rubocop:disable RSpec/ExampleLength -- Just trying to meet line length limit + chat.with_response_format( + :object, + properties: { hobbies: { type: :array, items: { type: :string, enum: %w[Soccer Golf Hockey] } } } + ) + response = chat.ask('Provide at least 1 hobby.').content + expect(response['hobbies']).to all(satisfy { |hobby| %w[Soccer Golf Hockey].include?(hobby) }) + end + end + end + end +end diff --git a/spec/ruby_llm/message_spec.rb b/spec/ruby_llm/message_spec.rb new file mode 100644 index 00000000..9c58b35d --- /dev/null +++ b/spec/ruby_llm/message_spec.rb @@ -0,0 +1,44 @@ +# frozen_string_literal: true + +require 'spec_helper' + +RSpec.describe RubyLLM::Message do + subject(:message) { described_class.new(role: :assistant, content: content, content_schema: content_schema) } + + let(:content_schema) { nil } + + describe '#content' do + let(:content) { 'Hello, world!' } + + it 'returns string content by default' do + expect(message.content).to eq(content) + end + + context 'when content has object schema' do + let(:content_schema) { { type: :object, properties: { foo: { type: :string } } } } + let(:content) { { 'result' => { 'foo' => 'bar' } }.to_json } + + it 'returns hash' do + expect(message.content).to be_a(Hash).and include('foo' => 'bar') + end + end + + context 'when content has integer schema' do + let(:content_schema) { { type: :integer } } + let(:content) { { 'result' => 123 }.to_json } + + it 'returns integer' do + expect(message.content).to eq(123) + end + end + + context 'when content has array schema' do + let(:content_schema) { { type: :array, items: { type: :boolean } } } + let(:content) { { 'result' => [true, false] }.to_json } + + it 'returns integer' do + expect(message.content).to eq([true, false]) + end + end + end +end diff --git a/spec/ruby_llm/schema_spec.rb b/spec/ruby_llm/schema_spec.rb new file mode 100644 index 00000000..71f56ce3 --- /dev/null +++ b/spec/ruby_llm/schema_spec.rb @@ -0,0 +1,75 @@ +# frozen_string_literal: true + +require 'spec_helper' + +RSpec.describe RubyLLM::Schema do + it 'deeply stringifies keys in a hash automatically' do + hash = { foo: 'bar', bar: { foo: 'bar' } } + schema = described_class.new(hash) + expect(schema.to_h).to eq(foo: 'bar', bar: { foo: 'bar' }) + end + + describe '#[]' do + let(:schema) { described_class.new('foo' => { 'bar' => 'foo' }, 'arr' => [{ 'some' => :val }]) } + + it 'deeply symbolizes keys in hash' do + expect(schema[:foo][:bar]).to eq('foo') + end + + it 'deeply symbolizes keys in array' do + expect(schema[:arr][0][:some]).to eq(:val) + end + end + + describe '#[]=' do + let(:schema) { described_class.new('foo' => {}) } + + it 'sets schema values with symbol keys in nested objects' do + schema[:foo] = { 'bar' => 123 } + expect(schema[:foo][:bar]).to eq(123) + end + end + + describe '#add_to_each_object_type!' do + before { schema.add_to_each_object_type!(:additionalProperties, true) } + + context 'with root data object' do + let(:schema) { described_class.new(type: :object, properties: { name: { type: :string } }) } + + it 'sets schema values with indifferent key vs symbols' do + expect(schema[:additionalProperties]).to be(true) + end + end + + context 'with nested data object' do + let(:schema) do + described_class.new(type: :object, + properties: { address: { type: :object, + properties: { city: { type: :string } } } }) + end + + it 'sets schema values with indifferent key vs symbols' do + expect(schema[:properties][:address][:additionalProperties]).to be(true) + end + end + + context 'with array item objects' do + let(:schema) do + described_class.new(type: :object, + properties: { + coordinates: { + type: :array, + items: { + type: :object, + properties: { lat: { type: :number }, lon: { type: :number } } + } + } + }) + end + + it 'sets schema values with indifferent key vs symbols' do + expect(schema[:properties][:coordinates][:items][:additionalProperties]).to be(true) + end + end + end +end