-
-
Notifications
You must be signed in to change notification settings - Fork 253
Add video file support #405
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…; Update capabilities
response = chat.ask('What do you see in this video?', with: { video: video_path }) | ||
|
||
expect(response.content).to be_present | ||
expect(response.content).not_to include('RubyLLM::Content') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add an expectation that it recognizes the actual content of the video, like at least includes the words "woman" and "beach"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so, 2 reasons:
-
A bit out of scope. No other tests of this spec do this. If we were to make that change, it would better to do it for all tests for consistency, probably on a separate PR
-
My understanding is these are more of a boundary interface test. In other words, we are testing if the lib correctly interacts with the providers (sends valid requests), and obtains responses. We are not testing the capability of the models themselves.
But I'll wait on more comments. If more people thinks it makes sense, I can add the assertions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right about it not existing in this spec file. I'm a bit surprised about this though as it does exist in the spec for the text models.
ruby_llm/spec/ruby_llm/chat_spec.rb
Line 16 in fa10f0c
expect(response.content).to include('4') |
ruby_llm/spec/ruby_llm/chat_spec.rb
Line 37 in fa10f0c
expect(first.content).to include('Matz') |
ruby_llm/spec/ruby_llm/chat_spec.rb
Line 40 in fa10f0c
expect(followup.content).to include('199') |
I do like how this makes the specs ensure that the models being used actually accomplish the user's intended purpose. But not a showstopper for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have a good point, but there is also some common wisdom that a project does not need to test the functionality of its external dependencies.
This depends a little on the testing philosophy of this project, I prefer to wait on maintainer feedback before making this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good question to ponder. There's value in adding the checks for content as it tests if the LLM has actually received the file. The only problem is understanding: we don't want to test that. Since I don't think we can separate the two I'm leaning towards checking the content too. We should probably add similar content checks to the rest of the spec, but perhaps in another PR.
What a great PR! Clean, focused, following the spirit of the project. I would have loved support in other providers as well, if you have the API keys. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #405 +/- ##
==========================================
+ Coverage 84.29% 84.37% +0.08%
==========================================
Files 36 36
Lines 1897 1907 +10
Branches 493 495 +2
==========================================
+ Hits 1599 1609 +10
Misses 298 298 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
I'm glad you enjoyed it! I love when my PRs are well received. Don't have other API keys, unfortunately. Think my contributions will be limited to Google/Gemini, at least for now. |
Thx @arnodirlam for doing the hard work for video support. Thx @altxtech for polishing the last remaining bits. |
I added VCRs for VertexAI and couldn't find any other major provider supporting video input directly so let's ship this! |
Nice! 🎉 Yea, I also did try with all providers we have, and Gemini was the only one supporting video input. |
What this does
Adds video file support to RubyLLM.
Supersedes #260, originally authored by @arnodirlam. Thank you for the groundwork.
What changed vs #260
main
gemini-2.5-flash
as a video model for testsType of change
Scope check
Quality check
overcommit --install
and all hooks passbundle exec rake vcr:record[provider_name]
bundle exec rspec
models.json
,aliases.json
)API changes
Related issues
Closes #259