-
Notifications
You must be signed in to change notification settings - Fork 543
feat(llm): pass llm params directly #1387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
Implements tool call extraction and passthrough functionality in LLMRails: - Add tool_calls_var context variable for storing LLM tool calls - Refactor llm_call utils to extract and store tool calls from responses - Support tool calls in both GenerationResponse and dict message formats - Add ToolMessage support for langchain message conversion - Comprehensive test coverage for tool calling integration
Add example configuration and documentation for using NVIDIA NeMoGuard NIMs, including content moderation, topic control, and jailbreak detection.
Update verbose logging to safely handle cases where log records may not have 'id' or 'task' attributes. Prevents potential AttributeError and improves robustness of LLM and prompt log output formatting.
Implements tool call extraction and passthrough functionality in LLMRails: - Add tool_calls_var context variable for storing LLM tool calls - Refactor llm_call utils to extract and store tool calls from responses - Support tool calls in both GenerationResponse and dict message formats - Add ToolMessage support for langchain message conversion - Comprehensive test coverage for tool calling integration
… Runnable protocol support - Implement comprehensive async/sync invoke, batch, and streaming support - Add robust input/output transformation for all LangChain formats (ChatPromptValue, BaseMessage, dict, string) - Enhance chaining behavior with intelligent __or__ method handling RunnableBinding and complex chains - Add concurrency controls, error handling, and configurable blocking messages - Implement proper tool calling support with tool call passthrough - Add extensive test suite (14 test files, 2800+ lines) covering all major functionality including batching, streaming, composition, piping, and tool calling - Reorganize and expand test structure for better maintainability apply review suggestions
0358cd7
to
3240dc9
Compare
…Rails Ensure AIMessage responses from RunnableRails contain the same metadata fields (response_metadata, usage_metadata, additional_kwargs, id) as direct LLM calls, enabling consistent LangChain integration behavior.
242de1f
to
cbfcf20
Compare
21e33e2
to
2f57ec4
Compare
cbfcf20
to
621aafb
Compare
Enhance streaming in RunnableRails to include generation metadata in streamed chunks. Skips END_OF_STREAM markers and updates chunk formatting to support metadata for AIMessageChunk outputs. This improves compatibility with consumers expecting metadata in streaming responses. fix fix
Introduce tool output/input rails configuration and Colang flows for tool call validation and parameter security checks. Add support for BotToolCall event emission in passthrough mode, enabling tool call guardrails before execution.
2f57ec4
to
ed234d6
Compare
d25c548
to
ef88f7f
Compare
…ion and processing - Add UserToolMessages event handling and tool input rails processing - Fix message-to-event conversion to properly handle tool messages in conversation history - Preserve tool call context in passthrough mode by using full conversation history - Support tool_calls and tool message metadata in LangChain format conversion - Include comprehensive test suite for tool input rails functionality test(runnable_rails): fix prompt format in passthrough mode feat: support ToolMessage in message dicts refactor: rename BotToolCall to BotToolCalls
ed234d6
to
4c34032
Compare
Extend llm_call to accept an optional llm_params dictionary for passing configuration parameters (e.g., temperature, max_tokens) to the language model. This enables more flexible control over LLM behavior during calls. refactor(llm): replace llm_params context manager with argument Update all usages of the llm_params context manager to pass llm_params as an argument to llm_call instead. This simplifies parameter handling and improves code clarity for LLM calls. docs: clarify prompt customization and llm_params usage update LLMChain config usage add unit and e2e tests fix failing tests
ef88f7f
to
5f99209
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR migrates from using context managers for LLM parameter management to passing parameters directly to the llm_call
function. The change leverages LangChain's universal .bind()
method to pass parameters like temperature and max_tokens directly to LLM models without temporarily modifying their state.
Key changes:
- Added
llm_params
parameter tollm_call
function for direct parameter passing - Replaced all
with llm_params(...)
context manager usage with direct parameter passing - Updated tests to cover the new parameter passing approach
Reviewed Changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
nemoguardrails/actions/llm/utils.py |
Added llm_params parameter to llm_call and implemented LLM binding |
tests/test_tool_calling_utils.py |
Added comprehensive tests for new parameter passing functionality |
tests/test_llm_params_e2e.py |
New end-to-end tests for LLM parameter functionality with real providers |
tests/test_llm_params.py |
Added migration tests comparing context manager with direct parameter approach |
Various action files | Updated all LLM calls to use direct parameter passing instead of context managers |
docs/user-guides/advanced/prompt-customization.md |
Updated documentation example to show new parameter passing syntax |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
chain = LLMChain(prompt=last_bot_prompt, llm=llm) | ||
|
||
# Generate multiple responses with temperature 1. | ||
with llm_params(llm, temperature=1.0, n=num_responses): | ||
extra_llm_response = await chain.agenerate( | ||
[{"text": last_bot_prompt_string}], | ||
run_manager=logging_callback_manager_for_chain, | ||
) | ||
# Use chain.with_config for runtime parameters | ||
configured_chain = chain.with_config( | ||
configurable={"temperature": 1.0, "n": num_responses} | ||
) | ||
extra_llm_response = await configured_chain.agenerate( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use of chain.with_config()
with configurable
parameter differs from the pattern used elsewhere in the codebase. This should use the llm_params
approach for consistency, or the LLM should be bound directly with .bind()
before creating the chain.
Copilot uses AI. Check for mistakes.
) | ||
negative_answer_result = create_negatives_chain.invoke( | ||
{"evidence": evidence, "answer": answer}, | ||
config={"temperature": 0.8, "max_tokens": 300}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use of chain.invoke()
with config
parameter is inconsistent with the migration pattern used throughout the rest of the codebase. This should follow the same llm_params
pattern for consistency.
config={"temperature": 0.8, "max_tokens": 300}, | |
llm_params={"temperature": 0.8, "max_tokens": 300}, |
Copilot uses AI. Check for mistakes.
4c34032
to
c8ff064
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, 4k LOC is too large for a single PR.
I'm a little confused about a few things:
- Why did we use a context-manager to pass a dict of LLM parameters in the first place? Normally they're used to make sure we close files/DB connections so we don't forget.
- Does a context-manager break some Langchain functionality?
- Can you add some local integration tests to make sure this works calling tools with production LLMs?
c8ff064
to
5792bea
Compare
Description
langchain-community models support .bind() method
langchain-core (contains Runnable interface with .bind())
── langchain-openai (inherits .bind())
── langchain-community (inherits .bind())
── langchain-anthropic (inherits .bind())
── langchain-* (all inherit .bind() ? )
Related Issue(s)
Checklist