Skip to content

Infinite tool calling loop with Meta Llama models in LangGraph agents #49

@fede-kamel

Description

@fede-kamel

Description

When using ChatOCIGenAI with bind_tools() in a LangGraph agent, Meta Llama models (and potentially other models) enter an infinite tool calling loop. After a tool is called and its results are returned via ToolMessage, the model continues to make the same tool call repeatedly instead of generating a final response.

Steps to Reproduce

  1. Create a LangGraph agent with a tool:
from langchain_oci.chat_models import ChatOCIGenAI
from langgraph.graph import StateGraph, START, END, MessagesState
from langgraph.prebuilt import ToolNode
from langchain.tools import StructuredTool

def get_weather(city: str) -> str:
    """Get weather for a city."""
    return f"Weather in {city}: Sunny, 65°F"

weather_tool = StructuredTool.from_function(
    func=get_weather,
    name="get_weather",
    description="Get the current weather for a given city name.",
)

# Create model and bind tools
chat_model = ChatOCIGenAI(
    model_id="meta.llama-4-scout-17b-16e-instruct",
    service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
    compartment_id="ocid1.compartment...",
    auth_type="SECURITY_TOKEN",
    disable_streaming="tool_calling",
)
model_with_tools = chat_model.bind_tools([weather_tool])

# Create agent graph
def call_model(state: MessagesState):
    return {"messages": [model_with_tools.invoke(state["messages"])]}

def should_continue(state: MessagesState):
    if state["messages"][-1].tool_calls:
        return "tools"
    return END

builder = StateGraph(MessagesState)
builder.add_node("call_model", call_model)
builder.add_node("tools", ToolNode(tools=[weather_tool]))
builder.add_edge(START, "call_model")
builder.add_conditional_edges("call_model", should_continue, ["tools", END])
builder.add_edge("tools", "call_model")
agent = builder.compile()

# Invoke agent
result = agent.invoke({
    "messages": [
        SystemMessage("You are a helpful assistant. Use tools when needed."),
        HumanMessage("What's the weather in Chicago?")
    ]
})
  1. The agent hits the recursion limit

Expected Behavior

After the tool is called and results are returned:

  1. Model calls get_weather tool for "Chicago"
  2. Tool returns result: "Weather in Chicago: Sunny, 65°F"
  3. Model receives the ToolMessage with the result
  4. Model generates final response using the weather data
  5. Agent completes successfully

Actual Behavior (Before Fix)

🔍 Model response:
   Content: 
   Tool calls: [{'name': 'get_weather', 'args': {'city': 'Chicago'}, 'id': 'call_1'}]

🔍 Tool executed: Weather in Chicago: Sunny, 65°F

🔍 Model response:
   Content: 
   Tool calls: [{'name': 'get_weather', 'args': {'city': 'Chicago'}, 'id': 'call_2'}]

🔍 Tool executed: Weather in Chicago: Sunny, 65°F

🔍 Model response:
   Content: 
   Tool calls: [{'name': 'get_weather', 'args': {'city': 'Chicago'}, 'id': 'call_3'}]

... (repeats 22 more times) ...

langgraph.errors.GraphRecursionError: Recursion limit of 25 reached without hitting a stop condition. 
This may indicate an infinite loop in the graph, or a graph that requires more steps than the limit.

The model keeps calling the same tool over and over, never generating a final response.

Root Cause

The bind_tools() method sends the tools parameter in every API request to OCI Generative AI. When ToolMessage is present in the conversation history, the model should be instructed to stop calling tools by setting tool_choice="none".

Currently, this is not being done automatically. As a result:

  • The model sees tools are still available (because tools is in the request)
  • The model doesn't know it should stop calling tools
  • The model calls the same tool again, creating an infinite loop

Affected Models

Confirmed to severely affect:

  • meta.llama-4-scout-17b-16e-instruct ❌ (infinite loop)
  • meta.llama-3.3-70b-instruct ❌ (infinite loop)

Also affects Cohere models to a lesser degree (they handle it better but still benefit from the fix):

  • cohere.command-a-03-2025
  • cohere.command-r-plus-08-2024

Environment

  • langchain-oci version: latest (from main branch)
  • LangChain Core version: 0.3+
  • LangGraph version: 0.2+
  • Python version: 3.11+
  • OCI Generative AI Service: us-chicago-1 region

Proposed Solution

In the GenericProvider.messages_to_oci_params() method, automatically set tool_choice="none" when:

  1. Tool results (ToolMessage) have been received in the conversation
  2. Tools are bound to the request
  3. User hasn't explicitly set tool_choice

This tells the model to stop calling tools and generate a final response instead.

Implementation:

# In messages_to_oci_params() method
has_tool_results = any(isinstance(msg, ToolMessage) for msg in messages)
if has_tool_results and "tools" in kwargs and "tool_choice" not in kwargs:
    result["tool_choice"] = self.oci_tool_choice_none()

Workarounds (Not Recommended)

Before this fix, users had to implement defensive workarounds like:

  • Manually counting tool calls and forcing a stop
  • Adding custom logic to strip tools after N iterations
  • Using custom nodes to generate responses when loops detected

These workarounds are fragile and shouldn't be necessary. The fix addresses the root cause.

Related Links

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions