Skip to content

Comments

Fix: Resolve JSON malformation causing infinite loops and TypeError#1037

Open
gdeyoung wants to merge 1 commit intoagent0ai:mainfrom
gdeyoung:main
Open

Fix: Resolve JSON malformation causing infinite loops and TypeError#1037
gdeyoung wants to merge 1 commit intoagent0ai:mainfrom
gdeyoung:main

Conversation

@gdeyoung
Copy link

Summary

Fixed multiple critical bugs causing JSON malformation, infinite loops, and TypeError crashes in Agent Zero.

Issues Fixed

1. JSON Object Extraction Bug (rfind)

File: python/helpers/extract_tools.py - Used rfind to find LAST closing brace instead of matching one
Fix: Proper nested brace tracking using depth counter

2. Escape Handling Logic Error

File: python/helpers/extract_tools.py - Escaped quotes not toggling in_string flag
Fix: Proper check for escaped quotes

3. No Loop Protection

File: agent.py - No protection against consecutive misformat errors
Fix: Added consecutive_misformat counter with 5-attempt limit + HandledException

4. TypeError: tool_args must be a mapping

File: agent.py - .get() returns string if key exists with string value
Fix: isinstance(tool_args, dict) validation

- Fixed rfind bug in extract_tools.py - proper nested brace tracking
- Fixed escape handling for escaped quotes in JSON strings
- Added HandledException class for graceful loop termination
- Added consecutive_misformat counter (5 attempt limit)
- Fixed tool_args TypeError - ensure dict before unpacking
@gdeyoung
Copy link
Author

Additional Fix v1.2 - HandledException Shadowing

Issue Found

A duplicate local class definition of HandledException in agent.py was shadowing the imported class from python.helpers.errors. This caused:

  • isinstance(exception, HandledException) check to fail in handle_critical_exception
  • Unhandled exceptions causing crashes instead of graceful loop termination

Root Cause

  • Line 36: from python.helpers.errors import RepairableException, HandledException ✅ (import)
  • Line 353: class HandledException(Exception): pass ❌ (duplicate local definition)

Fix Applied

Removed the duplicate class definition (lines 349-355) in agent.py. Now uses only the imported HandledException from errors.py.

Files Changed

  • agent.py: Removed duplicate HandledException class definition

Verification

  • Python syntax validated
  • Import now works correctly (same class object used throughout)
  • Exception handling now works as intended

Added: 2026-02-14

@longman391
Copy link

I can confirm this is a critical bug. I traced tool call failures across multiple chat sessions on my instance (v0.9.8.1, Claude Opus 4.6 via GitHub Copilot, 128k context) and found 105 empty tool_name failures in a single log file — all caused by the rfind bug in extract_json_object_string().

Evidence from my logs:

  • Chat mYlPuJkf: 38 consecutive 'Tool not found' errors (messages 131-168) — the agent was stuck in an infinite loop of malformed output → error → retry
  • The extract_json_object_string() function grabs everything between the first { and last }, which means incidental curly braces in LLM output (file paths like /restore/{backup_id}/, inline JSON examples, etc.) get misinterpreted as tool calls with empty tool_name
  • I verified this by testing DirtyJson directly: input 'The backup is at /restore/{backup_id}/files' parses to {'backup_id': ''} with no tool_name → triggers 'Tool not found'

Additional issue not covered by this PR: The LLM frequently hallucinates tool names from training data instead of using the actual tool names in the system prompt:

  • code_execution instead of code_execution_tool
  • web_search instead of search_engine
  • browser_tool instead of browser_agent
  • response_tool / message_tool instead of response
  • terminal instead of code_execution_tool

A simple alias mapping in get_tool() would catch these. Happy to submit a PR for that.

The consecutive misformat counter in this PR would have prevented the 38-message infinite loop. Please consider merging this — it's a significant stability improvement.

longman391 added a commit to longman391/agent-zero that referenced this pull request Feb 20, 2026
Fix agent0ai#3 - Empty tool_name validation:
- When DirtyJson parses valid JSON but the object has no tool_name field,
  the agent previously dispatched with an empty string, triggering
  'Tool  not found' errors. Now treats this as a misformat and increments
  the consecutive_misformat counter (integrates with PR agent0ai#1037's circuit breaker).
- Evidence: 105 empty tool_name failures found in a single log session.

Fix agent0ai#4 - Tool name alias mapping:
- LLMs frequently hallucinate tool names from training data instead of
  using the actual names in the system prompt. Added TOOL_ALIASES dict
  that maps common hallucinated names to actual Agent Zero tool names:
  - code_execution/terminal/shell -> code_execution_tool
  - web_search/search -> search_engine
  - browser_tool/browser -> browser_agent
  - response_tool/message_tool/message/reply -> response
  - knowledge_tool/memory_tool -> memory
  - task_manager -> scheduler
- Evidence: 20+ hallucinated tool name failures across multiple chat logs.

Related: agent0ai#1031, agent0ai#805
longman391 added a commit to longman391/agent-zero that referenced this pull request Feb 20, 2026
…ng resistance

Claude subordinates interpret Agent Zero system prompt as "prompt injection"
and refuse to output JSON, causing infinite misformat loops (even with
circuit breaker from PR agent0ai#1037).

GLM-5 reliably follows JSON formatting instructions and is capable
enough for agentic subordinate work.

Uses existing initialize_agent(override_settings=) mechanism.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants