fix: include tool and trace state in evaluation cache keys by aerosta · Pull Request #2561 · confident-ai/deepeval

aerosta · 2026-03-19T00:35:54Z

Summary

The evaluation cache key originally only considered the text fields on LLMTestCase: input, actual_output, expected_output, context, and retrieval_context.

For tool-based and trace-based evaluation, that was incomplete. Metrics such as ToolCorrectnessMetric and trace-level metrics also depend on execution-state fields like tools_called, expected_tools, MCP call data, and _trace_dict. Two test cases with the same text but different tool calls or trace structure could therefore produce the same cache key and reuse stale metric results.

This change moves cache key construction into CachedTestCase.create_cache_key() and includes the execution-state fields that affect scoring.

Changes

add CachedTestCase.create_cache_key() in deepeval/test_run/cache.py
replace inline cache-key construction in get_cached_test_case() and cache_test_case()
include tool, MCP, and trace fields in the cache key
normalize nested values before serialization so key generation stays deterministic

Context

Tool-calling and trace-level metrics evaluate fields such as tools_called, expected_tools, and _trace_dict. Because those fields were not part of the cache key, changing only the execution state between runs could still return cached scores from a previous run.

This updates the cache key format, so existing cache entries will miss on the first run after upgrade. That is intentional, a one-time re-evaluation is safer than serving stale results.

Tests

add tests/test_core/test_run/test_cache_keys.py
cover tool differentiation, trace differentiation, identity, and text-only cases

vercel · 2026-03-19T00:35:59Z

@aerosta is attempting to deploy a commit to the Confident AI Team on Vercel.

A member of the Team first needs to authorize it.

fix: include tool and trace state in evaluation cache keys

5205257

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: include tool and trace state in evaluation cache keys#2561

fix: include tool and trace state in evaluation cache keys#2561
aerosta wants to merge 1 commit intoconfident-ai:mainfrom
aerosta:fix/cache-key-include-execution-state

aerosta commented Mar 19, 2026

Uh oh!

vercel bot commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

aerosta commented Mar 19, 2026

Summary

Changes

Context

Tests

Uh oh!

vercel bot commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant