Turn._mcp_interaction PrivateAttr always False — MultiTurnMCPUseMetric/MCPTaskCompletionMetric scores degrade to near-zero

## Summary

`Turn._mcp_interaction` is declared as a `PrivateAttr(default=False)` and is intended to be set to `True` in the `mode="before"` model validator when `mcp_tools_called`, `mcp_resources_called`, or `mcp_prompts_called` are present. Due to a Pydantic v2 incompatibility, this never works — the flag stays `False` for every turn, causing `MultiTurnMCPUseMetric` and `MCPTaskCompletionMetric` to produce severely degraded scores on all conversational test cases that use MCP tool calls.

**deepeval version:** 3.9.2  
**pydantic version:** 2.12.5

---

## Root Cause

In `deepeval/test_case/conversational_test_case.py`, the `mode="before"` validator sets:

```python
data["_mcp_interaction"] = True
```

In **Pydantic v1**, this worked because private attributes were initialized from the constructor data dict. In **Pydantic v2**, `PrivateAttr` fields are explicitly excluded from `__init__` and from the validated field set — any key starting with `_` that isn't a model field is silently dropped after the validator returns. The `__pydantic_private__` dict is initialized to `{"_mcp_interaction": False}` regardless.

You can verify:

```python
import mcp.types
from deepeval.test_case import MCPToolCall, Turn

result = mcp.types.CallToolResult(content=[], structuredContent={"result": {}}, isError=False)
t = Turn(
    role="assistant",
    content="Looking up...",
    mcp_tools_called=[MCPToolCall(name="lookup_legislator", args={}, result=result)],
)
print(t.__pydantic_private__)  # {'_mcp_interaction': False}
print(t._mcp_interaction)      # False — should be True
```

---

## Impact

`MultiTurnMCPUseMetric._get_tasks()` and `MCPTaskCompletionMetric._get_tasks()` gate all tool-call rendering on `turn._mcp_interaction`:

```python
if turn._mcp_interaction:
    # render <Tool Called> block for the judge
else:
    new_task.steps_taken.append("Agent's response to user: \n" + turn.content)
```

Since `_mcp_interaction` is always `False`, every turn — including turns with `mcp_tools_called` set — falls into the `else` branch. The judge only ever sees `turn.content` for MCP turns, never the structured tool call details (tool name, args, or result). The judge prompt never contains any `<Tool Called>` sections, so the judge has no visibility into what tools were actually invoked or what they returned, leading to severely degraded scores.

Issue #2138 / PR #2141 patched a downstream `ZeroDivisionError` caused by this bug (empty task lists), but didn't address the root cause.

---

## Fix

Replace the `PrivateAttr` + broken validator with a `@property`. The value is fully derivable from existing fields — there's no reason to store it:

```python
# Remove this:
_mcp_interaction: bool = PrivateAttr(default=False)

# And the data["_mcp_interaction"] = True line in the validator.

# Add this:
@property
def _mcp_interaction(self) -> bool:
    return (
        self.mcp_tools_called is not None
        or self.mcp_resources_called is not None
        or self.mcp_prompts_called is not None
    )
```

Alternatively, change the validator to `mode="after"` so `self` is the live instance:

```python
@model_validator(mode="after")
def set_mcp_interaction(self):
    if (
        self.mcp_tools_called is not None
        or self.mcp_resources_called is not None
        or self.mcp_prompts_called is not None
    ):
        self._mcp_interaction = True
    return self
```

The `@property` approach is cleaner since it eliminates stored state entirely.

---

## Secondary Issue: `turn.content` Ignored on MCP Turns

A related gap in `_get_tasks()`: when `_mcp_interaction` is `True`, the method renders only the tool name, args, and `structuredContent` result — it ignores `turn.content` entirely. I'd like to use `turn.content` on MCP turns as a user-visible status message (e.g. "Looking up your address..."). This context is never surfaced to the judge, so the judge evaluates the conversation as if the user experienced a silent gap during tool execution. Including `turn.content` when non-empty would give the judge a more accurate picture of what the user actually saw.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Turn._mcp_interaction PrivateAttr always False — MultiTurnMCPUseMetric/MCPTaskCompletionMetric scores degrade to near-zero #2579

Summary

Root Cause

Impact

Fix

Secondary Issue: `turn.content` Ignored on MCP Turns

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Turn._mcp_interaction PrivateAttr always False — MultiTurnMCPUseMetric/MCPTaskCompletionMetric scores degrade to near-zero #2579

Description

Summary

Root Cause

Impact

Fix

Secondary Issue: turn.content Ignored on MCP Turns

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Secondary Issue: `turn.content` Ignored on MCP Turns