Handle errors more gracefully in pydantic-evals

### Description

Right now, if a task run has an error, it just causes an error with the whole dataset.evaluate run. We should gracefully handle both task errors and evaluator errors, and should have built-in support for retrying the tasks and evaluators (e.g. for things like LLMJudge where you might get an intermittent failure).

This will be closed by https://github.com/pydantic/pydantic-ai/pull/2295 but that PR may introduce breaking changes, so I want to get it merged before V1.

### References

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Handle errors more gracefully in pydantic-evals #2612

Description

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Handle errors more gracefully in pydantic-evals #2612

Description

Description

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions