Skip to content

Handle errors more gracefully in pydantic-evals #2612

@dmontagu

Description

@dmontagu

Description

Right now, if a task run has an error, it just causes an error with the whole dataset.evaluate run. We should gracefully handle both task errors and evaluator errors, and should have built-in support for retrying the tasks and evaluators (e.g. for things like LLMJudge where you might get an intermittent failure).

This will be closed by #2295 but that PR may introduce breaking changes, so I want to get it merged before V1.

References

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions