Distinguishing model failure vs task inadmissibility in evaluation #1089
Unanswered
finkeissen
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
In some evaluation workflows, a task can become ill-posed rather than simply failed.
For example:
Do you distinguish between model failure and task inadmissibility during evaluation?
I’m curious whether this is tracked explicitly in Kiln workflows or treated as a regular failure.
Beta Was this translation helpful? Give feedback.
All reactions