-
Notifications
You must be signed in to change notification settings - Fork 3.1k
adding readme - evaluator #43189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
w-javed
wants to merge
17
commits into
main
Choose a base branch
from
new_evaluator_in_eval_catalog
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
adding readme - evaluator #43189
Changes from 1 commit
Commits
Show all changes
17 commits
Select commit
Hold shift + click to select a range
b983843
adding readme
w-javed 9a4b6ef
adding readme
w-javed 8ff4d7a
Apply suggestion from @Copilot
w-javed dda0064
Apply suggestion from @Copilot
w-javed 6101c9c
Apply suggestion from @Copilot
w-javed 97377a9
adding readme
w-javed 5451b9d
fixed
w-javed 4fbb6e5
Merge branch 'new_evaluator_in_eval_catalog' of https://github.com/Az…
w-javed 7b98558
fixed
w-javed 3754b2f
fixed
w-javed bad07d9
fixed
w-javed 7c39d05
fixed
w-javed dbb5134
fixed
w-javed c49b7f6
fixed
w-javed a2a4507
fixed
w-javed 6a9c774
linkfix
w-javed f50dc69
linkfix
w-javed File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
113 changes: 113 additions & 0 deletions
113
sdk/evaluation/azure-ai-evaluation/samples/evaluator_catalog/README.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,113 @@ | ||
|
||
# How to publish new evaluator in Evaluator Catalog. | ||
|
||
This guild helps our partners to bring their evaluators into Microsoft provided Evaluator Catalog in Next Gen UI. | ||
|
||
## Context | ||
|
||
We are building an Evaluator Catalog, that will allow us to store Microsoft provided built-in evaluators, as well as Customer's provided custom evaluators. It will allow versioning support so that customer can maintain different version of custom evaluators. | ||
|
||
Using this catalog, customer can publish their custom evaluators under the project. Post Ignite, we'll allow them to prompt evaluators from projects to registries so that can share evaluators amount different projects. | ||
w-javed marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
This evaluator catalog is backed by Generic Asset Service (that provides scalable and multi-region support to store all your assets in CosmosDB). | ||
|
||
Types of Built_in Evaluators | ||
There are 3 types of evaluators we support as Built-In Evaluators. | ||
|
||
1. Code Based - It contains Python file | ||
2. Code + Prompt Based - It contains Python file & Prompty file | ||
3. Prompt Based - It contains only Prompty file. | ||
4. Service Based - It references the evaluator from Evaluation SDK or RAI Service. | ||
|
||
## Step 1: | ||
|
||
Create builtin evaluator and use azure-ai-evaluation SDK to run locally. | ||
List of evaluators can be found at [here](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluators) | ||
|
||
# Step 2: Create a PR | ||
Add a new folder with name as the Evaluator name. | ||
|
||
Please include following files. | ||
|
||
* asset.yaml | ||
* spec.yaml | ||
* 'evaluator' folder. Please include python files and prompty files in this folder. | ||
|
||
Please look at existing built-in evaluators for reference. | ||
Location : [/assets/evaluators/builtin](https://msdata.visualstudio.com/Vienna/_git/azureml-asset?path=/assets/evaluators/builtin) | ||
|
||
Sample PR: [pullrequest/1816050](https://msdata.visualstudio.com/Vienna/_git/azureml-asset/pullrequest/1816050?_a=files\) | ||
|
||
Please follow directions given below. | ||
|
||
## spec.yaml content | ||
|
||
```yml | ||
|
||
type: "evaluator" | ||
name: "test.{name}" | ||
version: 1 | ||
displayName: "{display name}" | ||
description: "{description}" | ||
evaluatorType: "builtin" | ||
evaluatorSubType: "code" | ||
It represents what type of evaluator It is. | ||
For #1 & #2 type evaluators, please add "code" | ||
For #3 type evaluator, please provide "prompt" | ||
For #4 type evaluator, please provide "service" | ||
|
||
**categories: ** | ||
It represents an array of categories (Quality, Safety, Agents) | ||
Example- ["Quality", "Safety"] | ||
|
||
**initParameterSchema:** | ||
The JSON schema (Draft 2020-12) for the evaluator's input parameters. This includes parameters like type, properties, required. | ||
Example- | ||
type: "object" | ||
properties: | ||
threshold: | ||
type: "number" | ||
minimumValue: 0 | ||
maximumValue: 1 | ||
step: 0.1 | ||
required: ["threshold"] | ||
|
||
|
||
**dataMappingSchema:** | ||
The JSON schema (Draft 2020-12) for the evaluator's input data. This includes parameters like type, properties, required. | ||
Example- | ||
type: "object" | ||
properties: | ||
ground_truth: | ||
type: "string" | ||
response: | ||
type: "string" | ||
required: ["ground_truth", "response"] | ||
|
||
**outputSchema:** | ||
List of output metrics produced by this evaluator | ||
Example- | ||
bleu: | ||
type: "continuous" | ||
desirable_direction: "increase" | ||
min_value: 0 | ||
max_value: 1 | ||
|
||
path: ./evaluator | ||
``` | ||
|
||
# Step 3: | ||
When PR is merged. Evaluation Team will be able to kick off the CI Pipeline to publish evaluator in the Evaluator Catalog. | ||
This is done is 2 steps. | ||
|
||
In Step 1, new evaluator is published in azureml-dev registry so that I can be tested in INT environment. Once all looks good, Step 2 is performed. | ||
w-javed marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
In Step 2, new evaluator is published in azure-ml registry (for Production). | ||
|
||
|
||
# Step 4: | ||
Now, use Evaluators CRUD APIs to view evaluator in GET /evaluator list. | ||
|
||
Use following links | ||
|
||
INT: | ||
PROD: | ||
w-javed marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.