Reasoning support for evaluators #42482

nagkumar91 · 2025-08-12T15:40:14Z

Description

Please add an informative description that covers that changes made by the pull request and link all relevant issues.

If an SDK is being regenerated based on a new API spec, a link to the pull request containing these API spec changes should be included above.

All SDK Contribution checklist:

The pull request does not introduce [breaking changes]
CHANGELOG is updated for new features, bug fixes or other significant changes.
I have read the contribution guidelines.

General Guidelines and Best Practices

Title of the pull request is clear and informative.
There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

Pull request includes test coverage for the included changes.

Add pyrit and not remove the other one

… evaluate_kwargs pop scope fix

… docstring; tests; add AZEVAL_USE_PROMPTFLOW override.

…; improve reasoning error hints; add tests

…ader_order_debug_sample.py

Copilot

Pull Request Overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated 5 comments.

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_legacy/prompty/_prompty.py

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_legacy/_adapters/_flows.py

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate.py

...valuation/azure-ai-evaluation/azure/ai/evaluation/_evaluators/_groundedness/_groundedness.py

sdk/evaluation/azure-ai-evaluation/tests/unittests/test_reasoning_model_plumbing.py

…ne/falsy and non-dict to {})

…ax_tokens; only extra_headers added

Copilot

Pull Request Overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated 5 comments.

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_legacy/prompty/_prompty.py

...valuation/azure-ai-evaluation/azure/ai/evaluation/_evaluators/_groundedness/_groundedness.py

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_batch_run/code_client.py

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_legacy/_adapters/_flows.py

- Simplify client selection: default code_client; support _use_pf_client/_use_run_submitter_client with conflict check. - Groundedness: always pass is_reasoning_model to AsyncPrompty when switching templates. - Remove stray debug prints in CodeClient.get_metrics. - Tidy imports in groundedness evaluator (separate os/logging). - Reasoning model params: robust dict handling for parameters in AsyncPrompty to avoid dict() pitfalls.

…; keep other refactors intact.

Copilot

Pull Request Overview

Copilot reviewed 22 out of 23 changed files in this pull request and generated 1 comment.

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate.py

Nagkumar Arkalgud and others added 30 commits May 28, 2025 11:11

Prepare evals SDK Release

4318329

Fix bug

192b980

Fix for ADV_CONV for FDP projects

758adb4

Update release date

de09fd1

Merge branch 'main' into main

ef60fe6

Merge branch 'Azure:main' into main

8ca51d0

Merge branch 'Azure:main' into main

98bfc3a

Merge branch 'Azure:main' into main

a5f32e8

Merge branch 'Azure:main' into main

5fd88b6

Merge branch 'Azure:main' into main

51f2b44

Merge branch 'Azure:main' into main

a5be8b5

Merge branch 'Azure:main' into main

75965b7

Merge branch 'Azure:main' into main

d0c5e53

Merge branch 'Azure:main' into main

b790276

Merge branch 'Azure:main' into main

d5ca243

re-add pyrit to matrix

8d62e36

Change grader ids

59a70f2

Merge branch 'Azure:main' into main

4d146d7

Update unit test

f7a4c83

replace all old grader IDs in tests

79e3a40

Merge branch 'main' into main

588cbec

Update platform-matrix.json

7514472

Add pyrit and not remove the other one

Update test to ensure everything is mocked

28b2513

tox/black fixes

8603e0e

Skip that test with issues

895f226

Merge branch 'Azure:main' into main

b4b2daf

update grader ID according to API View feedback

023f07f

Update test

45b5f5d

remove string check for grader ID

1ccb4db

Merge branch 'Azure:main' into main

6fd9aa5

Nagkumar Arkalgud and others added 3 commits September 22, 2025 16:55

evaluation(evaluate): revert flag renames and defensive default; keep…

72388cc

… evaluate_kwargs pop scope fix

Prefer SDK prompty for reasoning; PF wrapper; fix param sanitization;…

f403b99

… docstring; tests; add AZEVAL_USE_PROMPTFLOW override.

Merge branch 'Azure:main' into diff-20250811-171736

2dde27d

nagkumar91 requested a review from singankit September 23, 2025 17:46

Nagkumar Arkalgud and others added 4 commits September 23, 2025 14:57

prompty: remove AZEVAL_USE_LEGACY_PROMPTY, unify selection via kwargs…

4581b0e

…; improve reasoning error hints; add tests

Delete sdk/evaluation/azure-ai-evaluation/samples/aoai_score_model_gr…

a8869e5

…ader_order_debug_sample.py

Remove tracked .log files and samples directory

cea9be1

Restore samples directory from origin/main

ed1f9f7

nagkumar91 requested a review from Copilot September 24, 2025 15:15

Copilot AI reviewed Sep 24, 2025

View reviewed changes

Nagkumar Arkalgud added 3 commits September 24, 2025 08:54

Prompty: robust parameters handling for reasoning models (coalesce No…

4bfd6e9

…ne/falsy and non-dict to {})

Evaluate: rename get_client_type param to kwargs for clarity

c100fc8

Tests: clarify comments about reasoning models removing temperature/m…

91567e2

…ax_tokens; only extra_headers added

nagkumar91 requested a review from Copilot September 25, 2025 15:09

Copilot AI reviewed Sep 25, 2025

View reviewed changes

Nagkumar Arkalgud added 3 commits September 25, 2025 08:47

Apply evaluation refactors and cleanup into diff branch

40fb39c

Revert client selection to legacy tri-state behavior to satisfy tests…

2f2ceec

…; keep other refactors intact.

nagkumar91 requested a review from Copilot September 25, 2025 19:03

Copilot AI reviewed Sep 25, 2025

View reviewed changes

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate.py Show resolved Hide resolved

nagkumar91 added 2 commits September 25, 2025 13:30

Delete sdk/evaluation/azure-ai-evaluation/tests/2025_09_25__08_49.log

3115a75

Delete sdk/evaluation/azure-ai-evaluation/samples/.gitignore

4ff307d

luigiw approved these changes Sep 29, 2025

View reviewed changes

nagkumar91 added 2 commits September 30, 2025 06:36

lint fixes

6363cc8

skip until new way of passing credentials is supported

1b613f9

nagkumar91 enabled auto-merge (squash) September 30, 2025 14:01

slister1001 approved these changes Sep 30, 2025

View reviewed changes

nagkumar91 disabled auto-merge September 30, 2025 14:34

Fix issue

d9845d4

nagkumar91 closed this Oct 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reasoning support for evaluators #42482

Reasoning support for evaluators #42482

Uh oh!

nagkumar91 commented Aug 12, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Reasoning support for evaluators #42482

Reasoning support for evaluators #42482

Uh oh!

Conversation

nagkumar91 commented Aug 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

All SDK Contribution checklist:

General Guidelines and Best Practices

Testing Guidelines

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

nagkumar91 commented Aug 12, 2025 •

edited

Loading