Skip to content

Releases: Corbell-AI/evalmonkey

v1.0.1

09 May 06:22
d91c46d

Choose a tag to compare

What's Changed

  • feat: add framework adapters for LangGraph, LlamaIndex, and PydanticAI by @himmi-01 in #4

Full Changelog: v1.0.0...v1.0.1

v1.0.0

06 May 05:56
1ad2b0e

Choose a tag to compare

What's Changed

  • feat: evalmonkey web ui and benchmark stability fixes by @himmi-01 in #3
Screenshot 2026-05-05 at 10 48 33 PM Screenshot 2026-05-05 at 10 48 46 PM Screenshot 2026-05-05 at 10 49 01 PM

Full Changelog: v0.1.3...v1.0.0

v0.1.3

03 May 20:14
82a5925

Choose a tag to compare

implement automated eval asset generation and improvement prompts for failed benchmark traces ea99606
optimize benchmark loading by enabling streaming mode and add testing/inspection utilities 88a8fb5
strip markdown code fences from LLM judge responses 46be866
remove webarena benchmark 179d5ab

v0.1.2

26 Apr 01:15

Choose a tag to compare

v0.1.1

21 Apr 02:08

Choose a tag to compare

Let users bring their own benchmark dataset and allow running all chaos tests via single CLI command 88fecf8

v0.1.0

18 Apr 22:52

Choose a tag to compare

EvalMonkey's first release with MCP server support.

  • 8 Agent framework supported
  • 10 of the shelf benchmarks supported
  • 7 chaos scenarios supported
  • benchmark historical data on TUI