Skip to content

Conversation

@yuval-qf
Copy link
Collaborator

@yuval-qf yuval-qf commented Jun 27, 2025

Summary by CodeRabbit

  • Refactor

    • Improved the performance of policy scenario evaluations by enabling concurrent processing, resulting in faster update streaming and more responsive status updates during evaluations.
  • Chores

    • Updated configuration for pre-commit checks to improve compatibility with argument parsing.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jun 27, 2025

Walkthrough

The updates refactor the policy scenario evaluation process to run each scenario concurrently using asynchronous tasks and a shared queue for status updates, replacing the previous sequential approach. Additionally, the mypy pre-commit hook configuration is adjusted to split the config file argument into two separate entries.

Changes

File(s) Change Summary
.pre-commit-config.yaml Modified mypy hook arguments to split --config-file .mypy.ini into two separate arguments.
rogue/services/scenario_evaluation_service.py Refactored scenario evaluation to run scenarios concurrently via async tasks and a queue; added _evaluate_single_policy_scenario method and updated _evaluate_policy_scenarios method signature and implementation.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant ScenarioEvaluationService
    participant asyncio.Queue
    participant arun_evaluator_agent

    Client->>ScenarioEvaluationService: _evaluate_policy_scenarios()
    loop For each scenario
        ScenarioEvaluationService->>ScenarioEvaluationService: _evaluate_single_policy_scenario(scenario, queue) (async task)
        ScenarioEvaluationService->>asyncio.Queue: Put status update ("in_progress")
        ScenarioEvaluationService->>arun_evaluator_agent: Evaluate scenario (async)
        arun_evaluator_agent-->>ScenarioEvaluationService: Yield updates
        ScenarioEvaluationService->>asyncio.Queue: Put update
        ScenarioEvaluationService->>asyncio.Queue: Put result
        alt Exception
            ScenarioEvaluationService->>asyncio.Queue: Put error status update
        end
    end
    ScenarioEvaluationService->>Client: Yield updates from queue_watcher
    ScenarioEvaluationService->>ScenarioEvaluationService: Await all tasks (asyncio.gather)
Loading

Poem

In the warren of code, tasks now hop with delight,
Scenarios run side by side, not just through the night.
Updates queue up, swift as a hare,
While mypy’s config splits with care.
With async and order, our work multiplies—
A rabbit’s proud of this clever surprise!
🐇✨


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Lite

📥 Commits

Reviewing files that changed from the base of the PR and between de71cf7 and 2231b6c.

📒 Files selected for processing (2)
  • .pre-commit-config.yaml (1 hunks)
  • rogue/services/scenario_evaluation_service.py (2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: CR
PR: qualifire-dev/qualifire#0
File: .cursor/rules/self_improve.mdc:0-0
Timestamp: 2025-06-23T12:36:40.835Z
Learning: Update rules after major refactors to ensure they remain relevant and accurate.
Learnt from: CR
PR: qualifire-dev/qualifire#0
File: .cursor/rules/dev_workflow.mdc:0-0
Timestamp: 2025-06-23T12:36:21.756Z
Learning: After completing a subtask, review all code changes and chat history to identify new or modified code patterns, conventions, or best practices, and update or create rules accordingly.
🔇 Additional comments (3)
.pre-commit-config.yaml (1)

21-21: LGTM! Correct argument format for mypy.

Splitting the config file argument into separate list elements is the proper way to pass command-line arguments in pre-commit hooks.

rogue/services/scenario_evaluation_service.py (2)

37-65: Well-implemented async scenario evaluation with proper error handling.

The method correctly handles individual scenario evaluation with:

  • Status updates via the shared queue
  • Proper async iteration over evaluation results
  • Exception handling with contextual logging
  • Error status reporting on failures

66-84: Excellent parallelization implementation with proper queue handling.

The refactoring successfully parallelizes scenario evaluation with:

  • Concurrent task execution for each scenario
  • Non-blocking queue monitoring with timeout
  • Correct handling of the race condition between task completion and queue emptiness
  • Proper task synchronization using asyncio.gather

This should significantly improve performance when evaluating multiple scenarios.


🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@yuval-qf yuval-qf closed this Jul 1, 2025
@yuval-qf yuval-qf deleted the parallelize-scenarios branch July 1, 2025 19:28
yuval-qf added a commit that referenced this pull request Aug 14, 2025
yuval-qf added a commit that referenced this pull request Aug 17, 2025
* wip on vibing a new architecture

* wip on vibing a new architecture

* wip on vibing a new architecture

* wip

* wip

* wip

* wip

* wip

* --wip-- [skip ci]

* fixed!!!!!

* fixed!!!!!

* --wip-- [skip ci]

* fixed!!!!!

* fixed!!!!!

* wip

* wip

* wip

* wip

* Update rogue/server/api/health.py

Co-authored-by: yuval-qf <[email protected]>

* wip

* wip

* wip

* wip

* wip

* remove the non functional UI

* getting started with the go tui

* --wip-- [skip ci]

* --wip-- [skip ci]

* --wip-- [skip ci]

* getting started with the go tui

* wip

* wip on tui

* wip on tui

* ci

* fix the tests

* ci fix

* ci fix

* ci fix

* ci fix

* wip

* fix the tests

* tui in another pr

* Update sdks/python/rogue_client/client.py

Co-authored-by: yuval-qf <[email protected]>

* wip

* Update rogue/evaluator_agent/run_evaluator_agent.py

Co-authored-by: drorIvry <[email protected]>

* Update rogue/evaluator_agent/run_evaluator_agent.py

Co-authored-by: drorIvry <[email protected]>

* CR - part 1

* Consolidate types

* interviewer cr

* wip

* wip

* Remove legacy

* Consolidate services

* Consolidate services

* fix business context hardcoded

* wip

* websocket manager singleton

* endpoint prefixes

* rename

* fix license in pyproject

* fix model/judge_llm confusion

* Rename rogue_client -> rogue_sdk

* install sdk and server in gh action

* install on system

* rename rogue_sdk directory

* rename rogue_sdk directory #2

* Use venv

* typo in __init__

* Consolidate __main__ files

* fix init

* fix main

* Consolidate more types

* Consolidate more types

* Fix tests

* replace judge_llm_model -> judge_llm

* replace judge_llm_model -> judge_llm

* Fix mypy

* Fix mypy

* Fix model validate

* Remove unused aliases

* Fix AgentConfig initialization

* Fix tests cicd

* Fix tests

* rabbit ci

* Typing

* better backoff

* Add task lock

* mypy fixes

* exception handling

* rabbit cr

* wip

* wip

* Remove refresh button

* gh action

---------

Co-authored-by: yuval-qf <[email protected]>
yuval-qf added a commit that referenced this pull request Oct 8, 2025
yuval-qf added a commit that referenced this pull request Oct 8, 2025
* Reduce rogue startup import time

* Reduce rogue startup import time #2

* Reduce rogue startup import time #3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants