Parallelize scenarios #2

yuval-qf · 2025-06-27T12:18:25Z

Summary by CodeRabbit

Refactor
- Improved the performance of policy scenario evaluations by enabling concurrent processing, resulting in faster update streaming and more responsive status updates during evaluations.
Chores
- Updated configuration for pre-commit checks to improve compatibility with argument parsing.

coderabbitai · 2025-06-27T12:18:45Z

Walkthrough

The updates refactor the policy scenario evaluation process to run each scenario concurrently using asynchronous tasks and a shared queue for status updates, replacing the previous sequential approach. Additionally, the mypy pre-commit hook configuration is adjusted to split the config file argument into two separate entries.

Changes

File(s)	Change Summary
.pre-commit-config.yaml	Modified `mypy` hook arguments to split `--config-file .mypy.ini` into two separate arguments.
rogue/services/scenario_evaluation_service.py	Refactored scenario evaluation to run scenarios concurrently via async tasks and a queue; added `_evaluate_single_policy_scenario` method and updated `_evaluate_policy_scenarios` method signature and implementation.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant ScenarioEvaluationService
    participant asyncio.Queue
    participant arun_evaluator_agent

    Client->>ScenarioEvaluationService: _evaluate_policy_scenarios()
    loop For each scenario
        ScenarioEvaluationService->>ScenarioEvaluationService: _evaluate_single_policy_scenario(scenario, queue) (async task)
        ScenarioEvaluationService->>asyncio.Queue: Put status update ("in_progress")
        ScenarioEvaluationService->>arun_evaluator_agent: Evaluate scenario (async)
        arun_evaluator_agent-->>ScenarioEvaluationService: Yield updates
        ScenarioEvaluationService->>asyncio.Queue: Put update
        ScenarioEvaluationService->>asyncio.Queue: Put result
        alt Exception
            ScenarioEvaluationService->>asyncio.Queue: Put error status update
        end
    end
    ScenarioEvaluationService->>Client: Yield updates from queue_watcher
    ScenarioEvaluationService->>ScenarioEvaluationService: Await all tasks (asyncio.gather)

Poem

In the warren of code, tasks now hop with delight,
Scenarios run side by side, not just through the night.
Updates queue up, swift as a hare,
While mypy’s config splits with care.
With async and order, our work multiplies—
A rabbit’s proud of this clever surprise!
🐇✨

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Lite

📥 Commits

Reviewing files that changed from the base of the PR and between de71cf7 and 2231b6c.

📒 Files selected for processing (2)

.pre-commit-config.yaml (1 hunks)
rogue/services/scenario_evaluation_service.py (2 hunks)

🧰 Additional context used

🧠 Learnings (1)

📓 Common learnings

Learnt from: CR
PR: qualifire-dev/qualifire#0
File: .cursor/rules/self_improve.mdc:0-0
Timestamp: 2025-06-23T12:36:40.835Z
Learning: Update rules after major refactors to ensure they remain relevant and accurate.

Learnt from: CR
PR: qualifire-dev/qualifire#0
File: .cursor/rules/dev_workflow.mdc:0-0
Timestamp: 2025-06-23T12:36:21.756Z
Learning: After completing a subtask, review all code changes and chat history to identify new or modified code patterns, conventions, or best practices, and update or create rules accordingly.

🔇 Additional comments (3)

.pre-commit-config.yaml (1)

21-21: LGTM! Correct argument format for mypy.

Splitting the config file argument into separate list elements is the proper way to pass command-line arguments in pre-commit hooks.

rogue/services/scenario_evaluation_service.py (2)

37-65: Well-implemented async scenario evaluation with proper error handling.

The method correctly handles individual scenario evaluation with:

Status updates via the shared queue

Proper async iteration over evaluation results

Exception handling with contextual logging

Error status reporting on failures

66-84: Excellent parallelization implementation with proper queue handling.

The refactoring successfully parallelizes scenario evaluation with:

Concurrent task execution for each scenario

Non-blocking queue monitoring with timeout

Correct handling of the race condition between task completion and queue emptiness

Proper task synchronization using asyncio.gather

This should significantly improve performance when evaluating multiple scenarios.

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

* wip on vibing a new architecture * wip on vibing a new architecture * wip on vibing a new architecture * wip * wip * wip * wip * wip * --wip-- [skip ci] * fixed!!!!! * fixed!!!!! * --wip-- [skip ci] * fixed!!!!! * fixed!!!!! * wip * wip * wip * wip * Update rogue/server/api/health.py Co-authored-by: yuval-qf <[email protected]> * wip * wip * wip * wip * wip * remove the non functional UI * getting started with the go tui * --wip-- [skip ci] * --wip-- [skip ci] * --wip-- [skip ci] * getting started with the go tui * wip * wip on tui * wip on tui * ci * fix the tests * ci fix * ci fix * ci fix * ci fix * wip * fix the tests * tui in another pr * Update sdks/python/rogue_client/client.py Co-authored-by: yuval-qf <[email protected]> * wip * Update rogue/evaluator_agent/run_evaluator_agent.py Co-authored-by: drorIvry <[email protected]> * Update rogue/evaluator_agent/run_evaluator_agent.py Co-authored-by: drorIvry <[email protected]> * CR - part 1 * Consolidate types * interviewer cr * wip * wip * Remove legacy * Consolidate services * Consolidate services * fix business context hardcoded * wip * websocket manager singleton * endpoint prefixes * rename * fix license in pyproject * fix model/judge_llm confusion * Rename rogue_client -> rogue_sdk * install sdk and server in gh action * install on system * rename rogue_sdk directory * rename rogue_sdk directory #2 * Use venv * typo in __init__ * Consolidate __main__ files * fix init * fix main * Consolidate more types * Consolidate more types * Fix tests * replace judge_llm_model -> judge_llm * replace judge_llm_model -> judge_llm * Fix mypy * Fix mypy * Fix model validate * Remove unused aliases * Fix AgentConfig initialization * Fix tests cicd * Fix tests * rabbit ci * Typing * better backoff * Add task lock * mypy fixes * exception handling * rabbit cr * wip * wip * Remove refresh button * gh action --------- Co-authored-by: yuval-qf <[email protected]>

* Reduce rogue startup import time * Reduce rogue startup import time #2 * Reduce rogue startup import time #3

Parallelize scenarios

2231b6c

yuval-qf closed this Jul 1, 2025

yuval-qf deleted the parallelize-scenarios branch July 1, 2025 19:28

yuval-qf added a commit that referenced this pull request Aug 14, 2025

rename rogue_sdk directory #2

531fdff

yuval-qf added a commit that referenced this pull request Oct 8, 2025

Reduce rogue startup import time #2

d649c23

yuval-qf added a commit that referenced this pull request Oct 8, 2025

FIRE-815 | Bugfix | Reduce rogue import time on startup (#55)

8567820

* Reduce rogue startup import time * Reduce rogue startup import time #2 * Reduce rogue startup import time #3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallelize scenarios #2

Parallelize scenarios #2

Uh oh!

yuval-qf commented Jun 27, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jun 27, 2025 •

edited

Loading

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Parallelize scenarios #2

Parallelize scenarios #2

Uh oh!

Conversation

yuval-qf commented Jun 27, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jun 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yuval-qf commented Jun 27, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jun 27, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)