Skip to content

Add provider-aware parallel execution to run_benchmarks.py #71

@MHindermann

Description

@MHindermann

Plan: Provider-Aware Parallel Execution

  1. Analyze current structure
  • Review how benchmarks currently execute
  • Understand the test_config structure and provider information
  • Check if there are any threading/multiprocessing concerns with current code
  1. Create benchmark grouping function
  • Add function group_tests_by_provider(test_configs) that:
    • Takes a list of test configurations
    • Groups them by provider (openai, genai, anthropic, mistral, etc.)
    • Returns dict: {provider: [test_configs]}
  1. Implement per-provider parallel executor
  • Add function run_provider_tests_parallel(provider, test_configs, max_workers) that:
    • Uses ThreadPoolExecutor or ProcessPoolExecutor
    • Runs multiple tests for one provider in parallel
    • Limits concurrency to max_workers (default: 2)
    • Handles exceptions and logs results
  1. Create main parallel orchestrator
  • Add function main_parallel(test_ids, max_workers_per_provider=2) that:
    • Reads the CSV and filters by test_ids
    • Groups tests by provider
    • Spawns separate executor for each provider
    • All providers run simultaneously, each with limited concurrency
    • Waits for all to complete and reports summary
  1. Keep backward compatibility
  • Keep existing main() function unchanged for sequential execution
  • Add the new main_parallel() as an alternative entry point
  • Update the if name == "main" block to use main_parallel()
  1. Add configuration and safety
  • Add MAX_WORKERS_PER_PROVIDER config variable at top of file
  • Add proper error handling for parallel execution
  • Ensure logging is thread-safe
  • Add summary report at the end (success/failure counts per provider)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions