Skip to content

Feature request: Save/load attack datasets for cross-session re-evaluation #200

@Pitchlab

Description

@Pitchlab

Use Case

When iterating on an LLM system's safety, you want to:

  1. Generate attacks once (expensive — uses simulator LLM)
  2. Re-run the same attacks against improved versions of your model over time
  3. Compare results across iterations

Currently this workflow is not supported end-to-end.

Current State

  • RiskAssessment.save(to) exists and writes a JSON file with all test cases, scores, and metadata
  • There is no load(), from_json(), or from_file() method to deserialize a saved RiskAssessment back into Python objects
  • reuse_simulated_test_cases=True only works within the same Python session (reuses in-memory self.test_cases)
  • The EnumEncoder used in save() has no companion decoder

Proposed Solution

Minimum: Add RiskAssessment.load(path) / RedTeamer.load_test_cases(path)

A class method that deserializes saved JSON back into RTTestCase objects with proper enum types, so they can be injected into red_teamer.test_cases for reuse.

# Save after generation
risk_assessment.save(to='./red-team-attacks/')

# Load in a new session
red_teamer = RedTeamer(...)
red_teamer.load_test_cases('./red-team-attacks/results_20260309.json')

# Re-run with fresh model callback
risk_assessment = red_teamer.red_team(
    model_callback=my_callback,
    reuse_simulated_test_cases=True,
)

Ideal: First-class dataset support

A Dataset or AttackDataset class (similar to how AISafetyFramework subclasses work with _has_dataset=True) that:

Workaround

We currently:

  1. Save generated attacks to JSON manually (extracting fields from RTTestCase)
  2. Reconstruct RTTestCase objects from JSON with manual enum mapping
  3. Call the API ourselves for all test cases
  4. Use vulnerability._get_metric(type).a_measure(test_case) directly

This works but requires significant boilerplate and knowledge of DeepTeam internals.

Environment

  • deepteam 1.0.6
  • Python 3.12

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions