Feature request: Save/load attack datasets for cross-session re-evaluation

## Use Case

When iterating on an LLM system's safety, you want to:
1. **Generate attacks once** (expensive — uses simulator LLM)
2. **Re-run the same attacks** against improved versions of your model over time
3. **Compare results** across iterations

Currently this workflow is not supported end-to-end.

## Current State

- `RiskAssessment.save(to)` exists and writes a JSON file with all test cases, scores, and metadata
- **There is no `load()`, `from_json()`, or `from_file()` method** to deserialize a saved `RiskAssessment` back into Python objects
- `reuse_simulated_test_cases=True` only works within the same Python session (reuses in-memory `self.test_cases`)
- The `EnumEncoder` used in `save()` has no companion decoder

## Proposed Solution

### Minimum: Add `RiskAssessment.load(path)` / `RedTeamer.load_test_cases(path)`

A class method that deserializes saved JSON back into `RTTestCase` objects with proper enum types, so they can be injected into `red_teamer.test_cases` for reuse.

```python
# Save after generation
risk_assessment.save(to='./red-team-attacks/')

# Load in a new session
red_teamer = RedTeamer(...)
red_teamer.load_test_cases('./red-team-attacks/results_20260309.json')

# Re-run with fresh model callback
risk_assessment = red_teamer.red_team(
    model_callback=my_callback,
    reuse_simulated_test_cases=True,
)
```

### Ideal: First-class dataset support

A `Dataset` or `AttackDataset` class (similar to how `AISafetyFramework` subclasses work with `_has_dataset=True`) that:
- Loads attacks from a file
- Calls the model callback for ALL attacks (single + multi-turn) — see #199 for the multi-turn bug
- Evaluates with the appropriate vulnerability metrics
- Returns a `RiskAssessment`

## Workaround

We currently:
1. Save generated attacks to JSON manually (extracting fields from `RTTestCase`)
2. Reconstruct `RTTestCase` objects from JSON with manual enum mapping
3. Call the API ourselves for all test cases
4. Use `vulnerability._get_metric(type).a_measure(test_case)` directly

This works but requires significant boilerplate and knowledge of DeepTeam internals.

## Environment

- deepteam 1.0.6
- Python 3.12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: Save/load attack datasets for cross-session re-evaluation #200

Use Case

Current State

Proposed Solution

Minimum: Add `RiskAssessment.load(path)` / `RedTeamer.load_test_cases(path)`

Ideal: First-class dataset support

Workaround

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature request: Save/load attack datasets for cross-session re-evaluation #200

Description

Use Case

Current State

Proposed Solution

Minimum: Add RiskAssessment.load(path) / RedTeamer.load_test_cases(path)

Ideal: First-class dataset support

Workaround

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Minimum: Add `RiskAssessment.load(path)` / `RedTeamer.load_test_cases(path)`