Skip to content

Tag generated cyber tasks with MITRE ATT&CK technique IDs #88

@westonbrown

Description

@westonbrown

Updated 2026-05-29. Paths refreshed after the SDK extraction (#225)
and typed-property-graph refactor (#223). The cyber pack now lives in
packs/cyber_webapp/. Concept unchanged; still a good first issue and
aligned with the cyber-priority roadmap.

Context

Cyber-offense scenarios in cyber.webapp don't carry any reference to the
MITRE ATT&CK framework — the standard taxonomy for adversary techniques
(T1190 exploit public-facing application, T1078 valid accounts, T1059
command-and-scripting interpreter, etc.). Tagging generated tasks and
reference attack paths with technique IDs makes (a) reports analyst-readable
in their natural taxonomy, (b) defense agents speak a vocabulary that maps to
real defender training data.

What this issue tracks

Add MITRE ATT&CK technique IDs to:

  1. The vulnerability catalog — each entry gets a mitre_techniques field.
  2. Generated tasks — task instruction (and verifier details on success)
    reference the technique IDs the agent's path maps to.
  3. The request log / final state — when actions match a technique
    pattern, the log row carries the technique ID.

Where to start (current paths)

  • packs/cyber_webapp/cyber_webapp/vulnerabilities/__init__.py — the
    vulnerability catalog. Add a mitre_techniques: list[str] field.
  • packs/cyber_webapp/cyber_webapp/vulnerabilities/templates/ — handler
    templates, if technique IDs are template-aware.
  • packs/cyber_webapp/cyber_webapp/llm_generation.py — the LLM that writes
    task instructions; with ATT&CK metadata in the graph it can reference
    techniques.
  • ATT&CK reference: https://attack.mitre.org/

Suggested scope

Per-vuln mapping first (e.g. sql_injection → T1190, ssrf → T1090/T1071,
broken_authz → T1078). Per-task and per-log tagging come after.

Acceptance

  • Catalog entries carry mitre_techniques: list[str].
  • Generated tasks reference technique IDs in instruction text.
  • A test verifies IDs match the ATT&CK format (regex).

Don't over-engineer the data structure — list of strings is fine until a
consumer needs more.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No fields configured for Feature.

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions