Skip to content

cyber gym: POST/body channel + difficulty-curriculum knobs (deferred from #257) #258

@larstalian

Description

@larstalian

Two transfer/learnability improvements deliberately deferred from PR #257 (the per-class transfer-validity realignment). Neither blocks #257 — the H2 intra-class confound is closed there — but both are real follow-ups the audits flagged.

1. POST/body request channel (audit-rated MAJOR)

The webapp runtime (packs/cyber_webapp/cyber_webapp/codegen/templates/app.py.j2) only implements do_GET, so every exploit — including body-shaped ones (XXE documents, login credentials, command-injection payloads) — is delivered as a URL query parameter. On real XBOW/CVE-Bench targets these are POST/body operations with a Content-Type and body framing. An agent trained only on GET-query delivery learns the wrong request shape, a cross-cutting transfer gap affecting ~5 classes.

  • Add do_POST + body parsing to the runtime; let a vuln/endpoint sample its method (GET query vs POST form/JSON body), and route body-shaped classes through it.
  • This is orthogonal runtime/transport work (touches every handler's call convention), which is why it's its own PR rather than bundled into the exploit-content realignment.

2. Difficulty-curriculum knobs

  • families/pentest.py hardcodes difficulty=0.7 independent of the world. Derive it from loot shape + pool sizes + chain depth.
  • The pools that actually set difficulty (flag-path entropy, param/cred pools, hint-config presence) are module constants with no manifest path. Expose them as manifest knobs (pool sizes, hint on/off) so a curriculum can order easy→hard worlds.

Context

Found by the adversarial audits during #257. The per-class exploit content is now transfer-valid (faithful engines, discoverable flags, mutually-exclusive payload contexts at a ~33% replay floor); see packs/cyber_webapp/DESIGN.md. These two items are the remaining transfer/curriculum enhancements to a now-valid gym.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions