Skip to content

fixing secure_exec sandbox escape via getattr vulnerability#2592

Open
AseemPrasad wants to merge 1 commit intoconfident-ai:mainfrom
AseemPrasad:fix/sandbox-escape
Open

fixing secure_exec sandbox escape via getattr vulnerability#2592
AseemPrasad wants to merge 1 commit intoconfident-ai:mainfrom
AseemPrasad:fix/sandbox-escape

Conversation

@AseemPrasad
Copy link
Copy Markdown

The vulnerability i found was, secure_exec sandbox escape via getattr

The secure_exec function runs LLM-generated code inside a restricted globals sandbox, intended to prevent the executing code from accessing dangerous built-ins. However, it includes getattr and type in the allow-list of safe builtins:

Python
safe_globals = {
"builtins": {
...
"getattr": getattr, # ← CRITICAL: allows full object traversal
"type": type, # ← CRITICAL: allows type introspection
"hasattr": hasattr,
...
}
}
getattr allows traversing Python's entire object hierarchy by attribute name (as a plain string). Any literal value available in code (e.g., [], "", {}) can be used as a starting point to reach object, then object.subclasses(), then find subclasses like subprocess.Popen, os._wrap_close, or _io.FileIO:

Python

Classic Python sandbox escape — all using only safe builtins:

subclasses = getattr(getattr(getattr([], "class"), "mro")[-1], "subclasses")()
popen = [c for c in subclasses if c.name == "Popen"][0]
popen(["id"], stdout=-1).communicate() # executes a shell command
Impact: An adversarial or jailbroken LLM (the benchmark is literally evaluating LLM code generation) can generate code that escapes the sandbox, executing arbitrary OS commands with the privileges of the deepeval process — read SSH keys, exfiltrate API credentials from environment variables, establish reverse shells. This is a textbook Python sandbox escape, and HumanEval is exactly the scenario where it will be triggered by LLM output.

Root cause: Python's exec() sandboxing is inherently broken. There is no reliable way to sandbox Python exec at the interpreter level using only restricted globals. The inclusion of getattr makes escape trivial.

here's what i did,

Implemented a full fix for the HumanEval code-execution vulnerability by replacing interpreter-level sandboxing with isolated subprocess execution in human_eval.py.

  1. Added a new subprocess runner that executes candidate code in a separate Python process instead of using direct exec.

  2. The runner writes code to a temporary .py file, launches it with subprocess.Popen([sys.executable, temp_file]) (no shell), captures output, enforces a timeout, and returns pass/fail via exit code.

  3. Child-process hardening was added:

    • env={} to strip inherited secrets/API keys.
    • cwd=tempfile.gettempdir() to avoid running from project root.
    • POSIX-only resource limits (RLIMIT_CPU, RLIMIT_AS/RLIMIT_DATA, RLIMIT_NOFILE, RLIMIT_NPROC) via preexec_fn.
  4. Added AST defense-in-depth pre-scan before execution:

    • Blocks unsafe imports except a strict whitelist (math, collections, itertools, etc.).
    • Rejects dunder attribute access patterns (x) to reduce object-model escape vectors.
  5. Updated the predict loop to call the new subprocess path and count success only when the subprocess exits cleanly.

  6. Hardened the old secure_exec fallback by removing dangerous builtins (getattr, hasattr, type, repr) so even fallback behavior is safer.

@vercel
Copy link
Copy Markdown

vercel bot commented Apr 2, 2026

@AseemPrasad is attempting to deploy a commit to the Confident AI Team on Vercel.

A member of the Team first needs to authorize it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant