fixing secure_exec sandbox escape via getattr vulnerability#2592
Open
AseemPrasad wants to merge 1 commit intoconfident-ai:mainfrom
Open
fixing secure_exec sandbox escape via getattr vulnerability#2592AseemPrasad wants to merge 1 commit intoconfident-ai:mainfrom
AseemPrasad wants to merge 1 commit intoconfident-ai:mainfrom
Conversation
|
@AseemPrasad is attempting to deploy a commit to the Confident AI Team on Vercel. A member of the Team first needs to authorize it. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The vulnerability i found was, secure_exec sandbox escape via getattr
The secure_exec function runs LLM-generated code inside a restricted globals sandbox, intended to prevent the executing code from accessing dangerous built-ins. However, it includes getattr and type in the allow-list of safe builtins:
Python
safe_globals = {
"builtins": {
...
"getattr": getattr, # ← CRITICAL: allows full object traversal
"type": type, # ← CRITICAL: allows type introspection
"hasattr": hasattr,
...
}
}
getattr allows traversing Python's entire object hierarchy by attribute name (as a plain string). Any literal value available in code (e.g., [], "", {}) can be used as a starting point to reach object, then object.subclasses(), then find subclasses like subprocess.Popen, os._wrap_close, or _io.FileIO:
Python
Classic Python sandbox escape — all using only safe builtins:
subclasses = getattr(getattr(getattr([], "class"), "mro")[-1], "subclasses")()
popen = [c for c in subclasses if c.name == "Popen"][0]
popen(["id"], stdout=-1).communicate() # executes a shell command
Impact: An adversarial or jailbroken LLM (the benchmark is literally evaluating LLM code generation) can generate code that escapes the sandbox, executing arbitrary OS commands with the privileges of the deepeval process — read SSH keys, exfiltrate API credentials from environment variables, establish reverse shells. This is a textbook Python sandbox escape, and HumanEval is exactly the scenario where it will be triggered by LLM output.
Root cause: Python's exec() sandboxing is inherently broken. There is no reliable way to sandbox Python exec at the interpreter level using only restricted globals. The inclusion of getattr makes escape trivial.
here's what i did,
Implemented a full fix for the HumanEval code-execution vulnerability by replacing interpreter-level sandboxing with isolated subprocess execution in human_eval.py.
Added a new subprocess runner that executes candidate code in a separate Python process instead of using direct exec.
The runner writes code to a temporary .py file, launches it with subprocess.Popen([sys.executable, temp_file]) (no shell), captures output, enforces a timeout, and returns pass/fail via exit code.
Child-process hardening was added:
Added AST defense-in-depth pre-scan before execution:
Updated the predict loop to call the new subprocess path and count success only when the subprocess exits cleanly.
Hardened the old secure_exec fallback by removing dangerous builtins (getattr, hasattr, type, repr) so even fallback behavior is safer.