Pensar - auto fix for Unvalidated LLM-Generated Code Execution in Autonomous Workflow #12

pensarapp · 2025-04-01T23:10:22Z

Type	Identifier	Message	Severity	Link
Application	ML09, CWE-94, CWE-20	The workflow takes the output from the generate_code function and subsequently uses it to execute code in a Docker environment with run_locally. This behavior introduces a vulnerability corresponding to CWE ML09 (Manipulation of ML Model Outputs Affecting Integrity) because the LLM-generated outputs (including the dockerfile and files) are passed directly into a code execution environment without enforcing strict guardrails or sanitization. If an adversary manages to manipulate the input prompt or test conditions, they could drive the LLM to generate malicious code which would then be executed, potentially leading to unauthorized system modifications or container escapes.	critical	Link

The vulnerability (ML09/CWE-94/CWE-20) exists because the workflow directly executes LLM-generated code in a Docker environment without validating it first. If an attacker manipulates the inputs, they could trick the LLM into generating malicious code that would then be executed.

The patch adds a security validation layer that checks both the Dockerfile and generated files for potentially dangerous patterns before executing them. I've implemented a SecurityValidator class with methods to:

Validate Dockerfile content for dangerous patterns like privileged mode, host network access, volume mounting, etc.
Validate file content based on file type (Python, JavaScript, Shell) for risky operations like arbitrary command execution, dynamic code evaluation, etc.

The validation occurs at two critical points:

Before executing the run_locally function
When changes are made to files or the Dockerfile by the validate_output function

If security issues are detected, the workflow fails immediately with appropriate error logging, preventing the execution of potentially malicious code.

The patch imports the standard re module for pattern matching and typing for type hints, which are built-in Python modules that don't introduce external dependencies. The pattern-based approach balances security with maintainability and can be extended with additional patterns as needed.

This solution enforces proper guardrails around the LLM-generated outputs, addressing the core vulnerability while maintaining the original workflow functionality.

…omous Workflow (ML09, CWE-94, CWE-20)

restack-app · 2025-04-01T23:10:25Z

No applications have been configured for previews targeting branch: master. To do so go to restack console and configure your applications for previews.

Fix security issue: Unvalidated LLM-Generated Code Execution in Auton…

5418f9a

…omous Workflow (ML09, CWE-94, CWE-20)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Pensar - auto fix for Unvalidated LLM-Generated Code Execution in Autonomous Workflow #12

Pensar - auto fix for Unvalidated LLM-Generated Code Execution in Autonomous Workflow #12

Uh oh!

pensarapp bot commented Apr 1, 2025

Uh oh!

restack-app bot commented Apr 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Pensar - auto fix for Unvalidated LLM-Generated Code Execution in Autonomous Workflow #12

Are you sure you want to change the base?

Pensar - auto fix for Unvalidated LLM-Generated Code Execution in Autonomous Workflow #12

Uh oh!

Conversation

pensarapp bot commented Apr 1, 2025

Uh oh!

restack-app bot commented Apr 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants