Skip to content

Security hardening#13530

Draft
jordanrfrazier wants to merge 14 commits into
release-1.10.0from
security-hardening
Draft

Security hardening#13530
jordanrfrazier wants to merge 14 commits into
release-1.10.0from
security-hardening

Conversation

@jordanrfrazier

Copy link
Copy Markdown
Collaborator

No description provided.

Three critical issues for multi-tenant deployments where users may not author
custom components:

1. Code-execution core components (Python Interpreter/REPL, Python Code
   Structured tool, Smart Transform) are official, so their class-code hash is
   valid and they pass the allow_custom_components=False policy, yet they execute
   arbitrary user Python from their input fields. Add the
   LANGFLOW_BLOCK_CODE_INTERPRETER_COMPONENTS setting (default off), enforced in
   flow_validation at the Graph.from_payload choke point with recursion into
   nested flows so all build paths are covered.

2. LambdaFilterComponent evaluated an LLM-generated lambda with full builtins
   (prompt-injection -> RCE). Reject escape gadgets via validate_code_safety and
   eval with safe_builtins() (reuses python_repl_security).

3. The global-variable -> env-var fallback did os.getenv(<tenant-supplied name>),
   letting any tenant read LANGFLOW_SECRET_KEY / DATABASE_URL etc. Add
   env_var_security.safe_getenv with a reserved-name denylist and apply it at all
   fallback sites in lfx and langflow.
…holes

Four more critical issues for multi-tenant deployments where tenants use all core
components but cannot author custom ones:

1. Variable Credential->Generic type-confusion: PATCH /variables flipping a credential
   row's type to Generic without a value left the Fernet ciphertext in place; get_all
   then decrypted it and returned plaintext via GET /variables, exposing the server's
   shared provider keys. Reject the transition (write path) and never decrypt a
   Fernet-token value labeled Generic (read path).

2. SQL Database components (sql_executor, langchain sql/sql_database) accepted arbitrary
   connection URIs -> SSRF to internal DBs and sqlite:////abs/path local file
   read/write. Add validate_database_url_for_ssrf: host validated against SSRF blocked
   ranges (default-on), local-file dialects blocked under LANGFLOW_RESTRICT_LOCAL_FILE_ACCESS
   so single-tenant sqlite keeps working.

3. Web Search RSS/web fetches used bare requests.get with no SSRF guard (cloud-metadata
   cred theft). Route the RSS URL and result-link fetches through validate_url_for_ssrf.

4. File/Directory/JSON-to-Data/CSV-to-Data read arbitrary server files via uncontained
   resolve_path. Add LANGFLOW_RESTRICT_LOCAL_FILE_ACCESS (default off) +
   enforce_local_file_access, confining resolved paths to the storage data dir at every
   read sink. resolve_path itself is unchanged so persistence-dir components are unaffected.
Five more critical issues for multi-tenant deployments where tenants use all core
components + curated trusted custom components but cannot author custom ones.
All distinct from the issues fixed in 6e4ec5a and 1cdc305.

1. MCP stdio flow-embedded command execution (RCE): a tenant-built flow can embed an
   MCP stdio config (command/args/env) directly in the MCPTools component value, which
   reached `bash -c "exec <command>"` with no validation -- the MCPServerConfig allowlist
   only ran at the REST /api/v2/mcp/servers layer, never in the flow-execution path, and
   mcp_servers_locked / allow_custom_components=False do not cover it. New single-source
   lfx.base.mcp.security (constants + validate_mcp_stdio_config) enforced at the update_tools
   sink and the legacy deactivated/mcp_stdio sink; MCPServerConfig now imports the shared
   constants/helper so the two enforcement points cannot drift. MCP HTTP/SSE url is now
   SSRF-validated too.

2. Code-agent components bypass the code-interpreter lockdown: CodeActAgentSmolagents
   (smolagents LocalPythonExecutor) and OpenDsStarAgent (bare exec) run LLM-generated
   Python in-process but were absent from CODE_EXECUTION_COMPONENT_TYPES, so
   LANGFLOW_BLOCK_CODE_INTERPRETER_COMPONENTS=true did not block them. Added both class
   names + display-name aliases.

3. Git components clone arbitrary tenant URLs: GitExtractor/GitLoader pass repository_url
   /clone_url to git clone -> ext:: remote-helper RCE, file://+local-path file read,
   leading-'-' option injection, internal-host SSRF. Add validate_git_repository_url
   (always blocks remote helpers + option injection; blocks local-file clones under SSRF
   or local-file restriction; SSRF-validates network hosts incl. scp-like).

4. Home Assistant reflective SSRF -> cloud-metadata credential theft: List States (GET)
   and Control (POST) fetch f"{base_url}/api/.." and reflect the body to the tenant; a
   trailing '#' reaches the IMDS credential path. Route base_url through
   validate_url_for_ssrf before the request.

5. Model-provider base_url SSRF in build-config discovery: Ollama/LM Studio fetch a
   tenant-set base_url (even on field edit, no flow run) with no SSRF guard. Validate
   before each fetch. (Consistent with the existing url.py hard-block posture: local model
   servers need LANGFLOW_SSRF_ALLOWED_HOSTS or SSRF disabled.)

Tests: 220 lfx + 121 backend tests pass; ruff clean.
…commits

Follow-up to 6e4ec5a / 1cdc305 / c31eeb8 from an extensive PR review.
Fixes real defects in those commits and adds opt-in SSRF for connector components.
No backwards-incompatible default behavior: all host-blocking SSRF is opt-in, and
local/private hosts keep working out of the box.

Defect fixes (always active; only affect abuse/abnormal inputs, not legit flows):
- MCP stdio: close a command-injection bypass where a tenant packs the whole payload
  into `command` with empty `args` (e.g. "bash -c '<payload>'"). The validator now
  tokenizes the command, and MCPServerConfig delegates to the shared
  validate_mcp_stdio_config (single source of truth).
- Code-exec block list: add the missing "Python Code Structured" display-name alias
  and the PythonFunction component so a hash-valid node can't bypass the block.
- Env-var fallback: route the remaining tenant-controlled os.getenv sites through
  safe_getenv (VariableService, credentials, knowledge-base sources), harden the
  GetEnvVar component, and expand the infra-secret denylist.
- web_search + Home Assistant: allow_redirects=False so a 3xx can't bypass the SSRF guard.
- git: enforce the scheme allowlist regardless of the SSRF toggle; confine the Local
  repo_path under LANGFLOW_RESTRICT_LOCAL_FILE_ACCESS.
- api_request: reduce the Content-Disposition filename to a basename (no path traversal).

Opt-in connector SSRF (new LANGFLOW_CONNECTOR_SSRF_VALIDATION_ENABLED, default off):
- New validate_connector_url_for_ssrf / validate_connector_database_url_for_ssrf wrappers
  that no-op unless the flag is set, so connectors keep reaching localhost/private hosts
  by default. When enabled they defer to LANGFLOW_SSRF_PROTECTION_ENABLED / _ALLOWED_HOSTS.
- Applied to vector stores (chroma/clickhouse/qdrant/elasticsearch/opensearch/milvus/
  supabase/upstash/weaviate), the SQL Database components, glean, astradb_cql, model
  discovery (litellm/huggingface/xai/deepseek/groq/watsonx), Ollama (chat + embeddings),
  LM Studio, Home Assistant, and the MCP HTTP-mode URL.
- The SQL local-file dialect restriction stays on its own LANGFLOW_RESTRICT_LOCAL_FILE_ACCESS
  toggle, independent of the connector flag.

Local file-access confinement (no-op unless LANGFLOW_RESTRICT_LOCAL_FILE_ACCESS=true):
CSV/JSON/OpenAPI agents + save_file write.

Docs: document LANGFLOW_CONNECTOR_SSRF_VALIDATION_ENABLED + a multi-tenant recommendation
in api-keys-and-authentication.mdx. Re-export the MCP constants from langflow.api.v2.schemas
for backwards compatibility.
@coderabbitai

coderabbitai Bot commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 0303b4cc-8831-45d6-bf0d-7b26eb1a6f65

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch security-hardening

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

- Log (instead of silently swallowing) settings-read failures in the
  fail-open enable-switch helpers so a security control silently
  disabling is observable: restrict_local_file_access and
  connector_ssrf_validation_enabled gates.
- Log when get_all() skips a GENERIC variable holding ciphertext
  (likely a CREDENTIAL row relabeled GENERIC) instead of dropping it
  silently.
- Remove unreachable non-http/https scheme branch in SSRF validators
  (_validate_url_scheme already raises) and fix the misleading comment
  claiming such schemes are "not subject to SSRF protection".
- Close the unbalanced paren in the WebSearch SSRF-blocked message.
- Correct getenvvar comment: only LANGFLOW_*/LFX_* are prefix-matched;
  AWS protection is specific names, not an AWS_* glob.
- Fix three code-execution denylist comments to point at the real
  exec sites (get_function, LocalPythonInterpreter, external
  agents.ds_star executor).
- Correct LANGFLOW_SSRF_PROTECTION_ENABLED docs default (True, not False).
- Add symlink-escape containment tests for enforce_local_file_access
  (the docstring promises symlink resolution; previously untested).
…oggles

Address PR-review findings on the connector SSRF wrapper and docs:
- validate_connector_url_for_ssrf: raise a clear, actionable error for a
  scheme-less / host-less connector URL (e.g. Milvus "host:19530") instead
  of the shared validator's confusing "Invalid URL scheme ''". The message
  tells the operator to use an explicit http(s) scheme and notes that
  allowlisting alone does not permit a scheme-less host (the format gate
  runs before the allowlist check). Only fires when host validation would
  actually run (global SSRF on); stays a no-op otherwise.
- Document the DNS-rebinding residual in the wrapper docstring: connectors
  hand the URL to third-party clients that re-resolve DNS at connect time
  and expose no pinned-IP hook (would break TLS SNI), so unlike api_request
  this guard is validate-then-connect. Literal-IP targets (metadata,
  RFC1918) are blocked identically.
- Docs: add a "Multi-tenant component hardening" section covering
  LANGFLOW_BLOCK_CODE_INTERPRETER_COMPONENTS and
  LANGFLOW_RESTRICT_LOCAL_FILE_ACCESS, recommended alongside
  LANGFLOW_ALLOW_CUSTOM_COMPONENTS=false.
- Tests: add TestConnectorURLValidation (disabled no-op, metadata blocked,
  scheme-less clear-error, no-op when global SSRF off).
…undary

restrict_local_file_access confined reads to config_dir, but config_dir IS
the storage dir AND holds the server-managed secrets as siblings of the
per-flow upload subdirs: secret_key (Fernet master key), private_key.pem /
public_key.pem (JWT signing keys), and the SQLite DB (save_db_in_config_dir).
A tenant File-component input of "<flow>/../secret_key" routed through
build_full_path (no '..' check) resolved back to <config_dir>/secret_key,
passed the is_relative_to(config_dir) boundary, and was read -- disclosing the
key that decrypts every tenant's stored credentials. The control thus failed
its own stated goal under the very multi-tenant mode it adds.

enforce_local_file_access now denies the exact config_dir locations of
secret_key / private_key.pem / public_key.pem and the sqlite DB derived from
database_url (including the -wal / -shm / -journal sidecars that hold the same
row data, and the async sqlite+aiosqlite:/// + ?query URL forms), even though
they resolve inside the boundary. Matched at exact location only, not by
basename, so a tenant upload that merely shares a reserved name inside a flow
subdir stays readable. Fails safe on non-sqlite / empty / unavailable settings.

Known limitation (tracked separately): this does not scope reads per tenant,
so cross-tenant reads of <config_dir>/<other_flow_id>/<file> uploads remain
possible and need per-user/per-flow scoping in a follow-up.

Tests: reserved file blocked, traversal-to-reserved blocked, DB + WAL/SHM/
journal sidecars blocked, async+query URL blocked, same-named upload in a flow
subdir still allowed.
…e route footguns

Agentic MCP cross-tenant read/write (Issue 14): the langflow-agentic stdio MCP
server's flow/component tools took user_id as a caller-supplied (often optional,
defaulting to None) parameter, so an authenticated tenant could read or write
ANY tenant's flow data/component values by id -- reachable by embedding
`python -m langflow.agentic.mcp` in a flow's MCP stdio config, independent of the
agentic_experience flag. Bind the acting user from the authenticated request
instead of trusting the caller:
- agentic/mcp/server.py: drop user_id from all 9 flow/component tools; add
  _bound_user_id() reading LANGFLOW_AGENTIC_USER_ID and failing closed when absent.
- lfx.base.mcp.security: add AGENTIC_USER_ID_ENV_VAR / AGENTIC_MCP_MODULE; add
  langflow_agentic_user_id to DANGEROUS_ENV_VARS so a tenant stdio config cannot
  supply it.
- lfx.base.mcp.util.update_tools: new current_user_id; after validation, inject
  the authenticated id into the spawn env when the command targets the agentic
  module (auto-provisioned server AND any tenant-authored config -> tenant only
  ever binds their own id). Callers pass it (MCPTools component, v2 mcp servers).

Route footguns (round-4 review):
- api/v1/projects.py update_project: validate the supplied parent_id references a
  folder owned by the caller (404 otherwise) instead of assigning it blind.
- helpers/flow.py get_flow_by_id_or_endpoint_name: document the user_id=None
  unscoped contract and the requirement that callers pass an authenticated id.

Docs: add a "Session cookie hardening" section for LANGFLOW_ACCESS_SECURE /
ACCESS_HTTPONLY / ACCESS_SAME_SITE with the JS-frontend and HTTP caveats.

Tests: MCP stdio denylist + update_tools inject/fail-closed; agentic server
_bound_user_id env/fail-closed; update_project unowned-parent rejection.
The /files/images/{flow_id}/{file_name} endpoint streamed stored bytes with a
content type derived from the file extension and no anti-sniffing/disposition
headers. A tenant-uploaded SVG (served image/svg+xml) or HTML would execute
inline in the app origin when the URL is opened directly. The route is
owner-gated (so this is self-XSS, not cross-tenant), but harden it to match the
v1/v2 download_file endpoints: add X-Content-Type-Options: nosniff and
Content-Disposition: attachment. attachment forces a download on direct
navigation (image/svg+xml is scriptable regardless of nosniff); <img>/blob
embedding -- the intended use -- is unaffected.

Test asserts both headers in test_download_image_for_browser.
…oyments

External tracing integrations (LangSmith, Langfuse, Phoenix, Arize, Opik, ...)
are configured process-wide from env vars, not per user. Enabling one in a
multi-tenant deployment sends every tenant's flow inputs/outputs/prompts to a
single external project readable by anyone with that tracing account, and a
secret echoed into a component output may not be redacted. Add a warning to the
multi-tenant hardening guidance to use built-in local tracing instead.
…ce + allow_custom_components

The agentic assistant generates component code and EXECUTES it in-process
(validate_component_runtime -> build_custom_component_template -> compile/exec
of the module + class body + output methods; re-executed by the user-components
overlay on each assistant tool call). This path was reachable by any
authenticated user regardless of agentic_experience (routers mounted
unconditionally, no backend flag check) and ignored allow_custom_components,
so in a locked-down multi-tenant deployment a tenant could get arbitrary
in-process code execution -- defeating the very policy the lockdown enforces.
The only barrier was an AST denylist, which is trivially bypassable
(aliased imports, getattr, pathlib writes, socket/urllib not blocked).

Two complementary gates:
- agentic_experience: new langflow/agentic/api/deps.py::require_agentic_experience
  (404 when off), applied per-route to /assist, /assist/stream, /execute and
  router-level to the files + sessions routers. The read-only /agentic/check-config
  probe is intentionally exempt so non-agentic deployments can still query
  provider config.
- allow_custom_components: validate_component_runtime refuses before any
  instantiation, and the user-components overlay returns only the base registry
  (no .components walk, no template build) when it is false. The shared lfx
  build_custom_component_template is untouched, so the normal hash-gated
  custom-component path is unaffected.

Tests: require_agentic_experience 404/allow; validate_component_runtime refuses
and never builds; overlay returns base-only and never walks/executes. Agentic
api/services/helpers/mcp suite green (no regressions).
Defense-in-depth/consistency pass from the SQLi review (no SQLi found; these
harden parameterized-but-unescaped LIKE patterns and an order_by error-oracle):

- LIKE/ILIKE search terms were interpolated into the pattern without escaping
  the %/_/\ metacharacters in users (username), authz_roles (name), authz_teams
  (team/adom name), and the tracing repository (trace search) -> wildcard
  over-match + mild Postgres ReDoS. Add shared lfx.utils.util_strings.
  escape_like_pattern and apply it with escape="\\" at those sites; projects.py
  already escaped and now aliases the shared helper.

- order_by was passed to getattr(MessageTable, order_by) with no allowlist in
  monitor.get_messages and the langflow.memory query helper, so an invalid name
  raised and surfaced as a 500 error-oracle (get_shared_messages already had an
  allowlist). Move ALLOWED_MESSAGE_ORDER_FIELDS to the message model module;
  get_messages now returns 400 on an invalid field, get_shared_messages reuses
  the constant, and the memory helper raises ValueError.

Tests: lfx test_util_strings (escape behavior); projects/monitor/users suites green.
…esource key

build_orchestrate_run_payload resolved the run's agent_id as
provider_data.get("agent_id") or deployment_id. The caller-supplied branch is
dead today (the API input slot WatsonxApiExecutionInput is extra="forbid" and
has no agent_id field), so it always fell back to deployment_id. Drop the dead
fallback and pin agent_id to the deployment's own owner-controlled resource key
so a future caller that bypasses the input slot cannot redirect the run to an
arbitrary agent in the owner's WxO tenant. Defense-in-depth; no behavior change.
Quality-only cleanups from a post-review pass (no behavior change):
- Remove the dead `extract_base_command as _extract_base_command` re-export
  alias in api/v2/schemas.py (underscore-private, zero references) and trim the
  comment that claimed the base-command helper is re-exported.
- flow_validation.py: read block_code_interpreter_components as a direct
  settings attribute instead of getattr(..., False); the field is declared and
  allow_custom_components on the line above is already accessed directly.
- mcp/security.py: extract the duplicated file-path heuristic into _is_file_path()
  shared by extract_base_command and validate_mcp_stdio_config, and collapse the
  now-redundant nested if.

Tests: lfx mcp-stdio-security + flow_validation suites green; ruff clean.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant