Skip to content

bcoles/envex

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

envex

Build Status Release License: MIT

envex

Extract and analyze environment variables from running Linux processes.


envex extracts environment variables from running processes on Linux via /proc/<pid>/environ, flags secrets and credentials using a three-layer detection engine, and reports findings with confidence levels. Useful for penetration testing, security auditing, and incident response.

Why environment variables?

Environment variables are a primary vector for secret exposure in modern infrastructure. Processes routinely receive sensitive values through their environment:

  • API keys and tokens — cloud providers (AWS, GCP, Azure), SaaS platforms (Stripe, SendGrid, Twilio), and AI services (OpenAI, Anthropic) all use env vars for authentication
  • Database credentials — connection strings, passwords, and DSNs passed via DATABASE_URL, DB_PASSWORD, and similar variables
  • CI/CD secrets — GitHub Actions, GitLab CI, Jenkins, and other pipelines inject secrets as env vars by default
  • Container orchestration — Kubernetes Secrets, Docker Compose, ECS task definitions, and Nomad all project secrets into the process environment
  • Service mesh and infrastructure — HashiCorp Vault agent injection, AWS Parameter Store, and sidecar patterns routinely populate env vars with tokens

These secrets persist in /proc/<pid>/environ for the entire lifetime of the process, are rarely rotated at runtime, and are readable by anyone with appropriate file permissions on the proc filesystem.

Why not just grep?

The manual approach works:

cat /proc/*/environ 2>/dev/null | tr '\0' '\n' | grep -iE 'password|secret|token|key'

This catches variables with obvious names like DB_PASSWORD or API_KEY, but misses everything else:

  • Provider-specific tokensSTRIPE_SECRET_KEY, SENDGRID_API_KEY, VAULT_TOKEN, SENTRY_DSN — you'd need dozens of patterns to cover the major providers
  • Value-based secrets — an AWS access key (AKIA...), a GitHub PAT (ghp_...), a Slack bot token (xoxb-...), or a PEM private key don't need a telltale variable name — the value itself is identifiable
  • High-entropy strings — a secret stored in a generic variable like CUSTOM_CONFIG or APP_SETTING won't match any keyword grep, but its Shannon entropy gives it away
  • Process attributiongrep gives you raw lines with no context about which process, executable, user, or container the secret belongs to
  • Noise — a simple keyword grep returns every PATH, HOME, and HOSTNAME that happens to match, with no confidence ranking

envex replaces that manual workflow with 103 detection rules across three layers (name heuristics, value pattern matching, entropy analysis), deduplicates findings across processes, and attributes each secret to its source process.

Advantages over memory scraping

Traditional post-exploitation credential harvesting often involves reading process memory via ptrace or /proc/<pid>/mem. Environment variable extraction has significant advantages:

Env var extraction Memory scraping
Access mechanism File read (/proc/<pid>/environ) ptrace attach or /proc/<pid>/mem read
Permissions required Standard UNIX file permissions (same UID or root) CAP_SYS_PTRACE capability for cross-user access
Yama ptrace_scope Not affected ptrace_scope=1 (default on most distros) blocks non-parent tracing; ptrace_scope=2,3 blocks even root
SELinux / AppArmor Typically allowed between same-context processes Policies frequently deny ptrace and process_vm_readv
seccomp filters read/openat are rarely filtered ptrace is one of the most commonly blocked syscalls
Container boundaries Works within the same PID namespace Docker's default seccomp profile blocks ptrace; requires --cap-add=SYS_PTRACE
Data structure Structured key=value pairs, null-delimited Unstructured byte streams requiring pattern scanning
Data volume Kilobytes per process (tens to hundreds of variables) Megabytes to gigabytes of heap, stack, and mapped regions
Stability Values set at exec time, unchanging Secrets in memory may be overwritten, freed, or fragmented
Detection surface Minimal — normal file read operations Suspicious — ptrace attach generates audit events and may trigger EDR alerts

In practice, many hardened environments that block memory inspection still permit reading /proc/<pid>/environ for same-user processes. Container runtimes and cloud environments increasingly rely on env vars as the primary secret delivery mechanism, making environment variable extraction a high-signal, low-noise approach to credential discovery.

Features

  • Three-layer detection engine — name heuristics, value pattern matching (regex), and Shannon entropy analysis
  • 103 built-in rules — 59 name-based and 44 value-based patterns covering major cloud providers, CI/CD platforms, SaaS services, and common secret formats
  • Deduplication — identical secrets shared across processes are grouped by default
  • Container awareness — detects Docker, containerd, CRI-O, and Kubernetes container IDs from cgroup data
  • Process identification — captures exe path, full command line, UID, username, PPID, and start time for each process
  • Watch mode — continuously monitors for new processes at a configurable interval
  • Flexible output — text, JSON, JSONL (line-delimited), SARIF, and CSV formats
  • External rules — load additional rules from Gitleaks TOML, Kingfisher YAML, or envex YAML files
  • Value redaction — optionally mask secret values in output
  • Tunable entropy — adjust the Shannon entropy threshold for entropy-based detection
  • Exit codes — returns 1 when secrets are found, 0 when clean, for use in scripts and pipelines
  • UID filtering — scan only processes owned by a specific user with --uid
  • Allowlist patterns — suppress known-safe findings by variable name (--allow-name) or value substring (--allow-value)
  • Self-sandbox — seccomp BPF denylist blocks dangerous syscalls (ptrace, exec, network) by default
  • Summary mode — print only aggregate counts with --summary
  • Shell completions — generates completions for bash, zsh, fish, elvish, and PowerShell

Installation

From source

Requires Rust 1.85+ (edition 2024):

git clone https://github.com/bcoles/envex && \
cd envex && \
cargo build --release

The binary is at target/release/envex.

From releases

Pre-built static binaries (musl) are available for:

  • x86_64-unknown-linux-musl
  • i686-unknown-linux-musl
  • aarch64-unknown-linux-musl
  • armv7-unknown-linux-musleabihf

Download from the Releases page.

Documentation

A man page and optional shell completions are provided.

# Install man page
sudo cp man/envex.1 /usr/local/share/man/man1/

# Bash completions
sudo cp completions/envex.bash /etc/bash_completion.d/envex

# Zsh completions
sudo cp completions/_envex /usr/local/share/zsh/site-functions/_envex

# Fish completions
cp completions/envex.fish ~/.config/fish/completions/

Usage

envex [OPTIONS]

Basic scanning

# Scan all accessible processes (requires appropriate permissions)
envex

# Scan a specific process
envex --pid 1234

# Scan processes matching a name
envex -n nginx

# Show only high-confidence findings
envex -c high

Output formats

# JSON output
envex -f json

# JSONL (one JSON object per line, suitable for jq pipelines)
envex -f jsonl

# CSV output
envex -f csv

# SARIF output (for GitHub Code Scanning / CI integration)
envex -f sarif -o results.sarif

# Write to file
envex -f json -o findings.json

Filtering

# Exclude specific processes
envex --exclude-name systemd --exclude-pid 1

# Exclude common non-secret variables
envex --exclude-var PATH --exclude-var HOME --exclude-var LANG

# Scan only processes owned by a specific UID
envex --uid 1000

# Suppress known-safe findings by variable name or value
envex --allow-name TERM --allow-value /usr/bin

# Redact values in output
envex --redact

Watch mode

# Rescan every 5 seconds (default interval)
envex -w

# Rescan with a custom interval
envex -w 10

# Auto-stop after 60 seconds
envex -w --timeout 60

Dump mode

# Dump all environment variables across all processes (de-duplicated)
envex --dump

# Dump as JSON
envex --dump -f json

Tuning detection

# Disable entropy-based detection (reduce false positives)
envex --no-entropy

# Raise entropy threshold (more selective)
envex --min-entropy 5.0

# Load additional Gitleaks rules
envex -r /path/to/gitleaks.toml

# Load Kingfisher rules
envex -r /path/to/kingfisher-rules.yaml

Scripting

# Silent mode — exit code only (1 = secrets found)
envex -q && echo "Clean" || echo "Secrets found"

# Show all occurrences (skip deduplication)
envex --show-all

# List all loaded detection rules
envex --list-rules

# Suppress banner / disable colors
envex --no-banner --no-color

# Generate shell completions
envex --completions bash > /etc/bash_completion.d/envex

Detection engine

The detection engine applies three layers in order:

  1. Name heuristics — case-insensitive matching of environment variable names against known patterns (e.g., *_PASSWORD, AWS_SECRET_ACCESS_KEY, *_TOKEN). The most specific match wins (exact > prefix/suffix > contains).

  2. Value patterns — regex matching of values against known token formats (e.g., GitHub PAT ghp_[A-Za-z0-9]{36}, AWS secret key format, PEM private keys, JWTs).

  3. Entropy fallback — if no name or value rule matches, Shannon entropy analysis flags high-entropy values (default threshold: 4.5 bits/char, minimum 16 characters) that may be random secrets.

Each detection produces a confidence level:

Confidence Meaning
High Matches a specific provider token format or a known secret variable name
Medium Matches a generic pattern (e.g., *_KEY suffix) or a broad value regex
Low Flagged only by entropy analysis

External rule formats

envex can load additional rules from three formats:

  • envex YAML — native format supporting both name and value rules with configurable match types and confidence levels
  • Gitleaks TOML — compatible with Gitleaks rule files (regex-based value rules)
  • Kingfisher YAML — compatible with Kingfisher rule files (regex-based value rules with severity mapping)

Demo

A demo script is included in tests/demo.sh. It launches three simulated services (web API, background worker, deploy agent) with realistic secrets injected via environment variables, then scans all three with envex.

bash tests/demo.sh

Limitations

  • Initial environment only/proc/<pid>/environ contains the environment variables set when the process was created (via execve). Variables added or modified at runtime (e.g., via setenv() or putenv()) are not reflected in /proc/<pid>/environ. This is a kernel limitation, not a tool limitation.

  • Linux only — relies on the /proc filesystem, which is specific to Linux. Not available on macOS, Windows, or BSDs.

  • Permission-gated — reading /proc/<pid>/environ requires the same UID as the target process, or root/CAP_DAC_READ_SEARCH. Processes running as other users will be silently skipped.

  • Kernel threads have no environment — kernel threads (kthreadd children) do not have a meaningful /proc/<pid>/environ and are skipped.

  • Short-lived processes — processes that start and exit between scan intervals in watch mode will be missed. The tool reads a snapshot from /proc at each scan pass.

  • No memory inspection — by design, envex only reads structured environment data. Secrets stored only in process memory, files, or other IPC mechanisms are not detected.

  • Pattern coverage — the built-in rules cover common providers and formats but cannot detect every possible secret. Custom rules can be added via external rule files.

Permissions

envex reads from /proc/<pid>/environ, /proc/<pid>/comm, /proc/<pid>/exe, /proc/<pid>/cmdline, /proc/<pid>/status, /proc/<pid>/stat, and /proc/<pid>/cgroup. Access is governed by standard Linux file permissions:

  • Same user — a process can always read the environment of its own processes
  • Root — can read all processes (subject to LSM policies)
  • No special capabilities required — unlike memory scraping tools, envex does not need CAP_SYS_PTRACE or any other capability beyond normal proc filesystem access

For container environments, envex must run in the same PID namespace as the target processes.

Running Tests

cargo test

License

This project is licensed under the MIT License. See the LICENSE file for details.

About

Extract and analyze environment variables from running Linux processes.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors