Skip to content

Commit 52d7298

Browse files
hidai25claude
andcommitted
Add SSRF protection and prompt injection mitigation
Security improvements for open source release: - Add SSRF protection to all HTTP adapters - Block private IP ranges (10.x, 172.16.x, 192.168.x) - Block localhost and loopback addresses - Block cloud metadata endpoints (169.254.169.254) - Configurable via allow_private_urls setting - Add prompt injection mitigation to LLM-as-judge - Sanitize agent outputs before evaluation - Truncate long outputs to prevent token exhaustion - Escape common prompt delimiters - Add unique boundary markers for untrusted content - Harden system prompt with security instructions - New security module: evalview/core/security.py - Update SECURITY.md with documentation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
1 parent 6c6fdf3 commit 52d7298

File tree

10 files changed

+535
-35
lines changed

10 files changed

+535
-35
lines changed

SECURITY.md

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,24 +76,89 @@ When using EvalView, please follow these security best practices:
7676
- **Review dependencies**: Use tools like `pip-audit` to check for known vulnerabilities
7777
- **Lock versions**: Use `requirements.txt` or `poetry.lock` to pin dependency versions
7878

79+
## Built-in Security Features
80+
81+
### SSRF (Server-Side Request Forgery) Protection
82+
83+
EvalView includes built-in protection against SSRF attacks. By default in production mode, requests to the following destinations are blocked:
84+
85+
- **Private IP ranges**: 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16
86+
- **Loopback addresses**: localhost, 127.0.0.0/8
87+
- **Cloud metadata endpoints**: 169.254.169.254 (AWS, GCP, Azure)
88+
- **Link-local addresses**: 169.254.0.0/16
89+
- **Internal hostnames**: kubernetes.default, metadata.google.internal
90+
91+
#### Configuration
92+
93+
For local development, SSRF protection allows private URLs by default. To enable strict mode in production:
94+
95+
```yaml
96+
# .evalview/config.yaml
97+
allow_private_urls: false # Block private/internal networks (recommended for production)
98+
```
99+
100+
#### Security Considerations
101+
102+
- When running EvalView in production environments, set `allow_private_urls: false`
103+
- Be cautious when loading test cases from untrusted sources - they can specify arbitrary endpoints
104+
- Review test case YAML files before running them in sensitive environments
105+
106+
### LLM Prompt Injection Mitigation
107+
108+
The LLM-as-judge feature includes protections against prompt injection attacks:
109+
110+
1. **Output Sanitization**: Agent outputs are sanitized before being sent to the LLM judge
111+
- Long outputs are truncated (default: 10,000 chars) to prevent token exhaustion
112+
- Control characters are removed
113+
- Common prompt delimiters are escaped (```, ###, ---, XML tags, etc.)
114+
115+
2. **Boundary Markers**: Untrusted content is wrapped in unique cryptographic boundary markers
116+
117+
3. **Security Instructions**: The judge prompt explicitly instructs the LLM to:
118+
- Ignore any instructions within the agent output
119+
- Only evaluate content quality, not meta-instructions
120+
- Not follow commands embedded in the evaluated content
121+
122+
#### Limitations
123+
124+
While these mitigations reduce risk, they cannot completely prevent sophisticated prompt injection attacks. Consider:
125+
126+
- Agent outputs could still influence LLM evaluation through subtle manipulation
127+
- Very long outputs may be truncated, potentially hiding issues
128+
- New prompt injection techniques may bypass current protections
129+
130+
For high-stakes evaluations, consider:
131+
- Manual review of agent outputs
132+
- Multiple evaluation models
133+
- Structured evaluation criteria that are harder to manipulate
134+
79135
## Known Security Considerations
80136

81137
### LLM-as-Judge Evaluation
82138

83139
- EvalView uses OpenAI's API for output quality evaluation
84140
- Test outputs and expected outputs are sent to OpenAI for comparison
141+
- Agent outputs are sanitized to mitigate prompt injection, but no protection is 100% effective
85142
- **Recommendation**: Don't include sensitive/proprietary data in test cases if using LLM-as-judge
86143

87144
### HTTP Adapters
88145

89146
- Custom HTTP adapters may expose your agent endpoints
147+
- SSRF protection is enabled by default but can be bypassed with `allow_private_urls: true`
90148
- **Recommendation**: Use authentication, HTTPS, and rate limiting on agent endpoints
91149

92150
### Trace Data
93151

94152
- Execution traces may contain sensitive information from agent responses
95153
- **Recommendation**: Sanitize traces before sharing or storing long-term
96154

155+
### Verbose Mode
156+
157+
The `--verbose` flag may expose sensitive information in logs:
158+
- API request/response payloads
159+
- Query content and agent outputs
160+
- **Recommendation**: Avoid using verbose mode in production or when processing sensitive data
161+
97162
## Security Updates
98163

99164
We will disclose security vulnerabilities through:

evalview/adapters/base.py

Lines changed: 35 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,25 @@
11
"""Base agent adapter interface."""
22

33
from abc import ABC, abstractmethod
4-
from typing import Any, Optional, Dict
4+
from typing import Any, Optional, Dict, Set
55
from evalview.core.types import ExecutionTrace
6+
from evalview.core.security import validate_url, SSRFProtectionError
67

78

89
class AgentAdapter(ABC):
9-
"""Abstract adapter for connecting to different agent frameworks."""
10+
"""Abstract adapter for connecting to different agent frameworks.
11+
12+
Security Note:
13+
All adapters include SSRF (Server-Side Request Forgery) protection by default.
14+
This prevents requests to internal networks, cloud metadata endpoints, and
15+
other potentially dangerous destinations. Set `allow_private_urls=True` only
16+
in trusted development environments.
17+
"""
18+
19+
# SSRF protection settings (can be overridden in subclasses or instances)
20+
allow_private_urls: bool = False
21+
allowed_hosts: Optional[Set[str]] = None
22+
blocked_hosts: Optional[Set[str]] = None
1023

1124
@property
1225
@abstractmethod
@@ -36,3 +49,23 @@ async def health_check(self) -> bool:
3649
True if agent is healthy, False otherwise
3750
"""
3851
return True
52+
53+
def validate_endpoint(self, url: str) -> str:
54+
"""
55+
Validate an endpoint URL for SSRF protection.
56+
57+
Args:
58+
url: The URL to validate
59+
60+
Returns:
61+
The validated URL
62+
63+
Raises:
64+
SSRFProtectionError: If the URL fails security validation
65+
"""
66+
return validate_url(
67+
url,
68+
allow_private=self.allow_private_urls,
69+
allowed_hosts=self.allowed_hosts,
70+
blocked_hosts=self.blocked_hosts,
71+
)

evalview/adapters/crewai_adapter.py

Lines changed: 15 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
import httpx
77
import json
88
from datetime import datetime
9-
from typing import Any, Dict, List, Optional
9+
from typing import Any, Dict, List, Optional, Set
1010
import logging
1111

1212
from evalview.adapters.base import AgentAdapter
@@ -27,6 +27,11 @@ class CrewAIAdapter(AgentAdapter):
2727
- tasks: List of task executions
2828
- result: Final crew output
2929
- usage_metrics: Token usage
30+
31+
Security Note:
32+
SSRF protection is enabled by default. URLs targeting private/internal
33+
networks will be rejected. Set `allow_private_urls=True` only in trusted
34+
development environments.
3035
"""
3136

3237
def __init__(
@@ -36,8 +41,16 @@ def __init__(
3641
timeout: float = 120.0, # CrewAI can be slow
3742
verbose: bool = False,
3843
model_config: Optional[Dict[str, Any]] = None,
44+
allow_private_urls: bool = False,
45+
allowed_hosts: Optional[Set[str]] = None,
3946
):
40-
self.endpoint = endpoint
47+
# Set SSRF protection settings before validation
48+
self.allow_private_urls = allow_private_urls
49+
self.allowed_hosts = allowed_hosts
50+
51+
# Validate endpoint URL for SSRF protection
52+
self.endpoint = self.validate_endpoint(endpoint)
53+
4154
self.headers = headers or {"Content-Type": "application/json"}
4255
self.timeout = timeout
4356
self.verbose = verbose

evalview/adapters/http_adapter.py

Lines changed: 20 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
"""Generic HTTP adapter for REST API agents."""
22

33
from datetime import datetime
4-
from typing import Any, Optional, Dict, List
4+
from typing import Any, Optional, Dict, List, Set
55
import httpx
66
import logging
77
from evalview.adapters.base import AgentAdapter
@@ -18,14 +18,22 @@
1818

1919

2020
class HTTPAdapter(AgentAdapter):
21-
"""Generic HTTP adapter for REST API agents."""
21+
"""Generic HTTP adapter for REST API agents.
22+
23+
Security Note:
24+
SSRF protection is enabled by default. URLs targeting private/internal
25+
networks will be rejected. Set `allow_private_urls=True` only in trusted
26+
development environments.
27+
"""
2228

2329
def __init__(
2430
self,
2531
endpoint: str,
2632
headers: Optional[Dict[str, str]] = None,
2733
timeout: float = 30.0,
2834
model_config: Optional[Dict[str, Any]] = None,
35+
allow_private_urls: bool = False,
36+
allowed_hosts: Optional[Set[str]] = None,
2937
):
3038
"""
3139
Initialize HTTP adapter.
@@ -35,8 +43,17 @@ def __init__(
3543
headers: Optional HTTP headers
3644
timeout: Request timeout in seconds
3745
model_config: Model configuration with name and optional custom pricing
46+
allow_private_urls: If True, allow requests to private/internal networks
47+
(default: False for security)
48+
allowed_hosts: Optional set of explicitly allowed hostnames
3849
"""
39-
self.endpoint = endpoint
50+
# Set SSRF protection settings before validation
51+
self.allow_private_urls = allow_private_urls
52+
self.allowed_hosts = allowed_hosts
53+
54+
# Validate endpoint URL for SSRF protection
55+
self.endpoint = self.validate_endpoint(endpoint)
56+
4057
self.headers = headers or {}
4158
self.timeout = timeout
4259
self.model_config = model_config or {}

evalview/adapters/langgraph_adapter.py

Lines changed: 15 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
import httpx
77
import json
88
from datetime import datetime
9-
from typing import Any, Dict, List, Optional
9+
from typing import Any, Dict, List, Optional, Set
1010
import logging
1111

1212
from evalview.adapters.base import AgentAdapter
@@ -31,6 +31,11 @@ class LangGraphAdapter(AgentAdapter):
3131
Response formats:
3232
- {"messages": [...], "steps": [...]}
3333
- Streaming: data: {"type": "step", "content": "...", ...}
34+
35+
Security Note:
36+
SSRF protection is enabled by default. URLs targeting private/internal
37+
networks will be rejected. Set `allow_private_urls=True` only in trusted
38+
development environments.
3439
"""
3540

3641
def __init__(
@@ -43,8 +48,16 @@ def __init__(
4348
model_config: Optional[Dict[str, Any]] = None,
4449
assistant_id: Optional[str] = None,
4550
use_cloud_api: Optional[bool] = None, # Auto-detect if None
51+
allow_private_urls: bool = False,
52+
allowed_hosts: Optional[Set[str]] = None,
4653
):
47-
self.endpoint = endpoint
54+
# Set SSRF protection settings before validation
55+
self.allow_private_urls = allow_private_urls
56+
self.allowed_hosts = allowed_hosts
57+
58+
# Validate endpoint URL for SSRF protection
59+
self.endpoint = self.validate_endpoint(endpoint)
60+
4861
self.headers = headers or {"Content-Type": "application/json"}
4962
self.timeout = timeout
5063
self.streaming = streaming

evalview/adapters/tapescope_adapter.py

Lines changed: 18 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
"""Custom adapter for TapeScope streaming API and other streaming agents."""
22

33
from datetime import datetime
4-
from typing import Any, Optional, Dict
4+
from typing import Any, Optional, Dict, Set
55
import httpx
66
import json
77
import logging
@@ -46,6 +46,11 @@ class TapeScopeAdapter(AgentAdapter):
4646
- LangServe streaming endpoints
4747
- Custom streaming agents
4848
- Any JSONL-based API
49+
50+
Security Note:
51+
SSRF protection is enabled by default. URLs targeting private/internal
52+
networks will be rejected. Set `allow_private_urls=True` only in trusted
53+
development environments.
4954
"""
5055

5156
def __init__(
@@ -55,6 +60,8 @@ def __init__(
5560
timeout: float = 60.0,
5661
verbose: bool = False,
5762
model_config: Optional[Dict[str, Any]] = None,
63+
allow_private_urls: bool = False,
64+
allowed_hosts: Optional[Set[str]] = None,
5865
):
5966
"""
6067
Initialize streaming adapter.
@@ -65,8 +72,17 @@ def __init__(
6572
timeout: Request timeout in seconds
6673
verbose: Enable verbose logging (overrides DEBUG env var)
6774
model_config: Model configuration with name and optional custom pricing
75+
allow_private_urls: If True, allow requests to private/internal networks
76+
(default: False for security)
77+
allowed_hosts: Optional set of explicitly allowed hostnames
6878
"""
69-
self.endpoint = endpoint
79+
# Set SSRF protection settings before validation
80+
self.allow_private_urls = allow_private_urls
81+
self.allowed_hosts = allowed_hosts
82+
83+
# Validate endpoint URL for SSRF protection
84+
self.endpoint = self.validate_endpoint(endpoint)
85+
7086
self.headers = headers or {}
7187
self.timeout = timeout
7288
self.verbose = verbose or os.getenv("DEBUG") == "1"

0 commit comments

Comments
 (0)