Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,28 @@ async def process_data(data: dict) -> Result:
raise ValidationError(f"Missing required field: {e}") from e
```

**SQL Safety patterns**:
- NEVER use string formatting for SQL queries (f-strings, .format(), string concatenation)
- ALWAYS use parameterized queries with `$1, $2, $3` placeholders
- Use `SafeQueryBuilder` from `agents_api.queries.sql_utils` for complex dynamic queries
- Use `safe_format_query` for simple queries with validated ORDER BY clauses
- Validate all identifiers with `sanitize_identifier()` when needed
- Whitelist allowed sort fields and directions

Example:
```python
from agents_api.queries.sql_utils import SafeQueryBuilder

# Good - using SafeQueryBuilder
builder = SafeQueryBuilder("SELECT * FROM agents WHERE developer_id = $1", [dev_id])
builder.add_condition(" AND status = {}", status)
builder.add_order_by(sort_field, direction, allowed_fields={"created_at", "name"})
query, params = builder.build()

# Bad - NEVER do this
query = f"SELECT * FROM agents WHERE status = '{status}' ORDER BY {sort_field}"
```

---

## 4. Project layout & Core Components
Expand Down Expand Up @@ -213,6 +235,7 @@ async def create_entry(
* Large AI refactors in a single commit (makes `git bisect` difficult).
* Delegating test/spec writing entirely to AI (can lead to false confidence).
* **Note about `src/`**: Only the `cli` component has a `src/` directory. For `agents-api`, code is directly in `agents_api/`. Follow the existing pattern for each component.
* **SQL Injection vulnerabilities**: Using string formatting (f-strings, .format(), %) for SQL queries instead of parameterized queries and the SQL safety utilities in `agents_api/queries/sql_utils.py`.

---

Expand Down
21 changes: 21 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -179,6 +179,27 @@ To get a comprehensive understanding of Julep, we recommend exploring the codeba
- Add new tests for new functionality
- Ensure all tests pass before submitting your changes

### Security Guidelines

When contributing code that interacts with databases:

1. **SQL Injection Prevention**
- NEVER use string formatting for SQL queries (f-strings, .format(), %)
- ALWAYS use the SQL safety utilities from `agents-api/agents_api/queries/sql_utils.py`
- Use `SafeQueryBuilder` for complex queries with dynamic conditions
- Use `safe_format_query` for simple queries with ORDER BY clauses
- All user inputs must be parameterized using placeholder syntax ($1, $2, etc.)

2. **Input Validation**
- Validate and sanitize all user inputs before using in queries
- Use whitelisting for sort fields and directions
- Check identifier patterns with `sanitize_identifier()`

3. **Testing Security**
- Add tests for any new query functions that handle user input
- See `agents-api/tests/test_sql_injection_prevention.py` for examples
- Ensure your code passes SQL injection prevention tests

5. **Submit a Pull Request**
- Provide a clear description of your changes
- Reference any related issues
Expand Down
32 changes: 32 additions & 0 deletions agents-api/AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,3 +75,35 @@ Key Uses
- Expression validation checks syntax, undefined names, unsafe operations
- Task validation checks all expressions in workflow steps
- Security: Sandbox with limited function/module access

## SQL Safety Requirements
- ALWAYS use the SQL utilities from `agents_api/queries/sql_utils.py`
- NEVER use string formatting (f-strings, .format(), %) for SQL queries
- Use `SafeQueryBuilder` for complex queries with dynamic WHERE, ORDER BY, LIMIT
- Use `safe_format_query` for simple ORDER BY validation
- All sort fields must be whitelisted explicitly
- Run SQL injection tests: `poe test --search "sql_injection_prevention"`

### Example Usage
```python
# For complex queries
from agents_api.queries.sql_utils import SafeQueryBuilder

builder = SafeQueryBuilder("SELECT * FROM docs WHERE developer_id = $1", [dev_id])
if metadata_filter:
for key, value in metadata_filter.items():
builder.add_condition(" AND metadata->>{}::text = {}", key, str(value))
builder.add_order_by("created_at", "desc", allowed_fields={"created_at", "updated_at"})
builder.add_limit_offset(limit, offset)
query, params = builder.build()

# For simple ORDER BY queries
from agents_api.queries.sql_utils import safe_format_query

query = safe_format_query(
"SELECT * FROM entries ORDER BY {sort_by} {direction}",
sort_by="created_at",
direction="desc",
allowed_sort_fields={"created_at", "timestamp"}
)
```
4 changes: 4 additions & 0 deletions agents-api/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,10 @@ The `agents-api` project serves as the foundation of the agent management system

The `models` module encapsulates all data interactions with the CozoDB database, providing a structured way to perform CRUD operations and other specific data manipulations across various entities.

### Queries

The `queries` module contains database query builders organized by resource types. It includes critical SQL safety utilities in `sql_utils.py` to prevent SQL injection attacks through parameterized queries and input validation.

### Routers

The `routers` module handles HTTP routing for different parts of the application, directing incoming HTTP requests to the appropriate handler functions.
Expand Down
51 changes: 51 additions & 0 deletions agents-api/agents_api/queries/AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ Key Points
- Add new queries in `queries/` and index them if needed.
- Tests reside under `agents-api/tests/`.
- Add `AIDEV-NOTE` anchors at the top of query modules to clarify module purpose.
- **ALWAYS use SQL safety utilities** from `sql_utils.py` to prevent SQL injection.

# Queries

Expand Down Expand Up @@ -70,3 +71,53 @@ Key Points
- `prepare_execution_input`: Builds inputs for Temporal workflows
- `create_execution_transition`: Records state changes in executions
- `search_docs_hybrid`: Combined embedding and text search

## SQL Safety and Security

### SQL Injection Prevention
All queries MUST use the utilities from `sql_utils.py` to prevent SQL injection attacks:

1. **SafeQueryBuilder**: For complex queries with dynamic conditions
```python
from ..sql_utils import SafeQueryBuilder

builder = SafeQueryBuilder(base_query, initial_params)
builder.add_condition(" AND status = {}", status_value)
builder.add_order_by("created_at", "desc", allowed_fields={"created_at", "updated_at"})
builder.add_limit_offset(limit, offset)
query, params = builder.build()
```

2. **safe_format_query**: For simple queries with ORDER BY clauses
```python
from ..sql_utils import safe_format_query

query = safe_format_query(
query_template,
sort_by="created_at",
direction="desc",
allowed_sort_fields={"created_at", "updated_at", "name"},
table_prefix="t."
)
```

3. **validate_sort_field** and **validate_sort_direction**: For validating user inputs
- Uses whitelisting approach
- Validates against SQL identifier patterns
- Prevents SQL keyword injection

### Never Do This
- ❌ NEVER use f-strings for SQL: `f"SELECT * FROM {table} WHERE {field} = '{value}'"`
- ❌ NEVER use .format() for SQL: `"SELECT * FROM {} WHERE {}".format(table, condition)`
- ❌ NEVER concatenate user input into SQL: `query + " ORDER BY " + user_input`

### Always Do This
- ✅ Use parameterized queries: `$1, $2, $3` placeholders
- ✅ Use SafeQueryBuilder for complex dynamic queries
- ✅ Validate all identifiers with sanitize_identifier()
- ✅ Whitelist allowed sort fields and directions

### Testing
- SQL injection prevention tests are in `tests/test_sql_injection_prevention.py`
- Run with: `poe test --search "sql_injection_prevention"`
- 27 comprehensive tests covering various attack vectors
7 changes: 3 additions & 4 deletions agents-api/agents_api/queries/agents/list_agents.py
Original file line number Diff line number Diff line change
Expand Up @@ -89,10 +89,9 @@ async def list_agents(
direction,
]

# Handle metadata filter differently - using JSONB containment
agent_query = raw_query.format(
metadata_filter_query="AND a.metadata @> $6::jsonb" if metadata_filter else "",
)
# AIDEV-NOTE: Build metadata filter query safely to prevent SQL injection
metadata_filter_query = "AND a.metadata @> $6::jsonb" if metadata_filter else ""
agent_query = raw_query.replace("{metadata_filter_query}", metadata_filter_query)

# If we have metadata filters, safely add them as a parameter
if metadata_filter:
Expand Down
9 changes: 7 additions & 2 deletions agents-api/agents_api/queries/docs/bulk_delete_docs.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,9 @@ async def bulk_delete_docs(
params, metadata_filter if not delete_all else {}, table_alias="d."
)

query = f"""
# AIDEV-NOTE: Build query with proper parameter placeholders to avoid SQL injection
query = (
"""
WITH deleted_docs AS (
DELETE FROM docs d
WHERE d.developer_id = $1
Expand All @@ -63,7 +65,9 @@ async def bulk_delete_docs(
AND o.developer_id = d.developer_id
AND o.owner_type = $2
AND o.owner_id = $3
{metadata_conditions}
"""
+ metadata_conditions
+ """
)
RETURNING d.doc_id
),
Expand All @@ -76,5 +80,6 @@ async def bulk_delete_docs(
)
SELECT doc_id FROM deleted_docs;
"""
)

return query, params
31 changes: 16 additions & 15 deletions agents-api/agents_api/queries/docs/list_docs.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,8 @@

from ...autogen.openapi_model import Doc
from ...common.utils.db_exceptions import common_db_exceptions
from ..sql_utils import SafeQueryBuilder
from ..utils import (
build_metadata_filter_conditions,
make_num_validator,
pg_query,
rewrap_exceptions,
Expand Down Expand Up @@ -105,18 +105,19 @@ async def list_docs(

# AIDEV-NOTE: avoid mutable default; initialize metadata_filter
metadata_filter = metadata_filter if metadata_filter is not None else {}
# Start with the base query
query = base_docs_query
params = [developer_id, include_without_embeddings, owner_type, owner_id]

# Add metadata filtering before GROUP BY using the utility function with table alias
metadata_conditions, params = build_metadata_filter_conditions(
params, metadata_filter, table_alias="d."
# Build query using SafeQueryBuilder to prevent SQL injection
builder = SafeQueryBuilder(
base_docs_query, [developer_id, include_without_embeddings, owner_type, owner_id]
)
query += metadata_conditions

# Add metadata filtering using SafeQueryBuilder's condition system
if metadata_filter:
for key, value in metadata_filter.items():
builder.add_condition(" AND d.metadata->>{}::text = {}", key, str(value))

# Add GROUP BY clause
query += """
builder.add_raw_condition("""
GROUP BY
d.doc_id,
d.developer_id,
Expand All @@ -126,12 +127,12 @@ async def list_docs(
d.embedding_dimensions,
d.language,
d.metadata,
d.created_at"""
d.created_at""")

# Add sorting and pagination
query += (
f" ORDER BY {sort_by} {direction} LIMIT ${len(params) + 1} OFFSET ${len(params) + 2}"
# Add sorting and pagination with validation
builder.add_order_by(
sort_by, direction, allowed_fields={"created_at", "updated_at"}, table_prefix=""
)
params.extend([limit, offset])
builder.add_limit_offset(limit, offset)

return query, params
return builder.build()
10 changes: 8 additions & 2 deletions agents-api/agents_api/queries/entries/list_entries.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
from ...common.utils.datetime import utcnow
from ...common.utils.db_exceptions import common_db_exceptions
from ...metrics.counters import query_metrics
from ..sql_utils import safe_format_query
from ..utils import make_num_validator, pg_query, rewrap_exceptions, wrap_in_class

# Query for checking if the session exists
Expand Down Expand Up @@ -43,7 +44,7 @@
AND (er.relation IS NULL OR er.relation != ALL($6))
AND e.created_at >= $7
AND e.created_at >= (select created_at from sessions where session_id = $1 LIMIT 1)
ORDER BY e.{sort_by} {direction} -- safe to interpolate
ORDER BY e.{sort_by} {direction}
LIMIT $3
OFFSET $4;
"""
Expand Down Expand Up @@ -96,9 +97,14 @@ async def list_entries(
allowed_sources if allowed_sources is not None else ["api_request", "api_response"]
)
exclude_relations = exclude_relations if exclude_relations is not None else []
query = list_entries_query.format(

# AIDEV-NOTE: Use safe_format_query to prevent SQL injection in ORDER BY clause
query = safe_format_query(
list_entries_query,
sort_by=sort_by,
direction=direction,
allowed_sort_fields={"created_at", "timestamp"},
table_prefix="",
)

# Parameters for the entry query
Expand Down
26 changes: 12 additions & 14 deletions agents-api/agents_api/queries/files/list_files.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@

from ...autogen.openapi_model import File
from ...common.utils.db_exceptions import common_db_exceptions
from ..sql_utils import SafeQueryBuilder
from ..utils import make_num_validator, pg_query, rewrap_exceptions, wrap_in_class

# Base query for listing files
Expand Down Expand Up @@ -62,25 +63,22 @@ async def list_files(
"""
Lists files with optional owner and project filtering, pagination, and sorting.
"""
# Start with the base query
query = base_files_query
params = [developer_id]
param_index = 2
# Build query using SafeQueryBuilder to prevent SQL injection
builder = SafeQueryBuilder(base_files_query, [developer_id])

# Add owner filtering
if owner_type and owner_id:
query += f" AND fo.owner_type = ${param_index} AND fo.owner_id = ${param_index + 1}"
params.extend([owner_type, owner_id])
param_index += 2
builder.add_condition(" AND fo.owner_type = {}", owner_type)
builder.add_condition(" AND fo.owner_id = {}", owner_id)

# Add project filtering
if project:
query += f" AND p.canonical_name = ${param_index}"
params.append(project)
param_index += 1
builder.add_condition(" AND p.canonical_name = {}", project)

# Add sorting and pagination
query += f" ORDER BY f.{sort_by} {direction} LIMIT ${param_index} OFFSET ${param_index + 1}"
params.extend([limit, offset])
# Add sorting and pagination with validation
builder.add_order_by(
sort_by, direction, allowed_fields={"created_at", "updated_at"}, table_prefix="f."
)
builder.add_limit_offset(limit, offset)

return query, params
return builder.build()
Loading
Loading