LLM completion logs are trapped in Docker container when using RemoteConversation

# LLM completion logs are trapped in Docker container when using RemoteConversation

## Description

When setting `log_completions=True` for an LLM instance in a RemoteConversation (e.g., when using `DockerWorkspace`), the completion logs are written inside the Docker container and are not accessible from the client side. This makes it difficult to debug and analyze LLM behavior in remote execution scenarios.

## Root Cause Analysis

The issue stems from the architecture of RemoteConversation:

1. **Client-side LLM configuration**: When creating an LLM with `log_completions=True`, the `log_completions_folder` defaults to `os.path.join(ENV_LOG_DIR, "completions")` where `ENV_LOG_DIR = os.getenv("LOG_DIR", "logs")` (evaluated on the **client** machine).

2. **Serialization and transmission**: When creating a RemoteConversation, the agent (containing the LLM) is serialized via `agent.model_dump(mode="json", context={"expose_secrets": True})` and sent to the agent server inside the Docker container (`remote_conversation.py:466-468`).

3. **Server-side logging**: The LLM is deserialized and instantiated on the **server side** (inside Docker), where the actual logging occurs via the `Telemetry` class (`telemetry.py:86-87, 215-303`).

4. **Inaccessible logs**: The logs end up in the container's filesystem at the path that was evaluated on the client side, making them inaccessible from the client.

### Code References

- **LLM logging configuration**: `openhands-sdk/openhands/sdk/llm/llm.py:207-214`
- **Telemetry logging**: `openhands-sdk/openhands/sdk/llm/utils/telemetry.py:215-303`
- **Agent serialization**: `openhands-sdk/openhands/sdk/conversation/impl/remote_conversation.py:466-468`
- **Log directory default**: `openhands-sdk/openhands/sdk/logger/logger.py:36`

## Example to Reproduce

```python
import os
from pydantic import SecretStr
from openhands.sdk import LLM, Conversation
from openhands.tools.preset.default import get_default_agent
from openhands.workspace import DockerWorkspace

api_key = os.getenv("LLM_API_KEY")

# Configure LLM with log_completions enabled
llm = LLM(
    usage_id="agent",
    model="anthropic/claude-sonnet-4-5-20250929",
    api_key=SecretStr(api_key),
    log_completions=True,  # ← This won't work as expected!
)

with DockerWorkspace(
    base_image="nikolaik/python-nodejs:python3.12-nodejs22",
    host_port=8010,
) as workspace:
    agent = get_default_agent(llm=llm, cli_mode=True)
    conversation = Conversation(agent=agent, workspace=workspace)
    
    conversation.send_message("Write hello world in Python")
    conversation.run()
    
    # Expected: Completion logs in ./logs/completions/
    # Actual: Logs are inside the Docker container, not accessible from client
    conversation.close()
```

## Proposed Solutions

### Solution 1: Stream Completion Logs via WebSocket Events (Recommended)

**Approach**: Create a new event type `LLMCompletionLogEvent` that streams log data back to the client through the existing WebSocket connection.

**Pros**:
- Works seamlessly with existing RemoteConversation architecture
- Real-time streaming of logs as they're generated
- No infrastructure changes required
- Maintains the client-side log directory configuration intent

**Cons**:
- Requires implementation of a new event type
- Slightly increases WebSocket traffic
- May need buffering/compression for large completion logs

**Implementation outline**:
1. Add `LLMCompletionLogEvent` to event types
2. Modify `Telemetry.log_llm_call()` to detect if running in a remote context
3. If remote, emit log data as an event instead of writing to file
4. Client-side callback receives the event and writes to local filesystem

**Complexity**: Medium

---

### Solution 2: Post-Execution Log Retrieval API

**Approach**: Add REST API endpoints to retrieve completion logs from the server after execution.

**Pros**:
- Simpler implementation (no new event types)
- Logs remain on server for debugging server-side issues
- Can retrieve logs at any time, not just during execution

**Cons**:
- Not real-time (logs only available after completion)
- Requires additional API calls
- Logs accumulate on server until explicitly retrieved/cleaned up
- Requires storage management strategy

**Implementation outline**:
1. Add `GET /api/conversations/{conversation_id}/logs` endpoint
2. Server maintains log files indexed by conversation ID
3. Client calls endpoint to download logs after execution
4. Add cleanup mechanism for old log files

**Complexity**: Low-Medium

---

### Solution 3: Docker Volume Mounting

**Approach**: Allow users to mount a host directory into the Docker container for shared log access.

**Pros**:
- Simple concept (standard Docker practice)
- Logs immediately accessible on host filesystem
- No code changes to logging infrastructure

**Cons**:
- Requires infrastructure/configuration changes
- Only works for Docker deployments (not other remote execution models)
- Path coordination between client and container is error-prone
- Security implications of mounting host directories

**Implementation outline**:
1. Add `log_volume_mount` parameter to `DockerWorkspace`
2. Configure Docker container to mount host directory
3. Ensure LLM uses the mounted path

**Complexity**: Low (Docker-specific)

---

### Solution 4: Disable Remote Logging with Warning

**Approach**: Detect when LLM is being used in a remote context and automatically disable `log_completions` with a warning.

**Pros**:
- Prevents user confusion
- No incorrect behavior
- Simple to implement

**Cons**:
- Doesn't actually solve the problem
- Removes functionality rather than fixing it
- Users still can't get completion logs in remote scenarios

**Implementation outline**:
1. Add remote execution context detection to LLM
2. Override `log_completions=False` when serializing for remote execution
3. Emit warning to user

**Complexity**: Low (workaround, not a solution)

---

### Solution 5: Client-Side Interception and Logging

**Approach**: Intercept LLM calls on the client side before they're sent to the remote server, and log them client-side.

**Pros**:
- Logs on client where user expects them
- Works for any remote execution model

**Cons**:
- Architectural complexity (splits LLM execution from logging)
- May miss server-side retries or modifications
- Duplicates logic between client and server
- Doesn't capture server-side processing details

**Complexity**: High

---

## Recommendation

I recommend implementing **Solution 1 (Stream Completion Logs via WebSocket Events)** as the primary solution because:

1. It aligns with the existing RemoteConversation event-driven architecture
2. It preserves the user's intent when setting `log_completions=True`
3. It works for all remote execution models (not just Docker)
4. It provides real-time access to logs during execution
5. The implementation complexity is manageable

**Solution 2 (Post-Execution Log Retrieval API)** could be implemented as a complementary feature for scenarios where users want to retrieve historical logs.

**Solution 4 (Disable with Warning)** should be implemented immediately as a short-term fix to prevent user confusion while the proper solution is being developed.

## Additional Considerations

- Similar issues may exist for other file-based outputs (agent logs, workspace files that need special handling)
- The solution should be consistent with how other remote execution artifacts are handled
- Need to consider log size limits and potential WebSocket message size constraints
- Should support both streaming (real-time) and batch (post-execution) retrieval patterns

## Related Files

- `openhands-sdk/openhands/sdk/llm/llm.py`
- `openhands-sdk/openhands/sdk/llm/utils/telemetry.py`
- `openhands-sdk/openhands/sdk/conversation/impl/remote_conversation.py`
- `openhands-sdk/openhands/sdk/event/base.py`
- `openhands-agent-server/openhands/agent_server/conversation_service.py`


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LLM completion logs are trapped in Docker container when using RemoteConversation #1158

LLM completion logs are trapped in Docker container when using RemoteConversation

Description

Root Cause Analysis

Code References

Example to Reproduce

Proposed Solutions

Solution 1: Stream Completion Logs via WebSocket Events (Recommended)

Solution 2: Post-Execution Log Retrieval API

Solution 3: Docker Volume Mounting

Solution 4: Disable Remote Logging with Warning

Solution 5: Client-Side Interception and Logging

Recommendation

Additional Considerations

Related Files

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

LLM completion logs are trapped in Docker container when using RemoteConversation #1158

Description

LLM completion logs are trapped in Docker container when using RemoteConversation

Description

Root Cause Analysis

Code References

Example to Reproduce

Proposed Solutions

Solution 1: Stream Completion Logs via WebSocket Events (Recommended)

Solution 2: Post-Execution Log Retrieval API

Solution 3: Docker Volume Mounting

Solution 4: Disable Remote Logging with Warning

Solution 5: Client-Side Interception and Logging

Recommendation

Additional Considerations

Related Files

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions