-
Notifications
You must be signed in to change notification settings - Fork 0
Description
CI Run Link: https://github.com/coder/coder/actions/runs/19331399689
Commit: 5bfbb0301f956a31e7fde53179a030153d3742e4 by david-fraley
Job: test-go-pg-17 (run_attempt=1) ended 2025-11-13T12:30:24Z; Slack alert at 12:32:01Z (same run, within minutes).
Failure summary
- Failing test: agent TestWriteFile/IsDir (subtest of TestWriteFile)
- Workflow: ci → test-go-pg-17
- Other matrix jobs: passed (no matrix cancellation artifact)
Key error evidence (from logs)
=== FAIL: agent TestWriteFile/IsDir (25.00s)
coderdtest.go:1607:
Error Trace: .../coderd/coderdtest/coderdtest.go:1607
.../agent/files_test.go:377
Error: Should be true
Test: TestWriteFile/IsDir
Messages: should be SDK error, got do request: Post "http://[fd7a:115c:a1e0:46f9:9895:fcb1:79a2:cc1b]:4/api/v0/write-file?path=/tmp/directory": context deadline exceeded
Additional context around the failure shows repeated tailnet/Tailscale coordination and pings, then the client dialing IPv6 addr with port :4, followed by timeouts:
client: dial tcp addr_port="[fd7a:115c:a1e0:46f9:9895:fcb1:79a2:cc1b]:4"
agent.net.tailnet.tcp: accepted connection ... dst=[fd7a:...]:4
The test subcase expected a 400 with message "is a directory" but instead the SDK request timed out, leading to coderdtest.SDKError asserting failure.
Root cause classification
- Type: Flaky test (network/timing). The agent connection/peer port appeared as
:4and the request to/api/v0/write-filetimed out under the test’s context deadline. No data race indicators, panics, or OOM found in the logs. Other tests finished successfully in the same job.
Ownership analysis (assignment)
Primary test function blame
- File: agent/files_test.go
- Function: TestWriteFile
- Location: lines 280–388 on commit 5bfbb03 (start/end located via grep)
Commands used:
# Locate function boundaries
grep -n "^func TestWriteFile\(" agent/files_test.go # -> start ~280
grep -n "^func TestEditFiles\(" agent/files_test.go # -> next func ~389
# Blame the function lines
git blame -L 280,388 agent/files_test.go
Recent commit history touching this file (for context):
- d5a02d570fc2 (Asher) feat: add coder_workspace_write_file MCP tool — introduced TestWriteFile
- 30330abaea64 (Asher) feat: add coder_workspace_edit_file MCP tool — added TestEditFiles; did not modify TestWriteFile body (line shift only)
- 4bf63b4068a1 (Asher) feat: add coder_workspace_read_file MCP tool
Based on this, the last meaningful changes to TestWriteFile appear to be authored by Asher ([email protected]). If a different owner maintains agent file I/O + tailnet test harness, please reassign accordingly.
Error analysis details
- Expected: 400 Bad Request with error containing "is a directory" for path=/tmp/directory
- Actual: network timeout while posting to agent’s write-file endpoint via tailnet address;
coderdtest.SDKErrorcould not unwrap a codersdk error response. - No signs of infra instability (DB/network outside the tailnet emulation) in this job; other packages passed.
Duplicates search (past and closed issues)
Searched coder/internal with multiple queries; no matches found:
- "TestWriteFile IsDir"
- "do request: Post "/api/v0/write-file"
- "TestWriteFile"
- Also checked closed issues in last 30 days for the above terms.
Proposed next steps
- Stabilize TestWriteFile/IsDir by ensuring the agent connection is fully ready before issuing write-file, or avoid reliance on transient peer API port discovery for this negative-path assertion.
- Investigate why client attempted IPv6 with
:4port; confirm peer API port propagation and readiness in the tailnet test harness.
Reproduction
- Run test-go-pg-17 job on main; intermittently hits during agent package tests. Locally, run
go test ./agent -run TestWriteFile/IsDir -count=100to attempt to reproduce.
Related issues
- None found.