Writing tests to break the server #12

gwbischof · 2025-07-01T21:17:02Z

Changes /close/{node_id} to a DELETE and made idempotent.
Closing the connection marks `seq_num:{node_id} for expiration.
Deny the websocket connection if seq_num:{node_id} doesn't exist.
Add a check of content-length header to let clients know that their data is too large.

Notes on denying websocket connections: https://www.starlette.io/websockets/#send-denial-response

I think we squash-merge this one, the commits aren't interesting. I was experimenting with using AI to write tests to find bugs in the server code. I ended up removing most of the AI tests, they mostly verified that things were working correctly, and pointed out a couple minor issues.

…gaps Learned: Focusing on lightweight edge cases revealed actual crashes better than resource-intensive stress tests.

… server vulnerabilities Learned: Using pytest.mark.timeout prevented hanging tests and allowed systematic testing of edge cases.

…nt indefinite blocking Learned: Selective timeout application only on hanging tests maintains test efficiency while ensuring robust execution.

…ging behavior Learned: TODOs should focus on actionable bugs rather than design choices to maintain clear development priorities.

…uding crashes, hangs, and race conditions Learned: Removing validation tests and focusing only on tests that expose real bugs creates a more valuable test suite than broadly testing edge cases.

…suite from 12 to 20 comprehensive tests that all fail when server has bugs Learned: Moving imports to module level and removing defensive exception handling makes tests more maintainable and clearly exposes server issues.

…t suite on 18 actual bugs requiring fixes Learned: Keeping only failing tests that expose real bugs makes the test suite more actionable and prevents confusion about what needs to be fixed.

…tionality and expected fixes. Learned: Merging tests by server code path and creating separate files by functional area makes test suite more maintainable than one large file.

…oint that returns proper 400 responses. Learned: Testing failure first, then implementing the fix, then verifying success creates a clear development cycle that ensures the fix actually works.

… attacks with 16MB payload, 8KB header, and 1MB WebSocket frame limits. Learned: Adding size limits early in the request pipeline prevents resource exhaustion while maintaining performance for legitimate requests.

…n real client-accessible bugs only. Learned: Tests should only validate scenarios that clients can actually trigger through public APIs rather than artificial edge cases requiring direct database access.

…een bugs and their solutions. Learned: Linking fixes directly to the tests that expose the bugs improves code maintainability and helps future developers understand the purpose of each fix.

…r 400 responses while letting server errors bubble up as 500. Learned: Catching overly broad exception types can mask serious server issues and mislead users about the actual cause of errors.

…s and improve user experience with clear, actionable error descriptions. Learned: Raw exception details should never be exposed to clients as they leak implementation details and provide potential attack surface information.

…elopment principles and eliminate dead code. Learned: Only implement code that has corresponding tests to ensure functionality is validated and maintainable.

…point currently

… crashes Summary: Server now validates request sizes and handles malformed JSON gracefully with proper HTTP error responses instead of crashing. Learned: Middleware-level validation was unnecessary overhead - endpoint-specific validation is more targeted and easier to test consistently.

Summary: The /close endpoint now uses a Pydantic model for automatic JSON parsing and validation instead of manual error handling. Learned: Pydantic models eliminate boilerplate error handling code and provide better type safety than manual JSON parsing.

server.py

tests/test_json_parsing.py

gwbischof added 21 commits July 1, 2025 17:16

Add server bug tests that expose JSON parsing crashes and validation …

55586c4

…gaps Learned: Focusing on lightweight edge cases revealed actual crashes better than resource-intensive stress tests.

Add comprehensive edge case tests with timeout protection to discover…

f83d6b8

… server vulnerabilities Learned: Using pytest.mark.timeout prevented hanging tests and allowed systematic testing of edge cases.

Uncomment hanging WebSocket tests and add timeout protection to preve…

b19a691

…nt indefinite blocking Learned: Selective timeout application only on hanging tests maintains test efficiency while ensuring robust execution.

Add TODO comments to identify tests that expose server crashes or han…

a512d08

…ging behavior Learned: TODOs should focus on actionable bugs rather than design choices to maintain clear development priorities.

Refine test_server_bugs.py to focus on 12 actionable server bugs incl…

1453d46

…uding crashes, hangs, and race conditions Learned: Removing validation tests and focusing only on tests that expose real bugs creates a more valuable test suite than broadly testing edge cases.

Remove 2 passing tests that show correct server behavior to focus tes…

48326f4

…t suite on 18 actual bugs requiring fixes Learned: Keeping only failing tests that expose real bugs makes the test suite more actionable and prevents confusion about what needs to be fixed.

Reorganize bug tests into 5 focused test files grouped by server func…

bfc56e0

…tionality and expected fixes. Learned: Merging tests by server code path and creating separate files by functional area makes test suite more maintainable than one large file.

Fix server crash by adding JSON parsing error handling to /close endp…

9c9dc74

…oint that returns proper 400 responses. Learned: Testing failure first, then implementing the fix, then verifying success creates a clear development cycle that ensures the fix actually works.

Remove Redis manipulation tests that bypass API validation to focus o…

f212ec8

…n real client-accessible bugs only. Learned: Tests should only validate scenarios that clients can actually trigger through public APIs rather than artificial edge cases requiring direct database access.

Add test references to server code fixes for better traceability betw…

5b346e1

…een bugs and their solutions. Learned: Linking fixes directly to the tests that expose the bugs improves code maintainability and helps future developers understand the purpose of each fix.

Improve JSON error handling to only catch JSON-specific exceptions fo…

0b402f4

…r 400 responses while letting server errors bubble up as 500. Learned: Catching overly broad exception types can mask serious server issues and mislead users about the actual cause of errors.

makeing some progress

7f82c1f

Remove untested WebSocket frame size limits to follow test-driven dev…

8cadb32

…elopment principles and eliminate dead code. Learned: Only implement code that has corresponding tests to ensure functionality is validated and maintainable.

moving along

95bbdc7

ruff

7112820

cleanup

a319113

don't use the middleware because it is only needed for the upload end…

c66f737

…point currently

gwbischof marked this pull request as ready for review July 7, 2025 21:37

gwbischof added 3 commits July 8, 2025 12:26

clean up

0eb0c3b

touch ups

28eea12

gwbischof requested a review from danielballan July 8, 2025 16:41

danielballan requested changes Jul 8, 2025

View reviewed changes

server.py Outdated Show resolved Hide resolved

tests/test_json_parsing.py Outdated Show resolved Hide resolved

gwbischof added 3 commits July 9, 2025 14:03

remove the close_connection body

6211736

make close_connection a delete endpoint

f189940

add close connection tests

d95f65d

gwbischof added 6 commits July 10, 2025 16:38

test websocket connection to non-existant node

7c0a96c

return 404 if node not streamable

8212b36

clean up the test

f662241

touch ups

ea431da

touch ups

57c94be

touch up

cd31947

gwbischof requested a review from danielballan July 11, 2025 16:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Writing tests to break the server #12

Writing tests to break the server #12

Uh oh!

gwbischof commented Jul 1, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Writing tests to break the server #12

Are you sure you want to change the base?

Writing tests to break the server #12

Uh oh!

Conversation

gwbischof commented Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gwbischof commented Jul 1, 2025 •

edited

Loading