-
Notifications
You must be signed in to change notification settings - Fork 2
Writing tests to break the server #12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
gwbischof
wants to merge
33
commits into
main
Choose a base branch
from
more_tests
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…gaps Learned: Focusing on lightweight edge cases revealed actual crashes better than resource-intensive stress tests.
… server vulnerabilities Learned: Using pytest.mark.timeout prevented hanging tests and allowed systematic testing of edge cases.
…nt indefinite blocking Learned: Selective timeout application only on hanging tests maintains test efficiency while ensuring robust execution.
…ging behavior Learned: TODOs should focus on actionable bugs rather than design choices to maintain clear development priorities.
…uding crashes, hangs, and race conditions Learned: Removing validation tests and focusing only on tests that expose real bugs creates a more valuable test suite than broadly testing edge cases.
…suite from 12 to 20 comprehensive tests that all fail when server has bugs Learned: Moving imports to module level and removing defensive exception handling makes tests more maintainable and clearly exposes server issues.
…t suite on 18 actual bugs requiring fixes Learned: Keeping only failing tests that expose real bugs makes the test suite more actionable and prevents confusion about what needs to be fixed.
…tionality and expected fixes. Learned: Merging tests by server code path and creating separate files by functional area makes test suite more maintainable than one large file.
…oint that returns proper 400 responses. Learned: Testing failure first, then implementing the fix, then verifying success creates a clear development cycle that ensures the fix actually works.
… attacks with 16MB payload, 8KB header, and 1MB WebSocket frame limits. Learned: Adding size limits early in the request pipeline prevents resource exhaustion while maintaining performance for legitimate requests.
…n real client-accessible bugs only. Learned: Tests should only validate scenarios that clients can actually trigger through public APIs rather than artificial edge cases requiring direct database access.
…een bugs and their solutions. Learned: Linking fixes directly to the tests that expose the bugs improves code maintainability and helps future developers understand the purpose of each fix.
…r 400 responses while letting server errors bubble up as 500. Learned: Catching overly broad exception types can mask serious server issues and mislead users about the actual cause of errors.
…s and improve user experience with clear, actionable error descriptions. Learned: Raw exception details should never be exposed to clients as they leak implementation details and provide potential attack surface information.
…elopment principles and eliminate dead code. Learned: Only implement code that has corresponding tests to ensure functionality is validated and maintainable.
… crashes Summary: Server now validates request sizes and handles malformed JSON gracefully with proper HTTP error responses instead of crashing. Learned: Middleware-level validation was unnecessary overhead - endpoint-specific validation is more targeted and easier to test consistently.
Summary: The /close endpoint now uses a Pydantic model for automatic JSON parsing and validation instead of manual error handling. Learned: Pydantic models eliminate boilerplate error handling code and provide better type safety than manual JSON parsing.
danielballan
requested changes
Jul 8, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
/close/{node_id}to a DELETE and made idempotent.seq_num:{node_id}doesn't exist.content-lengthheader to let clients know that their data is too large.Notes on denying websocket connections: https://www.starlette.io/websockets/#send-denial-response
I think we squash-merge this one, the commits aren't interesting. I was experimenting with using AI to write tests to find bugs in the server code. I ended up removing most of the AI tests, they mostly verified that things were working correctly, and pointed out a couple minor issues.