-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Add Response Caching Middleware #1845
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
6850047
to
09369b5
Compare
03da107
to
d5f01da
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements a comprehensive response caching middleware system for FastMCP. It adds caching capabilities for various MCP operations including tool calls, resource reads, and prompt requests to improve performance and reduce server load.
- Implements both in-memory and disk-based caching backends
- Adds configurable TTL settings and filtering options for different operation types
- Includes cache invalidation through MCP notifications and comprehensive test coverage
Reviewed Changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
File | Description |
---|---|
tests/server/middleware/test_caching.py |
Comprehensive test suite covering cache implementations, middleware functionality, and integration tests |
src/fastmcp/server/middleware/middleware.py |
Updated middleware base class to use proper return type for resource reading operations |
src/fastmcp/server/middleware/caching.py |
Core caching middleware implementation with cache protocols, backends, and operation handlers |
pyproject.toml |
Added caching dependencies and test dependencies |
docs/servers/middleware.mdx |
Documentation for the new caching middleware functionality |
TypedDicts are a lot less fun than I was expecting and I'm not sure the cache entry model is going to stick around The implementation for the cache entries is mostly for the benefit of distributed cache implementations that can't just pickle |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 8 out of 11 changed files in this pull request and generated 3 comments.
src/fastmcp/contrib/middleware/caching/elasticsearch/elasticsearch_cache.py
Outdated
Show resolved
Hide resolved
Love this idea! The |
Prompt cannot be instantiated directly as its an ABC and so what comes through the middleware is a FunctionPrompt which can't be serialized/deserialized due to the function reference -- so the CachedPrompt offers a serializable/deserializable model Would love other ideas if you have them |
Oh! what do you think about removing the ABC from all components and just raising NotImplementedError() instead? Less self-documenting, more compatible? |
Wait I ran into this in a different place and made this change in #2031, hopefully that makes this simpler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking pretty good to me! I think you could get a little simpler by replacing the Cacheable*
classes with direct instantiations now of the "normal" classes but it has no functional impact
I'm going to update the PydanticAdapter in the py-key-value library to support lists of basemodels (and transparently nest them in an I should then be able to get rid of the cache-able lists and most of the cachable entries |
I'm have a couple more updates pending, including size limiting via a new kv store wrapper strawgate/py-key-value#50 and then this will be ready to go by end of day today hopefully |
I think this is almost ready to merge -- I'm considering either:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 9 out of 10 changed files in this pull request and generated 3 comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 9 out of 10 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (1)
src/fastmcp/server/middleware/caching.py:1
- Corrected spelling of 'istenchars' to 'tenchars' in comment - the string itself appears intentional for testing.
"""A middleware for response caching."""
|
||
def very_large_response(self) -> str: | ||
self.very_large_response_calls += 1 | ||
return "istenchars" * 100000 # 1,000,000 characters, 1mb |
Copilot
AI
Oct 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment states '1,000,000 characters, 1mb' but 'istenchars' is 10 characters, so 100,000 repetitions would be 1,000,000 characters. However, 1,000,000 characters would typically be around 1MB in UTF-8, not exactly 1MB. Consider updating the comment to be more precise about the actual size.
return "istenchars" * 100000 # 1,000,000 characters, 1mb | |
return "istenchars" * 100000 # 1,000,000 characters (~0.95MB in UTF-8) |
Copilot uses AI. Check for mistakes.
@jlowin this is ready I did change ToolResult to require list[ContentBlock] instead of taking Any and just figuring out what to do with it. I think this matches more closely our goal of "ToolResult is you saying you want the result to look like X" Though I think if we add contentcompatibilitymiddleware later there will be a second change here |
Really like this -- I think the ToolResult behavior change is a step too far but the middleware is 👍 |
047dca9
to
5831c4b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 4 out of 5 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
src/fastmcp/server/middleware/caching.py:1
- Corrected spelling of 'istenchars' to 'testchars' in the comment.
"""A middleware for response caching."""
@jlowin I rolled back the ToolResult change and merging! |
Description
Add a response caching middleware that leverages our new key-value library, https://github.com/strawgate/py-key-value
py-key-value-aio
py-key-value-aio has great store support for local, distributed, and secret stores and offers wrappers for namespacing collections, limiting item sizes, retries, timeouts, etc.
Contributors Checklist
Review Checklist