Skip to content

Conversation

strawgate
Copy link
Collaborator

@strawgate strawgate commented Sep 17, 2025

Description

Add a response caching middleware that leverages our new key-value library, https://github.com/strawgate/py-key-value py-key-value-aio

py-key-value-aio has great store support for local, distributed, and secret stores and offers wrappers for namespacing collections, limiting item sizes, retries, timeouts, etc.

  • Cache list calls
  • Cache call/read tool/prompt/resource
  • Customize TTL by method
  • Bust caches on update notifications

Contributors Checklist

  • My change closes Response Caching Middleware #1844
  • I have followed the repository's development workflow
  • I have tested my changes manually and by adding relevant tests
  • I have performed all required documentation updates

Review Checklist

  • I have self-reviewed my changes
  • My Pull Request is ready for review

@marvin-context-protocol marvin-context-protocol bot added enhancement Improvement to existing functionality. For issues and smaller PR improvements. server Related to FastMCP server implementation or server-side functionality. labels Sep 17, 2025
@strawgate strawgate force-pushed the responsecachingmiddleware branch from 6850047 to 09369b5 Compare September 17, 2025 03:47
@strawgate strawgate changed the title Add Response Caching Middleware [Draft] Add Response Caching Middleware Sep 17, 2025
@strawgate strawgate changed the title [Draft] Add Response Caching Middleware Add Response Caching Middleware Sep 18, 2025
@strawgate strawgate changed the title Add Response Caching Middleware [Draft] Add Response Caching Middleware Sep 18, 2025
@strawgate strawgate requested a review from Copilot September 18, 2025 00:45
@strawgate strawgate force-pushed the responsecachingmiddleware branch from 03da107 to d5f01da Compare September 18, 2025 00:46
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements a comprehensive response caching middleware system for FastMCP. It adds caching capabilities for various MCP operations including tool calls, resource reads, and prompt requests to improve performance and reduce server load.

  • Implements both in-memory and disk-based caching backends
  • Adds configurable TTL settings and filtering options for different operation types
  • Includes cache invalidation through MCP notifications and comprehensive test coverage

Reviewed Changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/server/middleware/test_caching.py Comprehensive test suite covering cache implementations, middleware functionality, and integration tests
src/fastmcp/server/middleware/middleware.py Updated middleware base class to use proper return type for resource reading operations
src/fastmcp/server/middleware/caching.py Core caching middleware implementation with cache protocols, backends, and operation handlers
pyproject.toml Added caching dependencies and test dependencies
docs/servers/middleware.mdx Documentation for the new caching middleware functionality

@strawgate
Copy link
Collaborator Author

strawgate commented Sep 18, 2025

TypedDicts are a lot less fun than I was expecting and I'm not sure the cache entry model is going to stick around

The implementation for the cache entries is mostly for the benefit of distributed cache implementations that can't just pickle

@strawgate strawgate self-assigned this Sep 18, 2025
@strawgate strawgate requested a review from Copilot September 18, 2025 18:48
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 8 out of 11 changed files in this pull request and generated 3 comments.

@jlowin
Copy link
Owner

jlowin commented Sep 19, 2025

Love this idea! The CachedPrompt (and similar) classes feel a little overwrought -- I'm not totally clear on what makes the call_next results inherently uncacheable already? Or perhaps I'm missing something you're designing around

@strawgate
Copy link
Collaborator Author

strawgate commented Sep 19, 2025

Love this idea! The CachedPrompt (and similar) classes feel a little overwrought -- I'm not totally clear on what makes the call_next results inherently uncacheable already? Or perhaps I'm missing something you're designing around

Prompt cannot be instantiated directly as its an ABC and so what comes through the middleware is a FunctionPrompt which can't be serialized/deserialized due to the function reference -- so the CachedPrompt offers a serializable/deserializable model

Would love other ideas if you have them

@jlowin
Copy link
Owner

jlowin commented Sep 21, 2025

Oh! what do you think about removing the ABC from all components and just raising NotImplementedError() instead? Less self-documenting, more compatible?

@jlowin
Copy link
Owner

jlowin commented Oct 10, 2025

Oh! what do you think about removing the ABC from all components and just raising NotImplementedError() instead? Less self-documenting, more compatible?

Wait I ran into this in a different place and made this change in #2031, hopefully that makes this simpler

Copy link
Owner

@jlowin jlowin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking pretty good to me! I think you could get a little simpler by replacing the Cacheable* classes with direct instantiations now of the "normal" classes but it has no functional impact

@strawgate
Copy link
Collaborator Author

I'm going to update the PydanticAdapter in the py-key-value library to support lists of basemodels (and transparently nest them in an items key).

I should then be able to get rid of the cache-able lists and most of the cachable entries

@strawgate
Copy link
Collaborator Author

I'm have a couple more updates pending, including size limiting via a new kv store wrapper strawgate/py-key-value#50 and then this will be ready to go by end of day today hopefully

@strawgate
Copy link
Collaborator Author

I think this is almost ready to merge -- I'm considering either:

  1. Namespace keys using the fastmcp server name
  2. Document how users can leverage the PrefixCollectionWrapper in py-key-value-aio to easily share a single distributed key-value store with multiple servers

@strawgate strawgate requested a review from Copilot October 16, 2025 18:11
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 9 out of 10 changed files in this pull request and generated 3 comments.

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 9 out of 10 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

src/fastmcp/server/middleware/caching.py:1

  • Corrected spelling of 'istenchars' to 'tenchars' in comment - the string itself appears intentional for testing.
"""A middleware for response caching."""


def very_large_response(self) -> str:
self.very_large_response_calls += 1
return "istenchars" * 100000 # 1,000,000 characters, 1mb
Copy link

Copilot AI Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment states '1,000,000 characters, 1mb' but 'istenchars' is 10 characters, so 100,000 repetitions would be 1,000,000 characters. However, 1,000,000 characters would typically be around 1MB in UTF-8, not exactly 1MB. Consider updating the comment to be more precise about the actual size.

Suggested change
return "istenchars" * 100000 # 1,000,000 characters, 1mb
return "istenchars" * 100000 # 1,000,000 characters (~0.95MB in UTF-8)

Copilot uses AI. Check for mistakes.

@strawgate
Copy link
Collaborator Author

@jlowin this is ready

I did change ToolResult to require list[ContentBlock] instead of taking Any and just figuring out what to do with it. I think this matches more closely our goal of "ToolResult is you saying you want the result to look like X"

Though I think if we add contentcompatibilitymiddleware later there will be a second change here

@jlowin
Copy link
Owner

jlowin commented Oct 16, 2025

Really like this -- I think the ToolResult behavior change is a step too far but the middleware is 👍

@jlowin jlowin added this to the 2.13.0 milestone Oct 16, 2025
@strawgate strawgate force-pushed the responsecachingmiddleware branch from 047dca9 to 5831c4b Compare October 17, 2025 02:57
@strawgate strawgate requested a review from Copilot October 17, 2025 03:01
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 4 out of 5 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

src/fastmcp/server/middleware/caching.py:1

  • Corrected spelling of 'istenchars' to 'testchars' in the comment.
"""A middleware for response caching."""

@strawgate
Copy link
Collaborator Author

@jlowin I rolled back the ToolResult change and merging!

@strawgate strawgate merged commit 83adbc0 into main Oct 17, 2025
11 checks passed
@strawgate strawgate deleted the responsecachingmiddleware branch October 17, 2025 03:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Improvement to existing functionality. For issues and smaller PR improvements. server Related to FastMCP server implementation or server-side functionality.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants