Skip to content

Comments

Add direct cache docs#1728

Open
Georgehe4 wants to merge 5 commits intomaximhq:mainfrom
Georgehe4:direct-cache-docs
Open

Add direct cache docs#1728
Georgehe4 wants to merge 5 commits intomaximhq:mainfrom
Georgehe4:direct-cache-docs

Conversation

@Georgehe4
Copy link
Contributor

Summary

Documents the existing direct hash mode (embedding-free) for the semantic cache plugin. Users looking for simple exact-match caching without the cost/complexity of an embedding provider had no way to discover this capability from the docs.

Changes

  • Added a new "Direct Hash Mode (Embedding-Free)" section to docs/features/semantic-caching.mdx covering setup (omit provider/keys/embedding_model from config), recommended vector store (Redis), and per-request cache type override behavior.
  • Included a warning that Qdrant and Pinecone are not compatible with this mode when no embedding provider is configured, since the zero-vector placeholder codepath requires plugin.client != nil.

Type of change

  • Bug fix
  • Feature
  • Refactor
  • Documentation
  • Chore/CI

Affected areas

  • Core (Go)
  • Transports (HTTP)
  • Providers/Integrations
  • Plugins
  • UI (Next.js)
  • Docs

How to test

  • Verify the new section renders correctly on the Mintlify docs site (PR preview or npx mintlify dev)
  • Confirm anchor links (#direct-hash-mode-embedding-free, #recommended-vector-store, #cache-type-control) resolve correctly
  • Validate the Go SDK and config.json examples match actual plugin behavior (omitting provider/keys triggers direct-only fallback per main.go lines 319-321)

Screenshots/Recordings

N/A -- docs-only change.

Breaking changes

  • Yes
  • No

Related issues

N/A

Security considerations

None.

Checklist

  • I read docs/contributing/README.md and followed the guidelines
  • I added/updated tests where appropriate
  • I updated documentation where needed
  • I verified builds succeed (Go and UI)
  • I verified the CI pipeline passes locally if applicable

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 22, 2026

📝 Walkthrough

Summary by CodeRabbit

  • Documentation
    • Added comprehensive "Direct Hash Mode" docs for exact-match caching without embeddings, covering global setup, per-request behaviors (direct-only vs dual-layer) and per-request header override.
    • Changed semantic cache initialization to use a pointer-style configuration reference.
    • Added setup examples (Go SDK, Helm, config), recommended Redis vector store, compatibility warnings, and clarified that direct-mode requests always use direct hashing.

Walkthrough

Added a Direct Hash Mode (Embedding‑Free) section to semantic caching docs describing exact-match caching that bypasses embeddings, global enablement by omitting embedding provider/keys, per-request override via x-bf-cache-type, Go SDK/Helm/config.json setup examples, and Redis vector store guidance.

Changes

Cohort / File(s) Summary
Semantic Caching Documentation
docs/features/semantic-caching.mdx
Added Direct Hash Mode (Embedding‑Free) content: exact-match (direct) caching behavior, global enablement by omitting embedding provider/keys/embedding_model, per-request modes (direct-only vs dual-layer) and x-bf-cache-type override, Go SDK/Helm/config.json examples, recommended Redis vector store config snippets and warnings about vector store requirements.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Client
  participant Core
  participant Cache
  participant VectorStore
  Client->>Core: HTTP request (+ optional header x-bf-cache-type)
  Core->>Core: Inspect global config (embedding provider present?)
  alt No embedding provider (global direct-only)
    Core->>Cache: Compute direct hash -> Exact-match lookup
    alt Hit
      Cache-->>Core: Cached response
      Core-->>Client: Return cached response
    else Miss
      Core-->>Client: Return miss or compute fallback (no embeddings)
    end
  else Embedding provider present
    alt Header x-bf-cache-type == "direct"
      Core->>Cache: Compute direct hash -> Exact-match lookup
      alt Hit
        Cache-->>Core: Cached response
        Core-->>Client: Return cached response
      else Miss
        Core->>VectorStore: Compute embeddings -> Vector lookup/store (dual-layer fallback)
        VectorStore-->>Core: Results
        Core-->>Client: Return results
      end
    else Default (dual-layer)
      Core->>VectorStore: Compute embeddings -> Vector lookup/store
      VectorStore-->>Core: Results
      Core-->>Client: Return results (optionally populate Cache)
    end
  end
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 I nibble docs and plant a seed,
Direct hash hops in when it's needed.
Redis burrows deep and steady,
Headers choose the path already,
Exact matches bloom, quick and neat!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Add direct cache docs' is concise and accurately reflects the main change—documentation for the direct cache (embedding-free) mode in semantic caching.
Description check ✅ Passed The description comprehensively covers the template with well-structured sections: summary, changes, type of change, affected areas, testing instructions, breaking changes, and checklist. All critical sections are present and detailed.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/features/semantic-caching.mdx`:
- Around line 275-277: The warning text inaccurately says vectors are not
generated; instead update the copy to explain the real failure: the codepath
that inserts a zero-vector placeholder exists but requires plugin.client != nil,
and without an embedding provider plugin.client is nil so the placeholder
generation fails; change the Warning block to mention Qdrant and Pinecone
require vectors and that the zero-vector fallback only works when plugin.client
is present (so absent embedding provider leads to a plugin.client == nil error),
and optionally note Weaviate may work but is not recommended.
- Around line 230-241: The example at line 106 constructs a value
`semanticcache.Config{}` but `semanticcache.Init(ctx, config, logger, store)`
expects a pointer `*semanticcache.Config`; update the example to create a
pointer (e.g., `cacheConfig := &semanticcache.Config{...}`) and pass
`cacheConfig` into `semanticcache.Init`. Locate the value construction of
`Config` in the sample and change it to a pointer so it matches the
`semanticcache.Init` signature and the working example that uses `cacheConfig :=
&semanticcache.Config{}` shown later.
- Around line 213-219: The docs incorrectly state that omitting embedding_model
triggers direct-only mode; update the setup text to say direct-only is enabled
by omitting provider and keys (the code uses Provider == "" or len(Keys) == 0 to
decide). Also fix the example type mismatch: change the example that creates
cacheConfig so it allocates a pointer to semanticcache.Config to match the Init
signature (func Init(..., config *Config, ...)) — i.e., use a pointer for
cacheConfig where Init is called. Reference symbols: Provider, Keys,
embedding_model, Init, Config, semanticcache.Config.

Comment on lines 275 to 277
<Warning>
Qdrant and Pinecone are not compatible with direct hash mode when no embedding provider is configured. These stores require vectors for all entries, and without an embedding provider, no vectors are generated, causing storage to fail. Weaviate may work but is not recommended for this use case.
</Warning>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Inaccurate technical explanation of Qdrant/Pinecone failure mode

The warning states "without an embedding provider, no vectors are generated, causing storage to fail." Per the PR description, the actual failure mechanism is that a zero-vector placeholder codepath exists but requires plugin.client != nil — the client is nil without an embedding provider, so the attempt to generate the placeholder fails. The current wording implies there is no vector-generation path at all, which misrepresents the actual error a user would encounter while debugging.

Consider updating the warning to reflect the actual failure:

📝 Suggested wording update
-Qdrant and Pinecone are not compatible with direct hash mode when no embedding
-provider is configured. These stores require vectors for all entries, and without
-an embedding provider, no vectors are generated, causing storage to fail. Weaviate
-may work but is not recommended for this use case.
+Qdrant and Pinecone are not compatible with direct hash mode when no embedding
+provider is configured. These stores require a vector for every entry; the plugin's
+zero-vector placeholder codepath requires an initialised embedding client, so
+storage will fail if no provider is set. Weaviate requires a vector per entry as
+well and is therefore also not recommended for direct-only mode.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
<Warning>
Qdrant and Pinecone are not compatible with direct hash mode when no embedding provider is configured. These stores require vectors for all entries, and without an embedding provider, no vectors are generated, causing storage to fail. Weaviate may work but is not recommended for this use case.
</Warning>
<Warning>
Qdrant and Pinecone are not compatible with direct hash mode when no embedding provider is configured. These stores require a vector for every entry; the plugin's zero-vector placeholder codepath requires an initialised embedding client, so storage will fail if no provider is set. Weaviate requires a vector per entry as well and is therefore also not recommended for direct-only mode.
</Warning>
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/features/semantic-caching.mdx` around lines 275 - 277, The warning text
inaccurately says vectors are not generated; instead update the copy to explain
the real failure: the codepath that inserts a zero-vector placeholder exists but
requires plugin.client != nil, and without an embedding provider plugin.client
is nil so the placeholder generation fails; change the Warning block to mention
Qdrant and Pinecone require vectors and that the zero-vector fallback only works
when plugin.client is present (so absent embedding provider leads to a
plugin.client == nil error), and optionally note Weaviate may work but is not
recommended.

@Georgehe4 Georgehe4 changed the title Direct cache docs Add direct cache docs Feb 22, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/features/semantic-caching.mdx`:
- Around line 264-279: The plugin name in the config example is incorrect:
replace the incorrect string "semanticcache" with the registered plugin name
"semantic_cache" in the example JSON so the plugin loader will match the
registered plugin; update the example in the docs where the config object
contains "name": "semanticcache" (the snippet showing the plugins array and its
config fields like "dimension", "ttl", "cleanup_on_shutdown", "cache_by_model",
"cache_by_provider") to use "semantic_cache" instead.

---

Duplicate comments:
In `@docs/features/semantic-caching.mdx`:
- Around line 288-331: Add the missing compatibility warning under the
"Recommended Vector Store" / direct-hash-redis tab: insert a short warning block
explaining that Qdrant and Pinecone do not work with direct hash mode when no
embedding provider is configured because the zero-vector placeholder path
requires plugin.client != nil, so users should use Redis for metadata-only
direct mode or ensure an embedding provider/plugin is configured; place this
warning immediately after the existing Redis example within the Tabs group
"direct-hash-redis" and use clear labels like "Warning: Qdrant/Pinecone
incompatibility" to make it prominent.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
docs/features/semantic-caching.mdx (1)

230-238: Clarify what Dimension: 1 means in this context

The comment // Minimal value; no real vectors are stored may mislead users into thinking nothing is written to the vector store. Cache entries are persisted (as metadata-only records); it is only the embedding vector component that is absent. A future attempt to switch to dual-layer mode (by adding a provider) with a mismatch between 1 and the embedding model's actual dimension will trigger the mixed-dimension warning (line 645). Consider updating the comment:

📝 Suggested comment update
-    Dimension: 1, // Minimal value; no real vectors are stored
+    Dimension: 1, // Sentinel value — no embedding vectors are computed or stored.
+                  // Changing this later requires a namespace reset (see Dimension Changes warning).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/features/semantic-caching.mdx` around lines 230 - 238, Update the inline
comment for the Dimension field in the semanticcache.Config initialization to
clarify that Dimension: 1 does not mean nothing is persisted; cache entries are
still stored as metadata-only records but the embedding vector component is
empty/minimal, and note that switching to dual-layer mode (adding a Provider)
must use a matching embedding dimension to avoid the mixed-dimension warning
referenced by the code that checks dimensions; reference the
semanticcache.Config struct and the Dimension field so the maintainer updates
this specific comment to explain persistence of metadata, absence of real
vectors, and the future warning if dimensions mismatch.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@docs/features/semantic-caching.mdx`:
- Around line 230-238: Update the inline comment for the Dimension field in the
semanticcache.Config initialization to clarify that Dimension: 1 does not mean
nothing is persisted; cache entries are still stored as metadata-only records
but the embedding vector component is empty/minimal, and note that switching to
dual-layer mode (adding a Provider) must use a matching embedding dimension to
avoid the mixed-dimension warning referenced by the code that checks dimensions;
reference the semanticcache.Config struct and the Dimension field so the
maintainer updates this specific comment to explain persistence of metadata,
absence of real vectors, and the future warning if dimensions mismatch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant