feat(aichat): cache prompts when using anthropic models #6489

centdix · 2025-08-28T13:11:52Z

Important

Add caching for prompts when using Anthropic models, with updates to backend and frontend components for improved handling and performance.

Behavior:
- Cache prompts for Anthropic models in ai.rs by checking X-Anthropic-SDK header.
- Update URL handling for Anthropic SDK in AIRequestConfig.
Frontend:
- Add @anthropic-ai/sdk dependency in package.json.
- Implement getAnthropicCompletion and parseAnthropicCompletion in anthropic.ts for handling Anthropic model completions.
- Modify AIChatManager.svelte.ts to use Anthropic-specific functions for completions.
Misc:
- Update openai dependency version in package.json.
- Refactor tool handling in shared.ts and apiTools.ts to support Anthropic models.

^{This description was created by}^{for b89c2cd. You can customize this summary. It will automatically update as commits are pushed.}

cloudflare-workers-and-pages · 2025-08-28T13:15:37Z

Deploying windmill with Cloudflare Pages

Latest commit:	`b89c2cd`
Status:	✅ Deploy successful!
Preview URL:	https://8b27b482.windmill.pages.dev
Branch Preview URL:	https://fg-cache-anthropic.windmill.pages.dev

View logs

ellipsis-dev

Caution

Changes requested ❌

Reviewed everything up to 0bbb368 in 2 minutes and 18 seconds. Click for details.

Reviewed 409 lines of code in 2 files
Skipped 0 files when reviewing.
Skipped posting 4 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. frontend/src/lib/components/copilot/chat/anthropic.ts:265

Draft comment:
Consider verifying that concatenating JSON fragments in currentToolCall.args produces valid JSON. It may be beneficial to accumulate fragments and parse once complete, or add validation to ensure the final JSON is well-formed.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% This is a real potential issue - concatenating JSON fragments could produce invalid JSON if the API sends malformed or incomplete chunks. However, since this is an adapter for Anthropic's official API, we should be able to trust that their streaming format sends valid JSON fragments that concatenate correctly. The API contract likely guarantees this. I could be wrong about trusting the API - even official APIs can have bugs or edge cases. Invalid JSON could cause runtime errors. While API bugs are possible, adding validation here would add complexity for an edge case that should be handled by the API provider. If there are issues, they should be fixed in Anthropic's implementation. The comment raises a valid concern but suggests adding complexity to handle what should be guaranteed by the API contract. The comment should be removed.

2. frontend/src/lib/components/copilot/chat/anthropic.ts:206

Draft comment:
Using Date.now() for messageId may lead to collisions under high load. Consider incorporating additional entropy (e.g., a random component) for more robust unique ID generation.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% The comment raises a valid point about potential ID collisions under high load. However, looking at the code context, this is a stream converter function where messageId is only used within a single stream conversion. Since each stream conversion creates its own messageId and the IDs only need to be unique within that stream, collisions across concurrent requests aren't actually a problem. The current implementation is sufficient for its purpose. I could be wrong about the scope of the messageId - perhaps it needs to be globally unique across all streams for some reason not visible in this file. The OpenAI API might have requirements about message ID uniqueness. Looking at how messageId is used, it's only referenced within this single stream conversion function and doesn't persist beyond it. The OpenAI API documentation doesn't specify any requirements about global uniqueness for streaming chunk IDs. The comment should be deleted because the current implementation is sufficient for its purpose - the messageId only needs to be unique within a single stream conversion, not globally.

3. frontend/src/lib/components/copilot/lib.ts:639

Draft comment:
Ensure that config.max_tokens is defined before using it in anthropicRequest. Consider providing a default or a validation check to avoid potential runtime issues.
Reason this comment was not posted:
Comment looked like it was already resolved.

4. frontend/src/lib/components/copilot/lib.ts:757

Draft comment:
Fix the typo in the comment ('udpating' should be 'updating').
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

Workflow ID: wflow_pCcFxRmRrXpFyYUf

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

frontend/src/lib/components/copilot/chat/anthropic.ts

ellipsis-dev

Important

Looks good to me! 👍

Reviewed 07e4e0d in 1 minute and 28 seconds. Click for details.

Reviewed 32 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 2 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. frontend/src/lib/components/copilot/chat/anthropic.ts:116

Draft comment:
Good use of try/catch to avoid JSON.parse errors. Consider checking if toolCall.function.arguments is actually a string before parsing, and possibly using console.warn instead of console.error, to indicate non-critical parsing failures.
Reason this comment was not posted:
Confidence changes required: 50% <= threshold 50% None

2. frontend/src/lib/components/copilot/chat/anthropic.ts:141

Draft comment:
Removing the fallback empty string for tool_use_id might lead to undefined values. Ensure that message.tool_call_id is always defined or provide a safe default.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% Looking at the type definitions, tool_use_id is required in the AnthropicContentBlock interface (line 22). The code is handling a 'tool' role message, which by definition should have a tool_call_id. If tool_call_id was undefined, it would indicate a malformed message. Adding a fallback would hide this error rather than expose it. I could be wrong about the OpenAI message format - maybe tool messages can legitimately have undefined tool_call_ids in some cases? The type definitions for the input format aren't shown. Even if undefined tool_call_ids are possible in OpenAI format, silently converting them to empty strings would be the wrong approach - it would be better to fail fast and fix the root cause. The comment should be deleted. Adding a fallback would mask potential errors rather than expose them, and the tool_use_id should be required by the message format.

Workflow ID: wflow_vx4t26YdFfYU2Y6k

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

ellipsis-dev

Important

Looks good to me! 👍

Reviewed 93ffdab in 3 minutes and 24 seconds. Click for details.

Reviewed 208 lines of code in 2 files
Skipped 0 files when reviewing.
Skipped posting 5 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. frontend/src/lib/components/copilot/chat/AIChatManager.svelte.ts:439

Draft comment:
Removing the error log in the catch block may hinder debugging. Consider conditionally logging errors (when not aborted) to aid troubleshooting.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% Looking at the code: 1. The error is still thrown if not aborted 2. This means it will be caught by higher level error handlers 3. Those handlers likely have their own logging 4. The removed log was redundant with the throw 5. The comment is suggesting a conditional log, but that's not clearly better I could be wrong about higher level error handling - maybe this error gets swallowed somewhere and the log was important. The error might contain sensitive info that shouldn't be logged. The code explicitly throws the error when not aborted, so it will propagate up. If logging is needed, it should be handled at a higher level consistently rather than scattered throughout catch blocks. The comment should be deleted. The error handling is appropriate without the log since errors are thrown and can be handled/logged at a higher level.

2. frontend/src/lib/components/copilot/lib.ts:675

Draft comment:
Nice use of async iteration for processing the streaming response. Ensure any errors during iteration are handled as needed.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 50% The comment starts with a compliment about the use of async iteration, which is purely informative. The second part of the comment suggests ensuring error handling, which falls under asking the author to ensure behavior is intended or tested. This violates the rules.

3. frontend/src/lib/components/copilot/lib.ts:688

Draft comment:
Resetting the 'answer' variable when tool calls are detected might discard accumulated text tokens. Please verify that this behavior is intentional.
Reason this comment was not posted:
Comment looked like it was already resolved.

4. frontend/src/lib/components/copilot/lib.ts:697

Draft comment:
The merging logic for tool call arguments concatenates strings; ensure that this preserves the expected format (e.g., valid JSON) for function arguments.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 50% The comment is asking the author to ensure that the merging logic preserves the expected format, which is a form of asking the author to double-check their work. This violates the rule against asking the author to ensure behavior is intended or tested.

5. frontend/src/lib/components/copilot/lib.ts:650

Draft comment:
The PR title mentions caching prompts for anthropic models, but no caching logic is visible in these changes. Please confirm that caching is implemented in the corresponding anthropic functions.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 50% The comment is asking the PR author to confirm whether caching is implemented, which violates the rule against asking for confirmation of intentions or ensuring behavior. It does not provide a specific suggestion or point out a specific issue in the code.

Workflow ID: wflow_IL370nKEfztP0UtU

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

centdix · 2025-08-29T17:23:44Z

/ai review this PR

claude · 2025-08-29T17:28:15Z

Claude finished @centdix's task —— View job

PR Review: feat(aichat): cache prompts when using anthropic models

Todo List:

Read and analyze backend changes (ai.rs)
Read and analyze frontend Anthropic integration (anthropic.ts)
Review frontend lib.ts changes for completion handling
Review AIChatManager.svelte.ts refactoring
Check shared.ts and other supporting files
Provide comprehensive code review feedback

Review Summary

This PR successfully implements Anthropic prompt caching by adding the official Anthropic SDK and introducing proper message conversion between OpenAI and Anthropic formats. The implementation looks solid overall with good separation of concerns and proper error handling. Here are my detailed findings:

✅ Strengths

Good Architecture: Clean separation between OpenAI and Anthropic completion handling with dedicated functions (getAnthropicCompletion, parseAnthropicCompletion)
Proper Caching Implementation:
- Backend correctly detects Anthropic SDK usage via X-Anthropic-SDK header (ai.rs:159)
- Frontend adds proper cache control markers (ephemeral) to system messages, last user message, and tools (anthropic.ts:153, anthropic.ts:226, anthropic.ts:260)
Message Format Conversion: Well-implemented conversion between OpenAI and Anthropic message formats with proper handling of tool calls and system messages
Error Handling: Proper try/catch blocks and error propagation throughout the completion flow

⚠️ Issues Found

High Priority

JSON Parsing Vulnerability (anthropic.ts:181-184)
```
try {
  input = JSON.parse(toolCall.function.arguments || '{}')
} catch (e) {
  console.error('Failed to parse tool call arguments', e)
}
```
Issue: Silent failures on malformed JSON could lead to tools receiving empty objects instead of intended arguments, potentially causing runtime errors.
Fix: Either throw the error or provide a more robust fallback.
Type Safety Concern (AIChatManager.svelte.ts:426)
```
const continueCompletion = await parseFn(
  completion as any, // ← Type assertion bypasses safety
  callbacks,
  // ...
)
```
Issue: Using as any defeats TypeScript's type checking and could hide interface mismatches.
Fix: Define proper union types or use type guards.

Medium Priority

Incomplete URL Handling (ai.rs:171-176)
```
} else if is_anthropic_sdk {
    let truncated_base_url = base_url.trim_end_matches("/v1");
    format!("{}/{}", truncated_base_url, path)
}
```
Issue: This logic assumes base URLs end with /v1, but what if they don't? The URL construction could be malformed.
Fix: More robust URL path joining logic.

Missing Error Context (anthropic.ts:97-100)

completion.on('error', (error: any) => {
  console.error('Anthropic stream error:', error)
  error = error // This just reassigns the same value
})

Issue: The error variable assignment doesn't provide additional context, making debugging harder.

Low Priority

Hardcoded API Version (anthropic.ts:42)
```
'anthropic-version': '2023-06-01'
```
Issue: API version is hardcoded and might need updates.
Suggestion: Consider making this configurable.
Potential Performance Impact: Adding cache control to every message/tool could increase payload size, though this is likely acceptable for the caching benefits.

📝 Observations

Dependencies: Clean addition of @anthropic-ai/sdk@^0.60.0 and OpenAI update to ^5.16.0
Backend Caching: The backend caching logic in ai.rs looks correct with proper cache key generation and expiration handling
Tool Compatibility: Good handling of tool call format conversion between the two APIs
Stream Handling: Both completion parsers properly handle streaming responses with appropriate callbacks

🔧 Recommendations

Fix the JSON parsing in anthropic.ts - This should be addressed before merging
Remove the as any type assertion - Add proper typing
Improve URL construction logic in the backend
Add integration tests for the Anthropic completion flow if not already present
Consider adding fallback logic for cache control failures

✅ Overall Assessment

This is a well-structured PR that correctly implements Anthropic prompt caching with good separation of concerns. The main issues are relatively minor and should be straightforward to address. The caching implementation follows Anthropic's recommended patterns and should provide the intended performance benefits.

Recommendation: Approve with requested changes - address the JSON parsing and type safety issues before merging.

ellipsis-dev

Important

Looks good to me! 👍

Reviewed 2f4c8c1 in 47 seconds. Click for details.

Reviewed 17 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 1 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. frontend/src/lib/components/copilot/chat/anthropic.ts:97

Draft comment:
Renamed the error callback parameter from ‘error’ to ‘e’ to avoid shadowing the outer variable. This ensures that the outer 'error' is properly assigned.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 50% This comment is purely informative, explaining a change that was made without suggesting any action or asking for confirmation. It doesn't provide any actionable feedback or raise any concerns about the code.

Workflow ID: wflow_In8N1mij6wShyaSA

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

ellipsis-dev

Important

Looks good to me! 👍

Reviewed b89c2cd in 55 seconds. Click for details.

Reviewed 28 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 2 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. frontend/src/lib/components/copilot/chat/flow/core.ts:641

Draft comment:
Why disable strict mode for setForLoopOptions? Add a comment explaining this change.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 50% The comment is asking the PR author to explain their change, which violates the rule against asking for explanations or confirmations of intent. It does not provide a specific code suggestion or identify a potential issue with the code itself.

2. frontend/src/lib/components/copilot/chat/flow/core.ts:662

Draft comment:
Why disable strict mode for setModuleControlOptions? Please document the rationale.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 50% The comment is asking the PR author to document their rationale for disabling strict mode, which is not allowed according to the rules. It is not providing a specific code suggestion or asking for a test to be written. Therefore, this comment should be removed.

Workflow ID: wflow_CiTsdwdlThcKitsT

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

working draft

c7f487e

centdix added 4 commits August 28, 2025 13:28

add caching

0ba58c1

add anthropic.ts

53b2298

cleaning

79cf5db

better typing

0bbb368

windmill-labs deleted a comment from claude bot Aug 28, 2025

centdix marked this pull request as ready for review August 28, 2025 15:13

centdix requested review from alpetric, hugocasa and rubenfiszel as code owners August 28, 2025 15:13

ellipsis-dev bot reviewed Aug 28, 2025

View reviewed changes

frontend/src/lib/components/copilot/chat/anthropic.ts Outdated Show resolved Hide resolved

nit

07e4e0d

ellipsis-dev bot reviewed Aug 28, 2025

View reviewed changes

centdix added 9 commits August 29, 2025 09:56

use anthropic sdk

1881ad4

draft

5e96efd

draft parser

97dabd3

cleaning

bd8b708

update openai sdk + use stream method

901cb71

fix anthropic base url

9fd5b9e

cleaning

092541a

better error handling + cleaning

9632f65

rollback to old parsing

93ffdab

ellipsis-dev bot reviewed Aug 29, 2025

View reviewed changes

windmill-labs deleted a comment from claude bot Aug 29, 2025

nit

2f4c8c1

ellipsis-dev bot reviewed Aug 29, 2025

View reviewed changes

set strict to false for mistral

b89c2cd

ellipsis-dev bot reviewed Aug 29, 2025

View reviewed changes

hugocasa approved these changes Sep 1, 2025

View reviewed changes

hugocasa added this pull request to the merge queue Sep 1, 2025

Merged via the queue into main with commit 8b3ae19 Sep 1, 2025
11 checks passed

hugocasa deleted the fg/cache-anthropic branch September 1, 2025 08:55

github-actions bot locked and limited conversation to collaborators Sep 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(aichat): cache prompts when using anthropic models #6489

feat(aichat): cache prompts when using anthropic models #6489

Uh oh!

centdix commented Aug 28, 2025 •

edited by ellipsis-dev bot

Loading

Uh oh!

cloudflare-workers-and-pages bot commented Aug 28, 2025 •

edited

Loading

Uh oh!

ellipsis-dev bot left a comment

Uh oh!

Uh oh!

ellipsis-dev bot left a comment

Uh oh!

ellipsis-dev bot left a comment

Uh oh!

centdix commented Aug 29, 2025

Uh oh!

claude bot commented Aug 29, 2025 •

edited

Loading

Uh oh!

ellipsis-dev bot left a comment

Uh oh!

ellipsis-dev bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat(aichat): cache prompts when using anthropic models #6489

feat(aichat): cache prompts when using anthropic models #6489

Uh oh!

Conversation

centdix commented Aug 28, 2025 • edited by ellipsis-dev bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cloudflare-workers-and-pages bot commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying windmill with Cloudflare Pages

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

centdix commented Aug 29, 2025

Uh oh!

claude bot commented Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: feat(aichat): cache prompts when using anthropic models

Review Summary

✅ Strengths

⚠️ Issues Found

High Priority

Medium Priority

Low Priority

📝 Observations

🔧 Recommendations

✅ Overall Assessment

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

centdix commented Aug 28, 2025 •

edited by ellipsis-dev bot

Loading

cloudflare-workers-and-pages bot commented Aug 28, 2025 •

edited

Loading

claude bot commented Aug 29, 2025 •

edited

Loading