Skip to content

Conversation

centdix
Copy link
Collaborator

@centdix centdix commented Aug 28, 2025

fixes #6396


Important

Add caching for prompts when using Anthropic models, with updates to backend and frontend components for improved handling and performance.

  • Behavior:
    • Cache prompts for Anthropic models in ai.rs by checking X-Anthropic-SDK header.
    • Update URL handling for Anthropic SDK in AIRequestConfig.
  • Frontend:
    • Add @anthropic-ai/sdk dependency in package.json.
    • Implement getAnthropicCompletion and parseAnthropicCompletion in anthropic.ts for handling Anthropic model completions.
    • Modify AIChatManager.svelte.ts to use Anthropic-specific functions for completions.
  • Misc:
    • Update openai dependency version in package.json.
    • Refactor tool handling in shared.ts and apiTools.ts to support Anthropic models.

This description was created by Ellipsis for b89c2cd. You can customize this summary. It will automatically update as commits are pushed.

Copy link

cloudflare-workers-and-pages bot commented Aug 28, 2025

Deploying windmill with  Cloudflare Pages  Cloudflare Pages

Latest commit: b89c2cd
Status: ✅  Deploy successful!
Preview URL: https://8b27b482.windmill.pages.dev
Branch Preview URL: https://fg-cache-anthropic.windmill.pages.dev

View logs

@windmill-labs windmill-labs deleted a comment from claude bot Aug 28, 2025
@centdix centdix marked this pull request as ready for review August 28, 2025 15:13
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Changes requested ❌

Reviewed everything up to 0bbb368 in 2 minutes and 18 seconds. Click for details.
  • Reviewed 409 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 4 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. frontend/src/lib/components/copilot/chat/anthropic.ts:265
  • Draft comment:
    Consider verifying that concatenating JSON fragments in currentToolCall.args produces valid JSON. It may be beneficial to accumulate fragments and parse once complete, or add validation to ensure the final JSON is well-formed.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% This is a real potential issue - concatenating JSON fragments could produce invalid JSON if the API sends malformed or incomplete chunks. However, since this is an adapter for Anthropic's official API, we should be able to trust that their streaming format sends valid JSON fragments that concatenate correctly. The API contract likely guarantees this. I could be wrong about trusting the API - even official APIs can have bugs or edge cases. Invalid JSON could cause runtime errors. While API bugs are possible, adding validation here would add complexity for an edge case that should be handled by the API provider. If there are issues, they should be fixed in Anthropic's implementation. The comment raises a valid concern but suggests adding complexity to handle what should be guaranteed by the API contract. The comment should be removed.
2. frontend/src/lib/components/copilot/chat/anthropic.ts:206
  • Draft comment:
    Using Date.now() for messageId may lead to collisions under high load. Consider incorporating additional entropy (e.g., a random component) for more robust unique ID generation.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% The comment raises a valid point about potential ID collisions under high load. However, looking at the code context, this is a stream converter function where messageId is only used within a single stream conversion. Since each stream conversion creates its own messageId and the IDs only need to be unique within that stream, collisions across concurrent requests aren't actually a problem. The current implementation is sufficient for its purpose. I could be wrong about the scope of the messageId - perhaps it needs to be globally unique across all streams for some reason not visible in this file. The OpenAI API might have requirements about message ID uniqueness. Looking at how messageId is used, it's only referenced within this single stream conversion function and doesn't persist beyond it. The OpenAI API documentation doesn't specify any requirements about global uniqueness for streaming chunk IDs. The comment should be deleted because the current implementation is sufficient for its purpose - the messageId only needs to be unique within a single stream conversion, not globally.
3. frontend/src/lib/components/copilot/lib.ts:639
  • Draft comment:
    Ensure that config.max_tokens is defined before using it in anthropicRequest. Consider providing a default or a validation check to avoid potential runtime issues.
  • Reason this comment was not posted:
    Comment looked like it was already resolved.
4. frontend/src/lib/components/copilot/lib.ts:757
  • Draft comment:
    Fix the typo in the comment ('udpating' should be 'updating').
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.

Workflow ID: wflow_pCcFxRmRrXpFyYUf

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed 07e4e0d in 1 minute and 28 seconds. Click for details.
  • Reviewed 32 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. frontend/src/lib/components/copilot/chat/anthropic.ts:116
  • Draft comment:
    Good use of try/catch to avoid JSON.parse errors. Consider checking if toolCall.function.arguments is actually a string before parsing, and possibly using console.warn instead of console.error, to indicate non-critical parsing failures.
  • Reason this comment was not posted:
    Confidence changes required: 50% <= threshold 50% None
2. frontend/src/lib/components/copilot/chat/anthropic.ts:141
  • Draft comment:
    Removing the fallback empty string for tool_use_id might lead to undefined values. Ensure that message.tool_call_id is always defined or provide a safe default.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% Looking at the type definitions, tool_use_id is required in the AnthropicContentBlock interface (line 22). The code is handling a 'tool' role message, which by definition should have a tool_call_id. If tool_call_id was undefined, it would indicate a malformed message. Adding a fallback would hide this error rather than expose it. I could be wrong about the OpenAI message format - maybe tool messages can legitimately have undefined tool_call_ids in some cases? The type definitions for the input format aren't shown. Even if undefined tool_call_ids are possible in OpenAI format, silently converting them to empty strings would be the wrong approach - it would be better to fail fast and fix the root cause. The comment should be deleted. Adding a fallback would mask potential errors rather than expose them, and the tool_use_id should be required by the message format.

Workflow ID: wflow_vx4t26YdFfYU2Y6k

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed 93ffdab in 3 minutes and 24 seconds. Click for details.
  • Reviewed 208 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 5 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. frontend/src/lib/components/copilot/chat/AIChatManager.svelte.ts:439
  • Draft comment:
    Removing the error log in the catch block may hinder debugging. Consider conditionally logging errors (when not aborted) to aid troubleshooting.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% Looking at the code: 1. The error is still thrown if not aborted 2. This means it will be caught by higher level error handlers 3. Those handlers likely have their own logging 4. The removed log was redundant with the throw 5. The comment is suggesting a conditional log, but that's not clearly better I could be wrong about higher level error handling - maybe this error gets swallowed somewhere and the log was important. The error might contain sensitive info that shouldn't be logged. The code explicitly throws the error when not aborted, so it will propagate up. If logging is needed, it should be handled at a higher level consistently rather than scattered throughout catch blocks. The comment should be deleted. The error handling is appropriate without the log since errors are thrown and can be handled/logged at a higher level.
2. frontend/src/lib/components/copilot/lib.ts:675
  • Draft comment:
    Nice use of async iteration for processing the streaming response. Ensure any errors during iteration are handled as needed.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 50% The comment starts with a compliment about the use of async iteration, which is purely informative. The second part of the comment suggests ensuring error handling, which falls under asking the author to ensure behavior is intended or tested. This violates the rules.
3. frontend/src/lib/components/copilot/lib.ts:688
  • Draft comment:
    Resetting the 'answer' variable when tool calls are detected might discard accumulated text tokens. Please verify that this behavior is intentional.
  • Reason this comment was not posted:
    Comment looked like it was already resolved.
4. frontend/src/lib/components/copilot/lib.ts:697
  • Draft comment:
    The merging logic for tool call arguments concatenates strings; ensure that this preserves the expected format (e.g., valid JSON) for function arguments.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 50% The comment is asking the author to ensure that the merging logic preserves the expected format, which is a form of asking the author to double-check their work. This violates the rule against asking the author to ensure behavior is intended or tested.
5. frontend/src/lib/components/copilot/lib.ts:650
  • Draft comment:
    The PR title mentions caching prompts for anthropic models, but no caching logic is visible in these changes. Please confirm that caching is implemented in the corresponding anthropic functions.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 50% The comment is asking the PR author to confirm whether caching is implemented, which violates the rule against asking for confirmation of intentions or ensuring behavior. It does not provide a specific suggestion or point out a specific issue in the code.

Workflow ID: wflow_IL370nKEfztP0UtU

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

@windmill-labs windmill-labs deleted a comment from claude bot Aug 29, 2025
@centdix
Copy link
Collaborator Author

centdix commented Aug 29, 2025

/ai review this PR

Copy link
Contributor

claude bot commented Aug 29, 2025

Claude finished @centdix's task —— View job


PR Review: feat(aichat): cache prompts when using anthropic models

Todo List:

  • Read and analyze backend changes (ai.rs)
  • Read and analyze frontend Anthropic integration (anthropic.ts)
  • Review frontend lib.ts changes for completion handling
  • Review AIChatManager.svelte.ts refactoring
  • Check shared.ts and other supporting files
  • Provide comprehensive code review feedback

Review Summary

This PR successfully implements Anthropic prompt caching by adding the official Anthropic SDK and introducing proper message conversion between OpenAI and Anthropic formats. The implementation looks solid overall with good separation of concerns and proper error handling. Here are my detailed findings:

Strengths

  1. Good Architecture: Clean separation between OpenAI and Anthropic completion handling with dedicated functions (getAnthropicCompletion, parseAnthropicCompletion)

  2. Proper Caching Implementation:

    • Backend correctly detects Anthropic SDK usage via X-Anthropic-SDK header (ai.rs:159)
    • Frontend adds proper cache control markers (ephemeral) to system messages, last user message, and tools (anthropic.ts:153, anthropic.ts:226, anthropic.ts:260)
  3. Message Format Conversion: Well-implemented conversion between OpenAI and Anthropic message formats with proper handling of tool calls and system messages

  4. Error Handling: Proper try/catch blocks and error propagation throughout the completion flow

⚠️ Issues Found

High Priority

  1. JSON Parsing Vulnerability (anthropic.ts:181-184)

    try {
      input = JSON.parse(toolCall.function.arguments || '{}')
    } catch (e) {
      console.error('Failed to parse tool call arguments', e)
    }

    Issue: Silent failures on malformed JSON could lead to tools receiving empty objects instead of intended arguments, potentially causing runtime errors.
    Fix: Either throw the error or provide a more robust fallback.

  2. Type Safety Concern (AIChatManager.svelte.ts:426)

    const continueCompletion = await parseFn(
      completion as any, // ← Type assertion bypasses safety
      callbacks,
      // ...
    )

    Issue: Using as any defeats TypeScript's type checking and could hide interface mismatches.
    Fix: Define proper union types or use type guards.

Medium Priority

  1. Incomplete URL Handling (ai.rs:171-176)

    } else if is_anthropic_sdk {
        let truncated_base_url = base_url.trim_end_matches("/v1");
        format!("{}/{}", truncated_base_url, path)
    }

    Issue: This logic assumes base URLs end with /v1, but what if they don't? The URL construction could be malformed.
    Fix: More robust URL path joining logic.

  2. Missing Error Context (anthropic.ts:97-100)

    completion.on('error', (error: any) => {
      console.error('Anthropic stream error:', error)
      error = error // This just reassigns the same value
    })

    Issue: The error variable assignment doesn't provide additional context, making debugging harder.

Low Priority

  1. Hardcoded API Version (anthropic.ts:42)

    'anthropic-version': '2023-06-01'

    Issue: API version is hardcoded and might need updates.
    Suggestion: Consider making this configurable.

  2. Potential Performance Impact: Adding cache control to every message/tool could increase payload size, though this is likely acceptable for the caching benefits.

📝 Observations

  1. Dependencies: Clean addition of @anthropic-ai/sdk@^0.60.0 and OpenAI update to ^5.16.0

  2. Backend Caching: The backend caching logic in ai.rs looks correct with proper cache key generation and expiration handling

  3. Tool Compatibility: Good handling of tool call format conversion between the two APIs

  4. Stream Handling: Both completion parsers properly handle streaming responses with appropriate callbacks

🔧 Recommendations

  1. Fix the JSON parsing in anthropic.ts - This should be addressed before merging
  2. Remove the as any type assertion - Add proper typing
  3. Improve URL construction logic in the backend
  4. Add integration tests for the Anthropic completion flow if not already present
  5. Consider adding fallback logic for cache control failures

Overall Assessment

This is a well-structured PR that correctly implements Anthropic prompt caching with good separation of concerns. The main issues are relatively minor and should be straightforward to address. The caching implementation follows Anthropic's recommended patterns and should provide the intended performance benefits.

Recommendation: Approve with requested changes - address the JSON parsing and type safety issues before merging.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed 2f4c8c1 in 47 seconds. Click for details.
  • Reviewed 17 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. frontend/src/lib/components/copilot/chat/anthropic.ts:97
  • Draft comment:
    Renamed the error callback parameter from ‘error’ to ‘e’ to avoid shadowing the outer variable. This ensures that the outer 'error' is properly assigned.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 50% This comment is purely informative, explaining a change that was made without suggesting any action or asking for confirmation. It doesn't provide any actionable feedback or raise any concerns about the code.

Workflow ID: wflow_In8N1mij6wShyaSA

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed b89c2cd in 55 seconds. Click for details.
  • Reviewed 28 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. frontend/src/lib/components/copilot/chat/flow/core.ts:641
  • Draft comment:
    Why disable strict mode for setForLoopOptions? Add a comment explaining this change.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 50% The comment is asking the PR author to explain their change, which violates the rule against asking for explanations or confirmations of intent. It does not provide a specific code suggestion or identify a potential issue with the code itself.
2. frontend/src/lib/components/copilot/chat/flow/core.ts:662
  • Draft comment:
    Why disable strict mode for setModuleControlOptions? Please document the rationale.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 50% The comment is asking the PR author to document their rationale for disabling strict mode, which is not allowed according to the rules. It is not providing a specific code suggestion or asking for a test to be written. Therefore, this comment should be removed.

Workflow ID: wflow_CiTsdwdlThcKitsT

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

@hugocasa hugocasa added this pull request to the merge queue Sep 1, 2025
Merged via the queue into main with commit 8b3ae19 Sep 1, 2025
11 checks passed
@hugocasa hugocasa deleted the fg/cache-anthropic branch September 1, 2025 08:55
@github-actions github-actions bot locked and limited conversation to collaborators Sep 1, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feature: Support cache for Anthropic model
2 participants