refactor(inference): modular multi-provider AI architecture#2343
refactor(inference): modular multi-provider AI architecture#2343camjac251 wants to merge 1 commit intokarakeep-app:mainfrom
Conversation
|
Warning Rate limit exceeded@camjac251 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 1 minutes and 30 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (68)
WalkthroughAdds multi-provider AI inference: new provider clients, factories, types, config/env vars, clientConfig fields, provider-status UI and i18n strings, docs updates, tests, tooling changes, and bumps zod plus new SDK deps. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~65 minutes Possibly related PRs
Pre-merge checks❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 16
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
apps/web/lib/i18n/locales/pt_BR/translation.json (1)
173-173: Critical: Duplicate JSON key"summarization".The key
"summarization"appears twice within the same"ai"object (lines 173 and 188). This violates JSON structure—duplicate keys cause undefined behavior, and most parsers will silently use the last occurrence, making line 173 inaccessible.🔎 Proposed fix: Remove the duplicate at line 188
"auto_summarization": "Resumo automático", - "summarization": "Resumo", "provider_status": "Status do provedor",The key already exists at line 173, so line 188 should be removed entirely.
Also applies to: 188-188
apps/web/lib/i18n/locales/cs/translation.json (1)
105-105: Critical: Duplicate JSON key "summarization".The key
"summarization"appears twice in thesettings.aiobject (lines 105 and 120). JSON parsers typically use the last occurrence and silently discard the first, which will cause data loss. This is likely unintentional.🔎 Recommended fix
Review the intended structure and remove the duplicate. If line 105 should remain as the general "Shrnutí" entry and line 120 was meant to be a different key, remove line 120:
"auto_tagging_description": "Automaticky generovat štítky pro tvoje záložky pomocí umělý inteligence.", "camelCase": "camelCase", "auto_summarization": "Automatický shrnutí", - "summarization": "Shrnutí", "provider_status": "Stav poskytovatele",Alternatively, if both entries serve different purposes, one should be renamed to a unique key.
Also applies to: 120-120
apps/web/components/settings/AISettings.tsx (1)
379-383: Add missing translation key to all locale files.The
no_preferencekey exists only inen/translation.jsonbut is missing from 29 other locale files. This will cause non-English users to see the raw translation keysettings.ai.no_preferenceinstead of translated text. Add this key to all locale files undersettings.ai:"no_preference": "No preference"docs/docs/03-configuration/01-environment-variables.md (1)
85-91: Update intro text to reflect all supported providers.The intro states "Either
OPENAI_API_KEYorOLLAMA_BASE_URLneed to be set" but now Anthropic and Google are also supported. This should be updated to reflect all provider options.🔎 Proposed fix
-Either `OPENAI_API_KEY` or `OLLAMA_BASE_URL` need to be set for automatic tagging to be enabled. Otherwise, automatic tagging will be skipped. +One of `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, `GEMINI_API_KEY`, or `OLLAMA_BASE_URL` must be set for automatic tagging to be enabled. Otherwise, automatic tagging will be skipped.
♻️ Duplicate comments (4)
apps/browser-extension/package.json (1)
41-41: LGTM - Part of coordinated zod upgrade.This change is consistent with the monorepo-wide zod upgrade to ^3.25.0 for Anthropic SDK compatibility. Verification requested in apps/mobile/package.json applies here as well.
apps/web/package.json (1)
104-104: LGTM - Part of coordinated zod upgrade.This change is consistent with the monorepo-wide zod upgrade to ^3.25.0 for Anthropic SDK compatibility. Verification requested in apps/mobile/package.json applies here as well.
packages/trpc/package.json (1)
29-29: LGTM - Part of coordinated zod upgrade.This change is consistent with the monorepo-wide zod upgrade to ^3.25.0 for Anthropic SDK compatibility. Verification requested in apps/mobile/package.json applies here as well.
packages/e2e_tests/package.json (1)
24-24: LGTM - Part of coordinated zod upgrade.This change is consistent with the monorepo-wide zod upgrade to ^3.25.0 for Anthropic SDK compatibility. Verification requested in apps/mobile/package.json applies here as well.
🟡 Minor comments (11)
apps/web/lib/i18n/locales/cs/translation.json-119-119 (1)
119-119: Fix gender agreement in Czech translation."Automatický shrnutí" has incorrect gender agreement. The adjective "Automatický" (masculine) does not agree with the neuter noun "shrnutí".
🔎 Proposed fix
- "auto_summarization": "Automatický shrnutí", + "auto_summarization": "Automatické shrnutí",docs/docs/03-configuration/02-different-ai-providers.md-9-14 (1)
9-14: Add language identifiers to fenced code blocks.All fenced code blocks should specify a language identifier for proper syntax highlighting and better readability. Since these are environment variable configuration examples, use
bashorshell.🔎 Example fix for the first few code blocks
-``` +```bash INFERENCE_PROVIDER=openai # Use OpenAI INFERENCE_PROVIDER=anthropic # Use Anthropic Claude INFERENCE_PROVIDER=google # Use Google Gemini INFERENCE_PROVIDER=ollama # Use Ollama (local)-
+bash
INFERENCE_PROVIDER=openai # Optional, auto-detected if OPENAI_API_KEY is set
OPENAI_API_KEY=sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxApply the same pattern to all other fenced code blocks in this file (lines 60, 74, 93, 103, 123, 135, 176, 192, 210, 225, 247, 265). </details> Based on static analysis hints. Also applies to: 20-27, 60-68, 74-88, 93-99, 103-115, 123-129, 135-150, 176-186, 192-199, 210-213, 225-237, 247-257, 265-272 </blockquote></details> <details> <summary>apps/web/lib/i18n/locales/pl/translation.json-201-209 (1)</summary><blockquote> `201-209`: **Duplicate key `image_tagging` detected.** Line 202 duplicates the `image_tagging` key that already exists at line 187. This should be removed. <details> <summary>🔎 Proposed fix</summary> ```diff "auto_summarization": "Automatyczne podsumowywanie", - "image_tagging": "Tagowanie obrazów", "provider_status": "Status dostawcy",apps/web/lib/i18n/locales/fi/translation.json-196-204 (1)
196-204: Duplicate keysummarizationdetected.Line 197 duplicates the
summarizationkey that already exists at line 182. This should be removed to prevent the duplicate from overwriting the original value.🔎 Proposed fix
"auto_summarization": "Automaattinen tiivistys", - "summarization": "Yhteenvedon luonti", "provider_status": "Palveluntarjoajan tila",apps/web/lib/i18n/locales/ko/translation.json-395-403 (1)
395-403: Duplicate keysummarizationdetected.Line 396 duplicates the
summarizationkey that already exists at line 381. This should be removed.🔎 Proposed fix
"auto_summarization": "자동 요약", - "summarization": "요약", "provider_status": "제공자 상태",apps/web/lib/i18n/locales/uk/translation.json-205-213 (1)
205-213: Duplicate keytagging_rulesdetected.Line 206 duplicates the
tagging_ruleskey that already exists at line 191. In JSON, duplicate keys are technically allowed but the second value overwrites the first, which appears unintentional here. This line should be removed.🔎 Proposed fix
"auto_summarization": "Автоматичне підсумовування", - "tagging_rules": "Правила тегів", "provider_status": "Статус постачальника",apps/web/lib/i18n/locales/es/translation.json-139-147 (1)
139-147: Duplicate keyimage_taggingdetected.The key
image_taggingis defined twice in this file - once at line 125 and again at line 140. This will cause one of the values to be silently overwritten when parsed.🔎 Proposed fix
Remove the duplicate key at line 140:
"camelCase": "camelCase", "auto_summarization": "Resumen automático", - "image_tagging": "Etiquetado de imágenes", "provider_status": "Estado del proveedor",Committable suggestion skipped: line range outside the PR's diff.
apps/web/lib/i18n/locales/hr/translation.json-208-216 (1)
208-216: Duplicate keyall_taggingdetected.The key
all_taggingis defined twice in this file - once at line 194 and again at line 209. JSON parsers will only retain one value (typically the last one), which could cause unexpected behavior.🔎 Proposed fix
Remove the duplicate key at line 209:
"camelCase": "camelCase", "auto_summarization": "Automatsko sažimanje", - "all_tagging": "Sve oznake", "provider_status": "Status pružatelja",apps/web/lib/i18n/locales/it/translation.json-193-201 (1)
193-201: Duplicate translation keysummarizationdetected.The key
summarizationappears twice in this file (line 179 and line 194). While both have the same value "Riassunto", this duplication should be removed for maintainability.🔎 Proposed fix - remove the duplicate
"camelCase": "camelCase", "auto_summarization": "Riassunto automatico", - "summarization": "Riassunto", "provider_status": "Stato del fornitore",Committable suggestion skipped: line range outside the PR's diff.
apps/web/lib/i18n/locales/da/translation.json-162-170 (1)
162-170: Duplicate translation keyimage_taggingdetected.The key
image_taggingappears twice in this file (line 148 and line 163). JSON parsers will silently use the last occurrence, but this is likely unintended and could cause confusion during maintenance.🔎 Proposed fix - remove the duplicate
"camelCase": "camelCase", - "auto_summarization": "Automatisk opsummering", - "image_tagging": "Billede-tagging", + "auto_summarization": "Automatisk opsummering", "provider_status": "Udbyderstatus",Committable suggestion skipped: line range outside the PR's diff.
tools/compare-models/src/inferenceClient.ts-58-80 (1)
58-80: Missing error handling for emptychoicesarray.If
chatCompletion.choicesis empty, accessingchoices[0]will returnundefined, andchoices[0].message.contentwill throw. Add a defensive check.🔎 Proposed fix
private async inferFromText( prompt: string, model: string, opts: InferenceOptions, ): Promise<InferenceResponse> { const chatCompletion = await this.client.chat.completions.create({ messages: [{ role: "user", content: prompt }], model: model, response_format: opts.schema ? zodResponseFormat(opts.schema, "schema") : { type: "json_object" as const }, }); - const response = chatCompletion.choices[0].message.content; + const choice = chatCompletion.choices[0]; + if (!choice) { + throw new Error("Got no choices from model response"); + } + const response = choice.message.content; if (!response) { throw new Error("Got no message content from model"); }
🧹 Nitpick comments (21)
packages/shared/index.ts (1)
1-3: Good practice: explicit path imports over barrel exports.The pattern of importing modules via specific paths (e.g.,
@karakeep/shared/config) rather than re-exporting everything through an index file is a solid choice for tree-shaking, build performance, and dependency clarity. Theexport {}statement correctly ensures this file is treated as an ES module.One optional consideration: if your
package.jsonalready defines anexportsfield mapping these subpaths, this index file may not be strictly necessary. You could verify whether removing it would affect the module resolution.apps/web/lib/i18n/locales/cs/translation.json (1)
127-127: Consider AI/ML-specific translation for "embeddings"."Vložení" (insertion) may not be the most idiomatic translation for "embeddings" in the AI/ML context, where it typically refers to vector representations. Consider using "embeddingy" (borrowed term commonly used in Czech AI/ML communities) or "vektorové reprezentace" (vector representations) for better domain clarity.
packages/shared/config.ts (1)
543-548: Consider documenting the read-only nature of exposed provider info.The comment on line 543 mentions "Provider info for read-only display", which is good. Consider whether these fields could potentially leak sensitive information or if additional documentation is needed for consumers of this client config.
This is just a documentation/clarity suggestion—the implementation looks correct for displaying provider status in the UI (as seen in AISettings.tsx).
packages/shared/inference/factory.test.ts (1)
1-159: Comprehensive factory test coverage looks excellent!The test suite thoroughly covers:
- All four inference providers (OpenAI, Anthropic, Google, Ollama)
- All three embedding providers (OpenAI, Google, Ollama)
- Null and unknown provider edge cases
- Proper verification that Anthropic returns null for embeddings (no embedding support)
The mocking strategy is sound and prevents real API client construction. The use of
beforeEachwithvi.clearAllMocks()ensures test isolation.Optional: Consider helper for type assertions
The type assertions on lines 88, 97, 143, and 152 are necessary for testing runtime behavior but bypass TypeScript's safety. Consider extracting a test helper to make the intent clearer:
+function setInvalidProvider(config: any, type: 'inference' | 'embedding', value: string | null) { + config[type].provider = value; +} + it("should return null when provider is not configured", async () => { const { default: serverConfig } = await import("../config"); - (serverConfig.inference as { provider: string | null }).provider = null; + setInvalidProvider(serverConfig, 'inference', null); const client = InferenceClientFactory.build(); expect(client).toBeNull(); });This is purely optional as the current approach is acceptable for test code.
packages/shared/inference/openai.test.ts (2)
278-282: Config cleanup is incomplete - missingtextModelrestoration.The
afterEachblock restoresoutputSchemabut thebeforeEachat line 273 also modifiestextModel. While it may not affect other tests in this suite, it's good practice to restore all modified values for test isolation.🔎 Proposed fix
afterEach(async () => { // Restore original config values to prevent test pollution const { default: serverConfig } = await import("../config"); serverConfig.inference.outputSchema = "structured"; + serverConfig.inference.textModel = "gpt-4o-mini"; });
330-334: Config cleanup is incomplete - missingtextModelrestoration.Same issue as the JSON output test suite -
textModelis modified inbeforeEach(line 325) but not restored inafterEach.🔎 Proposed fix
afterEach(async () => { // Restore original config values to prevent test pollution const { default: serverConfig } = await import("../config"); serverConfig.inference.outputSchema = "structured"; + serverConfig.inference.textModel = "gpt-4o-mini"; });packages/shared/inference/ollama.test.ts (2)
251-277: MissingafterEachcleanup for config restoration.The
beforeEachmodifiesserverConfig.inference.outputSchemato"json"but there's noafterEachto restore it. This could cause test pollution if test execution order changes.🔎 Proposed fix
beforeEach(async () => { vi.clearAllMocks(); // Set output schema to json (not structured) const { default: serverConfig } = await import("../config"); serverConfig.inference.outputSchema = "json"; client = new OllamaInferenceClient(); }); + afterEach(async () => { + const { default: serverConfig } = await import("../config"); + serverConfig.inference.outputSchema = "structured"; + }); + it("should use 'json' format string when outputSchema is json", async () => {
279-306: MissingafterEachcleanup for config restoration.Same issue as the JSON output test suite -
outputSchemais modified to"plain"but not restored.🔎 Proposed fix
beforeEach(async () => { vi.clearAllMocks(); // Set output schema to plain const { default: serverConfig } = await import("../config"); serverConfig.inference.outputSchema = "plain"; client = new OllamaInferenceClient(); }); + afterEach(async () => { + const { default: serverConfig } = await import("../config"); + serverConfig.inference.outputSchema = "structured"; + }); + it("should not set format when outputSchema is plain", async () => {packages/shared/inference/anthropic.test.ts (1)
309-342: Consider adding test for unsupported media type.The
supportsStructuredOutputstests are thorough. Consider adding a test in theinferFromImagedescribe block that verifies the error thrown for unsupported image types (e.g.,image/tiff).it("should throw error for unsupported image type", async () => { await expect( client.inferFromImage("describe", "image/tiff", "imagedata", {}) ).rejects.toThrow("Unsupported image type"); });packages/shared/inference/openai.ts (2)
316-360: Consider defining a type for the Responses API request.The
anytype with eslint-disable is pragmatic but reduces type safety. As the Responses API stabilizes, consider defining a proper interface.interface ResponsesApiRequest { model: string; input: string | unknown[]; text: ReturnType<typeof this.getResponsesTextFormat>; store: boolean; temperature: number; top_p: number; reasoning?: { effort: string }; max_output_tokens?: number; previous_response_id?: string; }
389-398: Silent JSON parsing failure may hide issues.When JSON parsing fails, the code silently falls back to the raw
outputText. This is reasonable for resilience, but consider logging a warning in development mode to help debug malformed responses.packages/shared/inference/google.test.ts (1)
233-268: Config mutation in tests may cause flakiness if tests run in parallel.The pattern of mutating
serverConfig.inference.outputSchemainbeforeEach/afterEachcould cause issues if Vitest runs tests concurrently. Consider usingvi.doMockwith a fresh import for each test or explicitly setting--no-threadsfor this test file.🔎 Alternative approach using isolated mocks
describe("GoogleGeminiInferenceClient with plain output", () => { let client: GoogleGeminiInferenceClient; - beforeEach(async () => { - vi.clearAllMocks(); - - // Set output schema to plain text - const { default: serverConfig } = await import("../config"); - serverConfig.inference.outputSchema = "plain"; - - client = new GoogleGeminiInferenceClient(); - }); - - afterEach(async () => { - // Restore original config value to prevent test pollution - const { default: serverConfig } = await import("../config"); - serverConfig.inference.outputSchema = "structured"; - }); + beforeEach(() => { + vi.clearAllMocks(); + // Override the mock for this describe block + vi.doMock("../config", () => ({ + default: { + inference: { + provider: "google", + geminiApiKey: "test-gemini-key", + textModel: "gemini-2.5-flash", + imageModel: "gemini-2.5-flash", + maxOutputTokens: 2048, + outputSchema: "plain", + }, + embedding: { + provider: "google", + textModel: "text-embedding-004", + }, + }, + })); + });packages/shared/inference/ollama.ts (1)
84-95: Consider usingundefinedinstead ofNaNfor failed token count.Setting
totalTokens = NaNon error could cause unexpected behavior if downstream code performs arithmetic operations. TheInferenceResponsetype declarestotalTokens: number | undefined, soundefinedwould be more consistent with the type contract.🔎 Proposed fix
// https://github.com/ollama/ollama-js/issues/72 - totalTokens = NaN; + totalTokens = undefined; logger.warn( `Got an exception from ollama, will still attempt to deserialize the response we got so far: ${e}`, );Note: This requires changing the return type handling since
totalTokensis initialized asnumber.packages/shared/inference/live-test.ts (2)
186-210: Source file inspection test is fragile and may break on refactoring.This test reads the
openai.tssource file to verify string patterns exist. This approach couples tests to implementation details and will break if the code is refactored (e.g., extracting to constants, changing string format).Consider testing the actual behavior instead - either by calling the client with different model names or by exporting and testing a model detection utility function.
🔎 Alternative: test exported behavior
If model detection logic is important, export a utility function and test it directly:
// In openai.ts - export the detection logic export function shouldUseResponsesApi(model: string): boolean { return model.startsWith("gpt-5") || model.startsWith("o1") || model.startsWith("o3") || model.startsWith("o4"); } // In live-test.ts - test the function await runTest("openai", "responses-model-check", async () => { const { shouldUseResponsesApi } = await import("./openai.js"); if (!shouldUseResponsesApi("gpt-5.2")) throw new Error("Should detect gpt-5.2"); if (!shouldUseResponsesApi("o1-mini")) throw new Error("Should detect o1-mini"); if (shouldUseResponsesApi("gpt-4o")) throw new Error("Should NOT detect gpt-4o"); return "Model detection working correctly"; });
499-513: Minor:{ schema: undefined }can be simplified.Using
{}is cleaner sinceschemais optional inPartial<InferenceOptions>.🔎 Proposed simplification
await runTest(`factory-${provider}`, "inferFromText-plain", async () => { const result = await client.inferFromText( "Summarize in one sentence: TypeScript is a typed superset of JavaScript.", - { schema: undefined }, + {}, );docs/docs/03-configuration/01-environment-variables.md (1)
100-101: Clarify OPENAI_REASONING_EFFORT default behavior.Line 101 states the default is
low, but also mentions thatgpt-5.1defaults tonone. This could be confusing. Consider clarifying whether this env var overrides model defaults or only applies when the model doesn't have its own default.Consider rewording to:
Reasoning effort level for GPT-5 models when using Responses API. Options:
none,minimal,low,medium,high,xhigh. Default islow. Note: Some models have their own defaults (gpt-5.1usesnone,gpt-5-proonly supportshigh).packages/shared/inference/google.ts (4)
23-45: Consider validatingoutputSchemaconfig value.The function reads
serverConfig.inference.outputSchemaand branches on"plain"vs any other value. If an invalid value is configured, it silently defaults to JSON mode which may not be the intended behavior.🔎 Suggested improvement
function buildGenerationConfig( opts: InferenceOptions, ): Record<string, unknown> { const config: Record<string, unknown> = { maxOutputTokens: serverConfig.inference.maxOutputTokens, }; // Configure response format based on outputSchema setting - if (serverConfig.inference.outputSchema === "plain") { + const outputSchema = serverConfig.inference.outputSchema; + if (outputSchema === "plain") { config.responseMimeType = "text/plain"; - } else { + } else if (outputSchema === "json" || outputSchema === "structured") { config.responseMimeType = "application/json"; // If a Zod schema is provided, convert it to JSON schema for structured output if (opts.schema) { config.responseJsonSchema = zodToJsonSchema(opts.schema, { $refStrategy: "none", }); } + } else { + throw new Error(`Invalid outputSchema value: ${outputSchema}`); } return config; }
80-110: Unused options:reasoningEffort,previousResponseId,storeare ignored.The
InferenceOptionsinterface includes OpenAI Responses API-specific fields (reasoningEffort,previousResponseId,store) that are silently ignored here. This is acceptable since these are provider-specific, but consider documenting this behavior or logging a warning if these options are passed to a provider that doesn't support them.
169-187: Sequential batch processing may be slow for large inputs.The batching logic processes each batch sequentially with
await. For very large input arrays, this could be slow. Consider usingPromise.allwith rate limiting for parallel processing, though the current sequential approach is safer and avoids rate limit issues.
189-198: Handle empty embeddings array defensively.If the API returns an embedding with
valuesasundefined, it becomes an empty array. This may cause downstream issues if consumers expect embeddings of consistent dimensions.🔎 Suggested improvement
private async embedBatch(inputs: string[]): Promise<EmbeddingResponse> { const result = await this.ai.models.embedContent({ model: serverConfig.embedding.textModel, contents: inputs, }); - const embeddings = (result.embeddings ?? []).map((e) => e.values ?? []); + const embeddings = (result.embeddings ?? []).map((e) => { + if (!e.values || e.values.length === 0) { + throw new Error("Received empty embedding values from Google API"); + } + return e.values; + }); return { embeddings }; }tools/compare-models/src/inferenceClient.ts (1)
5-12: Type duplication withpackages/shared/inference/types.ts.The
InferenceOptionsandInferenceResponseinterfaces duplicate the definitions inpackages/shared/inference/types.ts. Consider importing from the shared package to maintain consistency and avoid drift.🔎 Suggested change
import OpenAI from "openai"; import { zodResponseFormat } from "openai/helpers/zod"; import { z } from "zod"; +import type { InferenceOptions, InferenceResponse } from "@karakeep/shared/inference"; -export interface InferenceOptions { - schema: z.ZodSchema<any> | null; -} - -export interface InferenceResponse { - response: string; - totalTokens: number | undefined; -}
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (68)
apps/browser-extension/package.jsonapps/mcp/package.jsonapps/mobile/package.jsonapps/web/components/settings/AISettings.tsxapps/web/lib/clientConfig.tsxapps/web/lib/i18n/locales/ar/translation.jsonapps/web/lib/i18n/locales/cs/translation.jsonapps/web/lib/i18n/locales/da/translation.jsonapps/web/lib/i18n/locales/de/translation.jsonapps/web/lib/i18n/locales/el/translation.jsonapps/web/lib/i18n/locales/en/translation.jsonapps/web/lib/i18n/locales/en_US/translation.jsonapps/web/lib/i18n/locales/es/translation.jsonapps/web/lib/i18n/locales/fa/translation.jsonapps/web/lib/i18n/locales/fi/translation.jsonapps/web/lib/i18n/locales/fr/translation.jsonapps/web/lib/i18n/locales/ga/translation.jsonapps/web/lib/i18n/locales/gl/translation.jsonapps/web/lib/i18n/locales/hr/translation.jsonapps/web/lib/i18n/locales/hu/translation.jsonapps/web/lib/i18n/locales/it/translation.jsonapps/web/lib/i18n/locales/ja/translation.jsonapps/web/lib/i18n/locales/ko/translation.jsonapps/web/lib/i18n/locales/nb_NO/translation.jsonapps/web/lib/i18n/locales/nl/translation.jsonapps/web/lib/i18n/locales/pl/translation.jsonapps/web/lib/i18n/locales/pt/translation.jsonapps/web/lib/i18n/locales/pt_BR/translation.jsonapps/web/lib/i18n/locales/ru/translation.jsonapps/web/lib/i18n/locales/sk/translation.jsonapps/web/lib/i18n/locales/sl/translation.jsonapps/web/lib/i18n/locales/sv/translation.jsonapps/web/lib/i18n/locales/tr/translation.jsonapps/web/lib/i18n/locales/uk/translation.jsonapps/web/lib/i18n/locales/vi/translation.jsonapps/web/lib/i18n/locales/zh/translation.jsonapps/web/lib/i18n/locales/zhtw/translation.jsonapps/web/package.jsonapps/workers/package.jsondocs/docs/03-configuration/01-environment-variables.mddocs/docs/03-configuration/02-different-ai-providers.mdpackage.jsonpackages/api/package.jsonpackages/benchmarks/package.jsonpackages/e2e_tests/package.jsonpackages/open-api/package.jsonpackages/shared/config.tspackages/shared/index.tspackages/shared/inference.tspackages/shared/inference/anthropic.test.tspackages/shared/inference/anthropic.tspackages/shared/inference/factory.test.tspackages/shared/inference/factory.tspackages/shared/inference/google.test.tspackages/shared/inference/google.tspackages/shared/inference/index.tspackages/shared/inference/live-test.tspackages/shared/inference/ollama.test.tspackages/shared/inference/ollama.tspackages/shared/inference/openai.test.tspackages/shared/inference/openai.tspackages/shared/inference/types.tspackages/shared/package.jsonpackages/trpc/package.jsontools/compare-models/package.jsontools/compare-models/src/bookmarkProcessor.tstools/compare-models/src/config.tstools/compare-models/src/inferenceClient.ts
💤 Files with no reviewable changes (2)
- tools/compare-models/src/config.ts
- packages/shared/inference.ts
🧰 Additional context used
📓 Path-based instructions (6)
**/*.{ts,tsx,js,jsx,json,css,md}
📄 CodeRabbit inference engine (AGENTS.md)
Format code using Prettier according to project standards
Files:
packages/api/package.jsonapps/workers/package.jsonpackages/open-api/package.jsonapps/web/lib/i18n/locales/ja/translation.jsonpackages/e2e_tests/package.jsonapps/web/package.jsonpackages/benchmarks/package.jsonapps/web/lib/i18n/locales/cs/translation.jsonpackages/shared/inference/factory.test.tsapps/web/lib/i18n/locales/de/translation.jsonapps/web/components/settings/AISettings.tsxapps/web/lib/i18n/locales/hu/translation.jsonapps/web/lib/i18n/locales/en/translation.jsonapps/web/lib/i18n/locales/gl/translation.jsonpackages/shared/inference/anthropic.test.tspackages/shared/index.tspackages/shared/inference/openai.test.tspackages/shared/inference/index.tsapps/web/lib/i18n/locales/el/translation.jsonpackages/trpc/package.jsonpackages/shared/inference/openai.tspackages/shared/inference/anthropic.tspackages/shared/inference/google.test.tspackages/shared/inference/factory.tspackages/shared/inference/types.tsapps/web/lib/clientConfig.tsxpackages/shared/inference/ollama.tsapps/web/lib/i18n/locales/en_US/translation.jsonpackages/shared/inference/google.tsapps/web/lib/i18n/locales/sk/translation.jsonapps/mobile/package.jsonapps/web/lib/i18n/locales/nb_NO/translation.jsonapps/web/lib/i18n/locales/zhtw/translation.jsonapps/web/lib/i18n/locales/sv/translation.jsonapps/web/lib/i18n/locales/tr/translation.jsonapps/web/lib/i18n/locales/fa/translation.jsonapps/web/lib/i18n/locales/vi/translation.jsonapps/web/lib/i18n/locales/ru/translation.jsonapps/browser-extension/package.jsonpackages/shared/inference/ollama.test.tsapps/web/lib/i18n/locales/sl/translation.jsonapps/web/lib/i18n/locales/ar/translation.jsonpackages/shared/package.jsonpackages/shared/config.tsapps/mcp/package.jsonapps/web/lib/i18n/locales/uk/translation.jsonapps/web/lib/i18n/locales/pl/translation.jsonapps/web/lib/i18n/locales/zh/translation.jsontools/compare-models/package.jsonapps/web/lib/i18n/locales/it/translation.jsonapps/web/lib/i18n/locales/fi/translation.jsonapps/web/lib/i18n/locales/hr/translation.jsondocs/docs/03-configuration/02-different-ai-providers.mdapps/web/lib/i18n/locales/es/translation.jsontools/compare-models/src/inferenceClient.tsapps/web/lib/i18n/locales/pt_BR/translation.jsonpackages/shared/inference/live-test.tsapps/web/lib/i18n/locales/da/translation.jsonapps/web/lib/i18n/locales/pt/translation.jsonapps/web/lib/i18n/locales/ko/translation.jsonpackage.jsontools/compare-models/src/bookmarkProcessor.tsapps/web/lib/i18n/locales/ga/translation.jsonapps/web/lib/i18n/locales/fr/translation.jsonapps/web/lib/i18n/locales/nl/translation.jsondocs/docs/03-configuration/01-environment-variables.md
**/*.{ts,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
Use TypeScript for type safety in all source files
Files:
packages/shared/inference/factory.test.tsapps/web/components/settings/AISettings.tsxpackages/shared/inference/anthropic.test.tspackages/shared/index.tspackages/shared/inference/openai.test.tspackages/shared/inference/index.tspackages/shared/inference/openai.tspackages/shared/inference/anthropic.tspackages/shared/inference/google.test.tspackages/shared/inference/factory.tspackages/shared/inference/types.tsapps/web/lib/clientConfig.tsxpackages/shared/inference/ollama.tspackages/shared/inference/google.tspackages/shared/inference/ollama.test.tspackages/shared/config.tstools/compare-models/src/inferenceClient.tspackages/shared/inference/live-test.tstools/compare-models/src/bookmarkProcessor.ts
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (AGENTS.md)
Lint code using oxlint and fix issues with
pnpm lint:fix
Files:
packages/shared/inference/factory.test.tsapps/web/components/settings/AISettings.tsxpackages/shared/inference/anthropic.test.tspackages/shared/index.tspackages/shared/inference/openai.test.tspackages/shared/inference/index.tspackages/shared/inference/openai.tspackages/shared/inference/anthropic.tspackages/shared/inference/google.test.tspackages/shared/inference/factory.tspackages/shared/inference/types.tsapps/web/lib/clientConfig.tsxpackages/shared/inference/ollama.tspackages/shared/inference/google.tspackages/shared/inference/ollama.test.tspackages/shared/config.tstools/compare-models/src/inferenceClient.tspackages/shared/inference/live-test.tstools/compare-models/src/bookmarkProcessor.ts
**/*.{test,spec}.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (AGENTS.md)
Use Vitest for writing and running tests
Files:
packages/shared/inference/factory.test.tspackages/shared/inference/anthropic.test.tspackages/shared/inference/openai.test.tspackages/shared/inference/google.test.tspackages/shared/inference/ollama.test.ts
packages/shared/**/*.{ts,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
Organize shared code and types in the
packages/shareddirectory for use across packages
Files:
packages/shared/inference/factory.test.tspackages/shared/inference/anthropic.test.tspackages/shared/index.tspackages/shared/inference/openai.test.tspackages/shared/inference/index.tspackages/shared/inference/openai.tspackages/shared/inference/anthropic.tspackages/shared/inference/google.test.tspackages/shared/inference/factory.tspackages/shared/inference/types.tspackages/shared/inference/ollama.tspackages/shared/inference/google.tspackages/shared/inference/ollama.test.tspackages/shared/config.tspackages/shared/inference/live-test.ts
apps/web/**/*.{ts,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
apps/web/**/*.{ts,tsx}: Use Tailwind CSS for styling in the web application
Use Next.js for building the main web application
Files:
apps/web/components/settings/AISettings.tsxapps/web/lib/clientConfig.tsx
🧠 Learnings (9)
📚 Learning: 2025-11-27T16:02:47.440Z
Learnt from: CR
Repo: karakeep-app/karakeep PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-27T16:02:47.440Z
Learning: Applies to **/*.{ts,tsx,js,jsx,json,css,md} : Format code using Prettier according to project standards
Applied to files:
apps/workers/package.json
📚 Learning: 2025-11-27T16:02:47.440Z
Learnt from: CR
Repo: karakeep-app/karakeep PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-27T16:02:47.440Z
Learning: Applies to packages/e2e_tests/**/*.{ts,tsx,test,spec} : Write end-to-end tests in the `packages/e2e_tests` directory
Applied to files:
packages/shared/inference/factory.test.tspackages/shared/inference/openai.test.ts
📚 Learning: 2026-01-03T11:36:34.916Z
Learnt from: RobertRosca
Repo: karakeep-app/karakeep PR: 2339
File: packages/shared/config.ts:62-62
Timestamp: 2026-01-03T11:36:34.916Z
Learning: In packages/shared/config.ts, enforce OpenAI SDK version compatibility: service_tier values are limited to ["auto", "default", "flex"]. The "priority" tier requires OpenAI SDK >= v5.8.0. Add a guard or validation in config to prevent using priority tier unless the SDK is upgraded (v5.8.0+). Consider documenting this constraint and adding a unit test or lint rule to ensure only allowed service_tier values are used based on the installed SDK version.
Applied to files:
packages/shared/inference/factory.test.tspackages/shared/inference/anthropic.test.tspackages/shared/index.tspackages/shared/inference/openai.test.tspackages/shared/inference/index.tspackages/shared/inference/openai.tspackages/shared/inference/anthropic.tspackages/shared/inference/google.test.tspackages/shared/inference/factory.tspackages/shared/inference/types.tspackages/shared/inference/ollama.tspackages/shared/inference/google.tspackages/shared/inference/ollama.test.tspackages/shared/config.tspackages/shared/inference/live-test.ts
📚 Learning: 2025-11-27T16:02:47.440Z
Learnt from: CR
Repo: karakeep-app/karakeep PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-27T16:02:47.440Z
Learning: Applies to packages/shared/**/*.{ts,tsx} : Organize shared code and types in the `packages/shared` directory for use across packages
Applied to files:
packages/shared/index.ts
📚 Learning: 2025-11-27T16:02:47.440Z
Learnt from: CR
Repo: karakeep-app/karakeep PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-27T16:02:47.440Z
Learning: Applies to packages/shared-server/**/*.{ts,tsx} : Place shared server-side logic in `packages/shared-server`
Applied to files:
packages/shared/index.ts
📚 Learning: 2025-11-27T16:02:47.440Z
Learnt from: CR
Repo: karakeep-app/karakeep PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-27T16:02:47.440Z
Learning: Applies to packages/shared-react/**/*.{ts,tsx} : Place shared React components and hooks in `packages/shared-react`
Applied to files:
packages/shared/index.ts
📚 Learning: 2025-11-27T16:02:47.440Z
Learnt from: CR
Repo: karakeep-app/karakeep PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-27T16:02:47.440Z
Learning: Applies to **/*.{test,spec}.{ts,tsx,js,jsx} : Use Vitest for writing and running tests
Applied to files:
packages/shared/inference/openai.test.tspackages/shared/inference/google.test.ts
📚 Learning: 2025-11-27T16:02:47.440Z
Learnt from: CR
Repo: karakeep-app/karakeep PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-27T16:02:47.440Z
Learning: Applies to **/*.{ts,tsx,js,jsx} : Lint code using oxlint and fix issues with `pnpm lint:fix`
Applied to files:
package.json
📚 Learning: 2025-11-27T16:02:47.440Z
Learnt from: CR
Repo: karakeep-app/karakeep PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-27T16:02:47.440Z
Learning: Use Turborepo for managing the monorepo build system and tasks
Applied to files:
package.json
🧬 Code graph analysis (10)
packages/shared/inference/factory.test.ts (1)
packages/shared/inference/factory.ts (2)
InferenceClientFactory(17-41)EmbeddingClientFactory(54-76)
apps/web/components/settings/AISettings.tsx (4)
apps/web/lib/i18n/client.ts (1)
useTranslation(33-33)apps/web/lib/i18n/server.ts (1)
useTranslation(23-36)packages/shared/config.ts (1)
clientConfig(524-552)apps/web/lib/clientConfig.tsx (1)
useClientConfig(29-31)
packages/shared/inference/openai.test.ts (1)
packages/shared/inference/openai.ts (2)
OpenAIInferenceClient(136-405)OpenAIEmbeddingClient(412-432)
packages/shared/inference/google.test.ts (1)
packages/shared/inference/google.ts (2)
GoogleGeminiInferenceClient(73-153)GoogleEmbeddingClient(162-199)
packages/shared/inference/factory.ts (1)
packages/shared/inference/types.ts (2)
InferenceClient(26-37)EmbeddingClient(39-41)
packages/shared/inference/types.ts (1)
tools/compare-models/src/inferenceClient.ts (3)
InferenceResponse(9-12)InferenceOptions(5-7)InferenceClient(14-131)
packages/shared/inference/ollama.ts (2)
packages/shared/inference/index.ts (7)
InferenceClient(19-19)InferenceOptions(23-23)InferenceResponse(21-21)defaultInferenceOptions(26-26)OllamaEmbeddingClient(35-35)EmbeddingClient(20-20)EmbeddingResponse(22-22)packages/shared/inference/types.ts (6)
InferenceClient(26-37)InferenceOptions(12-20)InferenceResponse(3-6)defaultInferenceOptions(22-24)EmbeddingClient(39-41)EmbeddingResponse(8-10)
packages/shared/inference/google.ts (2)
packages/shared/inference/index.ts (8)
InferenceOptions(23-23)GoogleGeminiInferenceClient(34-34)InferenceClient(19-19)InferenceResponse(21-21)defaultInferenceOptions(26-26)GoogleEmbeddingClient(34-34)EmbeddingClient(20-20)EmbeddingResponse(22-22)packages/shared/inference/types.ts (6)
InferenceOptions(12-20)InferenceClient(26-37)InferenceResponse(3-6)defaultInferenceOptions(22-24)EmbeddingClient(39-41)EmbeddingResponse(8-10)
packages/shared/inference/ollama.test.ts (2)
packages/shared/inference/index.ts (2)
OllamaInferenceClient(35-35)OllamaEmbeddingClient(35-35)packages/shared/inference/ollama.ts (2)
OllamaInferenceClient(21-133)OllamaEmbeddingClient(140-162)
tools/compare-models/src/inferenceClient.ts (1)
packages/shared/inference/types.ts (1)
InferenceResponse(3-6)
🪛 markdownlint-cli2 (0.18.1)
docs/docs/03-configuration/02-different-ai-providers.md
9-9: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
20-20: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
60-60: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
74-74: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
93-93: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
103-103: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
123-123: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
135-135: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
176-176: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
192-192: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
210-210: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
225-225: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
247-247: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
265-265: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Greptile Review
Greptile SummaryRefactored monolithic inference module into modular multi-provider architecture supporting OpenAI, Anthropic Claude, Google Gemini, and Ollama with native SDK integrations. Key Changes:
Confidence Score: 5/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant App as Application Code
participant Factory as InferenceClientFactory
participant Config as serverConfig
participant OpenAI as OpenAIInferenceClient
participant Anthropic as AnthropicInferenceClient
participant Google as GoogleGeminiInferenceClient
participant Ollama as OllamaInferenceClient
participant SDK as Provider SDK
App->>Factory: InferenceClientFactory.build()
Factory->>Config: Get inference.provider
alt provider: openai
Factory->>OpenAI: new OpenAIInferenceClient()
OpenAI->>Config: Get OpenAI config
OpenAI-->>Factory: client instance
else provider: anthropic
Factory->>Anthropic: new AnthropicInferenceClient()
Anthropic->>Config: Get Anthropic config
Anthropic-->>Factory: client instance
else provider: google
Factory->>Google: new GoogleGeminiInferenceClient()
Google->>Config: Get Gemini config
Google-->>Factory: client instance
else provider: ollama
Factory->>Ollama: new OllamaInferenceClient()
Ollama->>Config: Get Ollama config
Ollama-->>Factory: client instance
else provider: null
Factory-->>App: null
end
Factory-->>App: InferenceClient
App->>OpenAI: inferFromText(prompt, options)
alt shouldUseResponsesApi()
OpenAI->>SDK: responses.create() [GPT-5/o-series]
Note over OpenAI,SDK: Supports reasoning effort,<br/>conversation, structured outputs
else
OpenAI->>SDK: chat.completions.create()
Note over OpenAI,SDK: Legacy Chat Completions API<br/>for GPT-4 and compatible models
end
SDK-->>OpenAI: API response
OpenAI-->>App: InferenceResponse {response, totalTokens}
|
Greptile found no issues!From now on, if a review finishes and we haven't found any issues, we will not post anything, but you can confirm that we reviewed your changes in the status check section. This feature can be toggled off in your Code Review Settings by deselecting "Create a status check for each PR". |
d52c0e7 to
c0206dc
Compare
There was a problem hiding this comment.
Actionable comments posted: 8
♻️ Duplicate comments (1)
packages/shared/inference/types.ts (1)
16-24: Verify reasoning_effort values match OpenAI API specification.The
reasoningEfforttype includes values ("none","minimal","xhigh") that may not be supported by the standard OpenAI Responses API. According to previous verification, the documented values for o-series models are"low","medium", and"high"(default:"medium").While the JSDoc comments describe model-specific capabilities, the type itself permits values that may cause runtime errors if passed to models that don't support them. Consider either:
- Restricting the type to the standard API values:
"low" | "medium" | "high"- Adding runtime validation that checks the model and reasoning effort combination
- Separating the type into different unions for different model families
OpenAI GPT-5 reasoning_effort parameter supported values and model-specific capabilities 2026Based on past review findings that identified API specification mismatches.
🧹 Nitpick comments (13)
tools/compare-models/src/inferenceClient.ts (3)
104-130: Consider edge cases in JSON extraction fallback logic.The greedy regex
/\{[\s\S]*\}/on line 119 matches from the first{to the last}in the string, which works for single objects but could produce invalid JSON if the response contains multiple separate JSON objects or malformed content between braces.Also, if all parsing strategies fail, the final
JSON.parse(trimmedResponse)throws a genericSyntaxError. Wrapping this in a descriptive error would improve debuggability.🔎 Optional improvement
- return JSON.parse(trimmedResponse); + try { + return JSON.parse(trimmedResponse); + } catch (e) { + throw new Error( + `Failed to parse JSON from response: ${e instanceof Error ? e.message : String(e)}`, + ); + } } }
133-133: Move import to the top of the file.The
configimport is placed after the class definition rather than with other imports at the top of the file. This is unconventional and reduces readability.🔎 Proposed fix
Move the import to the top of the file with other imports:
import OpenAI from "openai"; import { zodResponseFormat } from "openai/helpers/zod"; import { z } from "zod"; + +import { config } from "./config";Then remove line 133.
144-149: Wrapper function signature limits functionality.The
inferTagswrapper only acceptsclientandprompt, but the underlyingclient.inferTags()method supports additional parameters (model,lang,customPrompts). If callers need these options, they must bypass the wrapper and call the class method directly.If this is intentional to keep the API simple for the compare-models tool, this is fine. Otherwise, consider exposing these options.
docs/docs/03-configuration/02-different-ai-providers.md (6)
9-14: Add language identifier to code block.The fenced code block should specify a language for proper syntax highlighting. Use
bashorenv:🔎 Proposed fix
-``` +```bash INFERENCE_PROVIDER=openai # Use OpenAI INFERENCE_PROVIDER=anthropic # Use Anthropic Claude INFERENCE_PROVIDER=google # Use Google Gemini INFERENCE_PROVIDER=ollama # Use Ollama (local)</details> Based on static analysis hints from markdownlint. --- `20-27`: **Add language identifier to code blocks.** Multiple code blocks throughout this section lack language specifiers. They should be marked as `bash` or `env` for proper syntax highlighting. Example for this block: <details> <summary>🔎 Proposed fix</summary> ```diff -``` +```bash INFERENCE_PROVIDER=openai # Optional, auto-detected if OPENAI_API_KEY is set OPENAI_API_KEY=sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxThis applies to code blocks at lines: 20, 60, 74, 93, 103, 123 (same issue).
Based on static analysis hints from markdownlint.
135-150: Add language identifiers to code blocks.Code blocks at lines 135 and 176 need language specifiers (
bashorenv).Based on static analysis hints from markdownlint.
Also applies to: 176-186
192-241: Add language identifiers to code blocks.Code blocks at lines 192, 210, and 225 need language specifiers (
bashorenv).Based on static analysis hints from markdownlint.
247-257: Add language identifier to code block.🔎 Proposed fix
-``` +```bash INFERENCE_PROVIDER=ollama # Optional, auto-detected if OLLAMA_BASE_URL is set OLLAMA_BASE_URL=http://ollama.mylab.com:11434Based on static analysis hints from markdownlint.
265-272: Add language identifier to code block.🔎 Proposed fix
-``` +```bash EMBEDDING_PROVIDER=openai # Use OpenAI for embeddings EMBEDDING_PROVIDER=google # Use Google for embeddings EMBEDDING_PROVIDER=ollama # Use Ollama for embeddingsBased on static analysis hints from markdownlint.
packages/shared/inference/anthropic.ts (1)
67-71: Model matching withstartsWithmay be overly permissive.The
startsWithcheck allows any model string beginning with a supported model name to pass validation. This enables versioned model names (e.g.,claude-sonnet-4-5-20250929-v2) but could also match unintended models.This is likely intentional for flexibility with model versioning. Consider documenting this behavior in the function's JSDoc.
packages/shared/inference/openai.ts (2)
58-61: Verify the necessity of custom headers.The
X-TitleandHTTP-Refererheaders are non-standard for OpenAI API calls. While they may serve tracking or identification purposes, OpenAI's API doesn't require them. Confirm these headers provide value and don't cause issues with OpenAI's request validation.
321-329: Consider using a more specific type instead ofany.While the
eslint-disablecomment acknowledges the issue, consider defining a proper interface for the Responses API request object to improve type safety. This would help catch configuration errors at compile time.💡 Suggested improvement
+interface ResponsesApiRequest { + model: string; + input: string | unknown[]; + text: ReturnType<typeof this.getResponsesTextFormat>; + store: boolean; + temperature: number; + top_p: number; + reasoning?: { effort: string }; + max_output_tokens?: number; + previous_response_id?: string; +} + private buildResponsesRequest( model: string, input: string | unknown[], opts: InferenceOptions, ) { - // eslint-disable-next-line @typescript-eslint/no-explicit-any - const requestObj: any = { + const requestObj: ResponsesApiRequest = { model, input,packages/shared/config.ts (1)
296-347: Document provider auto-detection priority order.The auto-detection logic (lines 308-311) prioritizes providers in order: OpenAI → Anthropic → Google → Ollama. When multiple API keys are configured, the first matched provider will be selected. This could be unexpected if users have multiple credentials configured and don't set
INFERENCE_PROVIDERexplicitly.Consider adding a log message or documentation to clarify this behavior and encourage explicit
INFERENCE_PROVIDERconfiguration when multiple credentials exist.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (68)
apps/browser-extension/package.jsonapps/mcp/package.jsonapps/mobile/package.jsonapps/web/components/settings/AISettings.tsxapps/web/lib/clientConfig.tsxapps/web/lib/i18n/locales/ar/translation.jsonapps/web/lib/i18n/locales/cs/translation.jsonapps/web/lib/i18n/locales/da/translation.jsonapps/web/lib/i18n/locales/de/translation.jsonapps/web/lib/i18n/locales/el/translation.jsonapps/web/lib/i18n/locales/en/translation.jsonapps/web/lib/i18n/locales/en_US/translation.jsonapps/web/lib/i18n/locales/es/translation.jsonapps/web/lib/i18n/locales/fa/translation.jsonapps/web/lib/i18n/locales/fi/translation.jsonapps/web/lib/i18n/locales/fr/translation.jsonapps/web/lib/i18n/locales/ga/translation.jsonapps/web/lib/i18n/locales/gl/translation.jsonapps/web/lib/i18n/locales/hr/translation.jsonapps/web/lib/i18n/locales/hu/translation.jsonapps/web/lib/i18n/locales/it/translation.jsonapps/web/lib/i18n/locales/ja/translation.jsonapps/web/lib/i18n/locales/ko/translation.jsonapps/web/lib/i18n/locales/nb_NO/translation.jsonapps/web/lib/i18n/locales/nl/translation.jsonapps/web/lib/i18n/locales/pl/translation.jsonapps/web/lib/i18n/locales/pt/translation.jsonapps/web/lib/i18n/locales/pt_BR/translation.jsonapps/web/lib/i18n/locales/ru/translation.jsonapps/web/lib/i18n/locales/sk/translation.jsonapps/web/lib/i18n/locales/sl/translation.jsonapps/web/lib/i18n/locales/sv/translation.jsonapps/web/lib/i18n/locales/tr/translation.jsonapps/web/lib/i18n/locales/uk/translation.jsonapps/web/lib/i18n/locales/vi/translation.jsonapps/web/lib/i18n/locales/zh/translation.jsonapps/web/lib/i18n/locales/zhtw/translation.jsonapps/web/package.jsonapps/workers/package.jsondocs/docs/03-configuration/01-environment-variables.mddocs/docs/03-configuration/02-different-ai-providers.mdpackage.jsonpackages/api/package.jsonpackages/benchmarks/package.jsonpackages/e2e_tests/package.jsonpackages/open-api/package.jsonpackages/shared/config.tspackages/shared/index.tspackages/shared/inference.tspackages/shared/inference/anthropic.test.tspackages/shared/inference/anthropic.tspackages/shared/inference/factory.test.tspackages/shared/inference/factory.tspackages/shared/inference/google.test.tspackages/shared/inference/google.tspackages/shared/inference/index.tspackages/shared/inference/live-test.tspackages/shared/inference/ollama.test.tspackages/shared/inference/ollama.tspackages/shared/inference/openai.test.tspackages/shared/inference/openai.tspackages/shared/inference/types.tspackages/shared/package.jsonpackages/trpc/package.jsontools/compare-models/package.jsontools/compare-models/src/bookmarkProcessor.tstools/compare-models/src/config.tstools/compare-models/src/inferenceClient.ts
💤 Files with no reviewable changes (2)
- tools/compare-models/src/config.ts
- packages/shared/inference.ts
🚧 Files skipped from review as they are similar to previous changes (41)
- packages/shared/index.ts
- apps/web/lib/i18n/locales/sl/translation.json
- apps/mcp/package.json
- apps/web/lib/i18n/locales/sk/translation.json
- apps/web/lib/i18n/locales/nb_NO/translation.json
- apps/web/lib/i18n/locales/ja/translation.json
- packages/trpc/package.json
- tools/compare-models/package.json
- apps/web/lib/i18n/locales/ar/translation.json
- apps/web/lib/i18n/locales/gl/translation.json
- apps/web/lib/i18n/locales/sv/translation.json
- packages/e2e_tests/package.json
- apps/web/lib/i18n/locales/nl/translation.json
- apps/browser-extension/package.json
- packages/shared/inference/factory.test.ts
- apps/web/lib/i18n/locales/it/translation.json
- apps/web/lib/i18n/locales/hr/translation.json
- apps/web/lib/i18n/locales/hu/translation.json
- apps/mobile/package.json
- apps/web/lib/i18n/locales/cs/translation.json
- apps/web/lib/i18n/locales/fi/translation.json
- apps/web/lib/i18n/locales/de/translation.json
- apps/web/package.json
- apps/web/lib/i18n/locales/fr/translation.json
- apps/web/lib/i18n/locales/da/translation.json
- apps/web/lib/i18n/locales/tr/translation.json
- apps/web/lib/i18n/locales/ru/translation.json
- packages/shared/package.json
- apps/web/lib/i18n/locales/pl/translation.json
- packages/shared/inference/factory.ts
- packages/shared/inference/index.ts
- apps/web/lib/clientConfig.tsx
- apps/web/lib/i18n/locales/en/translation.json
- apps/web/lib/i18n/locales/en_US/translation.json
- packages/api/package.json
- tools/compare-models/src/bookmarkProcessor.ts
- package.json
- apps/web/lib/i18n/locales/uk/translation.json
- apps/web/lib/i18n/locales/ga/translation.json
- packages/benchmarks/package.json
- apps/web/lib/i18n/locales/zh/translation.json
🧰 Additional context used
📓 Path-based instructions (6)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
Use TypeScript for type safety in all source files
Files:
packages/shared/inference/openai.test.tsapps/web/components/settings/AISettings.tsxpackages/shared/inference/openai.tspackages/shared/inference/google.test.tspackages/shared/inference/anthropic.tspackages/shared/inference/ollama.tspackages/shared/inference/anthropic.test.tspackages/shared/inference/live-test.tspackages/shared/inference/ollama.test.tspackages/shared/inference/google.tspackages/shared/config.tspackages/shared/inference/types.tstools/compare-models/src/inferenceClient.ts
**/*.{ts,tsx,js,jsx,json,css,md}
📄 CodeRabbit inference engine (AGENTS.md)
Format code using Prettier according to project standards
Files:
packages/shared/inference/openai.test.tsapps/web/components/settings/AISettings.tsxpackages/shared/inference/openai.tsdocs/docs/03-configuration/02-different-ai-providers.mdpackages/shared/inference/google.test.tspackages/shared/inference/anthropic.tsapps/web/lib/i18n/locales/es/translation.jsonpackages/shared/inference/ollama.tsapps/workers/package.jsonpackages/shared/inference/anthropic.test.tspackages/shared/inference/live-test.tspackages/open-api/package.jsondocs/docs/03-configuration/01-environment-variables.mdpackages/shared/inference/ollama.test.tsapps/web/lib/i18n/locales/fa/translation.jsonpackages/shared/inference/google.tsapps/web/lib/i18n/locales/pt_BR/translation.jsonpackages/shared/config.tspackages/shared/inference/types.tsapps/web/lib/i18n/locales/vi/translation.jsonapps/web/lib/i18n/locales/el/translation.jsonapps/web/lib/i18n/locales/zhtw/translation.jsontools/compare-models/src/inferenceClient.tsapps/web/lib/i18n/locales/pt/translation.jsonapps/web/lib/i18n/locales/ko/translation.json
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (AGENTS.md)
Lint code using oxlint and fix issues with
pnpm lint:fix
Files:
packages/shared/inference/openai.test.tsapps/web/components/settings/AISettings.tsxpackages/shared/inference/openai.tspackages/shared/inference/google.test.tspackages/shared/inference/anthropic.tspackages/shared/inference/ollama.tspackages/shared/inference/anthropic.test.tspackages/shared/inference/live-test.tspackages/shared/inference/ollama.test.tspackages/shared/inference/google.tspackages/shared/config.tspackages/shared/inference/types.tstools/compare-models/src/inferenceClient.ts
**/*.{test,spec}.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (AGENTS.md)
Use Vitest for writing and running tests
Files:
packages/shared/inference/openai.test.tspackages/shared/inference/google.test.tspackages/shared/inference/anthropic.test.tspackages/shared/inference/ollama.test.ts
packages/shared/**/*.{ts,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
Organize shared code and types in the
packages/shareddirectory for use across packages
Files:
packages/shared/inference/openai.test.tspackages/shared/inference/openai.tspackages/shared/inference/google.test.tspackages/shared/inference/anthropic.tspackages/shared/inference/ollama.tspackages/shared/inference/anthropic.test.tspackages/shared/inference/live-test.tspackages/shared/inference/ollama.test.tspackages/shared/inference/google.tspackages/shared/config.tspackages/shared/inference/types.ts
apps/web/**/*.{ts,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
apps/web/**/*.{ts,tsx}: Use Tailwind CSS for styling in the web application
Use Next.js for building the main web application
Files:
apps/web/components/settings/AISettings.tsx
🧠 Learnings (4)
📚 Learning: 2025-11-27T16:02:47.440Z
Learnt from: CR
Repo: karakeep-app/karakeep PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-27T16:02:47.440Z
Learning: Applies to packages/e2e_tests/**/*.{ts,tsx,test,spec} : Write end-to-end tests in the `packages/e2e_tests` directory
Applied to files:
packages/shared/inference/openai.test.tspackages/shared/inference/live-test.ts
📚 Learning: 2026-01-03T11:36:34.916Z
Learnt from: RobertRosca
Repo: karakeep-app/karakeep PR: 2339
File: packages/shared/config.ts:62-62
Timestamp: 2026-01-03T11:36:34.916Z
Learning: In packages/shared/config.ts, enforce OpenAI SDK version compatibility: service_tier values are limited to ["auto", "default", "flex"]. The "priority" tier requires OpenAI SDK >= v5.8.0. Add a guard or validation in config to prevent using priority tier unless the SDK is upgraded (v5.8.0+). Consider documenting this constraint and adding a unit test or lint rule to ensure only allowed service_tier values are used based on the installed SDK version.
Applied to files:
packages/shared/inference/openai.test.tspackages/shared/inference/openai.tspackages/shared/inference/google.test.tspackages/shared/inference/anthropic.tspackages/shared/inference/ollama.tspackages/shared/inference/anthropic.test.tspackages/shared/inference/live-test.tspackages/shared/inference/ollama.test.tspackages/shared/inference/google.tspackages/shared/config.tspackages/shared/inference/types.ts
📚 Learning: 2025-11-27T16:02:47.440Z
Learnt from: CR
Repo: karakeep-app/karakeep PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-27T16:02:47.440Z
Learning: Applies to **/*.{test,spec}.{ts,tsx,js,jsx} : Use Vitest for writing and running tests
Applied to files:
packages/shared/inference/google.test.ts
📚 Learning: 2025-11-27T16:02:47.440Z
Learnt from: CR
Repo: karakeep-app/karakeep PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-27T16:02:47.440Z
Learning: Applies to **/*.{ts,tsx,js,jsx,json,css,md} : Format code using Prettier according to project standards
Applied to files:
apps/workers/package.json
🧬 Code graph analysis (7)
packages/shared/inference/openai.ts (2)
packages/shared/inference/index.ts (6)
InferenceOptions(23-23)InferenceResponse(21-21)defaultInferenceOptions(26-26)OpenAIEmbeddingClient(32-32)EmbeddingClient(20-20)EmbeddingResponse(22-22)packages/shared/inference/types.ts (5)
InferenceOptions(12-27)InferenceResponse(3-6)defaultInferenceOptions(29-31)EmbeddingClient(46-48)EmbeddingResponse(8-10)
packages/shared/inference/anthropic.ts (2)
packages/shared/inference/index.ts (4)
InferenceOptions(23-23)InferenceResponse(21-21)InferenceClient(19-19)defaultInferenceOptions(26-26)packages/shared/inference/types.ts (4)
InferenceOptions(12-27)InferenceResponse(3-6)InferenceClient(33-44)defaultInferenceOptions(29-31)
packages/shared/inference/ollama.ts (3)
packages/shared/inference/index.ts (8)
OllamaInferenceClient(35-35)InferenceClient(19-19)InferenceOptions(23-23)InferenceResponse(21-21)defaultInferenceOptions(26-26)OllamaEmbeddingClient(35-35)EmbeddingClient(20-20)EmbeddingResponse(22-22)packages/shared/inference/types.ts (6)
InferenceClient(33-44)InferenceOptions(12-27)InferenceResponse(3-6)defaultInferenceOptions(29-31)EmbeddingClient(46-48)EmbeddingResponse(8-10)packages/shared/customFetch.ts (1)
customFetch(24-24)
packages/shared/inference/anthropic.test.ts (1)
packages/shared/inference/anthropic.ts (2)
AnthropicInferenceClient(135-247)supportsStructuredOutputs(67-71)
packages/shared/inference/live-test.ts (2)
packages/shared/inference/anthropic.ts (1)
supportsStructuredOutputs(67-71)packages/shared/inference/factory.ts (1)
InferenceClientFactory(17-41)
packages/shared/inference/ollama.test.ts (1)
packages/shared/inference/ollama.ts (2)
OllamaInferenceClient(21-133)OllamaEmbeddingClient(140-162)
packages/shared/inference/types.ts (2)
packages/shared/inference/index.ts (5)
InferenceResponse(21-21)EmbeddingResponse(22-22)InferenceOptions(23-23)InferenceClient(19-19)EmbeddingClient(20-20)tools/compare-models/src/inferenceClient.ts (3)
InferenceResponse(9-12)InferenceOptions(5-7)InferenceClient(14-131)
🪛 markdownlint-cli2 (0.18.1)
docs/docs/03-configuration/02-different-ai-providers.md
9-9: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
20-20: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
60-60: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
74-74: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
93-93: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
103-103: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
123-123: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
135-135: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
176-176: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
192-192: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
210-210: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
225-225: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
247-247: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
265-265: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
docs/docs/03-configuration/01-environment-variables.md
96-96: Link text should be descriptive
(MD059, descriptive-link-text)
🔇 Additional comments (31)
packages/open-api/package.json (1)
10-10: LGTM! Zod version bump aligns with multi-provider architecture.The upgrade from
^3.24.2to^3.25.0is consistent with the PR's broader zod dependency update for Anthropic SDK compatibility. The existing@asteasolutions/zod-to-openapi@^7.2.0dependency is compatible with this zod version.apps/workers/package.json (1)
59-59: LGTM! Consistent zod version update.The upgrade to
^3.25.0maintains consistency with the repository-wide zod dependency update for Anthropic SDK compatibility.apps/web/lib/i18n/locales/el/translation.json (1)
196-203: LGTM! AI provider translations added correctly.The Greek translations for the new AI provider status and model configuration keys are well-structured and consistent with the multi-locale expansion across the repository. The past duplicate
"summarization"key issue flagged in previous reviews has been resolved—the current code is valid JSON.apps/web/lib/i18n/locales/zhtw/translation.json (1)
193-200: LGTM! AI provider translations added correctly.The Traditional Chinese translations for the new AI provider status and model configuration keys are properly formatted and consistent with the multi-locale expansion. The past duplicate
"summarization_prompt"key issue flagged in previous reviews has been resolved—the current code is valid JSON.apps/web/lib/i18n/locales/es/translation.json (1)
139-146: LGTM! AI provider translations added correctly.The Spanish translations for the new AI provider status and model configuration keys are well-structured and consistent with the multi-locale expansion across 31 locales mentioned in the PR objectives.
apps/web/lib/i18n/locales/ko/translation.json (1)
395-402: LGTM! Clean i18n expansion for AI provider status.The Korean translations for the new AI provider status keys are well-structured and properly integrated into the existing translation file.
apps/web/lib/i18n/locales/pt/translation.json (1)
217-224: LGTM! AI provider status keys added correctly.The Portuguese translations for the new AI provider configuration keys are properly formatted and integrate well with the existing translation structure.
apps/web/lib/i18n/locales/fa/translation.json (1)
247-254: LGTM! Farsi translations properly added.The AI provider status translations are correctly formatted and consistent with the other locale files in this PR.
packages/shared/inference/types.ts (3)
1-10: LGTM! Clean response type definitions.The
InferenceResponseandEmbeddingResponseinterfaces are well-defined and appropriate for the multi-provider architecture.
12-27: Well-structured options interface.The
InferenceOptionsinterface provides good flexibility with the schema field for structured outputs and proper support for request cancellation viaabortSignal. The use of// eslint-disable-next-linefor the generic schema type is appropriate here.Note: See separate comment regarding
reasoningEffortvalues.
29-48: LGTM! Clean and well-designed client interfaces.The
defaultInferenceOptionsprovides a sensible default, and theInferenceClientandEmbeddingClientinterfaces establish a clear contract for provider implementations. The use ofPartial<InferenceOptions>in the client methods provides good flexibility.docs/docs/03-configuration/02-different-ai-providers.md (1)
70-88: Documentation is accurate regarding reasoning_effort values and model support.The documentation correctly reflects the current OpenAI Responses API specification. The reasoning_effort values including
"none","minimal", and"xhigh"are indeed valid for certain GPT-5 models, as documented by OpenAI. The documentation appropriately describes model-dependent behavior where o-series models support the standard"low","medium","high"range, while newer GPT-5 variants expose additional options for reasoning control.packages/shared/inference/openai.test.ts (2)
210-231: Consistent test isolation pattern.Good use of
beforeEach/afterEachto modify and restore config values, preventing test pollution between suites. This pattern is consistently applied across the Responses API, JSON output, and plain output test blocks.
1-398: Comprehensive test coverage for OpenAI clients.The test suite thoroughly covers:
- Chat Completions and Responses API paths
- Structured, JSON, and plain output modes
- Image inference with proper message formatting
- Token counting and error handling
- Abort signal propagation
- Embedding generation
The mocking strategy correctly mirrors the OpenAI SDK structure.
apps/web/lib/i18n/locales/pt_BR/translation.json (1)
187-194: Localization strings for AI provider status added correctly.The new Brazilian Portuguese translations for provider status UI are complete and match the keys used in
AISettings.tsx. The translations appear grammatically correct.apps/web/components/settings/AISettings.tsx (1)
117-171: Well-structured ProviderIndicator component.The component:
- Handles the no-provider state with a clear warning message
- Uses consistent styling with the existing SettingsSection pattern
- Properly maps provider identifiers to display names with fallback
- Conditionally renders embeddings only when both provider and model are available
The responsive grid layout (
sm:grid-cols-2 lg:grid-cols-4) ensures good display across screen sizes.apps/web/lib/i18n/locales/vi/translation.json (1)
131-138: Vietnamese translations for AI provider status added correctly.The new keys align with the UI component and other locale files. The translations appear appropriate for Vietnamese.
packages/shared/inference/anthropic.ts (1)
135-247: Clean implementation of the Anthropic inference client.The implementation:
- Properly validates model capabilities before making requests
- Conditionally applies beta headers only when structured outputs are needed
- Correctly handles both text and image inference
- Properly propagates abort signals
- Calculates total tokens from input + output
The separation of concerns (schema conversion, model validation, media type validation, response extraction) makes the code maintainable.
packages/shared/inference/ollama.test.ts (1)
66-207: Thorough test coverage for OllamaInferenceClient.The test suite comprehensively covers:
- Streaming response accumulation
- Token counting across multiple chunks
- Model and configuration parameter passing
- JSON schema format when schema is provided
- Image inference with base64 data
- keep_alive setting propagation
The mock streaming generator (
mockChatStream) is a clean approach for simulating Ollama's streaming API.packages/shared/inference/anthropic.test.ts (3)
264-307: Good test isolation pattern withafterEachcleanup.Unlike the Ollama tests, this test block correctly restores
serverConfig.inference.outputSchemato"structured"inafterEach, preventing test pollution. This pattern should be applied consistently across all test files that modify shared config.
309-343: Comprehensive tests forsupportsStructuredOutputsutility.The tests cover:
- All documented model variants (Sonnet 4.5, Haiku 4.5, Opus 4.5, Opus 4.1)
- Older unsupported models (Claude 3.x, Claude 2.x)
- Edge cases (empty string, non-Claude models, partial matches)
This provides confidence in the model validation logic.
1-261: Well-structured test suite for Anthropic inference client.The tests thoroughly cover:
- API call construction and parameter passing
- Structured outputs beta header inclusion/exclusion
- Text and image inference paths
- Token accounting
- Abort signal propagation
- Different image media types
- Error handling for missing text content
The mock structure correctly mirrors the Anthropic SDK's beta API shape.
packages/shared/inference/google.test.ts (1)
1-316: LGTM! Comprehensive test coverage.The test suite is well-structured and covers all critical paths for both GoogleGeminiInferenceClient and GoogleEmbeddingClient:
- Text and image inference with various configurations
- JSON schema validation and plain text output modes
- Token counting with and without usage metadata
- Error handling for missing text content
- Abort signal propagation
- Batch embedding generation
The dynamic config modification for plain output tests (lines 236-250) is handled correctly with proper cleanup in afterEach.
packages/shared/inference/live-test.ts (1)
1-593: LGTM! Excellent integration test coverage.This live test script provides comprehensive coverage across all providers:
- Direct SDK tests for OpenAI, Anthropic, and Google
- Factory-based InferenceClient tests matching actual app usage
- Realistic prompts from the actual tagging implementation
- Structured output validation with schema parsing
- Image inference and embeddings testing
- Proper conditional skipping when API keys are missing
The use of
void $schema;(line 271) to avoid unused variable warnings is an acceptable pattern.packages/shared/inference/google.ts (1)
1-202: LGTM! Clean implementation of Google Gemini integration.The implementation is well-structured and correct:
- Proper API key validation in
createGoogleClient- Correct handling of structured vs plain output modes
- Appropriate token counting from
usageMetadata- Proper error handling for missing responses
- Abort signal propagation to the underlying API calls
The embedding client correctly handles batch sizes and accumulates results. The sequential batch processing (lines 174-186) is acceptable, though it could be optimized with
Promise.all()for parallel processing if needed in the future.packages/shared/inference/openai.ts (3)
333-348: Excellent validation for o-series reasoning constraints.The defensive check preventing o-series models from using
"none"reasoning effort is well-implemented and prevents a common API error. The error message is clear and actionable.
412-431: LGTM: Clean embedding client implementation.The embedding client correctly implements the interface and properly maps the OpenAI API response to the expected 2D array format.
143-158: No action needed. TheInferenceOptionsinterface is complete and properly defines all properties referenced in the code (abortSignal,store,reasoningEffort,previousResponseId). The original concern was based on incomplete context provided during review; the actual interface inpackages/shared/inference/types.tsfully defines all these properties with appropriate types. No type safety issue exists.Likely an incorrect or invalid review comment.
packages/shared/config.ts (3)
349-368: LGTM: Correct handling of embedding provider selection.The embedding provider logic correctly excludes Anthropic (which lacks an embeddings API) and provides sensible fallback behavior when
EMBEDDING_PROVIDERis not explicitly set. The auto-detection aligns well with the inference provider selection while respecting provider capabilities.
538-549: LGTM: Appropriate client-side config exposure.The expanded client config correctly exposes provider information for read-only UI display (e.g., ProviderIndicator component). The fields align with the multi-provider architecture and support the UI requirements mentioned in the PR objectives.
87-88: Remove the recommendation to revert to gpt-4o-mini.GPT-5 models were publicly released on August 7, 2025, and are available via the OpenAI API, making "gpt-5-mini" a valid default choice. The concern about unavailability is no longer applicable.
47530b0 to
b1738bd
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
packages/shared/inference/ollama.ts (1)
84-95: LGTM! Error handling correctly usesundefinedfor unknown token counts.The error handling properly sets
totalTokens = undefinedwhen an exception occurs, which aligns with theInferenceResponse.totalTokens: number | undefinedtype. The workaround for the Ollama streaming bug is well-documented with the GitHub issue reference.docs/docs/03-configuration/01-environment-variables.md (1)
96-96: Use descriptive link text instead of "here".The link text "here" is not descriptive and harms accessibility and SEO. Consider using descriptive text that indicates the destination.
🔎 Proposed fix
-| OPENAI_API_KEY | No | Not set | The OpenAI API key for inference and embeddings. More on that in [here](../integrations/openai). +| OPENAI_API_KEY | No | Not set | The OpenAI API key for inference and embeddings. See [OpenAI integration guide](../integrations/openai) for details.
🧹 Nitpick comments (4)
packages/shared/inference/live-test.ts (1)
186-210: Consider testing behavior instead of source code inspection.This test reads the source file to verify string patterns exist. This is fragile—if the code is refactored (e.g., patterns moved to constants or the logic restructured), the test breaks even if behavior is unchanged. Consider testing the actual
shouldUseResponsesApifunction behavior with various model names instead.🔎 Suggested alternative approach
// Test the exported function behavior instead of source inspection await runTest("openai", "responses-model-check", async () => { const { shouldUseResponsesApi } = await import("./openai.js"); // Verify GPT-5 models use Responses API (when enabled) // Note: This would need the config to enable openaiUseResponsesApi const modelsToCheck = ["gpt-5", "gpt-5.2", "o1-mini", "o3-mini", "o4-mini"]; for (const model of modelsToCheck) { // Test that these models are recognized for Responses API routing // (actual behavior depends on config.openaiUseResponsesApi) } return "Model detection behavior verified"; });packages/shared/inference/openai.ts (2)
136-141: Consider makingopenAIfield private.The
openAIfield is only used internally. Making itprivatewould improve encapsulation and signal that it's not part of the public API.🔎 Suggested change
export class OpenAIInferenceClient implements InferenceClient { - openAI: OpenAI; + private openAI: OpenAI; constructor() { this.openAI = createOpenAIClient(); }
412-431: Consider makingopenAIfield private.Same suggestion as for
OpenAIInferenceClient- the field is only used internally.🔎 Suggested change
export class OpenAIEmbeddingClient implements EmbeddingClient { - openAI: OpenAI; + private openAI: OpenAI; constructor() { this.openAI = createOpenAIClient(); }tools/compare-models/src/inferenceClient.ts (1)
107-133: Consider preserving error context in fallback parsing logic.The multi-strategy JSON extraction is robust and handles various response formats well. However, the empty
catchblocks at lines 112, 119, and 128 discard potentially useful error information that could aid debugging when all strategies fail.🔎 Optional enhancement to preserve error context
private parseJsonFromResponse(response: string): unknown { const trimmedResponse = response.trim(); + const errors: string[] = []; try { return JSON.parse(trimmedResponse); - } catch { + } catch (e) { + errors.push(`Direct parse: ${e instanceof Error ? e.message : String(e)}`); const jsonBlockRegex = /```(?:json)?\s*(\{[\s\S]*?\})\s*```/i; const match = trimmedResponse.match(jsonBlockRegex); if (match) { try { return JSON.parse(match[1]); - } catch {} + } catch (e) { + errors.push(`Markdown block: ${e instanceof Error ? e.message : String(e)}`); + } } const jsonObjectRegex = /\{[\s\S]*\}/; const objectMatch = trimmedResponse.match(jsonObjectRegex); if (objectMatch) { try { return JSON.parse(objectMatch[0]); - } catch {} + } catch (e) { + errors.push(`Object extraction: ${e instanceof Error ? e.message : String(e)}`); + } } - return JSON.parse(trimmedResponse); + try { + return JSON.parse(trimmedResponse); + } catch (e) { + throw new Error( + `Failed to parse JSON from response. Attempted strategies: ${errors.join("; ")}. Final error: ${e instanceof Error ? e.message : String(e)}` + ); + } } }
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (68)
apps/browser-extension/package.jsonapps/mcp/package.jsonapps/mobile/package.jsonapps/web/components/settings/AISettings.tsxapps/web/lib/clientConfig.tsxapps/web/lib/i18n/locales/ar/translation.jsonapps/web/lib/i18n/locales/cs/translation.jsonapps/web/lib/i18n/locales/da/translation.jsonapps/web/lib/i18n/locales/de/translation.jsonapps/web/lib/i18n/locales/el/translation.jsonapps/web/lib/i18n/locales/en/translation.jsonapps/web/lib/i18n/locales/en_US/translation.jsonapps/web/lib/i18n/locales/es/translation.jsonapps/web/lib/i18n/locales/fa/translation.jsonapps/web/lib/i18n/locales/fi/translation.jsonapps/web/lib/i18n/locales/fr/translation.jsonapps/web/lib/i18n/locales/ga/translation.jsonapps/web/lib/i18n/locales/gl/translation.jsonapps/web/lib/i18n/locales/hr/translation.jsonapps/web/lib/i18n/locales/hu/translation.jsonapps/web/lib/i18n/locales/it/translation.jsonapps/web/lib/i18n/locales/ja/translation.jsonapps/web/lib/i18n/locales/ko/translation.jsonapps/web/lib/i18n/locales/nb_NO/translation.jsonapps/web/lib/i18n/locales/nl/translation.jsonapps/web/lib/i18n/locales/pl/translation.jsonapps/web/lib/i18n/locales/pt/translation.jsonapps/web/lib/i18n/locales/pt_BR/translation.jsonapps/web/lib/i18n/locales/ru/translation.jsonapps/web/lib/i18n/locales/sk/translation.jsonapps/web/lib/i18n/locales/sl/translation.jsonapps/web/lib/i18n/locales/sv/translation.jsonapps/web/lib/i18n/locales/tr/translation.jsonapps/web/lib/i18n/locales/uk/translation.jsonapps/web/lib/i18n/locales/vi/translation.jsonapps/web/lib/i18n/locales/zh/translation.jsonapps/web/lib/i18n/locales/zhtw/translation.jsonapps/web/package.jsonapps/workers/package.jsondocs/docs/03-configuration/01-environment-variables.mddocs/docs/03-configuration/02-different-ai-providers.mdpackage.jsonpackages/api/package.jsonpackages/benchmarks/package.jsonpackages/e2e_tests/package.jsonpackages/open-api/package.jsonpackages/shared/config.tspackages/shared/index.tspackages/shared/inference.tspackages/shared/inference/anthropic.test.tspackages/shared/inference/anthropic.tspackages/shared/inference/factory.test.tspackages/shared/inference/factory.tspackages/shared/inference/google.test.tspackages/shared/inference/google.tspackages/shared/inference/index.tspackages/shared/inference/live-test.tspackages/shared/inference/ollama.test.tspackages/shared/inference/ollama.tspackages/shared/inference/openai.test.tspackages/shared/inference/openai.tspackages/shared/inference/types.tspackages/shared/package.jsonpackages/trpc/package.jsontools/compare-models/package.jsontools/compare-models/src/bookmarkProcessor.tstools/compare-models/src/config.tstools/compare-models/src/inferenceClient.ts
💤 Files with no reviewable changes (2)
- tools/compare-models/src/config.ts
- packages/shared/inference.ts
🚧 Files skipped from review as they are similar to previous changes (36)
- apps/web/lib/i18n/locales/fi/translation.json
- apps/web/lib/i18n/locales/en/translation.json
- packages/open-api/package.json
- apps/web/lib/i18n/locales/uk/translation.json
- apps/web/lib/i18n/locales/zh/translation.json
- apps/web/lib/i18n/locales/sv/translation.json
- packages/shared/inference/factory.test.ts
- apps/web/lib/i18n/locales/ru/translation.json
- apps/mobile/package.json
- apps/web/lib/i18n/locales/ja/translation.json
- packages/shared/inference/anthropic.ts
- packages/shared/package.json
- apps/web/lib/i18n/locales/pt/translation.json
- apps/web/components/settings/AISettings.tsx
- apps/web/lib/i18n/locales/da/translation.json
- apps/web/lib/i18n/locales/nl/translation.json
- apps/web/lib/i18n/locales/ar/translation.json
- apps/web/lib/i18n/locales/tr/translation.json
- packages/shared/inference/types.ts
- packages/shared/inference/ollama.test.ts
- apps/web/lib/i18n/locales/es/translation.json
- packages/api/package.json
- apps/web/lib/i18n/locales/pt_BR/translation.json
- apps/web/lib/i18n/locales/de/translation.json
- apps/web/lib/i18n/locales/vi/translation.json
- apps/workers/package.json
- apps/web/lib/i18n/locales/hr/translation.json
- apps/web/lib/i18n/locales/hu/translation.json
- apps/web/lib/i18n/locales/fa/translation.json
- packages/shared/inference/anthropic.test.ts
- package.json
- apps/web/lib/i18n/locales/zhtw/translation.json
- tools/compare-models/package.json
- apps/browser-extension/package.json
- apps/web/lib/i18n/locales/gl/translation.json
- apps/web/lib/i18n/locales/en_US/translation.json
🧰 Additional context used
📓 Path-based instructions (6)
**/*.{ts,tsx,js,jsx,json,css,md}
📄 CodeRabbit inference engine (AGENTS.md)
Format code using Prettier according to project standards
Files:
apps/mcp/package.jsonapps/web/package.jsonpackages/shared/inference/factory.tsapps/web/lib/i18n/locales/it/translation.jsonpackages/shared/inference/ollama.tspackages/shared/inference/openai.test.tspackages/trpc/package.jsonpackages/shared/index.tsapps/web/lib/i18n/locales/fr/translation.jsonapps/web/lib/i18n/locales/ko/translation.jsonpackages/shared/inference/google.tsdocs/docs/03-configuration/01-environment-variables.mdapps/web/lib/i18n/locales/el/translation.jsonpackages/shared/inference/openai.tspackages/shared/inference/google.test.tsdocs/docs/03-configuration/02-different-ai-providers.mdapps/web/lib/i18n/locales/sl/translation.jsonapps/web/lib/clientConfig.tsxapps/web/lib/i18n/locales/pl/translation.jsonapps/web/lib/i18n/locales/nb_NO/translation.jsonpackages/shared/config.tspackages/shared/inference/live-test.tspackages/benchmarks/package.jsonpackages/e2e_tests/package.jsonapps/web/lib/i18n/locales/ga/translation.jsontools/compare-models/src/inferenceClient.tsapps/web/lib/i18n/locales/cs/translation.jsonpackages/shared/inference/index.tsapps/web/lib/i18n/locales/sk/translation.jsontools/compare-models/src/bookmarkProcessor.ts
**/*.{ts,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
Use TypeScript for type safety in all source files
Files:
packages/shared/inference/factory.tspackages/shared/inference/ollama.tspackages/shared/inference/openai.test.tspackages/shared/index.tspackages/shared/inference/google.tspackages/shared/inference/openai.tspackages/shared/inference/google.test.tsapps/web/lib/clientConfig.tsxpackages/shared/config.tspackages/shared/inference/live-test.tstools/compare-models/src/inferenceClient.tspackages/shared/inference/index.tstools/compare-models/src/bookmarkProcessor.ts
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (AGENTS.md)
Lint code using oxlint and fix issues with
pnpm lint:fix
Files:
packages/shared/inference/factory.tspackages/shared/inference/ollama.tspackages/shared/inference/openai.test.tspackages/shared/index.tspackages/shared/inference/google.tspackages/shared/inference/openai.tspackages/shared/inference/google.test.tsapps/web/lib/clientConfig.tsxpackages/shared/config.tspackages/shared/inference/live-test.tstools/compare-models/src/inferenceClient.tspackages/shared/inference/index.tstools/compare-models/src/bookmarkProcessor.ts
packages/shared/**/*.{ts,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
Organize shared code and types in the
packages/shareddirectory for use across packages
Files:
packages/shared/inference/factory.tspackages/shared/inference/ollama.tspackages/shared/inference/openai.test.tspackages/shared/index.tspackages/shared/inference/google.tspackages/shared/inference/openai.tspackages/shared/inference/google.test.tspackages/shared/config.tspackages/shared/inference/live-test.tspackages/shared/inference/index.ts
**/*.{test,spec}.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (AGENTS.md)
Use Vitest for writing and running tests
Files:
packages/shared/inference/openai.test.tspackages/shared/inference/google.test.ts
apps/web/**/*.{ts,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
apps/web/**/*.{ts,tsx}: Use Tailwind CSS for styling in the web application
Use Next.js for building the main web application
Files:
apps/web/lib/clientConfig.tsx
🧠 Learnings (5)
📚 Learning: 2026-01-03T11:36:34.916Z
Learnt from: RobertRosca
Repo: karakeep-app/karakeep PR: 2339
File: packages/shared/config.ts:62-62
Timestamp: 2026-01-03T11:36:34.916Z
Learning: In packages/shared/config.ts, enforce OpenAI SDK version compatibility: service_tier values are limited to ["auto", "default", "flex"]. The "priority" tier requires OpenAI SDK >= v5.8.0. Add a guard or validation in config to prevent using priority tier unless the SDK is upgraded (v5.8.0+). Consider documenting this constraint and adding a unit test or lint rule to ensure only allowed service_tier values are used based on the installed SDK version.
Applied to files:
packages/shared/inference/factory.tspackages/shared/inference/ollama.tspackages/shared/inference/openai.test.tspackages/shared/index.tspackages/shared/inference/google.tspackages/shared/inference/openai.tspackages/shared/inference/google.test.tspackages/shared/config.tspackages/shared/inference/live-test.tspackages/shared/inference/index.ts
📚 Learning: 2025-11-27T16:02:47.440Z
Learnt from: CR
Repo: karakeep-app/karakeep PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-27T16:02:47.440Z
Learning: Applies to packages/shared/**/*.{ts,tsx} : Organize shared code and types in the `packages/shared` directory for use across packages
Applied to files:
packages/shared/index.ts
📚 Learning: 2025-11-27T16:02:47.440Z
Learnt from: CR
Repo: karakeep-app/karakeep PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-27T16:02:47.440Z
Learning: Applies to packages/shared-server/**/*.{ts,tsx} : Place shared server-side logic in `packages/shared-server`
Applied to files:
packages/shared/index.ts
📚 Learning: 2025-11-27T16:02:47.440Z
Learnt from: CR
Repo: karakeep-app/karakeep PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-27T16:02:47.440Z
Learning: Applies to **/*.{test,spec}.{ts,tsx,js,jsx} : Use Vitest for writing and running tests
Applied to files:
packages/shared/inference/google.test.ts
📚 Learning: 2025-11-27T16:02:47.440Z
Learnt from: CR
Repo: karakeep-app/karakeep PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-27T16:02:47.440Z
Learning: Applies to packages/e2e_tests/**/*.{ts,tsx,test,spec} : Write end-to-end tests in the `packages/e2e_tests` directory
Applied to files:
packages/shared/inference/google.test.tspackages/shared/inference/live-test.ts
🧬 Code graph analysis (7)
packages/shared/inference/factory.ts (2)
packages/shared/inference/types.ts (2)
InferenceClient(33-44)EmbeddingClient(46-48)packages/shared/inference/openai.ts (1)
OpenAIEmbeddingClient(412-432)
packages/shared/inference/ollama.ts (3)
packages/shared/inference/types.ts (6)
InferenceClient(33-44)InferenceOptions(12-27)InferenceResponse(3-6)defaultInferenceOptions(29-31)EmbeddingClient(46-48)EmbeddingResponse(8-10)tools/compare-models/src/inferenceClient.ts (3)
InferenceClient(14-134)InferenceOptions(5-7)InferenceResponse(9-12)packages/shared/customFetch.ts (1)
customFetch(24-24)
packages/shared/inference/openai.test.ts (1)
packages/shared/inference/openai.ts (2)
OpenAIInferenceClient(136-405)OpenAIEmbeddingClient(412-432)
packages/shared/inference/google.ts (1)
packages/shared/inference/types.ts (5)
InferenceOptions(12-27)InferenceClient(33-44)InferenceResponse(3-6)EmbeddingClient(46-48)EmbeddingResponse(8-10)
packages/shared/inference/google.test.ts (1)
packages/shared/inference/google.ts (2)
GoogleGeminiInferenceClient(73-153)GoogleEmbeddingClient(162-199)
packages/shared/inference/live-test.ts (3)
packages/shared/inference/anthropic.ts (1)
supportsStructuredOutputs(67-71)packages/shared/inference/factory.ts (1)
InferenceClientFactory(17-41)packages/shared/inference/index.ts (1)
InferenceClientFactory(29-29)
tools/compare-models/src/inferenceClient.ts (2)
packages/shared/inference/types.ts (3)
InferenceOptions(12-27)InferenceResponse(3-6)InferenceClient(33-44)tools/compare-models/src/config.ts (1)
config(29-29)
🪛 markdownlint-cli2 (0.18.1)
docs/docs/03-configuration/02-different-ai-providers.md
96-96: Link text should be descriptive
(MD059, descriptive-link-text)
148-148: Bare URL used
(MD034, no-bare-urls)
190-190: Link text should be descriptive
(MD059, descriptive-link-text)
205-205: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🔇 Additional comments (41)
apps/web/lib/i18n/locales/sk/translation.json (1)
287-294: LGTM! Previous duplicate issue resolved, and new translations are appropriate.The duplicate
summarization_promptkey flagged in the previous review has been successfully removed. The new AI provider-related translation keys are properly formatted, and the Slovak translations are accurate and consistent with the multi-provider AI architecture additions described in the PR objectives.apps/web/lib/i18n/locales/pl/translation.json (1)
201-208: LGTM! Polish translations for AI provider status are well-structured.The new translation keys for the provider status UI are properly formatted with correct JSON syntax, appropriate Polish translations, and consistent formatting. The addition of the trailing comma on line 201 and the structure of the new entries (lines 202-208) follow proper JSON conventions.
The translations are contextually appropriate for AI settings and maintain consistency with the existing translation patterns in the file.
packages/e2e_tests/package.json (1)
24-24: LGTM: Zod dependency bump aligns with PR objectives.The version bump from ^3.24.2 to ^3.25.0 is consistent with the repository-wide upgrade for Anthropic SDK compatibility, and test results confirm no breaking changes.
apps/web/package.json (1)
104-104: LGTM: Consistent dependency upgrade.The zod version bump is consistent with the monorepo-wide update and supports the new multi-provider inference architecture.
packages/trpc/package.json (1)
29-29: LGTM: Dependency version aligned across monorepo.The zod upgrade is part of the coordinated version bump across all packages for Anthropic SDK compatibility.
apps/mcp/package.json (1)
44-44: LGTM: Version bump supports new AI architecture.The zod upgrade is consistent with the multi-provider inference changes and passed all validation checks.
packages/benchmarks/package.json (1)
22-22: LGTM: Consistent monorepo-wide dependency update.The zod version bump completes the coordinated upgrade across all packages. Note that zod 3.25.0 introduced a significant architectural change (Zod 4 alongside Zod 3 via subpath exports), which has caused TypeScript compatibility issues in some projects. However, since the PR tests passed, this package appears unaffected by that change.
packages/shared/index.ts (1)
1-3: LGTM! Valid module boundary pattern.The empty export with documentation comments correctly establishes this as a module boundary file, directing users to import from specific subpaths (e.g.,
@karakeep/shared/config). This is a standard pattern for packages with subpath exports.apps/web/lib/i18n/locales/nb_NO/translation.json (1)
315-322: LGTM! Norwegian translation keys added correctly.The new AI provider and model translation keys are properly formatted and integrate cleanly into the existing
aisettings block. The JSON structure is valid with correct comma placement.Note: The past review comment about a duplicate
images_promptkey appears to have been resolved in this version.apps/web/lib/i18n/locales/el/translation.json (1)
196-203: LGTM! Greek translation keys added correctly.The new AI provider and model translation keys are properly integrated into the
aisettings block with valid JSON structure. All keys use appropriate Greek translations.Note: The past review comment about a duplicate
summarizationkey appears to have been resolved. The current diff showsauto_summarization(line 196) andsummarization(line 182) as distinct keys.apps/web/lib/i18n/locales/ga/translation.json (1)
119-126: LGTM! Irish translation keys added correctly.The new AI provider and model translation keys integrate properly into the
aisettings block with valid JSON formatting. Irish translations are correctly provided for all new keys.Note: The past review comment about a duplicate
summarizationkey appears to have been resolved. The current diff showsauto_summarization(line 119) andsummarization(line 105) as distinct keys.apps/web/lib/i18n/locales/ko/translation.json (1)
395-402: LGTM! Korean translation keys added correctly.The new AI provider and model translation keys are properly formatted and consistent with the other locale files in this PR. The JSON structure is valid with appropriate Korean translations for all new keys.
tools/compare-models/src/bookmarkProcessor.ts (1)
3-3: LGTM! Clean refactoring for multi-provider architecture.The import change correctly shifts
InferenceClientto a type-only import from the local module while preserving theinferTagsruntime function. This aligns with the PR's goal of modularizing provider-specific clients.apps/web/lib/i18n/locales/fr/translation.json (1)
193-200: LGTM! French translations correctly added.The new provider status and model configuration keys are properly formatted and grammatically correct. These support the new ProviderIndicator UI component for displaying active AI providers.
apps/web/lib/i18n/locales/sl/translation.json (1)
120-127: LGTM! Slovenian translations correctly added, and past duplicate key issue resolved.The new provider configuration keys are properly formatted. The previous review concern about a duplicate
tagging_rule_descriptionkey at line 121 appears to have been addressed—line 121 now correctly containsprovider_status.apps/web/lib/clientConfig.tsx (1)
19-23: LGTM! Client config correctly extended for multi-provider support.The new inference fields properly separate inference and embedding providers (critical since Anthropic lacks an embeddings API). Default values are appropriate:
nullfor optional providers and empty strings for model names.apps/web/lib/i18n/locales/it/translation.json (1)
193-200: LGTM! Italian translations correctly added.The new provider status and model configuration keys are properly formatted and grammatically correct. These translations complete the i18n support for the multi-provider AI configuration UI.
apps/web/lib/i18n/locales/cs/translation.json (1)
119-127: LGTM!The new Czech translation keys for AI provider status are well-structured and align with the multi-provider UI additions. The translations appear linguistically appropriate.
packages/shared/inference/openai.test.ts (4)
1-47: LGTM!Well-structured test setup with comprehensive mocking of the OpenAI SDK and server config. The mock structure correctly mirrors the SDK's API surface (chat.completions, responses, embeddings).
57-207: Good test coverage for Chat Completions API paths.Tests comprehensively validate:
- Default API routing
- Prompt and model configuration
- Structured output with Zod schema
- Error handling for missing content
- Abort signal propagation
- Image inference with proper content structure
210-261: Good Responses API test coverage with proper config isolation.The beforeEach/afterEach pattern correctly mutates and restores config values to test GPT-5 model routing. The reasoning effort test validates the expected request structure.
365-397: LGTM!Embedding client tests properly validate model configuration usage and response mapping from the SDK format to the expected 2D array structure.
packages/shared/inference/google.test.ts (3)
1-42: LGTM!Well-structured mock setup for the Google Gen AI SDK. The Type enum mock correctly provides the constants needed for schema generation.
215-229: Good defensive test case.Testing graceful handling of missing
usageMetadataensures the client won't fail when the API response lacks token count information.
270-315: LGTM!Embedding tests properly validate:
- Batch embedding generation
- Single batch request composition
- Empty embeddings array handling
packages/shared/inference/factory.ts (2)
17-41: LGTM!Clean factory implementation with proper exhaustiveness checking. The
nevertype pattern ensures compile-time errors if a new provider is added to the config type but not handled in the switch.
54-76: LGTM!Good documentation noting that Anthropic doesn't provide embeddings. The factory correctly excludes Anthropic from the embedding provider options.
packages/shared/inference/index.ts (1)
1-35: LGTM!Well-organized barrel file with clear documentation. The structure appropriately separates:
- Types (for consumers who need type information)
- Factories (recommended entry point)
- Individual clients (for advanced usage/testing)
packages/shared/inference/live-test.ts (2)
72-115: LGTM!Good test result tracking with status, duration, response preview, and error capture. The
runTesthelper provides consistent timing and logging across all provider tests.
471-553: Good integration test pattern.Testing the actual
InferenceClientFactory.build()path validates the full code path the app uses. The three test cases (plain text, tagging with schema, image tagging) cover the primary use cases.packages/shared/inference/openai.ts (4)
17-39: LGTM!Good defensive validation of image types with a clear error message listing supported formats.
65-98: LGTM!Good model detection helpers with clear separation of concerns:
requiresMaxCompletionTokens: GPT-5 and o-seriesisOSeriesModel: Only reasoning models (o1, o3, o4)getTokenLimitParam: Returns appropriate parameter name
340-345: Good validation for o-series model constraints.Clear error message when attempting to use unsupported "none" reasoning effort with o-series models.
389-398: Good defensive JSON parsing.The try-catch fallback ensures the client doesn't fail if the model returns unexpected output format.
docs/docs/03-configuration/01-environment-variables.md (1)
95-110: LGTM! Multi-provider configuration is well-documented.The documentation clearly explains:
- Provider selection via
INFERENCE_PROVIDERwith auto-detection fallback- Provider-specific API keys and base URLs for OpenAI, Anthropic, Google Gemini, and Ollama
- OpenAI-specific options for Responses API and reasoning effort
- Separate embedding provider configuration with appropriate defaults
The descriptions align well with the implementation in
packages/shared/config.ts.docs/docs/03-configuration/02-different-ai-providers.md (1)
1-292: LGTM! Comprehensive multi-provider documentation.This documentation provides excellent coverage of all four supported AI providers:
- Clear provider selection guidance with auto-detection fallback
- Detailed model families, pricing, and recommendations for each provider
- Important warnings (e.g., Anthropic's lack of embeddings, Azure deployment names)
- Structured output compatibility notes
- Comprehensive embedding configuration section
- Helpful reference tables for quick lookup
The documentation will significantly improve user experience when configuring multi-provider AI inference.
packages/shared/inference/google.ts (1)
1-202: LGTM! Google Gemini integration is well-implemented.The implementation correctly:
- Validates API key presence with clear error messages
- Handles three output schema modes (plain, json, structured) with appropriate MIME types
- Converts Zod schemas to JSON schema using
$refStrategy: "none"for structured outputs- Calculates token usage from
usageMetadata(promptTokenCount + candidatesTokenCount)- Implements proper batch processing for embeddings with a 100-item limit matching Google's API constraints
- Follows the Google GenAI SDK patterns for both text and multimodal (image + text) inference
The code is clean, well-documented, and consistent with the other provider implementations in this PR.
packages/shared/config.ts (2)
59-83: LGTM! Multi-provider environment variable schema is well-structured.The schema additions are comprehensive:
- Explicit
INFERENCE_PROVIDERselection with four supported providers- Provider-specific configuration blocks (OpenAI, Anthropic, Google Gemini, Ollama)
- Separate
EMBEDDING_PROVIDERconfiguration for flexibilityOPENAI_SERVICE_TIERenum correctly limited to["auto", "default", "flex"]per SDK compatibilityBased on learnings, the service_tier constraint is properly enforced at the schema level (excluding "priority" tier which requires OpenAI SDK >= v5.8.0).
296-370: LGTM! Multi-provider configuration logic is well-designed.The implementation correctly handles:
Inference provider selection:
- Respects explicit
INFERENCE_PROVIDERwhen set- Auto-detects from available credentials in documented order (OpenAI → Anthropic → Google → Ollama)
- Returns
nullwhen no provider is configuredEmbedding provider selection:
- Respects explicit
EMBEDDING_PROVIDERwhen set- Intelligently falls back to inference provider when it supports embeddings (OpenAI, Google, Ollama)
- Handles Anthropic's lack of embeddings API by auto-detecting alternative providers
- Uses appropriate credential-based fallback order
The configuration structure cleanly separates provider-specific settings while maintaining backward compatibility. The
clientConfigexposure provides necessary read-only information for UI components without leaking sensitive credentials.tools/compare-models/src/inferenceClient.ts (2)
29-56: LGTM! Well-structured tagging method.The method properly:
- Defines a clear schema for validation
- Uses
safeParsewith explicit error handling- Delegates to helper methods for separation of concerns
- Provides sensible defaults for language and custom prompts
58-83: LGTM! Bounds check properly implemented.The method correctly:
- Validates the
choicesarray is non-empty (lines 71-72) ✅ — this addresses the previous review comment- Checks for null/empty content with a clear error message
- Uses appropriate
response_formatbased on whether a schema is provided- Returns structured response with token usage
Replace monolithic inference module with provider-specific clients supporting OpenAI, Anthropic Claude, Google Gemini, and Ollama. Each provider now has native SDK integration with structured output support via JSON schema. Architecture: - Add packages/shared/inference/ with modular provider clients - Add InferenceClientFactory and EmbeddingClientFactory for provider selection - Separate inference and embedding providers (Anthropic lacks embeddings API) New providers: - OpenAI: Chat Completions + Responses API for GPT-5/o-series reasoning - Google: Gemini 2.5/3.x with JSON schema, batch embeddings - Anthropic: Claude 4.5 with structured outputs (beta), vision support Configuration: - Add INFERENCE_PROVIDER for explicit provider selection (auto-detects if unset) - Add ANTHROPIC_API_KEY, ANTHROPIC_BASE_URL - Add GEMINI_API_KEY, GEMINI_BASE_URL - Add EMBEDDING_PROVIDER for separate embedding configuration - Add OPENAI_USE_RESPONSES_API, OPENAI_REASONING_EFFORT for GPT-5 - Add reasoning effort support for o-series models (o1, o3, o4) - Auto-detect max_completion_tokens for GPT-5/o-series models - Change default models to gpt-5-mini Breaking changes: - Remove INFERENCE_SUPPORTS_STRUCTURED_OUTPUT (use INFERENCE_OUTPUT_SCHEMA) - Remove INFERENCE_USE_MAX_COMPLETION_TOKENS (now auto-detected) Other: - Add ProviderIndicator component showing active provider/models in settings - Add @anthropic-ai/sdk and @google/genai dependencies - Upgrade zod 3.24.2 -> 3.25.0 for Anthropic SDK compatibility - Add 106 unit tests and 19 live API integration tests - Rewrite AI provider documentation with model tables and examples - Add translations for AI provider settings UI to all 31 locales
b1738bd to
15fd85b
Compare
|
@MohamedBassem could this be reviewed? |
|
Hey, thanks for taking the time to send a PR. However, I'd recommend reading our contribution guidelines here to avoid any wasted effort. It's important for us to align on what to work on, otherwise it might not be aligned with the direction of the product. Now for this PR, it's honestly not clear to me the value it adds compared to the complexity it introduced. All major LLM providers already support the open AI API. So if anything, I'm trying to converge more on that rather than split into separate providers. If I'm to go with separate providers, it'll also probably be through vercel's AI sdk. So unfortunately, I don't think I'm currently willing to accept the extra complexity introduced by this PR. I hate closing PRs that people have put effort into, that's why it's important for us to align first. Thanks again for taking the time. |
Summary
Replace monolithic inference module with provider-specific clients supporting OpenAI, Anthropic Claude, Google Gemini, and Ollama. Each provider now has native SDK integration with structured output support via JSON schema.
New Providers
Architecture Changes
packages/shared/inference/with modular provider clientsInferenceClientFactoryandEmbeddingClientFactoryfor provider selectionNew Configuration
INFERENCE_PROVIDER- explicit provider selection (auto-detects if unset)ANTHROPIC_API_KEY,ANTHROPIC_BASE_URLGEMINI_API_KEY,GEMINI_BASE_URLEMBEDDING_PROVIDER- separate embedding configurationOPENAI_USE_RESPONSES_API,OPENAI_REASONING_EFFORTfor GPT-5Breaking Changes
INFERENCE_USE_MAX_COMPLETION_TOKENS(now auto-detected)INFERENCE_SUPPORTS_STRUCTURED_OUTPUT(useINFERENCE_OUTPUT_SCHEMA)Other
ProviderIndicatorcomponent showing active provider/models in settingspnpm inference:live-testscript for manual API testingTest plan