fix: Add JSON instruction to default text tagging prompt before content insertion by ElectricTea · Pull Request #2356 · karakeep-app/karakeep

ElectricTea · 2026-01-06T21:53:45Z

Added the prompt instruction

- You must respond in valid JSON with the key "tags" and the value is list of tags. Don't wrap the response in a markdown code.

to the default text tagging instructions before the content insertion. This change significantly improves the success rate of a response containing structured JSON when a prompt is truncated by the LLM due to the prompt exceeding a maximum token limit.

I kept the original JSON instruction at the end of the prompt because it "reminds" the LLM to use JSON structure after the content insertion, and it causes no issues.

…nt insertion Added the prompt instruction ``` - You must respond in valid JSON with the key "tags" and the value is list of tags. Don't wrap the response in a markdown code. ``` to the default text tagging instructions before the content insertion. This change significantly improves the success rate of a response containing structured JSON when a prompt is truncated by the LLM due to the prompt exceeding a maximum token limit. I kept the original JSON instruction at the end of the prompt because it "reminds" the LLM to use JSON structure after the content insertion, and it causes no issues.

coderabbitai · 2026-01-06T21:53:57Z

Walkthrough

A duplicate instruction line was added to the buildImagePrompt function in packages/shared/prompts.ts, requiring responses to be valid JSON with a "tags" key containing a list of tags. No control flow or logic changes were introduced.

Changes

Cohort / File(s)	Summary
Prompt Instruction Enhancement `packages/shared/prompts.ts`	Added duplicate JSON formatting directive to `buildImagePrompt` requiring "tags" key with list value in response

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Pre-merge checks

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically describes the main change: adding a JSON instruction to a prompt before content insertion.
Description check	✅ Passed	The description is directly related to the changeset, explaining the added JSON instruction and its purpose for improving structured JSON responses.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

greptile-apps · 2026-01-06T21:55:35Z

Greptile Summary

Added JSON format instruction before content insertion in buildImagePrompt() to improve robustness when prompts are truncated due to token limits. The change places the JSON instruction at line 61, before the tag style and custom prompts, while keeping the reminder at line 64. This ensures LLMs receive the JSON format requirement even when the prompt is cut off, significantly improving structured response success rates.

Confidence Score: 5/5

This PR is safe to merge with minimal risk
The change is a simple, non-breaking improvement to prompt engineering that addresses a real issue (truncated prompts) without modifying any logic, types, or interfaces. The duplicate instruction is intentional and beneficial.
No files require special attention

Important Files Changed

Filename	Overview
packages/shared/prompts.ts	Added JSON instruction before content insertion in image tagging prompt to improve success rate when prompts are truncated by token limits

Sequence Diagram

sequenceDiagram
    participant Worker as Tagging Worker
    participant Prompt as buildImagePrompt()
    participant LLM as LLM Service
    participant Parser as parseJsonFromLLMResponse()
    
    Worker->>Prompt: Request image tagging prompt
    Note over Prompt: Constructs prompt with rules<br/>JSON instruction at line 61<br/>tagStyleInstruction<br/>customPrompts<br/>JSON instruction reminder at line 64
    Prompt-->>Worker: Return complete prompt
    
    Worker->>LLM: Send prompt + image
    alt Token limit exceeded
        Note over LLM: Truncates prompt from end<br/>Early JSON instruction survives
        LLM-->>Worker: Valid JSON response
    else Normal processing
        Note over LLM: Processes full prompt<br/>Both JSON instructions present
        LLM-->>Worker: Valid JSON response
    end
    
    Worker->>Parser: Parse LLM response
    alt Valid JSON
        Parser-->>Worker: Parsed tags object
    else Invalid format
        Note over Parser: Attempts extraction from<br/>markdown or text
        Parser-->>Worker: Parsed tags or error
    end
    
    Worker->>Worker: Connect tags to bookmark

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI Agents

In @packages/shared/prompts.ts:
- Line 61: The duplicate JSON instruction was added to buildImagePrompt but
truncation happens in buildTextPrompt/constructTextTaggingPrompt; move or add
the duplicated instruction into constructTextTaggingPrompt immediately before
the <TEXT_CONTENT> placeholder so the JSON instruction is present prior to
content truncation performed by buildTextPrompt; keep buildImagePrompt unchanged
and ensure only one JSON instruction remains at the end of
constructTextTaggingPrompt (and/or a duplicate immediately before
<TEXT_CONTENT>) so it survives token truncation.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between aa7a81e and 84106e1.

📒 Files selected for processing (1)

packages/shared/prompts.ts

🧰 Additional context used

📓 Path-based instructions (4)

**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Use TypeScript for type safety in all source files

Files:

packages/shared/prompts.ts

**/*.{ts,tsx,js,jsx,json,css,md}

📄 CodeRabbit inference engine (AGENTS.md)

Format code using Prettier according to project standards

Files:

packages/shared/prompts.ts

**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (AGENTS.md)

Lint code using oxlint and fix issues with pnpm lint:fix

Files:

packages/shared/prompts.ts

packages/shared/**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Organize shared code and types in the packages/shared directory for use across packages

Files:

packages/shared/prompts.ts

🧠 Learnings (1)

📚 Learning: 2026-01-03T11:36:34.916Z

Learnt from: RobertRosca
Repo: karakeep-app/karakeep PR: 2339
File: packages/shared/config.ts:62-62
Timestamp: 2026-01-03T11:36:34.916Z
Learning: In packages/shared/config.ts, enforce OpenAI SDK version compatibility: service_tier values are limited to ["auto", "default", "flex"]. The "priority" tier requires OpenAI SDK >= v5.8.0. Add a guard or validation in config to prevent using priority tier unless the SDK is upgraded (v5.8.0+). Consider documenting this constraint and adding a unit test or lint rule to ensure only allowed service_tier values are used based on the installed SDK version.

Applied to files:

packages/shared/prompts.ts

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Greptile Review

coderabbitai · 2026-01-06T21:56:48Z

packages/shared/prompts.ts

 - If the tag is not generic enough, don't include it.
 - Aim for 10-15 tags.
 - If there are no good tags, don't emit any.
+- You must respond in valid JSON with the key "tags" and the value is list of tags. Don't wrap the response in a markdown code.


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

wc -l packages/shared/prompts.ts

Repository: karakeep-app/karakeep

Length of output: 95

🏁 Script executed:

cat -n packages/shared/prompts.ts

Repository: karakeep-app/karakeep

Length of output: 6847

Change appears to be in the wrong function given the stated objectives.

The duplicate JSON instruction is correctly identified at lines 61 and 64 in buildImagePrompt. However, this change does not align with the stated PR objective of protecting against truncation:

buildImagePrompt (lines 46-65) performs no truncation—it returns a static template string

The actual truncation happens in buildTextPrompt (lines 131-156), which calculates token budgets and truncates content before passing it to constructTextTaggingPrompt

constructTextTaggingPrompt (lines 70-96), where text content is actually processed, has only a single JSON instruction at the end (line 95), not a duplicate

If the goal is to protect the JSON instruction from being lost during truncation, the duplicate instruction should be added to constructTextTaggingPrompt before the <TEXT_CONTENT> block to ensure it survives token truncation in buildTextPrompt.

🤖 Prompt for AI Agents

In @packages/shared/prompts.ts at line 61, The duplicate JSON instruction was added to buildImagePrompt but truncation happens in buildTextPrompt/constructTextTaggingPrompt; move or add the duplicated instruction into constructTextTaggingPrompt immediately before the <TEXT_CONTENT> placeholder so the JSON instruction is present prior to content truncation performed by buildTextPrompt; keep buildImagePrompt unchanged and ensure only one JSON instruction remains at the end of constructTextTaggingPrompt (and/or a duplicate immediately before <TEXT_CONTENT>) so it survives token truncation.

ElectricTea · 2026-01-06T22:17:15Z

Woops, I added it to the incorrect function 🤦

Sorry about that. Will close this PR and open another to keep the commit history clean.

coderabbitai bot reviewed Jan 6, 2026

View reviewed changes

ElectricTea marked this pull request as draft January 6, 2026 22:07

ElectricTea closed this Jan 6, 2026

ElectricTea deleted the patch-3 branch January 6, 2026 22:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: Add JSON instruction to default text tagging prompt before content insertion#2356

fix: Add JSON instruction to default text tagging prompt before content insertion#2356
ElectricTea wants to merge 1 commit intokarakeep-app:mainfrom
ElectricTea:patch-3

ElectricTea commented Jan 6, 2026

Uh oh!

coderabbitai bot commented Jan 6, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Jan 6, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Jan 6, 2026

Uh oh!

ElectricTea commented Jan 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

ElectricTea commented Jan 6, 2026

Uh oh!

coderabbitai bot commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Pre-merge checks

Uh oh!

greptile-apps bot commented Jan 6, 2026

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

ElectricTea commented Jan 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai bot commented Jan 6, 2026 •

edited

Loading