Skip to content

Feat/add integration deplication fixes: #1664#1881

Open
aaryan359 wants to merge 2 commits intomaximhq:mainfrom
aaryan359:feat/add_integration_deplication
Open

Feat/add integration deplication fixes: #1664#1881
aaryan359 wants to merge 2 commits intomaximhq:mainfrom
aaryan359:feat/add_integration_deplication

Conversation

@aaryan359
Copy link
Contributor

@aaryan359 aaryan359 commented Mar 3, 2026

Summary

This PR extends Bifrost's CLI User-Agent detection and tool deduplication mechanism to support several modern AI integrations. This prevents tools from being attached twice (duplicated) when Bifrost is used as an MCP gateway with these specific CLI tools.

Changes

  • Enhanced User Agent Detection: Updated DetectCLIUserAgent to recognize Gemini CLI, Qwen CLI, Cursor, Codex, and n8n.
  • False Positive Prevention: Implemented isCLIToken with strict word-boundary matching to ensure strings like "precursor" or "vs-codex-plugin" do not trigger incorrect deduplication.
  • Improved Tool Mapping: Extended buildIntegrationDuplicateCheckMap to handle the mcp__{server}__{tool} naming pattern used by these CLIs, mapping them to Bifrost's internal format.
  • Logging Refinement: Added diagnostic logs (now cleaned up for production) to verify detection and mapping during development.

Type of change

  • Bug fix
  • Feature
  • Refactor
  • Documentation
  • Chore/CI

Affected areas

  • Core (Go)
  • Transports (HTTP)
  • Providers/Integrations
  • Plugins
  • UI (Next.js)
  • Docs

How to test

1. Run Unit Tests

Verify detection logic and mapping patterns:

go test ./core/mcp/... ./transports/bifrost-http/integrations/... -v

2. Screenrecording

https://drive.google.com/file/d/1A78CCCGy1yS9szQ0ehHsT3KViblZyW90/view?usp=sharing

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 3, 2026

📝 Walkthrough

Summary by CodeRabbit

  • Bug Fixes
    • Improved tool deduplication for CLI-based integrations by properly detecting and normalizing tool names to prevent duplicates in tool listings.
    • Enhanced detection of CLI user agents to ensure correct duplicate avoidance across different client types.

Walkthrough

This PR adds CLI user-agent detection capabilities across the bifrost-http transport layer and enhances MCP tool deduplication to handle integration-specific tool name prefixes. It introduces shared detection utilities, integrates detection checkpoints in HTTP handlers, and updates tool embedding logic to normalize prefixed tool names for specific integrations (claude-cli, gemini-cli, qwen-cli, cursor, codex, n8n).

Changes

Cohort / File(s) Summary
MCP Tool Deduplication
core/mcp/toolmanager.go
Enhances buildIntegrationDuplicateCheckMap to handle CLI-prefixed tool names (mcp__server__*) for specific integrations, reconstructing clean names by removing prefix and replacing double-underscores with hyphens. Adds switch default branch for future cases.
CLI Detection Infrastructure
transports/bifrost-http/integrations/utils.go
Adds DetectCLIUserAgent exported function to identify CLI/integration clients from User-Agent headers and store detection result in BifrostContext. Includes private isCLIToken helper for word-boundary matching. Note: Summary indicates potential duplicate symbol declarations in file.
HTTP Handler Integration Points
transports/bifrost-http/handlers/inference.go
Adds DetectCLIUserAgent calls in three handlers (textCompletion, chatCompletion, responses) after context conversion and before streaming/response handling.
Integration Routing & Passthrough
transports/bifrost-http/integrations/router.go, transports/bifrost-http/integrations/anthropic.go
Router adds DetectCLIUserAgent invocation in GenericRouter.createHandler after integration type assignment. Anthropic integration refactors inline claude-cli detection to use centralized DetectCLIUserAgent call.

Sequence Diagram

sequenceDiagram
    participant Client
    participant Handler as HTTP Handler
    participant Router as Integration Router
    participant Detector as CLI Detector
    participant Context as Bifrost Context
    participant ToolMgr as MCP Tool Manager

    Client->>Handler: Request (with User-Agent)
    Handler->>Detector: DetectCLIUserAgent(ctx, bifrostCtx)
    Detector->>Context: Store detected CLI type in BifrostContext
    Handler->>Router: Route request with integration type set
    Router->>Detector: DetectCLIUserAgent(ctx, bifrostCtx)
    Detector->>Context: Mark CLI user agent if detected
    Router->>ToolMgr: Embed tools with CLI context available
    ToolMgr->>Context: Query CLI type for deduplication rules
    ToolMgr->>ToolMgr: Normalize prefixed tool names (mcp__server__*) for matching CLI integrations
    ToolMgr->>Handler: Return deduplicated tools
    Handler->>Client: Response with tools
Loading

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly Related Issues

Poem

🐰 CLI agents hop through the gate,
Prefixed tools meet their fate,
Double-underscores transform with care,
Duplicates vanish into air!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Feat/add integration deplication fixes' is partially related but contains a typo ('deplication' instead of 'deduplication') and is somewhat vague about the specific improvements. Correct the typo and make the title more specific, such as 'Add CLI tool deduplication support for Gemini, Qwen, Cursor, Codex, and n8n integrations'.
✅ Passed checks (2 passed)
Check name Status Explanation
Description check ✅ Passed The PR description covers Summary, Changes, Type, and Affected areas well, but is incomplete regarding required template sections like How to test (missing go test command output expectations), Breaking changes checkbox, Related issues, and Checklist.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
transports/bifrost-http/integrations/utils.go (1)

342-346: Tighten boundary comments to match the actual predicate.

The description says "non-alphanumeric" boundary behavior, but implementation only accepts start/space/slash on the left side. Narrowing the comment avoids future misreads.

Also applies to: 388-393

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/bifrost-http/integrations/utils.go` around lines 342 - 346, The
comment describing the left boundary for matching tokens ("non-alphanumeric") is
inaccurate; update the comment around the "cursor", "codex" and "n8n"
explanation (and the similar block at the later comment) to state the actual
predicate used: tokens are matched only at the start of the header value or when
immediately preceded by a space or '/' (not any non-alphanumeric character).
Edit the comment text near that predicate in
transports/bifrost-http/integrations/utils.go so it precisely documents the
implemented left-boundary rules for the token-matching logic.
core/mcp/toolmanager.go (1)

245-247: Align the examples with the implemented prefix normalization.

Line [245] mentions examples like bifrost__read_file, but Line [265] only normalizes names starting with mcp__. Either normalize {server}__{tool} too, or update the comments to avoid implying unsupported coverage.

Also applies to: 265-276

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/mcp/toolmanager.go` around lines 245 - 247, The comment examples mention
server-prefixed names like "bifrost__read_file" but the normalization code in
core/mcp/toolmanager.go (the block around where tool names are
normalized/deduplicated — the code that currently only strips "mcp__" prefixes)
only handles "mcp__" prefixes; either broaden the normalization to strip any
"{server}__" prefix or change the comment examples to only show the supported
"mcp__" case. To fix: locate the normalization/deduplication logic (the
function/block that processes tool names around lines 265-276), and replace the
hard-coded "mcp__" check with a generic prefix strip (e.g., detect and remove a
leading "<server>__" using a regex like "^[^_]+__" or strings.Index for "__"),
or alternatively update the surrounding comment text to remove or change
examples to "mcp__read_file" so they match the implemented behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@transports/bifrost-http/integrations/utils.go`:
- Around line 383-404: isCLIToken currently uses strings.Index once and returns
false on the first boundary-fail, which misses later valid occurrences; change
isCLIToken to loop over every occurrence of token in ua (repeatedly call
strings.Index starting from the byte after the previous match) and for each
found idx perform the same left-boundary (check ua[idx-1]) and right-boundary
(compute end := idx+len(token) and check ua[end]) checks; return true on the
first occurrence that passes the boundary checks and return false only after no
more occurrences remain. Ensure you keep the existing boundary rules and use the
same variables (ua, token, idx, end) so behavior stays identical for the first
valid match.

---

Nitpick comments:
In `@core/mcp/toolmanager.go`:
- Around line 245-247: The comment examples mention server-prefixed names like
"bifrost__read_file" but the normalization code in core/mcp/toolmanager.go (the
block around where tool names are normalized/deduplicated — the code that
currently only strips "mcp__" prefixes) only handles "mcp__" prefixes; either
broaden the normalization to strip any "{server}__" prefix or change the comment
examples to only show the supported "mcp__" case. To fix: locate the
normalization/deduplication logic (the function/block that processes tool names
around lines 265-276), and replace the hard-coded "mcp__" check with a generic
prefix strip (e.g., detect and remove a leading "<server>__" using a regex like
"^[^_]+__" or strings.Index for "__"), or alternatively update the surrounding
comment text to remove or change examples to "mcp__read_file" so they match the
implemented behavior.

In `@transports/bifrost-http/integrations/utils.go`:
- Around line 342-346: The comment describing the left boundary for matching
tokens ("non-alphanumeric") is inaccurate; update the comment around the
"cursor", "codex" and "n8n" explanation (and the similar block at the later
comment) to state the actual predicate used: tokens are matched only at the
start of the header value or when immediately preceded by a space or '/' (not
any non-alphanumeric character). Edit the comment text near that predicate in
transports/bifrost-http/integrations/utils.go so it precisely documents the
implemented left-boundary rules for the token-matching logic.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b05b453 and 11e16fb.

📒 Files selected for processing (5)
  • core/mcp/toolmanager.go
  • transports/bifrost-http/handlers/inference.go
  • transports/bifrost-http/integrations/anthropic.go
  • transports/bifrost-http/integrations/router.go
  • transports/bifrost-http/integrations/utils.go

Comment on lines +383 to +404
func isCLIToken(ua, token string) bool {
idx := strings.Index(ua, token)
if idx < 0 {
return false
}
// Check left boundary: must be start or preceded by space or '/'
if idx > 0 {
prev := ua[idx-1]
if prev != ' ' && prev != '/' {
return false
}
}
// Check right boundary: must be end, or followed by '/', ' ', or '@'
end := idx + len(token)
if end < len(ua) {
next := ua[end]
if next != '/' && next != ' ' && next != '@' {
return false
}
}
return true
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

fd -t f "utils.go" transports/bifrost-http/integrations/

Repository: maximhq/bifrost

Length of output: 104


🏁 Script executed:

if [ -f "transports/bifrost-http/integrations/utils.go" ]; then
  sed -n '383,404p' transports/bifrost-http/integrations/utils.go
fi

Repository: maximhq/bifrost

Length of output: 543


🏁 Script executed:

python3 - <<'PY'
def is_cli_token_current(ua, token):
    idx = ua.find(token)
    if idx < 0:
        return False
    if idx > 0:
        prev = ua[idx - 1]
        if prev not in (' ', '/'):
            return False
    end = idx + len(token)
    if end < len(ua):
        nxt = ua[end]
        if nxt not in ('/', ' ', '@'):
            return False
    return True

tests = [
    ("precursor/2.0 cursor/1.0", "cursor", True),
    ("vs-codex-plugin/1.0 codex/0.1", "codex", True),
    ("plugin-n8n/1.0 n8n/2.0", "n8n", True),
    ("cursor/1.0", "cursor", True),
    ("curl/7.0", "curl", False),
]

print("Current implementation results:")
for ua, token, expected in tests:
    actual = is_cli_token_current(ua, token)
    status = "✓" if actual == expected else "✗ MISMATCH"
    print(f"{status} ua={ua!r}, token={token!r}, actual={actual}, expected={expected}")
PY

Repository: maximhq/bifrost

Length of output: 477


🏁 Script executed:

# Check the DetectCLIUserAgent function that uses isCLIToken
sed -n '330,382p' transports/bifrost-http/integrations/utils.go

Repository: maximhq/bifrost

Length of output: 2561


🏁 Script executed:

# Verify the suggested fix logic works
python3 - <<'PY'
def is_cli_token_fixed(ua, token):
    """Fixed version from suggested diff"""
    start = 0
    while start < len(ua):
        idx = ua.find(token, start)
        if idx < 0:
            return False
        
        left_ok = (idx == 0 or ua[idx - 1] in (' ', '/'))
        end = idx + len(token)
        right_ok = (end == len(ua) or ua[end] in ('/', ' ', '@'))
        
        if left_ok and right_ok:
            return True
        
        start = idx + 1
    return False

tests = [
    ("precursor/2.0 cursor/1.0", "cursor", True),
    ("vs-codex-plugin/1.0 codex/0.1", "codex", True),
    ("plugin-n8n/1.0 n8n/2.0", "n8n", True),
    ("cursor/1.0", "cursor", True),
    ("curl/7.0", "curl", True),  # Corrected expectation
    ("precursor/2.0", "cursor", False),
    ("curl/7.0", "curl", True),
]

print("Fixed implementation results:")
for ua, token, expected in tests:
    actual = is_cli_token_fixed(ua, token)
    status = "✓" if actual == expected else "✗"
    print(f"{status} ua={ua!r}, token={token!r}, actual={actual}, expected={expected}")
PY

Repository: maximhq/bifrost

Length of output: 560


🏁 Script executed:

# Check the usage context - are there real UA patterns that would trigger this?
grep -A 5 -B 5 "isCLIToken" transports/bifrost-http/integrations/utils.go | head -30

Repository: maximhq/bifrost

Length of output: 1237


isCLIToken fails to find valid tokens when they appear after an invalid substring match.

The function uses strings.Index() which finds only the first occurrence. When a token appears as part of a longer product name before appearing standalone (e.g., precursor/2.0 cursor/1.0), the function matches the invalid cursor substring inside precursor, fails the boundary check, and returns false without checking the valid cursor/1.0 token that follows.

This contradicts the documented behavior, which explicitly lists cases like precursor/2.0 and vs-codex-plugin/1.0 as non-matches, implying the function should correctly identify valid tokens elsewhere in the string. The fix loops through all occurrences until a match with valid boundaries is found:

Suggested fix
 func isCLIToken(ua, token string) bool {
-	idx := strings.Index(ua, token)
-	if idx < 0 {
-		return false
-	}
-	// Check left boundary: must be start or preceded by space or '/'
-	if idx > 0 {
-		prev := ua[idx-1]
-		if prev != ' ' && prev != '/' {
-			return false
-		}
-	}
-	// Check right boundary: must be end, or followed by '/', ' ', or '@'
-	end := idx + len(token)
-	if end < len(ua) {
-		next := ua[end]
-		if next != '/' && next != ' ' && next != '@' {
-			return false
-		}
-	}
-	return true
+	for start := 0; start < len(ua); {
+		idx := strings.Index(ua[start:], token)
+		if idx < 0 {
+			return false
+		}
+		idx += start
+
+		leftOK := idx == 0 || ua[idx-1] == ' ' || ua[idx-1] == '/'
+		end := idx + len(token)
+		rightOK := end == len(ua) || ua[end] == '/' || ua[end] == ' ' || ua[end] == '@'
+
+		if leftOK && rightOK {
+			return true
+		}
+
+		start = idx + 1
+	}
+	return false
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func isCLIToken(ua, token string) bool {
idx := strings.Index(ua, token)
if idx < 0 {
return false
}
// Check left boundary: must be start or preceded by space or '/'
if idx > 0 {
prev := ua[idx-1]
if prev != ' ' && prev != '/' {
return false
}
}
// Check right boundary: must be end, or followed by '/', ' ', or '@'
end := idx + len(token)
if end < len(ua) {
next := ua[end]
if next != '/' && next != ' ' && next != '@' {
return false
}
}
return true
}
func isCLIToken(ua, token string) bool {
for start := 0; start < len(ua); {
idx := strings.Index(ua[start:], token)
if idx < 0 {
return false
}
idx += start
leftOK := idx == 0 || ua[idx-1] == ' ' || ua[idx-1] == '/'
end := idx + len(token)
rightOK := end == len(ua) || ua[end] == '/' || ua[end] == ' ' || ua[end] == '@'
if leftOK && rightOK {
return true
}
start = idx + 1
}
return false
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/bifrost-http/integrations/utils.go` around lines 383 - 404,
isCLIToken currently uses strings.Index once and returns false on the first
boundary-fail, which misses later valid occurrences; change isCLIToken to loop
over every occurrence of token in ua (repeatedly call strings.Index starting
from the byte after the previous match) and for each found idx perform the same
left-boundary (check ua[idx-1]) and right-boundary (compute end :=
idx+len(token) and check ua[end]) checks; return true on the first occurrence
that passes the boundary checks and return false only after no more occurrences
remain. Ensure you keep the existing boundary rules and use the same variables
(ua, token, idx, end) so behavior stays identical for the first valid match.

@aaryan359
Copy link
Contributor Author

@akshaydeo sir,
This is a fresh PR for the feature. I decided to start a new one to ensure a clean implementation after running into issues with merge conflicts on the previous branch.

Please let me know if there are any further updates or changes needed from my side. I have included a screen recording demonstrating the feature in action; looking forward to your feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant