Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 01a1cc8f60
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| upstreamResponse = await fetch(getCustomContentFilterUrl(), { | ||
| method: "POST", | ||
| headers: { | ||
| "Content-Type": "application/json", | ||
| Authorization: `Bearer ${config.apiKey}`, | ||
| "X-Client-Request-Id": context.requestId, | ||
| }, |
There was a problem hiding this comment.
Bypass content filter for internal custom moderation requests
Calling fetch(getCustomContentFilterUrl(), ...) here sends moderation traffic back to /v1/chat/completions with no internal bypass signal, so when GATEWAY_URL resolves to this same gateway (the default path via getGatewayPublicBaseUrl) and LLM_CONTENT_FILTER_MODELS is unset or includes LLM_CONTENT_FILTER_CUSTOM_MODEL, the nested request re-enters chat.ts and executes the contentFilterMethod === "custom" branch again, recursively issuing more moderation calls until timeout. This can make custom moderation effectively unusable and create load amplification; the internal moderation request needs an explicit skip path.
Useful? React with 👍 / 👎.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (2)
WalkthroughAdds a new "custom" content-filter: configuration helpers, a custom moderation checker that POSTs to an upstream /v1/chat/completions endpoint, integrates results into the chat handler flow, and adds comprehensive tests covering success, overrides, parsing, and failure modes. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant ChatHandler as Chat Handler
participant CustomFilter as Custom Content Filter
participant ModerationAPI as External Moderation API
participant Logger
Client->>ChatHandler: POST /v1/chat/completions
activate ChatHandler
ChatHandler->>CustomFilter: checkCustomContentFilter(messages, context, signal)
activate CustomFilter
CustomFilter->>CustomFilter: Load env config, build system prompt + payload
CustomFilter->>ModerationAPI: POST <base>/v1/chat/completions (model, response_format=json_schema)
activate ModerationAPI
ModerationAPI-->>CustomFilter: HTTP response (verdict in assistant content)
deactivate ModerationAPI
CustomFilter->>CustomFilter: Parse/validate verdict JSON, derive flagged + responses
CustomFilter-->>ChatHandler: CustomContentFilterCheckResult
deactivate CustomFilter
ChatHandler->>ChatHandler: Aggregate gatewayContentFilterResponses (OpenAI + Custom)
alt flagged
ChatHandler->>ChatHandler: Null message.content, set finish_reason = "content_filter"
else not flagged
ChatHandler->>ChatHandler: Preserve completion content (or monitor behavior)
end
ChatHandler->>Logger: Persist logs including gatewayContentFilterResponses
ChatHandler-->>Client: Completion response
deactivate ChatHandler
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
Adds a new “custom” gateway content-filter method that performs moderation by calling back into the LLMGateway /v1/chat/completions endpoint with a dedicated API key/model, and integrates its results into the existing moderation logging shape.
Changes:
- Introduces
customas a supportedLLM_CONTENT_FILTER_METHOD, with env-based config (LLM_CONTENT_FILTER_CUSTOM_API_KEY,LLM_CONTENT_FILTER_CUSTOM_MODEL). - Implements custom moderation execution/parsing + fail-open logging in a new tool module.
- Wires custom moderation into
/v1/chat/completionsflow and adds unit + API-level tests covering block/monitor/missing-config/skip-by-model behavior.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| apps/gateway/src/chat/tools/custom-content-filter.ts | New custom moderation implementation that calls gateway chat completions, parses verdict JSON, and logs results/errors. |
| apps/gateway/src/chat/tools/custom-content-filter.spec.ts | Unit tests for custom moderation request shape, JSON parsing, and fail-open behavior. |
| apps/gateway/src/chat/tools/check-content-filter.ts | Extends content-filter method enum and adds env-driven custom moderation config helper. |
| apps/gateway/src/chat/tools/check-content-filter.spec.ts | Tests for new custom method selection and config validation. |
| apps/gateway/src/chat/chat.ts | Integrates custom moderation into content-filter match logic and log payload aggregation. |
| apps/gateway/src/api.spec.ts | Adds end-to-end API tests for custom method (block/monitor/missing-config/skip model). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const customContentFilterResult = | ||
| shouldApplyGatewayContentFilter && contentFilterMethod === "custom" | ||
| ? await checkCustomContentFilter( | ||
| messages as BaseMessage[], | ||
| { | ||
| requestId, | ||
| organizationId: project.organizationId, | ||
| projectId: project.id, | ||
| apiKeyId: apiKey.id, | ||
| }, | ||
| c.req.raw.signal, | ||
| ) | ||
| : null; |
There was a problem hiding this comment.
Custom moderation calls back into the gateway’s own /v1/chat/completions. If LLM_CONTENT_FILTER_MODELS is unset/empty (meaning “apply to all models”) and LLM_CONTENT_FILTER_METHOD=custom, the moderation request itself will also go through this same gateway content filter path and trigger another custom moderation call, causing an infinite recursion / request storm. Add an explicit internal bypass (e.g., send a dedicated header on the internal moderation fetch and have chat.ts skip gateway content filtering when that header is present), or otherwise ensure the moderation model is always excluded from filtering in a robust, non-config-dependent way.
| upstreamResponse = await fetch(getCustomContentFilterUrl(), { | ||
| method: "POST", | ||
| headers: { | ||
| "Content-Type": "application/json", | ||
| Authorization: `Bearer ${config.apiKey}`, | ||
| "X-Client-Request-Id": context.requestId, | ||
| }, | ||
| body: JSON.stringify({ | ||
| model: config.model, | ||
| temperature: 0, | ||
| max_tokens: CUSTOM_CONTENT_FILTER_MAX_TOKENS, | ||
| messages: [ | ||
| { | ||
| role: "system", | ||
| content: CUSTOM_CONTENT_FILTER_SYSTEM_PROMPT, | ||
| }, | ||
| { | ||
| role: "user", | ||
| content: buildCustomContentFilterInput(messages), | ||
| }, | ||
| ], | ||
| }), | ||
| signal, | ||
| }); |
There was a problem hiding this comment.
The custom moderation fetch hits the gateway’s own chat-completions endpoint. Without an explicit bypass header/flag, this can recurse when custom content filtering is enabled for the moderation model (default LLM_CONTENT_FILTER_MODELS behavior applies filtering to all models). Consider adding an internal-only header to this request and updating the chat-completions handler to skip gateway content filtering when that header is present.
| } | ||
|
|
||
| if (part.type === "image") { | ||
| return `inline-image: media_type=${part.source.media_type}, bytes=${part.source.data.length}`; |
There was a problem hiding this comment.
getImageReference() reports bytes=${part.source.data.length} for inline images, but source.data is typically base64 text (string length), not actual bytes. Either rename the field (e.g., base64Length) or compute real byte size if needed to avoid misleading moderation inputs/logs.
| return `inline-image: media_type=${part.source.media_type}, bytes=${part.source.data.length}`; | |
| return `inline-image: media_type=${part.source.media_type}, base64Length=${part.source.data.length}`; |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
apps/gateway/src/chat/tools/check-content-filter.ts (1)
47-53:⚠️ Potential issue | 🟡 MinorLet explicit
LLM_CONTENT_FILTER_METHODoverride legacy mode.At Line 47, legacy
LLM_CONTENT_FILTER_MODE=openaitakes precedence overLLM_CONTENT_FILTER_METHOD=custom, making custom mode unreachable in mixed env configurations.🔁 Proposed precedence fix
- if (envValue === "openai" || legacyModeEnvValue === "openai") { - return "openai"; - } - if (envValue === "custom") { return "custom"; } + + if (envValue === "openai" || legacyModeEnvValue === "openai") { + return "openai"; + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/gateway/src/chat/tools/check-content-filter.ts` around lines 47 - 53, The current logic gives legacyModeEnvValue ("LLM_CONTENT_FILTER_MODE") precedence over envValue ("LLM_CONTENT_FILTER_METHOD"), causing "custom" to be unreachable when legacyModeEnvValue === "openai"; change the precedence so envValue is checked first (i.e., evaluate envValue === "custom" or envValue === "openai" before checking legacyModeEnvValue), or explicitly prefer envValue when it is set (use envValue if truthy, otherwise fall back to legacyModeEnvValue) so that envValue="custom" can override legacyModeEnvValue="openai".
🧹 Nitpick comments (1)
apps/gateway/src/chat/tools/custom-content-filter.spec.ts (1)
34-53: Add a regression assertion for the internal moderation bypass header.Given custom moderation calls back into
/v1/chat/completions, this test should also assert the internal bypass header once implemented, so recursion cannot regress silently.🧪 Suggested assertion
const headers = new Headers(init?.headers); expect(headers.get("authorization")).toBe("Bearer custom-api-key"); expect(headers.get("x-client-request-id")).toBe("request-id"); + expect(headers.get("x-llmgateway-internal-content-filter")).toBe("1");🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/gateway/src/chat/tools/custom-content-filter.spec.ts` around lines 34 - 53, The test needs an assertion that the internal moderation-bypass header is sent when custom moderation calls back into /v1/chat/completions; inside the mocked fetch in custom-content-filter.spec.ts (where fetchSpy is created and headers are checked), add an assertion that headers.get("<internal-moderation-bypass-header>") equals the same value used by the implementation (or use the implementation constant, e.g. INTERNAL_MODERATION_BYPASS_HEADER and its expected value such as "1" or "true") so recursion cannot regress silently.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/gateway/src/chat/chat.ts`:
- Around line 1913-1925: The custom content-filter branch can recurse because
checkCustomContentFilter posts back to the gateway endpoint and re-enters the
same contentFilterMethod === "custom" path; modify the gateway branch that calls
checkCustomContentFilter (the code using
shouldApplyGatewayContentFilter/contentFilterMethod and function
checkCustomContentFilter) to detect and skip moderation when a special internal
header is present, and update the internal fetch inside
apps/gateway/src/chat/tools/custom-content-filter.ts (the request to
GATEWAY_URL/v1/chat/completions) to include a unique bypass header (e.g.,
X-Gateway-Internal-Moderation: 1) so the gateway can short-circuit and avoid
re-invoking the custom content filter.
In `@apps/gateway/src/chat/tools/check-content-filter.spec.ts`:
- Around line 247-263: The tests for getCustomContentFilterConfig only cover
missing (undefined) env vars; also add cases where
LLM_CONTENT_FILTER_CUSTOM_API_KEY or LLM_CONTENT_FILTER_CUSTOM_MODEL are present
but empty strings (""), because empty values should be treated the same as
missing and cause the same error; update the two specs ("throws when the custom
api key is missing" and "throws when the custom model is missing") or add new
tests to set the respective env var to "" before calling
getCustomContentFilterConfig() and expect the same toThrow messages.
In `@apps/gateway/src/chat/tools/custom-content-filter.ts`:
- Around line 395-418: The moderation POST is re-entering the gateway's content
filter causing recursion; mark the internal moderation request and skip gateway
filtering. Add the internal-moderation header (e.g.,
"x-llmgateway-internal-moderation": "true") to the fetch call created by
getCustomContentFilterUrl()/the upstreamResponse POST so the request can be
identified, and ensure the gating logic in checkCustomContentFilter()/chat.ts
uses that header (isInternalModerationRequest) to bypass applying the gateway
content filter to requests with that header set.
- Around line 139-146: The getImageReference function currently copies
part.image_url.url verbatim for part.type === "image_url", which can leak
sensitive query params or hostnames; change it to never include the raw URL
string — instead return a sanitized placeholder (e.g., "remote-image:
[redacted]") or a safe summary that omits the URL/hostname/query (you may
include non-identifying metadata like image size or safe media type if
available), and ensure the same treatment is applied for part.type === "image"
if any source fields could leak identifying info; update getImageReference to
reference part.image_url.url only to extract non-sensitive metadata (if needed)
but do not emit the URL itself.
---
Outside diff comments:
In `@apps/gateway/src/chat/tools/check-content-filter.ts`:
- Around line 47-53: The current logic gives legacyModeEnvValue
("LLM_CONTENT_FILTER_MODE") precedence over envValue
("LLM_CONTENT_FILTER_METHOD"), causing "custom" to be unreachable when
legacyModeEnvValue === "openai"; change the precedence so envValue is checked
first (i.e., evaluate envValue === "custom" or envValue === "openai" before
checking legacyModeEnvValue), or explicitly prefer envValue when it is set (use
envValue if truthy, otherwise fall back to legacyModeEnvValue) so that
envValue="custom" can override legacyModeEnvValue="openai".
---
Nitpick comments:
In `@apps/gateway/src/chat/tools/custom-content-filter.spec.ts`:
- Around line 34-53: The test needs an assertion that the internal
moderation-bypass header is sent when custom moderation calls back into
/v1/chat/completions; inside the mocked fetch in custom-content-filter.spec.ts
(where fetchSpy is created and headers are checked), add an assertion that
headers.get("<internal-moderation-bypass-header>") equals the same value used by
the implementation (or use the implementation constant, e.g.
INTERNAL_MODERATION_BYPASS_HEADER and its expected value such as "1" or "true")
so recursion cannot regress silently.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 9e28cf75-15b4-4484-a285-efec64d1977c
📒 Files selected for processing (6)
apps/gateway/src/api.spec.tsapps/gateway/src/chat/chat.tsapps/gateway/src/chat/tools/check-content-filter.spec.tsapps/gateway/src/chat/tools/check-content-filter.tsapps/gateway/src/chat/tools/custom-content-filter.spec.tsapps/gateway/src/chat/tools/custom-content-filter.ts
| const customContentFilterResult = | ||
| shouldApplyGatewayContentFilter && contentFilterMethod === "custom" | ||
| ? await checkCustomContentFilter( | ||
| messages as BaseMessage[], | ||
| { | ||
| requestId, | ||
| organizationId: project.organizationId, | ||
| projectId: project.id, | ||
| apiKeyId: apiKey.id, | ||
| }, | ||
| c.req.raw.signal, | ||
| ) | ||
| : null; |
There was a problem hiding this comment.
Prevent self-recursive moderation requests in custom mode.
At Line 1913, this branch invokes checkCustomContentFilter, and that helper posts back to GATEWAY_URL/v1/chat/completions (apps/gateway/src/chat/tools/custom-content-filter.ts, Lines 384-420) without a bypass marker. That internal request can re-enter the same contentFilterMethod === "custom" path and loop indefinitely.
🔧 Proposed guard + internal marker
+ const isInternalContentFilterRequest =
+ c.req.header("x-llmgateway-internal-content-filter") === "1";
const shouldApplyGatewayContentFilter =
+ !isInternalContentFilterRequest &&
contentFilterMode !== "disabled" &&
shouldApplyContentFilterToModel(requestedModel);Also add this header on the internal moderation fetch in apps/gateway/src/chat/tools/custom-content-filter.ts:
headers: {
"Content-Type": "application/json",
Authorization: `Bearer ${config.apiKey}`,
"X-Client-Request-Id": context.requestId,
+ "X-LLMGateway-Internal-Content-Filter": "1",
},🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/gateway/src/chat/chat.ts` around lines 1913 - 1925, The custom
content-filter branch can recurse because checkCustomContentFilter posts back to
the gateway endpoint and re-enters the same contentFilterMethod === "custom"
path; modify the gateway branch that calls checkCustomContentFilter (the code
using shouldApplyGatewayContentFilter/contentFilterMethod and function
checkCustomContentFilter) to detect and skip moderation when a special internal
header is present, and update the internal fetch inside
apps/gateway/src/chat/tools/custom-content-filter.ts (the request to
GATEWAY_URL/v1/chat/completions) to include a unique bypass header (e.g.,
X-Gateway-Internal-Moderation: 1) so the gateway can short-circuit and avoid
re-invoking the custom content filter.
| it("throws when the custom api key is missing", () => { | ||
| delete process.env.LLM_CONTENT_FILTER_CUSTOM_API_KEY; | ||
| process.env.LLM_CONTENT_FILTER_CUSTOM_MODEL = "openai/gpt-5-mini"; | ||
|
|
||
| expect(() => getCustomContentFilterConfig()).toThrow( | ||
| "LLM_CONTENT_FILTER_CUSTOM_API_KEY environment variable is required for custom content filter", | ||
| ); | ||
| }); | ||
|
|
||
| it("throws when the custom model is missing", () => { | ||
| process.env.LLM_CONTENT_FILTER_CUSTOM_API_KEY = "custom-key"; | ||
| delete process.env.LLM_CONTENT_FILTER_CUSTOM_MODEL; | ||
|
|
||
| expect(() => getCustomContentFilterConfig()).toThrow( | ||
| "LLM_CONTENT_FILTER_CUSTOM_MODEL environment variable is required for custom content filter", | ||
| ); | ||
| }); |
There was a problem hiding this comment.
Cover blank env-var values in these new config tests.
These cases only assert undefined, but misconfigured deployments usually fail as "". That leaves the new validation path partially untested.
🧪 Suggested test additions
+ it("throws when the custom api key is empty", () => {
+ process.env.LLM_CONTENT_FILTER_CUSTOM_API_KEY = "";
+ process.env.LLM_CONTENT_FILTER_CUSTOM_MODEL = "openai/gpt-5-mini";
+
+ expect(() => getCustomContentFilterConfig()).toThrow(
+ "LLM_CONTENT_FILTER_CUSTOM_API_KEY environment variable is required for custom content filter",
+ );
+ });
+
+ it("throws when the custom model is empty", () => {
+ process.env.LLM_CONTENT_FILTER_CUSTOM_API_KEY = "custom-key";
+ process.env.LLM_CONTENT_FILTER_CUSTOM_MODEL = "";
+
+ expect(() => getCustomContentFilterConfig()).toThrow(
+ "LLM_CONTENT_FILTER_CUSTOM_MODEL environment variable is required for custom content filter",
+ );
+ });📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| it("throws when the custom api key is missing", () => { | |
| delete process.env.LLM_CONTENT_FILTER_CUSTOM_API_KEY; | |
| process.env.LLM_CONTENT_FILTER_CUSTOM_MODEL = "openai/gpt-5-mini"; | |
| expect(() => getCustomContentFilterConfig()).toThrow( | |
| "LLM_CONTENT_FILTER_CUSTOM_API_KEY environment variable is required for custom content filter", | |
| ); | |
| }); | |
| it("throws when the custom model is missing", () => { | |
| process.env.LLM_CONTENT_FILTER_CUSTOM_API_KEY = "custom-key"; | |
| delete process.env.LLM_CONTENT_FILTER_CUSTOM_MODEL; | |
| expect(() => getCustomContentFilterConfig()).toThrow( | |
| "LLM_CONTENT_FILTER_CUSTOM_MODEL environment variable is required for custom content filter", | |
| ); | |
| }); | |
| it("throws when the custom api key is missing", () => { | |
| delete process.env.LLM_CONTENT_FILTER_CUSTOM_API_KEY; | |
| process.env.LLM_CONTENT_FILTER_CUSTOM_MODEL = "openai/gpt-5-mini"; | |
| expect(() => getCustomContentFilterConfig()).toThrow( | |
| "LLM_CONTENT_FILTER_CUSTOM_API_KEY environment variable is required for custom content filter", | |
| ); | |
| }); | |
| it("throws when the custom model is missing", () => { | |
| process.env.LLM_CONTENT_FILTER_CUSTOM_API_KEY = "custom-key"; | |
| delete process.env.LLM_CONTENT_FILTER_CUSTOM_MODEL; | |
| expect(() => getCustomContentFilterConfig()).toThrow( | |
| "LLM_CONTENT_FILTER_CUSTOM_MODEL environment variable is required for custom content filter", | |
| ); | |
| }); | |
| it("throws when the custom api key is empty", () => { | |
| process.env.LLM_CONTENT_FILTER_CUSTOM_API_KEY = ""; | |
| process.env.LLM_CONTENT_FILTER_CUSTOM_MODEL = "openai/gpt-5-mini"; | |
| expect(() => getCustomContentFilterConfig()).toThrow( | |
| "LLM_CONTENT_FILTER_CUSTOM_API_KEY environment variable is required for custom content filter", | |
| ); | |
| }); | |
| it("throws when the custom model is empty", () => { | |
| process.env.LLM_CONTENT_FILTER_CUSTOM_API_KEY = "custom-key"; | |
| process.env.LLM_CONTENT_FILTER_CUSTOM_MODEL = ""; | |
| expect(() => getCustomContentFilterConfig()).toThrow( | |
| "LLM_CONTENT_FILTER_CUSTOM_MODEL environment variable is required for custom content filter", | |
| ); | |
| }); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/gateway/src/chat/tools/check-content-filter.spec.ts` around lines 247 -
263, The tests for getCustomContentFilterConfig only cover missing (undefined)
env vars; also add cases where LLM_CONTENT_FILTER_CUSTOM_API_KEY or
LLM_CONTENT_FILTER_CUSTOM_MODEL are present but empty strings (""), because
empty values should be treated the same as missing and cause the same error;
update the two specs ("throws when the custom api key is missing" and "throws
when the custom model is missing") or add new tests to set the respective env
var to "" before calling getCustomContentFilterConfig() and expect the same
toThrow messages.
| function getImageReference(part: MessageContent): string | null { | ||
| if (part.type === "image_url") { | ||
| return `remote-image: ${part.image_url.url}`; | ||
| } | ||
|
|
||
| if (part.type === "image") { | ||
| return `inline-image: media_type=${part.source.media_type}, bytes=${part.source.data.length}`; | ||
| } |
There was a problem hiding this comment.
Do not send raw image URLs to the moderator.
part.image_url.url is copied verbatim into the prompt, which can leak presigned query params, internal hostnames, or user identifiers to the moderation model even though it only sees the URL as text.
🔐 Safer handling
function getImageReference(part: MessageContent): string | null {
if (part.type === "image_url") {
- return `remote-image: ${part.image_url.url}`;
+ try {
+ const url = new URL(part.image_url.url);
+ if (url.protocol !== "http:" && url.protocol !== "https:") {
+ return `remote-image: [${url.protocol.replace(":", "")}]`;
+ }
+ return `remote-image: ${url.origin}${url.pathname}`;
+ } catch {
+ return "remote-image: [redacted]";
+ }
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/gateway/src/chat/tools/custom-content-filter.ts` around lines 139 - 146,
The getImageReference function currently copies part.image_url.url verbatim for
part.type === "image_url", which can leak sensitive query params or hostnames;
change it to never include the raw URL string — instead return a sanitized
placeholder (e.g., "remote-image: [redacted]") or a safe summary that omits the
URL/hostname/query (you may include non-identifying metadata like image size or
safe media type if available), and ensure the same treatment is applied for
part.type === "image" if any source fields could leak identifying info; update
getImageReference to reference part.image_url.url only to extract non-sensitive
metadata (if needed) but do not emit the URL itself.
| upstreamResponse = await fetch(getCustomContentFilterUrl(), { | ||
| method: "POST", | ||
| headers: { | ||
| "Content-Type": "application/json", | ||
| Authorization: `Bearer ${config.apiKey}`, | ||
| "X-Client-Request-Id": context.requestId, | ||
| }, | ||
| body: JSON.stringify({ | ||
| model: config.model, | ||
| temperature: 0, | ||
| max_tokens: CUSTOM_CONTENT_FILTER_MAX_TOKENS, | ||
| messages: [ | ||
| { | ||
| role: "system", | ||
| content: CUSTOM_CONTENT_FILTER_SYSTEM_PROMPT, | ||
| }, | ||
| { | ||
| role: "user", | ||
| content: buildCustomContentFilterInput(messages), | ||
| }, | ||
| ], | ||
| }), | ||
| signal, | ||
| }); |
There was a problem hiding this comment.
Skip content filtering on the internal moderation request.
This POST goes through the same /v1/chat/completions handler that invokes checkCustomContentFilter() in apps/gateway/src/chat/chat.ts:1891-1932. With LLM_CONTENT_FILTER_MODELS unset — the default apply-to-all behavior — or configured to include config.model, the moderation request re-enters this code and recursively moderates itself until timeout, so custom filtering fail-opens instead of classifying the original request.
🔁 One way to break the recursion
upstreamResponse = await fetch(getCustomContentFilterUrl(), {
method: "POST",
headers: {
"Content-Type": "application/json",
Authorization: `Bearer ${config.apiKey}`,
"X-Client-Request-Id": context.requestId,
+ "X-LLMGateway-Internal-Moderation": "true",
},// apps/gateway/src/chat/chat.ts
const isInternalModerationRequest =
c.req.header("x-llmgateway-internal-moderation") === "true";
const shouldApplyGatewayContentFilter =
!isInternalModerationRequest &&
contentFilterMode !== "disabled" &&
shouldApplyContentFilterToModel(requestedModel);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/gateway/src/chat/tools/custom-content-filter.ts` around lines 395 - 418,
The moderation POST is re-entering the gateway's content filter causing
recursion; mark the internal moderation request and skip gateway filtering. Add
the internal-moderation header (e.g., "x-llmgateway-internal-moderation":
"true") to the fetch call created by getCustomContentFilterUrl()/the
upstreamResponse POST so the request can be identified, and ensure the gating
logic in checkCustomContentFilter()/chat.ts uses that header
(isInternalModerationRequest) to bypass applying the gateway content filter to
requests with that header set.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d2b640048d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (part.type === "image") { | ||
| return `inline-image: media_type=${part.source.media_type}, bytes=${part.source.data.length}`; | ||
| } |
There was a problem hiding this comment.
Send real inline images to custom moderation
When contentFilterMethod === "custom", inline images are converted to metadata (media_type and byte length) instead of image data, so the moderation model cannot inspect the actual pixels. In enabled mode this allows image-only unsafe content to pass as unflagged whenever accompanying text is benign or empty, which undermines the filter for multimodal requests. The moderation request should include the real image payload (or use a moderation endpoint that accepts image inputs) rather than a textual placeholder.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
🧹 Nitpick comments (2)
apps/gateway/src/chat/tools/custom-content-filter.ts (1)
324-330: Consider extracting the hardcoded score threshold.The
0.5threshold for category scores appears here and again inbuildCustomModerationPayload(line 397). Consider extracting to a named constant for consistency and easier tuning.♻️ Optional: Extract threshold constant
+const CUSTOM_CONTENT_FILTER_SCORE_THRESHOLD = 0.5; + function getFlaggedCategories(payload: ModerationApiPayload): string[] { // ... for (const [category, score] of Object.entries( result.category_scores ?? {}, )) { - if (score > 0.5) { + if (score > CUSTOM_CONTENT_FILTER_SCORE_THRESHOLD) { categories.add(category); } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/gateway/src/chat/tools/custom-content-filter.ts` around lines 324 - 330, Extract the hardcoded 0.5 into a single named constant (e.g., CATEGORY_SCORE_THRESHOLD = 0.5) and replace the literal in the category loop inside custom-content-filter.ts (the for (const [category, score] of Object.entries(...) { if (score > 0.5) ... }) and the other occurrence in buildCustomModerationPayload) so both sites reference the same constant for consistency and easier tuning; ensure the constant is exported or colocated at the top of the module so both functions use it.apps/gateway/src/chat/tools/custom-content-filter.spec.ts (1)
7-227: Consider additional test coverage.The current tests cover the happy path and missing config scenario well. Consider adding tests for:
- Network timeout handling
- Non-2xx upstream responses
- Invalid/malformed upstream JSON responses
- Request cancellation via
AbortSignalThese would increase confidence in the fail-open behavior under various failure modes.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/gateway/src/chat/tools/custom-content-filter.spec.ts` around lines 7 - 227, Add tests for failure modes of checkCustomContentFilter: (1) simulate a network timeout by mocking global.fetch to reject (e.g., throw a TypeError or a custom timeout Error) and assert the function returns fail-open (flagged false), does not crash, and logs an error; (2) simulate a non-2xx upstream response by returning a Response with status 500 and optional error body, then assert fail-open behavior and logged error; (3) simulate malformed/invalid JSON by returning a 200 Response whose choices.message.content is not valid JSON and assert the parser falls back safely and returns fail-open while logging; and (4) test request cancellation by creating an AbortController, passing its signal into checkCustomContentFilter (where supported) and mocking fetch to reject with an AbortError or to observe the signal, then assert the function handles cancellation by returning fail-open and logging. Reference the test file and the checkCustomContentFilter function to add these cases, mock logger.error to inspect logs, and ensure expectations mirror existing tests (fetch calls, result.flagged false, result.responses empty or sanitized).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@apps/gateway/src/chat/tools/custom-content-filter.spec.ts`:
- Around line 7-227: Add tests for failure modes of checkCustomContentFilter:
(1) simulate a network timeout by mocking global.fetch to reject (e.g., throw a
TypeError or a custom timeout Error) and assert the function returns fail-open
(flagged false), does not crash, and logs an error; (2) simulate a non-2xx
upstream response by returning a Response with status 500 and optional error
body, then assert fail-open behavior and logged error; (3) simulate
malformed/invalid JSON by returning a 200 Response whose choices.message.content
is not valid JSON and assert the parser falls back safely and returns fail-open
while logging; and (4) test request cancellation by creating an AbortController,
passing its signal into checkCustomContentFilter (where supported) and mocking
fetch to reject with an AbortError or to observe the signal, then assert the
function handles cancellation by returning fail-open and logging. Reference the
test file and the checkCustomContentFilter function to add these cases, mock
logger.error to inspect logs, and ensure expectations mirror existing tests
(fetch calls, result.flagged false, result.responses empty or sanitized).
In `@apps/gateway/src/chat/tools/custom-content-filter.ts`:
- Around line 324-330: Extract the hardcoded 0.5 into a single named constant
(e.g., CATEGORY_SCORE_THRESHOLD = 0.5) and replace the literal in the category
loop inside custom-content-filter.ts (the for (const [category, score] of
Object.entries(...) { if (score > 0.5) ... }) and the other occurrence in
buildCustomModerationPayload) so both sites reference the same constant for
consistency and easier tuning; ensure the constant is exported or colocated at
the top of the module so both functions use it.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: f8081615-32b3-458e-b58b-08f8ec5996d6
📒 Files selected for processing (3)
apps/gateway/src/api.spec.tsapps/gateway/src/chat/tools/custom-content-filter.spec.tsapps/gateway/src/chat/tools/custom-content-filter.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- apps/gateway/src/api.spec.ts
Summary
customcontent filter method alongsidekeywordsandopenai/v1/chat/completionsusingLLM_CONTENT_FILTER_CUSTOM_API_KEYandLLM_CONTENT_FILTER_CUSTOM_MODELTesting
Summary by CodeRabbit
New Features
Tests