Skip to content

fix: use CHAT_SETTING_LIMITS for max_tokens instead of hardcoded values#2008

Open
KushalLukhi wants to merge 1 commit intomckaywrigley:mainfrom
KushalLukhi:fix/use-chat-setting-limits-for-max-tokens
Open

fix: use CHAT_SETTING_LIMITS for max_tokens instead of hardcoded values#2008
KushalLukhi wants to merge 1 commit intomckaywrigley:mainfrom
KushalLukhi:fix/use-chat-setting-limits-for-max-tokens

Conversation

@KushalLukhi
Copy link

Problem

The OpenAI and Azure chat routes had hardcoded max_tokens logic with TODO comments indicating this needed to be fixed. Only gpt-4-vision-preview and gpt-4o were explicitly set to 4096 tokens, while other models had no limit set (null).

Solution

Updated both routes to use CHAT_SETTING_LIMITS to look up the appropriate MAX_TOKEN_OUTPUT_LENGTH for each model, ensuring consistent token limits across all models.

Changes

  • app/api/chat/openai/route.ts: Import CHAT_SETTING_LIMITS and use it for max_tokens
  • app/api/chat/azure/route.ts: Import CHAT_SETTING_LIMITS and use it for max_tokens
  • Removed TODO comments that were tracking this issue

Benefits

  • All models now use their configured MAX_TOKEN_OUTPUT_LENGTH from CHAT_SETTING_LIMITS
  • No more hardcoded model-specific logic
  • Future model additions automatically use correct token limits
  • Consistent behavior between OpenAI and Azure routes

Testing

  • TypeScript compilation passes
  • All existing models continue to use their configured token limits
  • gpt-4-vision-preview and gpt-4o still use 4096 tokens as before
  • Other models (gpt-4-turbo-preview, gpt-4, gpt-3.5-turbo) now correctly use their limits

Fixes technical debt noted in TODO comments.

- Import CHAT_SETTING_LIMITS in OpenAI and Azure chat routes
- Replace hardcoded max_tokens logic with CHAT_SETTING_LIMITS lookup
- Removes TODO comments that were tracking this issue
- Ensures all models use their configured MAX_TOKEN_OUTPUT_LENGTH
- Fixes inconsistency where only certain models had max_tokens set

This change ensures that:
- gpt-4-vision-preview and gpt-4o use 4096 tokens (as before)
- Other models now correctly use their configured limits from CHAT_SETTING_LIMITS
- Future model additions automatically use correct token limits
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant