fix: use CHAT_SETTING_LIMITS for max_tokens instead of hardcoded values#2008
Open
KushalLukhi wants to merge 1 commit intomckaywrigley:mainfrom
Open
fix: use CHAT_SETTING_LIMITS for max_tokens instead of hardcoded values#2008KushalLukhi wants to merge 1 commit intomckaywrigley:mainfrom
KushalLukhi wants to merge 1 commit intomckaywrigley:mainfrom
Conversation
- Import CHAT_SETTING_LIMITS in OpenAI and Azure chat routes - Replace hardcoded max_tokens logic with CHAT_SETTING_LIMITS lookup - Removes TODO comments that were tracking this issue - Ensures all models use their configured MAX_TOKEN_OUTPUT_LENGTH - Fixes inconsistency where only certain models had max_tokens set This change ensures that: - gpt-4-vision-preview and gpt-4o use 4096 tokens (as before) - Other models now correctly use their configured limits from CHAT_SETTING_LIMITS - Future model additions automatically use correct token limits
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The OpenAI and Azure chat routes had hardcoded max_tokens logic with TODO comments indicating this needed to be fixed. Only
gpt-4-vision-previewandgpt-4owere explicitly set to 4096 tokens, while other models had no limit set (null).Solution
Updated both routes to use
CHAT_SETTING_LIMITSto look up the appropriateMAX_TOKEN_OUTPUT_LENGTHfor each model, ensuring consistent token limits across all models.Changes
CHAT_SETTING_LIMITSand use it formax_tokensCHAT_SETTING_LIMITSand use it formax_tokensBenefits
MAX_TOKEN_OUTPUT_LENGTHfromCHAT_SETTING_LIMITSTesting
Fixes technical debt noted in TODO comments.