LLM update 202507 #8455

haraldschilly · 2025-07-17T08:41:03Z

LLM Infrastructure Improvements

Key Updates:

Google Gemini 2.5 Thinking Token Support - Added maxReasoningTokens: 1024 for Gemini 2.5 models
Unified LangChain Implementation - Consolidated 6 LLM provider files into 1 implementation, reducing ~600 lines of duplicated code
OpenAI o1 Model Support - Fixed streaming and system role issues for o1 and o1-mini models
Enhanced Testing - Added comprehensive user-defined LLM tests and improved admin testing interface
Code Consistency & Cleanup - Improved token counting accuracy and error handling

Highlights:

Fixed OpenAI o1 models - Resolved stream_options and system role compatibility issues
User-defined LLM tests - Complete test coverage for all supported providers (OpenAI, Google, Anthropic, Mistral, Custom OpenAI)
Migration framework - Added USE_NEWER_LC_IMPL flag to support gradual migration to unified implementation
Better testing tools - New table-based test interface with improved error display

All 27 LLM tests now pass, including the new user-defined LLM functionality. The changes maintain backward compatibility while providing a foundation for future LLM improvements.

… tests

…mprove admin panel LLM test UI

- Fix o1 models stream_options error by only including stream_options when streaming is enabled - Fix o1 models system role error by omitting system messages entirely (o1 models don't support system roles) - Update tests to use USE_NEWER_LC_IMPL flag to switch between legacy and unified LangChain implementations - Export USE_NEWER_LC_IMPL flag for test usage - All 22 LLM tests now pass including both o1 and o1-mini models 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

- Add comprehensive test suite for user-defined LLMs - Test OpenAI, Google, Anthropic, Mistral, and custom OpenAI models - Create ephemeral test database account with proper user-defined LLM config storage - Use environment variables for API keys (COCALC_TEST_*_KEY) - Tests validate end-to-end functionality from database storage to LLM evaluation - Update Anthropic model to use claude-3-5-haiku-latest alias - All 5 user-defined LLM tests passing 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

williamstein · 2025-07-21T20:59:12Z

@haraldschilly what is the release process? E.g., what's the current coupling between frontend backend for this? You usually do a good job explaining this. E.g., what happens if only the frontend is updated? Only the backend? do I need to force people to refresh?

haraldschilly · 2025-07-22T07:49:53Z

Hmm, good question. Since the backend reports to the frontend what LLMs are available, it is fine to update the frontend and then the backend. There was no change with the actual communication. Overall, both need to be updated.

williamstein · 2025-07-22T14:57:10Z

Hmm, good question. Since the backend reports to the frontend what LLMs are available, it is fine to update the frontend and then the backend. There was no change with the actual communication. Overall, both need to be updated.

Thanks - that's optimal ! And of course it is all live right now.

haraldschilly added 3 commits July 16, 2025 19:53

util/llm: define a few more models to get started ...

ef704e7

Merge remote-tracking branch 'origin/master' into llm-update-202507

63e0b77

util/llm: refine model list and fix test

8be3ca9

haraldschilly force-pushed the llm-update-202507 branch 2 times, most recently from 2b1a5c4 to 02295d5 Compare July 17, 2025 09:48

server/llm: update llm packages

27bdd94

haraldschilly force-pushed the llm-update-202507 branch from 02295d5 to 27bdd94 Compare July 17, 2025 10:11

haraldschilly added 2 commits July 17, 2025 12:25

server/llm: revamp counting tokens for google genai

1c01445

server/llm: improve token counting

a602031

haraldschilly force-pushed the llm-update-202507 branch 2 times, most recently from 51375f8 to fae9f42 Compare July 17, 2025 13:00

server/llm: combine langchain implementations in a single file -- add…

de3e86c

… tests

haraldschilly force-pushed the llm-update-202507 branch from fae9f42 to de3e86c Compare July 17, 2025 13:44

williamstein added the PR-work in progress label Jul 17, 2025

haraldschilly force-pushed the llm-update-202507 branch 2 times, most recently from 77c44d7 to 1928b93 Compare July 17, 2025 14:53

llm: fix custom OpenAI implementation (backwards compatibility) and i…

6bdcbfa

…mprove admin panel LLM test UI

haraldschilly force-pushed the llm-update-202507 branch from 1928b93 to 6bdcbfa Compare July 17, 2025 15:00

haraldschilly and others added 3 commits July 18, 2025 10:04

Merge remote-tracking branch 'origin/master' into llm-update-202507

317c4fe

haraldschilly force-pushed the llm-update-202507 branch 3 times, most recently from 4b88bb5 to db444e6 Compare July 21, 2025 12:33

server/llm: further tweak generalized langchain processing

f764dc0

haraldschilly force-pushed the llm-update-202507 branch from db444e6 to f764dc0 Compare July 21, 2025 12:35

server/llm: fix pricing

321ab58

haraldschilly added PR-needs review and removed PR-work in progress labels Jul 21, 2025

haraldschilly marked this pull request as ready for review July 21, 2025 12:58

williamstein merged commit 92554d0 into master Jul 21, 2025
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LLM update 202507 #8455

LLM update 202507 #8455

Uh oh!

haraldschilly commented Jul 17, 2025 •

edited

Loading

Uh oh!

williamstein commented Jul 21, 2025

Uh oh!

Uh oh!

haraldschilly commented Jul 22, 2025

Uh oh!

williamstein commented Jul 22, 2025

Uh oh!

Uh oh!

LLM update 202507 #8455

LLM update 202507 #8455

Uh oh!

Conversation

haraldschilly commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

LLM Infrastructure Improvements

Key Updates:

Highlights:

Uh oh!

williamstein commented Jul 21, 2025

Uh oh!

Uh oh!

haraldschilly commented Jul 22, 2025

Uh oh!

williamstein commented Jul 22, 2025

Uh oh!

Uh oh!

haraldschilly commented Jul 17, 2025 •

edited

Loading