Skip to content

LLM update 202507 #8455

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Jul 21, 2025
Merged

LLM update 202507 #8455

merged 13 commits into from
Jul 21, 2025

Conversation

haraldschilly
Copy link
Contributor

@haraldschilly haraldschilly commented Jul 17, 2025

LLM Infrastructure Improvements

Key Updates:

  1. Google Gemini 2.5 Thinking Token Support - Added maxReasoningTokens: 1024 for Gemini 2.5 models
  2. Unified LangChain Implementation - Consolidated 6 LLM provider files into 1 implementation, reducing ~600 lines of duplicated code
  3. OpenAI o1 Model Support - Fixed streaming and system role issues for o1 and o1-mini models
  4. Enhanced Testing - Added comprehensive user-defined LLM tests and improved admin testing interface
  5. Code Consistency & Cleanup - Improved token counting accuracy and error handling

Highlights:

  • Fixed OpenAI o1 models - Resolved stream_options and system role compatibility issues
  • User-defined LLM tests - Complete test coverage for all supported providers (OpenAI, Google, Anthropic, Mistral, Custom OpenAI)
  • Migration framework - Added USE_NEWER_LC_IMPL flag to support gradual migration to unified implementation
  • Better testing tools - New table-based test interface with improved error display

All 27 LLM tests now pass, including the new user-defined LLM functionality. The changes maintain backward compatibility while providing a foundation for future LLM improvements.

@haraldschilly haraldschilly force-pushed the llm-update-202507 branch 2 times, most recently from 2b1a5c4 to 02295d5 Compare July 17, 2025 09:48
@haraldschilly haraldschilly force-pushed the llm-update-202507 branch 2 times, most recently from 51375f8 to fae9f42 Compare July 17, 2025 13:00
@haraldschilly haraldschilly force-pushed the llm-update-202507 branch 2 times, most recently from 77c44d7 to 1928b93 Compare July 17, 2025 14:53
haraldschilly and others added 3 commits July 18, 2025 10:04
- Fix o1 models stream_options error by only including stream_options when streaming is enabled
- Fix o1 models system role error by omitting system messages entirely (o1 models don't support system roles)
- Update tests to use USE_NEWER_LC_IMPL flag to switch between legacy and unified LangChain implementations
- Export USE_NEWER_LC_IMPL flag for test usage
- All 22 LLM tests now pass including both o1 and o1-mini models

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Add comprehensive test suite for user-defined LLMs
- Test OpenAI, Google, Anthropic, Mistral, and custom OpenAI models
- Create ephemeral test database account with proper user-defined LLM config storage
- Use environment variables for API keys (COCALC_TEST_*_KEY)
- Tests validate end-to-end functionality from database storage to LLM evaluation
- Update Anthropic model to use claude-3-5-haiku-latest alias
- All 5 user-defined LLM tests passing

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@haraldschilly haraldschilly force-pushed the llm-update-202507 branch 3 times, most recently from 4b88bb5 to db444e6 Compare July 21, 2025 12:33
@haraldschilly haraldschilly marked this pull request as ready for review July 21, 2025 12:58
@williamstein
Copy link
Contributor

@haraldschilly what is the release process? E.g., what's the current coupling between frontend backend for this? You usually do a good job explaining this. E.g., what happens if only the frontend is updated? Only the backend? do I need to force people to refresh?

@williamstein williamstein merged commit 92554d0 into master Jul 21, 2025
3 checks passed
@haraldschilly
Copy link
Contributor Author

Hmm, good question. Since the backend reports to the frontend what LLMs are available, it is fine to update the frontend and then the backend. There was no change with the actual communication. Overall, both need to be updated.

@williamstein
Copy link
Contributor

Hmm, good question. Since the backend reports to the frontend what LLMs are available, it is fine to update the frontend and then the backend. There was no change with the actual communication. Overall, both need to be updated.

Thanks - that's optimal ! And of course it is all live right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants