fix: Resolve JSONDecodeError in LLM fine-tuning tune() API by Priyanshu-u07 · Pull Request #2610 · kubeflow/katib

Priyanshu-u07 · 2026-01-29T20:57:51Z

Description

This PR fixes a bug in the Katib Python SDK where LLM training parameters were passed as an improperly quoted JSON string when using the tune() API. The extra shell quoting caused the LLM worker pod (PyTorch container) to fail with a JSONDecodeError, preventing fine-tuning experiments from running correctly.

The implementation ensures that:

training_parameters and lora_config are serialized as valid JSON without extra quotes before being passed to the container.
A validation check is added to ensure trainer_parameters.training_parameters is not empty, raising a helpful error if missing.
Unit tests are added to verify both the JSON serialization and the validation behavior.

This resolves the LLM fine-tuning errors in the worker pod, allowing experiments using the tune() API to initialize and run successfully.

Changes Included

sdk/python/v1beta1/kubeflow/katib/api/katib_client.py:

Fixed JSON serialization for training_parameters and lora_config container args by removing extra shell quoting.
Added type-safe json.dumps() conversion to ensure Kubernetes args are always strings.
Added early validation for missing training_parameters with a user-friendly error message linking to documentation.

sdk/python/v1beta1/kubeflow/katib/api/katib_client_test.py

Added a test case for missing training_parameters.
Added a test case to verify correct JSON serialization format in container arguments.

Testing:

All 25 existing tests pass.
New unit tests verify that the bug is fixed and the container receives correctly formatted JSON.

Fixes #2587

Checklist:

Docs included if any changes are user facing

google-oss-prow · 2026-01-29T20:57:57Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign gaocegege for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

github-actions · 2026-01-29T20:58:01Z

🎉 Welcome to the Kubeflow Katib repo! 🎉

Thanks for opening your first PR! We're excited to have you onboard 🚀

Next steps:

Our team will review your PR soon! cc @kubeflow/wg-automl-leads
Check out the Contributing Guide and the Kubeflow Contributor Guide
Join the Kubeflow Slack channels: https://www.kubeflow.org/docs/about/community/#kubeflow-slack-channels
Join the AutoML & Training WG meetings: https://bit.ly/2PWVCkV

Feel free to ask questions in the comments. Thanks again for contributing! 🙏

Signed-off-by: Priyanshu-u07 <connect.priyanshu8271@gmail.com>

google-oss-prow bot requested review from anencore94 and helenxie-bit January 29, 2026 20:57

google-oss-prow bot added the size/L label Jan 29, 2026

Priyanshu-u07 added 2 commits January 30, 2026 02:55

Resolve JSONDecodeError in LLM fine-tuning tune() API

87d0f20

Signed-off-by: Priyanshu-u07 <connect.priyanshu8271@gmail.com>

Remove outdated comment

39cdc41

Signed-off-by: Priyanshu-u07 <connect.priyanshu8271@gmail.com>

Priyanshu-u07 force-pushed the fix-llm-training-parameters-json branch from cefc33f to 39cdc41 Compare January 29, 2026 21:29

google-oss-prow bot added size/M and removed size/L labels Jan 29, 2026

Fix lint issues

2cf3a8a

Signed-off-by: Priyanshu-u07 <connect.priyanshu8271@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Resolve JSONDecodeError in LLM fine-tuning tune() API#2610

fix: Resolve JSONDecodeError in LLM fine-tuning tune() API#2610
Priyanshu-u07 wants to merge 3 commits intokubeflow:masterfrom
Priyanshu-u07:fix-llm-training-parameters-json

Priyanshu-u07 commented Jan 29, 2026

Uh oh!

google-oss-prow bot commented Jan 29, 2026

Uh oh!

github-actions bot commented Jan 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Priyanshu-u07 commented Jan 29, 2026

Description

Uh oh!

google-oss-prow bot commented Jan 29, 2026

Uh oh!

github-actions bot commented Jan 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant