You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Kubeflow SDK allows users with limited Kubernetes knowledge to use standard Python APIs to interact with the Kubeflow ecosystem. Documentation: https://sdk.kubeflow.org/en/latest/index.html
271
+
272
+
Most of us use LLMs to create/debug code for jobs, models, etc., but currently there is no mechanism for the LLM to see TrainJob status, debug a crash loop, or provide consolidated metrics about previous tasks. We want to extend and improve the Developer Experience (DX) with a Model Context Protocol (MCP) server for the Kubeflow ecosystem.
273
+
274
+
We have a [kubeflow/community#936](https://github.com/kubeflow/community/issues/936) and an existing MVP for this project. The contributor will extend the MCP server to cover additional use cases, improve error handling, add comprehensive documentation, and potentially integrate with other Kubeflow components like Model Registry.
275
+
276
+
**Core Deliverables:**
277
+
278
+
- MCP tools for TrainJob lifecycle (`fine_tune`, `get_training_job`, `list_training_jobs`, `delete_training_job`)
0 commit comments