Skip to content

[BUG] Annotations from engine/router on pods not passed down to pod spec reconcile #417

@beiguo218

Description

@beiguo218

What happened?

With changes in PR: #391, we found the metadatas (annotations/labels) from engine/router on pods not being passed down to the pod spec reconcile logic so causing wrong determinations in pod spec reconcile logic where we need to rely on the values of those annotations/labels.

The wrong determinations we found so far in the pod spec reconcile logic (caused by #391)

  1. wrong env variable added;
  2. wrong volume mount added;
  3. wrong volume added.

What did you expect to happen?

It is okay to make changes to avoid unnecessary annotations/labels added to the unwanted components. The expectation is for the places where the logic needs to rely on those metadatas (annotations/labels from engine/router on pods), we need to pass down the correct metadatas, to make sure the existing reconcile logic won't break.

The expectation for such changes:

  1. Make sure to mention explicitly in PR description what components you target to change to have different metadata;
  2. Make sure you still pass the original metadata when there is a need to rely on its value.

Under the case your goal has conflicts with current reconciling logic, you should open an issue.

How can we reproduce it (as minimally and precisely as possible)?

Under the case where the runtime spec is with an annotation ome.io/inject-model-init: 'true' under engineConfig which we use to inject the model init for model enigma usage, with changes from #391, this annotations from engine won't be passed down to pod spec reconcile logic so causing unwanted env variable (MODEL_PATH) and unwanted volume mount added to our serving container.

Example runtime with ome.io/inject-model-init: 'true' annotation:

apiVersion: ome.io/v1beta1
kind: ClusterServingRuntime
metadata:
  annotations:
    meta.helm.sh/release-name: ome-model-serving
    meta.helm.sh/release-namespace: ome
  labels:
    app.kubernetes.io/managed-by: Helm
  name: sample-runtime
spec:
  engineConfig:
    affinity:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
            - matchExpressions:
                - key: node.kubernetes.io/instance-type
                  operator: In
                  values:
                    - BM.GPU.H100.8
    annotations:
      ome.io/inject-model-init: 'true'
    labels:
      logging-forward: enabled
    runner:
      image: sample-image:sample-tag
      ports:
        - containerPort: 8080
          name: http1
          protocol: TCP
      resources:
        limits:
          nvidia.com/gpu: '2'
        requests:
          cpu: 16
          memory: 120Gi
          nvidia.com/gpu: '2'
      volumeMounts:
        - mountPath: /dev/shm
          name: shared-memory
        - mountPath: /opt/ml/model
          name: model-empty-dir
    tolerations:
      - effect: NoSchedule
        key: nvidia.com/gpu
        operator: Exists
    volumes:
      - emptyDir:
          medium: Memory
          sizeLimit: 1Gi
        name: shared-memory
      - emptyDir:
          medium: Memory
        name: model-empty-dir
  modelSizeRange:
    max: 120B
    min: 100B
  supportedModelFormats:
    - autoSelect: true
      modelArchitecture: CohereForCausalLM
      modelFormat:
        name: tensorrtllm
        operator: Equal
        version: v0.8.0
        weight: 1
      modelFramework:
        name: tensorrtllm
        operator: Equal
        version: v0.8.0
        weight: 1
      version: '1.0'

Anything else we need to know?

It is related to changes in PR: #391

PR is reverted, please follow this issue to make corresponding changes moving forward.

Environment

  • OME version:
  • Kubernetes version (use kubectl version):
  • Cloud provider or hardware configuration:
  • OS (e.g., from /etc/os-release):
  • Runtime (SGLang, vLLM, etc.) and version:
  • Model being served (if applicable):
  • Install method (Helm, kubectl, etc.):

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions