Skip to content

panic on the pod view during initContainerStats #3866

@gberche-orange

Description

@gberche-orange




Describe the bug

When displaying the pod view in specific conditions, k9s panics with the following stack trace

panic: runtime error: index out of range [1] with length 1
goroutine 440 [running]:
github.com/derailed/k9s/internal/render.(*Pod).initContainerStats(...)
github.com/derailed/k9s/internal/render/pod.go:430
github.com/derailed/k9s/internal/render.(*Pod).defaultRow(0xc00020a298, 0xc0044d4450, 0xc004298918)
github.com/derailed/k9s/internal/render/pod.go:165 +0x1775
github.com/derailed/k9s/internal/render.(*Pod).Render(0xc00020a298, {0x64a9460?, 0xc0044d4450?}, {0x8?, 0x1?}, 0xc004298918)
github.com/derailed/k9s/internal/render/pod.go:137 +0x69
github.com/derailed/k9s/internal/model1.Hydrate.func1({0x7998be0?, 0xa8e1680?})
github.com/derailed/k9s/internal/model1/helpers.go:31 +0xfe
github.com/derailed/k9s/internal/model1.(*WorkerPool).Add.func1({0x7998be0?, 0xa8e1680?}, 0xc0000bafa0?, 0x73f85b0?, 0xc00050e070)
github.com/derailed/k9s/internal/model1/pool.go:57 +0x7e
created by github.com/derailed/k9s/internal/model1.(*WorkerPool).Add in goroutine 232
github.com/derailed/k9s/internal/model1/pool.go:52 +0x136

original unannotated stack trace

    panic: runtime error: index out of range [1] with length 1 
    goroutine 440 [running]:
    github.com/derailed/k9s/internal/render.(*Pod).initContainerStats(...)
    github.com/derailed/k9s/internal/render/pod.go:430 
    github.com/derailed/k9s/internal/render.(*Pod).defaultRow(0xc00020a298, 0xc0044d4450, 0xc004298918)  https://github.com/derailed/k9s/blob/d9b3cea9dd7aa7704905d7264eae4da92210b2ae/internal/render/pod.go#L152-L165
    github.com/derailed/k9s/internal/render/pod.go:165 +0x1775
    github.com/derailed/k9s/internal/render.(*Pod).Render(0xc00020a298, {0x64a9460?, 0xc0044d4450?}, {0x8?, 0x1?}, 0xc004298918)
    github.com/derailed/k9s/internal/render/pod.go:137 +0x69 
    github.com/derailed/k9s/internal/model1.Hydrate.func1({0x7998be0?, 0xa8e1680?})
    github.com/derailed/k9s/internal/model1/helpers.go:31 +0xfe 
    github.com/derailed/k9s/internal/model1.(*WorkerPool).Add.func1({0x7998be0?, 0xa8e1680?}, 0xc0000bafa0?, 0x73f85b0?, 0xc00050e070)
    github.com/derailed/k9s/internal/model1/pool.go:57 +0x7e
    created by github.com/derailed/k9s/internal/model1.(*WorkerPool).Add in goroutine 232
    github.com/derailed/k9s/internal/model1/pool.go:52 +0x136

The following workaround attempts were unsuccessful:

  • disable metrics server from the k8s api
  • disable pod count in k9s config disablePodCounting: true

To Reproduce

We're still unclear on the exact root cause that trigger the panic. We're reproducing against a vcluster v0.30.4 k8s server on top of a gke v1.33.5-gke.2228001

Steps to reproduce the behavior:

k9s -n my-namespace -c pods

Historical Documents

After enabling k8s api audit logs (as a workaround for AFAIK lack of client-side request tracing in k9s see #3741, it seems that k9s crashes on the response from the following endpoint
/api/v1/namespaces/k8saas-xplane-6c9dc9a2-6ab2-468a-98cd-4439f1655d55/pods?allowWatchBookmarks=true\u0026resourceVersionMatch=NotOlderThan\u0026sendInitialEvents=true\u0026timeoutSeconds=404\u0026watch=true I'm suspecting an unusual pod status response from the K8s api server.

It seems to be occuring with pods in failing/pending state such as

kubectl get pods -n k8saas-xplane-6c9dc9a2-6ab2-468a-98cd-4439f1655d55 --watch 
NAME                                                       READY   STATUS             RESTARTS          AGE
k8s-0826388a-ec0c-4c29-8f1c-ab232022f12b-0                 1/1     Failed             0                 11d
k8s-0826388a-ec0c-4c29-8f1c-ab232022f12b-95f49b84c-b5qf7   0/1     Running            154 (5m25s ago)   10h
vcluster-7b55cf74b7-l7svs                                  0/1     Running            83 (3m35s ago)    10h
vcluster-etcd-0                                            1/1     Failed             0                 11d
vcluster-etcd-1                                            0/1     Completed          0                 11d
vcluster-etcd-2                                            1/1     Failed             0                 11d
vk9s-666c6f8895-cmx7j                                      1/1     Running            0                 10h
vk9s-8fc7fcb6f-drzdw                                       0/1     ImagePullBackOff   0                 63m

We're trying to isolate a simple test case and provide the response payload for easier diagnostic.

Expected behavior
No k9s crash, and error message

Versions (please complete the following information):

  • OS: linux
  • K9s: k9s version 0.50.18
  • K8s: a vcluster v0.30.4 k8s server on top of a gke v1.33.5-gke.2228001

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions