Skip to content

Commit d5fb05f

Browse files
authored
Use Vector to collect cluster observability information (#732)
* Use Fluent* to persist cluster logs * Update time slice format * Update charts/studio/values.yaml * Update charts/studio/values.yaml * Update Chart.yaml * Update values.yaml * Update values.yaml * Update values.yaml * Helm-Docs update * Update helm-lint-and-install.yaml * fixup! Update time slice format * fixup! fixup! Update time slice format * Helm-Docs update * Migrate from Fluent Bit to Vector * Update helm-lint-and-install.yaml * Helm-Docs update * Lint * Helm-Docs update
1 parent ba8a3c9 commit d5fb05f

File tree

5 files changed

+381
-4
lines changed

5 files changed

+381
-4
lines changed

.github/workflows/helm-lint-and-install.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ jobs:
3737
- name: Add external repositories
3838
run: |
3939
helm repo add bitnami https://charts.bitnami.com/bitnami
40+
helm repo add vector https://helm.vector.dev
4041
4142
- name: Run chart-testing (list-changed)
4243
id: list-changed

charts/studio/Chart.lock

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,5 +8,11 @@ dependencies:
88
- name: clickhouse
99
repository: https://charts.bitnami.com/bitnami
1010
version: 9.2.2
11-
digest: sha256:d86442027fdeecf48ae3e1df3b660cececd2faf2d6ca0e2d36b3e0a9c1ceb2ef
12-
generated: "2025-05-09T22:14:24.213857777Z"
11+
- name: vector
12+
repository: https://helm.vector.dev
13+
version: 0.45.0
14+
- name: vector
15+
repository: https://helm.vector.dev
16+
version: 0.45.0
17+
digest: sha256:971a3d5864e123dad05c065bd32b04f1a3ece329c41aa0c08ddd256c2cbc12f7
18+
generated: "2025-09-09T22:38:17.455796+02:00"

charts/studio/Chart.yaml

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ apiVersion: v2
22
name: studio
33
description: A Helm chart for Kubernetes
44
type: application
5-
version: 0.18.102
5+
version: 0.18.103
66
appVersion: "v2.207.3"
77
maintainers:
88
- name: iterative
@@ -21,3 +21,13 @@ dependencies:
2121
condition: clickhouse.enabled
2222
version: "9.2.2"
2323
repository: "https://charts.bitnami.com/bitnami"
24+
- name: vector
25+
condition: vector-agent.enabled
26+
version: "0.45.0"
27+
repository: "https://helm.vector.dev"
28+
alias: vector-agent
29+
- name: vector
30+
condition: vector-aggregator.enabled
31+
version: "0.45.0"
32+
repository: "https://helm.vector.dev"
33+
alias: vector-aggregator

charts/studio/README.md

Lines changed: 24 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# studio
22

3-
![Version: 0.18.102](https://img.shields.io/badge/Version-0.18.102-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: v2.207.3](https://img.shields.io/badge/AppVersion-v2.207.3-informational?style=flat-square)
3+
![Version: 0.18.103](https://img.shields.io/badge/Version-0.18.103-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: v2.207.3](https://img.shields.io/badge/AppVersion-v2.207.3-informational?style=flat-square)
44

55
A Helm chart for Kubernetes
66

@@ -17,6 +17,8 @@ A Helm chart for Kubernetes
1717
| https://charts.bitnami.com/bitnami | clickhouse | 9.2.2 |
1818
| https://charts.bitnami.com/bitnami | postgresql | 16.7.2 |
1919
| https://charts.bitnami.com/bitnami | redis | 21.0.2 |
20+
| https://helm.vector.dev | vector-agent(vector) | 0.45.0 |
21+
| https://helm.vector.dev | vector-aggregator(vector) | 0.45.0 |
2022

2123
## Values
2224

@@ -245,6 +247,27 @@ A Helm chart for Kubernetes
245247
| studioWorker.strategy | object | `{"rollingUpdate":{"maxSurge":1,"maxUnavailable":0}}` | Worker deployment strategy |
246248
| studioWorker.terminationGracePeriodSeconds | int | `150` | Worker termination grace period |
247249
| studioWorker.tolerations | list | `[]` | Worker tolerations |
250+
| vector-agent | object | `{"customConfig":{"api":{"enabled":false},"data_dir":"/data/vector","expire_metrics_secs":60,"sinks":{"vector_aggregator":{"address":"studio-vector-aggregator:6000","compression":true,"inputs":["kubernetes_logs_filtered","kubernetes_metrics_filtered","kubernetes_metrics_cadvisor_filtered"],"type":"vector"}},"sources":{"kubernetes_logs":{"ignore_older_secs":600,"type":"kubernetes_logs"},"kubernetes_metrics":{"auth":{"strategy":"bearer","token":"${KUBERNETES_SERVICE_ACCOUNT_TOKEN:?}"},"endpoints":["https://${KUBERNETES_NODE_IP}:10250/metrics"],"scrape_interval_secs":30,"tls":{"verify_certificate":false},"type":"prometheus_scrape"},"kubernetes_metrics_cadvisor":{"auth":{"strategy":"bearer","token":"${KUBERNETES_SERVICE_ACCOUNT_TOKEN:?}"},"endpoints":["https://${KUBERNETES_NODE_IP}:10250/metrics/cadvisor"],"scrape_interval_secs":30,"tls":{"verify_certificate":false},"type":"prometheus_scrape"}},"transforms":{"kubernetes_logs_filtered":{"inputs":["kubernetes_logs"],"source":". = {\n \"message\": .message,\n \"source_type\": .source_type,\n \"stream\": .stream,\n \"timestamp\": .timestamp,\n \"kubernetes\": {\n \"pod_name\": .kubernetes.pod_name,\n \"namespace\": .kubernetes.pod_namespace,\n \"container_name\": .kubernetes.container_name\n }\n}\n","type":"remap"},"kubernetes_metrics_cadvisor_filtered":{"condition":".name == \"node_cpu_usage_seconds_total\" || .name == \"node_memory_working_set_bytes\" || .name == \"container_cpu_usage_seconds_total\" || .name == \"container_memory_working_set_bytes\" || .name == \"container_start_time_seconds\"","inputs":["kubernetes_metrics_cadvisor"],"type":"filter"},"kubernetes_metrics_filtered":{"condition":"starts_with!(.name, \"kubelet_volume_stats_\") || .name == \"kubelet_image_pull_duration_seconds\"","inputs":["kubernetes_metrics"],"type":"filter"}}},"enabled":false,"env":[{"name":"KUBERNETES_SERVICE_ACCOUNT_TOKEN","valueFrom":{"secretKeyRef":{"key":"token","name":"studio-vector-agent-token"}}},{"name":"KUBERNETES_NODE_IP","valueFrom":{"fieldRef":{"fieldPath":"status.hostIP"}}}],"extraObjects":[{"apiVersion":"rbac.authorization.k8s.io/v1","kind":"ClusterRole","metadata":{"annotations":{"helm.sh/hook":"pre-install","helm.sh/hook-delete-policy":"hook-failed, before-hook-creation"},"name":"studio-vector-agent-extended"},"rules":[{"apiGroups":[""],"resources":["nodes/metrics","nodes/stats"],"verbs":["get"]}]},{"apiVersion":"rbac.authorization.k8s.io/v1","kind":"ClusterRoleBinding","metadata":{"annotations":{"helm.sh/hook":"pre-install","helm.sh/hook-delete-policy":"hook-failed, before-hook-creation"},"name":"studio-vector-agent-extended"},"roleRef":{"kind":"ClusterRole","name":"studio-vector-agent-extended"},"subjects":[{"kind":"ServiceAccount","name":"studio-vector-agent","namespace":"default"}]},{"apiVersion":"v1","kind":"Secret","metadata":{"annotations":{"kubernetes.io/service-account.name":"studio-vector-agent"},"name":"studio-vector-agent-token"},"type":"kubernetes.io/service-account-token"}],"fullnameOverride":"studio-vector-agent","image":{"base":"alpine"},"role":"Agent","serviceAccount":{"create":true,"name":"studio-vector-agent"},"tolerations":[{"operator":"Exists"}]}` | Vector Agent configuration for log collection (DaemonSet) |
251+
| vector-agent.customConfig | object | `{"api":{"enabled":false},"data_dir":"/data/vector","expire_metrics_secs":60,"sinks":{"vector_aggregator":{"address":"studio-vector-aggregator:6000","compression":true,"inputs":["kubernetes_logs_filtered","kubernetes_metrics_filtered","kubernetes_metrics_cadvisor_filtered"],"type":"vector"}},"sources":{"kubernetes_logs":{"ignore_older_secs":600,"type":"kubernetes_logs"},"kubernetes_metrics":{"auth":{"strategy":"bearer","token":"${KUBERNETES_SERVICE_ACCOUNT_TOKEN:?}"},"endpoints":["https://${KUBERNETES_NODE_IP}:10250/metrics"],"scrape_interval_secs":30,"tls":{"verify_certificate":false},"type":"prometheus_scrape"},"kubernetes_metrics_cadvisor":{"auth":{"strategy":"bearer","token":"${KUBERNETES_SERVICE_ACCOUNT_TOKEN:?}"},"endpoints":["https://${KUBERNETES_NODE_IP}:10250/metrics/cadvisor"],"scrape_interval_secs":30,"tls":{"verify_certificate":false},"type":"prometheus_scrape"}},"transforms":{"kubernetes_logs_filtered":{"inputs":["kubernetes_logs"],"source":". = {\n \"message\": .message,\n \"source_type\": .source_type,\n \"stream\": .stream,\n \"timestamp\": .timestamp,\n \"kubernetes\": {\n \"pod_name\": .kubernetes.pod_name,\n \"namespace\": .kubernetes.pod_namespace,\n \"container_name\": .kubernetes.container_name\n }\n}\n","type":"remap"},"kubernetes_metrics_cadvisor_filtered":{"condition":".name == \"node_cpu_usage_seconds_total\" || .name == \"node_memory_working_set_bytes\" || .name == \"container_cpu_usage_seconds_total\" || .name == \"container_memory_working_set_bytes\" || .name == \"container_start_time_seconds\"","inputs":["kubernetes_metrics_cadvisor"],"type":"filter"},"kubernetes_metrics_filtered":{"condition":"starts_with!(.name, \"kubelet_volume_stats_\") || .name == \"kubelet_image_pull_duration_seconds\"","inputs":["kubernetes_metrics"],"type":"filter"}}}` | Vector Agent configuration |
252+
| vector-agent.enabled | bool | `false` | Vector Agent enabled |
253+
| vector-agent.fullnameOverride | string | `"studio-vector-agent"` | Vector Agent name override |
254+
| vector-agent.image.base | string | `"alpine"` | The base to use for Vector's image. |
255+
| vector-agent.role | string | `"Agent"` | Deploy as DaemonSet for log collection from all nodes |
256+
| vector-agent.serviceAccount | object | `{"create":true,"name":"studio-vector-agent"}` | Vector Agent service account |
257+
| vector-agent.tolerations | list | `[{"operator":"Exists"}]` | Vector Agent tolerations |
258+
| vector-aggregator | object | `{"args":["while sleep 60; do find /data/vector/logs -type f -mtime +7 -delete; done &\nexec /usr/local/bin/vector --config-dir /etc/vector/"],"command":["/bin/sh","-c"],"customConfig":{"api":{"address":"0.0.0.0:8686","enabled":true,"playground":true},"data_dir":"/data/vector","expire_metrics_secs":60,"sinks":{"events_file":{"encoding":{"codec":"json"},"inputs":["kubernetes_events_deduped"],"path":"/data/vector/events/%Y-%m-%d.log","type":"file"},"logs_file":{"encoding":{"codec":"json"},"inputs":["vector_agent_route.logs"],"path":"/data/vector/logs/%Y-%m-%d-{{ \"{{\" }} .kubernetes.pod_name {{ \"}}\" }}.log","type":"file"},"metrics_file":{"encoding":{"codec":"json"},"inputs":["vector_agent_route.metrics"],"path":"/data/vector/metrics/%Y-%m-%d.log","type":"file"}},"sources":{"kubernetes_events":{"auth":{"strategy":"bearer","token":"${KUBERNETES_SERVICE_ACCOUNT_TOKEN:?}"},"decoding":{"codec":"json"},"endpoint":"https://kubernetes.default.svc:443/api/v1/events","headers":{"Accept":["application/json"]},"scrape_interval_secs":30,"tls":{"ca_file":"/var/run/secrets/kubernetes.io/serviceaccount/ca.crt"},"type":"http_client"},"vector_agent":{"address":"0.0.0.0:6000","type":"vector","version":"2"}},"transforms":{"kubernetes_events_deduped":{"fields":{"match":["node_name","object_name","message","timestamp"]},"inputs":["kubernetes_events_normalized"],"type":"dedupe"},"kubernetes_events_normalized":{"inputs":["kubernetes_events_unnested"],"source":". = {\n \"run_id\": null,\n \"node_name\": .items.reportingInstance,\n \"object_name\": .items.involvedObject.name,\n \"timestamp\": .items.lastTimestamp,\n \"message\": .items.message,\n}\n","type":"remap"},"kubernetes_events_unnested":{"inputs":["kubernetes_events"],"source":". = unnest!(.items)\n","type":"remap"},"vector_agent_route":{"inputs":["vector_agent"],"route":{"logs":{"type":"is_log"},"metrics":{"type":"is_metric"}},"type":"route"}}},"enabled":false,"env":[{"name":"KUBERNETES_SERVICE_ACCOUNT_TOKEN","valueFrom":{"secretKeyRef":{"key":"token","name":"studio-vector-aggregator-token"}}}],"extraObjects":[{"apiVersion":"rbac.authorization.k8s.io/v1","kind":"ClusterRole","metadata":{"annotations":{"helm.sh/hook":"pre-install","helm.sh/hook-delete-policy":"hook-failed, before-hook-creation"},"name":"studio-vector-aggregator-extended"},"rules":[{"apiGroups":[""],"resources":["events"],"verbs":["get","list"]}]},{"apiVersion":"rbac.authorization.k8s.io/v1","kind":"ClusterRoleBinding","metadata":{"annotations":{"helm.sh/hook":"pre-install","helm.sh/hook-delete-policy":"hook-failed, before-hook-creation"},"name":"studio-vector-aggregator-extended"},"roleRef":{"kind":"ClusterRole","name":"studio-vector-aggregator-extended"},"subjects":[{"kind":"ServiceAccount","name":"studio-vector-aggregator","namespace":"default"}]},{"apiVersion":"v1","kind":"Secret","metadata":{"annotations":{"kubernetes.io/service-account.name":"studio-vector-aggregator"},"name":"studio-vector-aggregator-token"},"type":"kubernetes.io/service-account-token"}],"fullnameOverride":"studio-vector-aggregator","image":{"base":"alpine"},"persistence":{"accessModes":["ReadWriteOnce"],"enabled":true,"size":"64Gi","storageClass":""},"replicaCount":1,"resources":{"limits":{"memory":"512Mi"},"requests":{"cpu":"200m","memory":"256Mi"}},"role":"Aggregator","service":{"enabled":true,"ports":[{"name":"logs","port":6000,"protocol":"TCP","targetPort":6000},{"name":"api","port":8686,"protocol":"TCP","targetPort":8686}],"type":"ClusterIP"},"serviceAccount":{"create":true,"name":"studio-vector-aggregator"}}` | Vector Aggregator configuration for log aggregation and processing |
259+
| vector-aggregator.args | list | `["while sleep 60; do find /data/vector/logs -type f -mtime +7 -delete; done &\nexec /usr/local/bin/vector --config-dir /etc/vector/"]` | Vector arguments. |
260+
| vector-aggregator.command | list | `["/bin/sh","-c"]` | Vector command. |
261+
| vector-aggregator.customConfig | object | `{"api":{"address":"0.0.0.0:8686","enabled":true,"playground":true},"data_dir":"/data/vector","expire_metrics_secs":60,"sinks":{"events_file":{"encoding":{"codec":"json"},"inputs":["kubernetes_events_deduped"],"path":"/data/vector/events/%Y-%m-%d.log","type":"file"},"logs_file":{"encoding":{"codec":"json"},"inputs":["vector_agent_route.logs"],"path":"/data/vector/logs/%Y-%m-%d-{{ \"{{\" }} .kubernetes.pod_name {{ \"}}\" }}.log","type":"file"},"metrics_file":{"encoding":{"codec":"json"},"inputs":["vector_agent_route.metrics"],"path":"/data/vector/metrics/%Y-%m-%d.log","type":"file"}},"sources":{"kubernetes_events":{"auth":{"strategy":"bearer","token":"${KUBERNETES_SERVICE_ACCOUNT_TOKEN:?}"},"decoding":{"codec":"json"},"endpoint":"https://kubernetes.default.svc:443/api/v1/events","headers":{"Accept":["application/json"]},"scrape_interval_secs":30,"tls":{"ca_file":"/var/run/secrets/kubernetes.io/serviceaccount/ca.crt"},"type":"http_client"},"vector_agent":{"address":"0.0.0.0:6000","type":"vector","version":"2"}},"transforms":{"kubernetes_events_deduped":{"fields":{"match":["node_name","object_name","message","timestamp"]},"inputs":["kubernetes_events_normalized"],"type":"dedupe"},"kubernetes_events_normalized":{"inputs":["kubernetes_events_unnested"],"source":". = {\n \"run_id\": null,\n \"node_name\": .items.reportingInstance,\n \"object_name\": .items.involvedObject.name,\n \"timestamp\": .items.lastTimestamp,\n \"message\": .items.message,\n}\n","type":"remap"},"kubernetes_events_unnested":{"inputs":["kubernetes_events"],"source":". = unnest!(.items)\n","type":"remap"},"vector_agent_route":{"inputs":["vector_agent"],"route":{"logs":{"type":"is_log"},"metrics":{"type":"is_metric"}},"type":"route"}}}` | Vector Aggregator configuration |
262+
| vector-aggregator.enabled | bool | `false` | Vector Aggregator enabled |
263+
| vector-aggregator.fullnameOverride | string | `"studio-vector-aggregator"` | Vector Aggregator name override |
264+
| vector-aggregator.image.base | string | `"alpine"` | The base to use for Vector's image. |
265+
| vector-aggregator.persistence | object | `{"accessModes":["ReadWriteOnce"],"enabled":true,"size":"64Gi","storageClass":""}` | Vector Aggregator persistence configuration |
266+
| vector-aggregator.replicaCount | int | `1` | Vector Aggregator replica count |
267+
| vector-aggregator.resources | object | `{"limits":{"memory":"512Mi"},"requests":{"cpu":"200m","memory":"256Mi"}}` | Vector Aggregator resources |
268+
| vector-aggregator.role | string | `"Aggregator"` | Deploy as StatefulSet for aggregation and persistence |
269+
| vector-aggregator.service | object | `{"enabled":true,"ports":[{"name":"logs","port":6000,"protocol":"TCP","targetPort":6000},{"name":"api","port":8686,"protocol":"TCP","targetPort":8686}],"type":"ClusterIP"}` | Vector Aggregator service configuration |
270+
| vector-aggregator.serviceAccount | object | `{"create":true,"name":"studio-vector-aggregator"}` | Vector Aggregator service account |
248271

249272
----------------------------------------------
250273
Autogenerated from chart metadata using [helm-docs v1.14.2](https://github.com/norwoodj/helm-docs/releases/v1.14.2)

0 commit comments

Comments
 (0)