Skip to content

Commit 8ea2552

Browse files
Add comprehensive metrics API documentation
Co-authored-by: brendandburns <5751682+brendandburns@users.noreply.github.com>
1 parent 7817229 commit 8ea2552

File tree

1 file changed

+302
-0
lines changed

1 file changed

+302
-0
lines changed

kubernetes/docs/metrics.md

Lines changed: 302 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,302 @@
1+
# Kubernetes Metrics API Support
2+
3+
This document describes how to use the metrics utilities in the Kubernetes Python client to access resource usage data from the metrics-server.
4+
5+
## Overview
6+
7+
The metrics utilities provide easy access to pod and node resource consumption data (CPU and memory) through the `metrics.k8s.io/v1beta1` API. This enables monitoring and autoscaling workflows directly from Python applications.
8+
9+
## Prerequisites
10+
11+
- A running Kubernetes cluster with [metrics-server](https://github.com/kubernetes-sigs/metrics-server) installed
12+
- Kubernetes Python client library installed
13+
- Appropriate RBAC permissions to access metrics API endpoints
14+
15+
## Installation
16+
17+
The metrics utilities are included in the `kubernetes.utils` module:
18+
19+
```python
20+
from kubernetes import client, config, utils
21+
```
22+
23+
## Quick Start
24+
25+
```python
26+
from kubernetes import client, config, utils
27+
28+
# Load kubernetes configuration
29+
config.load_kube_config()
30+
31+
# Create API client
32+
api_client = client.ApiClient()
33+
34+
# Get node metrics
35+
node_metrics = utils.get_nodes_metrics(api_client)
36+
for node in node_metrics['items']:
37+
print(f"{node['metadata']['name']}: {node['usage']}")
38+
39+
# Get pod metrics in a namespace
40+
pod_metrics = utils.get_pods_metrics(api_client, 'default')
41+
for pod in pod_metrics['items']:
42+
print(f"Pod: {pod['metadata']['name']}")
43+
for container in pod['containers']:
44+
print(f" {container['name']}: {container['usage']}")
45+
```
46+
47+
## API Reference
48+
49+
### `get_nodes_metrics(api_client)`
50+
51+
Fetches current resource usage for all nodes in the cluster.
52+
53+
**Parameters:**
54+
- `api_client` (kubernetes.client.ApiClient): Configured API client instance
55+
56+
**Returns:**
57+
- dict: Response containing node metrics with structure:
58+
```python
59+
{
60+
'kind': 'NodeMetricsList',
61+
'apiVersion': 'metrics.k8s.io/v1beta1',
62+
'items': [
63+
{
64+
'metadata': {'name': 'node-name', ...},
65+
'timestamp': '2024-01-01T00:00:00Z',
66+
'window': '30s',
67+
'usage': {'cpu': '100m', 'memory': '1Gi'}
68+
}
69+
]
70+
}
71+
```
72+
73+
**Raises:**
74+
- `ApiException`: When the metrics server is unavailable or request fails
75+
76+
**Example:**
77+
```python
78+
metrics = utils.get_nodes_metrics(api_client)
79+
for node in metrics['items']:
80+
name = node['metadata']['name']
81+
cpu = node['usage']['cpu']
82+
memory = node['usage']['memory']
83+
print(f"Node {name}: CPU={cpu}, Memory={memory}")
84+
```
85+
86+
### `get_pods_metrics(api_client, namespace, label_selector=None)`
87+
88+
Fetches current resource usage for pods in a specific namespace.
89+
90+
**Parameters:**
91+
- `api_client` (kubernetes.client.ApiClient): Configured API client instance
92+
- `namespace` (str): Kubernetes namespace to query (required)
93+
- `label_selector` (str, optional): Label selector to filter pods (e.g., `'app=nginx,env=prod'`)
94+
95+
**Returns:**
96+
- dict: Response containing pod metrics with structure:
97+
```python
98+
{
99+
'kind': 'PodMetricsList',
100+
'apiVersion': 'metrics.k8s.io/v1beta1',
101+
'items': [
102+
{
103+
'metadata': {'name': 'pod-name', 'namespace': 'default', ...},
104+
'timestamp': '2024-01-01T00:00:00Z',
105+
'window': '30s',
106+
'containers': [
107+
{
108+
'name': 'container-name',
109+
'usage': {'cpu': '50m', 'memory': '512Mi'}
110+
}
111+
]
112+
}
113+
]
114+
}
115+
```
116+
117+
**Raises:**
118+
- `ValueError`: When namespace is None or empty
119+
- `ApiException`: When the metrics server is unavailable or request fails
120+
121+
**Examples:**
122+
```python
123+
# Get all pod metrics in namespace
124+
metrics = utils.get_pods_metrics(api_client, 'default')
125+
126+
# Get pods matching labels
127+
metrics = utils.get_pods_metrics(api_client, 'production', 'app=nginx')
128+
metrics = utils.get_pods_metrics(api_client, 'prod', 'tier=frontend,env=staging')
129+
130+
# Process the results
131+
for pod in metrics['items']:
132+
pod_name = pod['metadata']['name']
133+
for container in pod['containers']:
134+
container_name = container['name']
135+
cpu = container['usage']['cpu']
136+
memory = container['usage']['memory']
137+
print(f"{pod_name}/{container_name}: CPU={cpu}, Memory={memory}")
138+
```
139+
140+
### `get_pods_metrics_in_all_namespaces(api_client, namespaces, label_selector=None)`
141+
142+
Fetches pod metrics across multiple namespaces.
143+
144+
**Parameters:**
145+
- `api_client` (kubernetes.client.ApiClient): Configured API client instance
146+
- `namespaces` (list of str): List of namespace names to query
147+
- `label_selector` (str, optional): Label selector applied to all namespaces
148+
149+
**Returns:**
150+
- dict: Maps namespace names to their metrics or error information:
151+
```python
152+
{
153+
'namespace-1': {
154+
'kind': 'PodMetricsList',
155+
'items': [...]
156+
},
157+
'namespace-2': {
158+
'kind': 'Error',
159+
'error': 'error message'
160+
}
161+
}
162+
```
163+
164+
**Example:**
165+
```python
166+
namespaces = ['default', 'kube-system', 'production']
167+
all_metrics = utils.get_pods_metrics_in_all_namespaces(api_client, namespaces)
168+
169+
for ns, result in all_metrics.items():
170+
if 'error' in result:
171+
print(f"{ns}: ERROR - {result['error']}")
172+
else:
173+
pod_count = len(result.get('items', []))
174+
print(f"{ns}: {pod_count} pods")
175+
```
176+
177+
## Complete Example
178+
179+
See [examples/metrics_example.py](../examples/metrics_example.py) for a complete working example that demonstrates:
180+
- Fetching node metrics
181+
- Fetching pod metrics in specific namespaces
182+
- Using label selectors to filter pods
183+
- Querying multiple namespaces
184+
- Error handling
185+
186+
## Parsing Resource Values
187+
188+
The metrics API returns resource values as Kubernetes quantity strings (e.g., `"100m"` for CPU, `"1Gi"` for memory). You can parse these using the existing `parse_quantity` utility:
189+
190+
```python
191+
from kubernetes import utils
192+
193+
cpu_value = utils.parse_quantity("100m") # Returns Decimal('0.1')
194+
memory_value = utils.parse_quantity("1Gi") # Returns Decimal('1073741824')
195+
```
196+
197+
## Common Use Cases
198+
199+
### Monitoring Resource Usage
200+
201+
```python
202+
def monitor_namespace_resources(api_client, namespace):
203+
"""Monitor total resource usage in a namespace."""
204+
metrics = utils.get_pods_metrics(api_client, namespace)
205+
206+
total_cpu = 0
207+
total_memory = 0
208+
209+
for pod in metrics['items']:
210+
for container in pod['containers']:
211+
cpu = utils.parse_quantity(container['usage']['cpu'])
212+
memory = utils.parse_quantity(container['usage']['memory'])
213+
total_cpu += cpu
214+
total_memory += memory
215+
216+
print(f"Namespace {namespace}:")
217+
print(f" Total CPU: {total_cpu} cores")
218+
print(f" Total Memory: {total_memory / (1024**3):.2f} GiB")
219+
```
220+
221+
### Finding Resource-Intensive Pods
222+
223+
```python
224+
def find_high_cpu_pods(api_client, namespace, threshold_millicores=500):
225+
"""Find pods using more than threshold CPU."""
226+
metrics = utils.get_pods_metrics(api_client, namespace)
227+
high_cpu_pods = []
228+
229+
for pod in metrics['items']:
230+
pod_name = pod['metadata']['name']
231+
for container in pod['containers']:
232+
cpu_str = container['usage']['cpu']
233+
cpu_millicores = utils.parse_quantity(cpu_str) * 1000
234+
235+
if cpu_millicores > threshold_millicores:
236+
high_cpu_pods.append({
237+
'pod': pod_name,
238+
'container': container['name'],
239+
'cpu': cpu_str
240+
})
241+
242+
return high_cpu_pods
243+
```
244+
245+
### Comparing Usage Across Namespaces
246+
247+
```python
248+
def compare_namespace_usage(api_client, namespaces):
249+
"""Compare resource usage across namespaces."""
250+
all_metrics = utils.get_pods_metrics_in_all_namespaces(api_client, namespaces)
251+
252+
for ns, result in all_metrics.items():
253+
if 'error' not in result:
254+
pod_count = len(result['items'])
255+
container_count = sum(len(pod['containers']) for pod in result['items'])
256+
print(f"{ns}: {pod_count} pods, {container_count} containers")
257+
```
258+
259+
## Troubleshooting
260+
261+
### Metrics Server Not Available
262+
263+
If you get an error about metrics not being available:
264+
265+
```
266+
ApiException: (404)
267+
Reason: Not Found
268+
```
269+
270+
This means metrics-server is not installed or not running. Install it using:
271+
272+
```bash
273+
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
274+
```
275+
276+
### Permission Denied
277+
278+
If you get a 403 Forbidden error, ensure your service account has permissions to access the metrics API:
279+
280+
```yaml
281+
apiVersion: rbac.authorization.k8s.io/v1
282+
kind: ClusterRole
283+
metadata:
284+
name: metrics-reader
285+
rules:
286+
- apiGroups: ["metrics.k8s.io"]
287+
resources: ["pods", "nodes"]
288+
verbs: ["get", "list"]
289+
```
290+
291+
### Empty Results
292+
293+
If metrics return empty results, check that:
294+
1. Pods/nodes are actually running in the namespace
295+
2. Metrics-server has had time to collect data (usually 15-60 seconds after pod start)
296+
3. Label selectors are correct if using filtering
297+
298+
## Additional Resources
299+
300+
- [Kubernetes Metrics Server Documentation](https://github.com/kubernetes-sigs/metrics-server)
301+
- [Metrics API Design](https://github.com/kubernetes/design-proposals-archive/blob/main/instrumentation/resource-metrics-api.md)
302+
- [HorizontalPodAutoscaler using metrics](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/)

0 commit comments

Comments
 (0)