Skip to content

Commit 71d47e6

Browse files
committed
remove deviceIndex
Signed-off-by: Alay Patel <[email protected]>
1 parent 6324114 commit 71d47e6

File tree

1 file changed

+12
-43
lines changed
  • keps/sig-node/5304-dra-attributes-downward-api

1 file changed

+12
-43
lines changed

keps/sig-node/5304-dra-attributes-downward-api/README.md

Lines changed: 12 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -193,7 +193,7 @@ This proposal introduces a new Downward API selector (`resourceSliceAttributeRef
193193
4. Maintains a per-Pod cache of `(claimName, requestName) -> {attribute: value}` mappings
194194
5. Resolves `resourceSliceAttributeRef` references when containers start
195195

196-
Downward API references expose one attribute per reference. The kubelet resolves only the attribute explicitly referenced via `resourceSliceAttributeRef`.
196+
Downward API references expose one attribute per reference. The kubelet resolves only the attribute explicitly referenced via `ResourceSliceAttributeSelector`.
197197

198198
### User Stories (Optional)
199199

@@ -251,14 +251,6 @@ type ResourceSliceAttributeSelector struct {
251251
// The attribute name must be present in the ResourceSlice's device attributes.
252252
// +required
253253
Attribute string `json:"attribute"`
254-
255-
// DeviceIndex selects which device's attribute to surface when a request
256-
// is satisfied by multiple devices. When unset, attributes from all
257-
// allocated devices for the request are joined in allocation order.
258-
// Zero-based index into the allocation results for the matching request.
259-
// Must be >= 0 if set. Bounds are validated at resolution time by the kubelet.
260-
// +optional
261-
DeviceIndex *int32 `json:"deviceIndex,omitempty"`
262254
}
263255

264256
// In core/v1 EnvVarSource:
@@ -278,7 +270,7 @@ Validation:
278270
- Enforce exactly one of `fieldRef`, `resourceFieldRef`, or `resourceSliceAttributeRef` in both env and volume items
279271
- Validate `claimName` and `requestName` against DNS label rules
280272
- No API-level enumeration of attribute names; kubelet resolves attributes that exist in the matching `ResourceSlice` at runtime
281-
- If `deviceIndex` is set, validate it is >= 0; if unset, aggregate across all allocated devices
273+
282274

283275
####
284276

@@ -292,14 +284,14 @@ The kubelet runs a local DRA attributes controller that:
292284
4. Maintains Cache: Keeps a per-Pod map of `(claimName, requestName) -> {attribute: value}` with a readiness flag
293285

294286
Resolution Semantics:
287+
- Prioritized List compatibility:
288+
- Clients do not need to know the number of devices a priori. At container start, kubelet aggregates the attribute across all devices actually allocated for the request and joins the values with "," in allocation order. If any allocated device lacks the attribute, resolution fails and the pod start errors.
295289
- Cache entries are updated on claim/slice changes
296290
- For container environment variables, resolution happens at container start using the latest ready values
297-
- Attributes are not frozen at allocation time; scheduler and controllers are not involved in copying attributes
298-
299-
- Failure on missing data: If the `ResourceSlice` is not found, the attribute is absent, or `deviceIndex` is out of range at container start, kubelet records a warning event and returns an error to the sync loop. The pod start fails per standard semantics (e.g., `restartPolicy` governs restarts; Jobs will fail the pod).
300-
- Multi-device requests:
301-
- When `deviceIndex` is unset, kubelet resolves the attribute across all allocated devices for the request, preserving allocation order, and joins values with a comma (",") into a single string. Devices that do not report the attribute are skipped. If no devices provide the attribute, the value is considered not ready.
302-
- When `deviceIndex` is set, kubelet selects the device at that zero-based index from the allocation results and resolves the attribute for that device only. If the index is out of range or the attribute is missing on that device, the value is considered not ready.
291+
- Attributes are not frozen at allocation time; scheduler and controllers are not involved in copying attributes
292+
293+
- Failure on missing data: If the `ResourceSlice` is not found, or the attribute is absent on any allocated device at container start, kubelet records a warning event and returns an error to the sync loop. The pod start fails per standard semantics (e.g., `restartPolicy` governs restarts; Jobs will fail the pod).
294+
- Multi-device requests: Kubelet resolves the attribute across all allocated devices for the request, preserving allocation order, and joins values with a comma (",") into a single string. If any allocated device does not report the attribute, resolution fails (pod start error).
303295

304296
Security & RBAC:
305297
- Node kubelet uses NodeAuthorizer to watch/read `ResourceClaim` and `ResourceSlice` objects related to Pods scheduled to the node
@@ -329,34 +321,12 @@ spec:
329321
claimName: pgpu-claim
330322
requestName: pgpu-request
331323
attribute: resource.kubernetes.io/pcieRoot
332-
# deviceIndex omitted -> aggregate across all devices
324+
# If multiple devices are allocated for this request, values are joined with "," in allocation order.
333325
```
334326

335327

336328

337-
Environment Variable Example (SpecificIndex for multi-device request):
338-
339-
```yaml
340-
apiVersion: v1
341-
kind: Pod
342-
metadata:
343-
name: virt-launcher-gpu-index
344-
spec:
345-
resourceClaims:
346-
- name: pgpu-claim
347-
resourceClaimName: my-physical-gpu-claim
348-
containers:
349-
- name: compute
350-
image: virt-launcher:latest
351-
env:
352-
- name: PGPU_CLAIM_PCI_ROOT_INDEX1
353-
valueFrom:
354-
resourceSliceAttributeRef:
355-
claimName: pgpu-claim
356-
requestName: pgpu-request
357-
attribute: resource.kubernetes.io/pcieRoot
358-
deviceIndex: 1
359-
```
329+
360330

361331
### Feature Gate
362332

@@ -406,8 +376,8 @@ Integration tests will cover:
406376

407377
- Feature gate toggling: Verify API rejects `resourceSliceAttributeRef` when feature gate is disabled
408378
- End-to-end resolution: Create Pod with resourceClaims, verify env vars contain correct attribute values
409-
- Negative cases: Missing allocation, missing `ResourceSlice`, missing attribute, invalid `deviceIndex` — expect warning event and pod start failure
410-
- Multi-device semantics: All-mode joining order and delimiter; SpecificIndex with valid/invalid index; missing attribute on selected device
379+
- Negative cases: Missing allocation, missing `ResourceSlice`, missing attribute on any allocated device — expect warning event and pod start failure
380+
- Multi-device semantics: Joining order and delimiter; mixed presence of attributes across allocated devices should cause failure
411381

412382
Tests will be added to `test/integration/kubelet/` and `test/integration/dra/`.
413383

@@ -430,7 +400,6 @@ Tests will be added to `test/e2e/dra/` and `test/e2e_node/downwardapi_test.go`.
430400
- Feature implemented behind `DRADownwardDeviceAttributes` feature gate
431401
- API types added: `resourceSliceAttributeRef` in `core/v1.EnvVarSource`
432402
- Kubelet DRA attributes controller implemented
433-
- Support for `resource.kubernetes.io/pcieRoot` and `dra.kubervirt.io/mdevUUID` attributes
434403
- Unit tests for validation, cache, and resolution logic
435404
- Initial integration and e2e tests completed
436405
- Documentation published for API usage

0 commit comments

Comments
 (0)