Skip to content

Commit b56dc0c

Browse files
Merge pull request #1881 from ibihim/ibihim/2025-07-30_check-for-user-scc-violation-git-commit-history
CNTRLPLANE-180: check for user-based SCCs causing PSA violations
2 parents d3e707d + 1d98eee commit b56dc0c

33 files changed

+3554
-95
lines changed
Lines changed: 113 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,113 @@
1+
# Pod Security Readiness Controller
2+
3+
The Pod Security Readiness Controller evaluates namespace compatibility with Pod Security Admission (PSA) enforcement in clusters.
4+
5+
## Purpose
6+
7+
This controller performs dry-run PSA evaluations to determine which namespaces would experience pod creation failures if PSA enforcement labels were applied.
8+
9+
The controller generates telemetry data for `ClusterFleetEvaluation` and helps us to understand PSA compatibility before enabling enforcement.
10+
11+
## Implementation
12+
13+
The controller follows this evaluation algorithm:
14+
15+
1. **Namespace Discovery** - Find namespaces without PSA enforcement
16+
2. **PSA Level Determination** - Predict what enforcement level would be applied
17+
3. **Dry-Run Evaluation** - Test namespace against predicted PSA level
18+
4. **Violation Classification** - Categorize any violations found for telemetry
19+
20+
### Namespace Discovery
21+
22+
Selects namespaces without PSA enforcement labels:
23+
24+
```go
25+
selector := "!pod-security.kubernetes.io/enforce"
26+
```
27+
28+
### PSA Level Determination
29+
30+
The controller determines the effective PSA enforcement level using this precedence:
31+
32+
1. `security.openshift.io/MinimallySufficientPodSecurityStandard` annotation
33+
2. Most restrictive of existing `pod-security.kubernetes.io/warn` or `pod-security.kubernetes.io/audit` labels, if owned by the PSA label syncer
34+
3. Kube API server's future global default: `restricted`
35+
36+
### Dry-Run Evaluation
37+
38+
The controller performs the equivalent of this oc command:
39+
40+
```bash
41+
oc label --dry-run=server --overwrite namespace $NAMESPACE_NAME \
42+
pod-security.kubernetes.io/enforce=$POD_SECURITY_STANDARD
43+
```
44+
45+
PSA warnings during dry-run indicate the namespace contains violating workloads.
46+
47+
### Violation Classification
48+
49+
Violating namespaces are categorized for telemetry analysis:
50+
51+
| Classification | Criteria | Purpose |
52+
|------------------|-----------------------------------------------------------------|----------------------------------------|
53+
| `runLevelZero` | Core namespaces: `kube-system`, `default`, `kube-public` | Platform infrastructure tracking |
54+
| `openshift` | Namespaces with `openshift-` prefix | OpenShift component tracking |
55+
| `disabledSyncer` | Label `security.openshift.io/scc.podSecurityLabelSync: "false"` | Intentionally excluded namespaces |
56+
| `userSCC` | Contains user workloads that violate PSA | SCC vs PSA policy conflicts |
57+
| `unknown` | All other violating namespaces | We simply don't know |
58+
| `inconclusive` | Evaluation failed due to API errors | Operational problems |
59+
60+
#### User SCC Detection
61+
62+
The PSA label syncer bases its evaluation exclusively on a ServiceAccount's SCCs, ignoring a user's SCCs.
63+
When a pod's SCC assignment comes from user permissions rather than its ServiceAccount, the syncer's predicted PSA level may be incorrect.
64+
Therefore we need to evaluate the affected pods (if any) against the target PSA level.
65+
66+
### Inconclusive Handling
67+
68+
When the evaluation process fails, namespaces are marked as `inconclusive`.
69+
70+
Common causes for inconclusive results:
71+
72+
- **API server unavailable** - Network timeouts, etcd issues
73+
- **Resource conflicts** - Concurrent namespace modifications
74+
- **Invalid PSA levels** - Malformed enforcement level strings
75+
- **Pod listing failures** - RBAC issues or resource pressure
76+
77+
High rates of inconclusive results across the fleet may indicate systematic issues that requires investigation.
78+
79+
## Output
80+
81+
The controller updates `OperatorStatus` conditions for each violation type:
82+
83+
```go
84+
type podSecurityOperatorConditions struct {
85+
violatingOpenShiftNamespaces []string // PodSecurityOpenshiftEvaluationConditionsDetected
86+
violatingRunLevelZeroNamespaces []string // PodSecurityRunLevelZeroEvaluationConditionsDetected
87+
violatingDisabledSyncerNamespaces []string // PodSecurityDisabledSyncerEvaluationConditionsDetected
88+
violatingUserSCCNamespaces []string // PodSecurityUserSCCEvaluationConditionsDetected
89+
violatingUnclassifiedNamespaces []string // PodSecurityUnknownEvaluationConditionsDetected
90+
inconclusiveNamespaces []string // PodSecurityInconclusiveEvaluationConditionsDetected
91+
}
92+
```
93+
94+
Conditions follow the pattern:
95+
96+
- `PodSecurity{Type}EvaluationConditionsDetected`
97+
- Status: `True` (violations found) / `False` (no violations)
98+
- Message includes violating namespace list
99+
100+
## Configuration
101+
102+
The controller runs with a configurable interval (default: 4 hours) and uses rate limiting to avoid overwhelming the API server:
103+
104+
```go
105+
kubeClientCopy.QPS = 2
106+
kubeClientCopy.Burst = 2
107+
```
108+
109+
## Integration Points
110+
111+
- **PSA Label Syncer**: Reads syncer-managed PSA labels to predict enforcement levels
112+
- **Cluster Operator**: Reports status through standard operator conditions
113+
- **Telemetry**: Violation data feeds into cluster fleet analysis systems
Lines changed: 101 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,101 @@
1+
package podsecurityreadinesscontroller
2+
3+
import (
4+
"context"
5+
"errors"
6+
"strings"
7+
8+
securityv1 "github.com/openshift/api/security/v1"
9+
corev1 "k8s.io/api/core/v1"
10+
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
11+
"k8s.io/apimachinery/pkg/util/sets"
12+
"k8s.io/klog/v2"
13+
psapi "k8s.io/pod-security-admission/api"
14+
"k8s.io/pod-security-admission/policy"
15+
)
16+
17+
var (
18+
runLevelZeroNamespaces = sets.New[string](
19+
"default",
20+
"kube-system",
21+
"kube-public",
22+
"kube-node-lease",
23+
)
24+
errNoViolatingPods = errors.New("no violating pods in violating namespace")
25+
)
26+
27+
func (c *PodSecurityReadinessController) classifyViolatingNamespace(
28+
ctx context.Context,
29+
conditions *podSecurityOperatorConditions,
30+
ns *corev1.Namespace,
31+
enforceLevel psapi.Level,
32+
) error {
33+
if runLevelZeroNamespaces.Has(ns.Name) {
34+
conditions.addViolatingRunLevelZero(ns)
35+
return nil
36+
}
37+
if strings.HasPrefix(ns.Name, "openshift") {
38+
conditions.addViolatingOpenShift(ns)
39+
return nil
40+
}
41+
if ns.Labels[labelSyncControlLabel] == "false" {
42+
conditions.addViolatingDisabledSyncer(ns)
43+
return nil
44+
}
45+
46+
// Evaluate by individual pod.
47+
allPods, err := c.kubeClient.CoreV1().Pods(ns.Name).List(ctx, metav1.ListOptions{})
48+
if err != nil {
49+
// Will end up in inconclusive as we couldn't diagnose the violation root
50+
// cause.
51+
klog.V(2).ErrorS(err, "Failed to list pods in namespace", "namespace", ns.Name)
52+
return err
53+
}
54+
55+
isViolating := createPodViolationEvaluator(c.psaEvaluator, enforceLevel)
56+
violatingPods := []corev1.Pod{}
57+
for _, pod := range allPods.Items {
58+
if isViolating(pod) {
59+
violatingPods = append(violatingPods, pod)
60+
}
61+
}
62+
if len(violatingPods) == 0 {
63+
klog.V(2).ErrorS(errNoViolatingPods, "failed to find violating pod", "namespace", ns.Name)
64+
return errNoViolatingPods
65+
}
66+
67+
violatingUserSCCPods := []corev1.Pod{}
68+
for _, pod := range violatingPods {
69+
if pod.Annotations[securityv1.ValidatedSCCSubjectTypeAnnotation] == "user" {
70+
violatingUserSCCPods = append(violatingUserSCCPods, pod)
71+
}
72+
}
73+
if len(violatingUserSCCPods) > 0 {
74+
conditions.addViolatingUserSCC(ns)
75+
}
76+
if len(violatingUserSCCPods) != len(violatingPods) {
77+
conditions.addUnclassifiedIssue(ns)
78+
}
79+
80+
return nil
81+
}
82+
83+
func createPodViolationEvaluator(evaluator policy.Evaluator, enforcement psapi.Level) func(pod corev1.Pod) bool {
84+
return func(pod corev1.Pod) bool {
85+
results := evaluator.EvaluatePod(
86+
psapi.LevelVersion{
87+
Level: enforcement,
88+
Version: psapi.LatestVersion(),
89+
},
90+
&pod.ObjectMeta,
91+
&pod.Spec,
92+
)
93+
94+
for _, result := range results {
95+
if !result.Allowed {
96+
return true
97+
}
98+
}
99+
return false
100+
}
101+
}

0 commit comments

Comments
 (0)