|
| 1 | +# Pod Security Readiness Controller |
| 2 | + |
| 3 | +The Pod Security Readiness Controller evaluates namespace compatibility with Pod Security Admission (PSA) enforcement in clusters. |
| 4 | + |
| 5 | +## Purpose |
| 6 | + |
| 7 | +This controller performs dry-run PSA evaluations to determine which namespaces would experience pod creation failures if PSA enforcement labels were applied. |
| 8 | + |
| 9 | +The controller generates telemetry data for `ClusterFleetEvaluation` and helps us to understand PSA compatibility before enabling enforcement. |
| 10 | + |
| 11 | +## Implementation |
| 12 | + |
| 13 | +The controller follows this evaluation algorithm: |
| 14 | + |
| 15 | +1. **Namespace Discovery** - Find namespaces without PSA enforcement |
| 16 | +2. **PSA Level Determination** - Predict what enforcement level would be applied |
| 17 | +3. **Dry-Run Evaluation** - Test namespace against predicted PSA level |
| 18 | +4. **Violation Classification** - Categorize any violations found for telemetry |
| 19 | + |
| 20 | +### Namespace Discovery |
| 21 | + |
| 22 | +Selects namespaces without PSA enforcement labels: |
| 23 | + |
| 24 | +```go |
| 25 | +selector := "!pod-security.kubernetes.io/enforce" |
| 26 | +``` |
| 27 | + |
| 28 | +### PSA Level Determination |
| 29 | + |
| 30 | +The controller determines the effective PSA enforcement level using this precedence: |
| 31 | + |
| 32 | +1. `security.openshift.io/MinimallySufficientPodSecurityStandard` annotation |
| 33 | +2. Most restrictive of existing `pod-security.kubernetes.io/warn` or `pod-security.kubernetes.io/audit` labels, if owned by the PSA label syncer |
| 34 | +3. Kube API server's future global default: `restricted` |
| 35 | + |
| 36 | +### Dry-Run Evaluation |
| 37 | + |
| 38 | +The controller performs the equivalent of this oc command: |
| 39 | + |
| 40 | +```bash |
| 41 | +oc label --dry-run=server --overwrite namespace $NAMESPACE_NAME \ |
| 42 | + pod-security.kubernetes.io/enforce=$POD_SECURITY_STANDARD |
| 43 | +``` |
| 44 | + |
| 45 | +PSA warnings during dry-run indicate the namespace contains violating workloads. |
| 46 | + |
| 47 | +### Violation Classification |
| 48 | + |
| 49 | +Violating namespaces are categorized for telemetry analysis: |
| 50 | + |
| 51 | +| Classification | Criteria | Purpose | |
| 52 | +|------------------|-----------------------------------------------------------------|----------------------------------------| |
| 53 | +| `runLevelZero` | Core namespaces: `kube-system`, `default`, `kube-public` | Platform infrastructure tracking | |
| 54 | +| `openshift` | Namespaces with `openshift-` prefix | OpenShift component tracking | |
| 55 | +| `disabledSyncer` | Label `security.openshift.io/scc.podSecurityLabelSync: "false"` | Intentionally excluded namespaces | |
| 56 | +| `userSCC` | Contains user workloads that violate PSA | SCC vs PSA policy conflicts | |
| 57 | +| `unknown` | All other violating namespaces | We simply don't know | |
| 58 | +| `inconclusive` | Evaluation failed due to API errors | Operational problems | |
| 59 | + |
| 60 | +#### User SCC Detection |
| 61 | + |
| 62 | +The PSA label syncer bases its evaluation exclusively on a ServiceAccount's SCCs, ignoring a user's SCCs. |
| 63 | +When a pod's SCC assignment comes from user permissions rather than its ServiceAccount, the syncer's predicted PSA level may be incorrect. |
| 64 | +Therefore we need to evaluate the affected pods (if any) against the target PSA level. |
| 65 | + |
| 66 | +### Inconclusive Handling |
| 67 | + |
| 68 | +When the evaluation process fails, namespaces are marked as `inconclusive`. |
| 69 | + |
| 70 | +Common causes for inconclusive results: |
| 71 | + |
| 72 | +- **API server unavailable** - Network timeouts, etcd issues |
| 73 | +- **Resource conflicts** - Concurrent namespace modifications |
| 74 | +- **Invalid PSA levels** - Malformed enforcement level strings |
| 75 | +- **Pod listing failures** - RBAC issues or resource pressure |
| 76 | + |
| 77 | +High rates of inconclusive results across the fleet may indicate systematic issues that requires investigation. |
| 78 | + |
| 79 | +## Output |
| 80 | + |
| 81 | +The controller updates `OperatorStatus` conditions for each violation type: |
| 82 | + |
| 83 | +```go |
| 84 | +type podSecurityOperatorConditions struct { |
| 85 | + violatingOpenShiftNamespaces []string // PodSecurityOpenshiftEvaluationConditionsDetected |
| 86 | + violatingRunLevelZeroNamespaces []string // PodSecurityRunLevelZeroEvaluationConditionsDetected |
| 87 | + violatingDisabledSyncerNamespaces []string // PodSecurityDisabledSyncerEvaluationConditionsDetected |
| 88 | + violatingUserSCCNamespaces []string // PodSecurityUserSCCEvaluationConditionsDetected |
| 89 | + violatingUnclassifiedNamespaces []string // PodSecurityUnknownEvaluationConditionsDetected |
| 90 | + inconclusiveNamespaces []string // PodSecurityInconclusiveEvaluationConditionsDetected |
| 91 | +} |
| 92 | +``` |
| 93 | + |
| 94 | +Conditions follow the pattern: |
| 95 | + |
| 96 | +- `PodSecurity{Type}EvaluationConditionsDetected` |
| 97 | +- Status: `True` (violations found) / `False` (no violations) |
| 98 | +- Message includes violating namespace list |
| 99 | + |
| 100 | +## Configuration |
| 101 | + |
| 102 | +The controller runs with a configurable interval (default: 4 hours) and uses rate limiting to avoid overwhelming the API server: |
| 103 | + |
| 104 | +```go |
| 105 | +kubeClientCopy.QPS = 2 |
| 106 | +kubeClientCopy.Burst = 2 |
| 107 | +``` |
| 108 | + |
| 109 | +## Integration Points |
| 110 | + |
| 111 | +- **PSA Label Syncer**: Reads syncer-managed PSA labels to predict enforcement levels |
| 112 | +- **Cluster Operator**: Reports status through standard operator conditions |
| 113 | +- **Telemetry**: Violation data feeds into cluster fleet analysis systems |
0 commit comments