Skip to content

Commit 8669ca1

Browse files
Maximilien-Rtekton-robot
authored andcommitted
feat: resolve steps referencing StepActions in concurrently
Avoids unnecessary DeepCopy operations on steps that do not reference a StepAction. Introduces concurrent resolution of steps that reference StepActions to improve the performance of TaskRun reconciliation, especially when using remote resolvers like git. The key changes include: - `hasStepRefs` function: A new function that quickly checks if a `TaskSpec` contains any steps referencing `StepActions`. This allows for an early exit if no resolution is needed, avoiding unnecessary work. - `resolveStepRef` function: This new function encapsulates the logic for resolving a single `StepAction` reference. It handles fetching the remote resource, merging the `StepAction` with the step's specification, and returning the resolved step - Two-phase resolution: The `GetStepActionsData` function is now split into two distinct phases: - Concurrent Resolution: All `StepAction` references are resolved concurrently using an `errgroup`. - Sequential Merging: The resolved steps and their provenance are merged into the final step list and the `TaskRun` status sequentially. - `updateTaskRunProvenance` function: A dedicated function for updating the TaskRun's status with provenance information. The maximum number of StepActions that can be resolved concurrently is defined by the default config and its `default-step-ref-concurrency-limit` key.
1 parent e7cc64e commit 8669ca1

File tree

10 files changed

+566
-151
lines changed

10 files changed

+566
-151
lines changed

config/config-defaults.yaml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -157,3 +157,11 @@ data:
157157
# This value is used by the sidecar-tekton-log-results container and can be tuned for performance or test scenarios.
158158
# Example values: "100ms", "500ms", "1s"
159159
default-sidecar-log-polling-interval: "100ms"
160+
161+
# default-step-ref-concurrency-limit specifies the concurrency limit for resolving step references.
162+
# This setting controls the maximum number of concurrent goroutines used to resolve
163+
# step references (`step.ref` fields) simultaneously. This limit acts as a throttle
164+
# to prevent overwhelming remote servers (e.g., git providers, OCI registries) or
165+
# the Kubernetes API server, especially when a TaskRun contains many steps that
166+
# reference StepActions.
167+
default-step-ref-concurrency-limit: "5"

docs/additional-configs.md

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,7 @@ installation.
3434
- [TaskRuns with `imagePullBackOff` Timeout](#taskruns-with-imagepullbackoff-timeout)
3535
- [Disabling Inline Spec in TaskRun and PipelineRun](#disabling-inline-spec-in-taskrun-and-pipelinerun)
3636
- [Exponential Backoff for TaskRun and CustomRun Creation](#exponential-backoff-for-taskrun-and-customrun-creation)
37+
- [Limiting Step reference concurrency resolution](#limiting-step-reference-concurrency-resolution)
3738
- [Next steps](#next-steps)
3839

3940

@@ -782,6 +783,33 @@ If `enable-wait-exponential-backoff` is not set or is set to `"false"`, the cont
782783
783784
**Note:** This feature is especially useful in clusters where webhook services (such as Kyverno, OPA, or custom admission controllers) may be temporarily unavailable or slow to respond.
784785
786+
## Limiting Step reference concurrency resolution
787+
788+
You can control the maximum number of concurrent goroutines that the Tekton controller uses to resolve steps referencing a `StepAction` via the `step.ref` field.
789+
790+
When a `TaskRun` is processed, any step that uses a `ref` to a remote `StepAction` (e.g., one stored in a git repository or an OCI registry) triggers a fetch request. If a `Task` contains many such steps, the controller will attempt to resolve them all in parallel. This can lead to a "thundering herd" problem, potentially overwhelming remote servers, hitting API rate limits, saturating network resources, or placing excessive load on the Kubernetes API server and the Tekton controller itself.
791+
792+
To mitigate this, Tekton Pipelines includes a configurable concurrency limit. By default, a sensible limit is already in place to ensure stability.
793+
794+
#### Default Behavior
795+
796+
If the `default-step-ref-concurrency-limit` key is not set in the `config-defaults` ConfigMap, Tekton Pipelines defaults to a concurrency limit of **5**. This provides a safe, built-in throttle without requiring any initial configuration.
797+
798+
#### Overriding the Default
799+
800+
You can override this default to better suit your environment's capacity (e.g., a high-capacity, self-hosted git server might allow for a higher limit). To change the limit, set the `default-step-ref-concurrency-limit` key in your `config-defaults` `ConfigMap`.
801+
802+
**Example**: To increase the concurrency limit to 20:
803+
```yaml
804+
apiVersion: v1
805+
kind: ConfigMap
806+
metadata:
807+
name: config-defaults
808+
namespace: tekton-pipelines
809+
data:
810+
default-step-ref-concurrency-limit: "20"
811+
```
812+
785813
---
786814
787815
## Next steps

pkg/apis/config/default.go

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,9 @@ const (
5656

5757
DefaultSidecarLogPollingInterval = 100 * time.Millisecond
5858

59+
// DefaultStepRefConcurrencyLimit is the default concurrency limit for resolving step references.
60+
DefaultStepRefConcurrencyLimit = 5
61+
5962
defaultTimeoutMinutesKey = "default-timeout-minutes"
6063
defaultServiceAccountKey = "default-service-account"
6164
defaultManagedByLabelValueKey = "default-managed-by-label-value"
@@ -70,6 +73,7 @@ const (
7073
defaultImagePullBackOffTimeout = "default-imagepullbackoff-timeout"
7174
defaultMaximumResolutionTimeout = "default-maximum-resolution-timeout"
7275
defaultSidecarLogPollingIntervalKey = "default-sidecar-log-polling-interval"
76+
DefaultStepRefConcurrencyLimitKey = "default-step-ref-concurrency-limit"
7377
)
7478

7579
// DefaultConfig holds all the default configurations for the config.
@@ -95,6 +99,7 @@ type Defaults struct {
9599
// This value is loaded from the 'sidecar-log-polling-interval' key in the config-defaults ConfigMap.
96100
// It is used to control the responsiveness and resource usage of the sidecar in both production and test environments.
97101
DefaultSidecarLogPollingInterval time.Duration
102+
DefaultStepRefConcurrencyLimit int
98103
}
99104

100105
// GetDefaultsConfigName returns the name of the configmap containing all
@@ -128,6 +133,7 @@ func (cfg *Defaults) Equals(other *Defaults) bool {
128133
other.DefaultImagePullBackOffTimeout == cfg.DefaultImagePullBackOffTimeout &&
129134
other.DefaultMaximumResolutionTimeout == cfg.DefaultMaximumResolutionTimeout &&
130135
other.DefaultSidecarLogPollingInterval == cfg.DefaultSidecarLogPollingInterval &&
136+
other.DefaultStepRefConcurrencyLimit == cfg.DefaultStepRefConcurrencyLimit &&
131137
reflect.DeepEqual(other.DefaultForbiddenEnv, cfg.DefaultForbiddenEnv)
132138
}
133139

@@ -143,6 +149,7 @@ func NewDefaultsFromMap(cfgMap map[string]string) (*Defaults, error) {
143149
DefaultImagePullBackOffTimeout: DefaultImagePullBackOffTimeout,
144150
DefaultMaximumResolutionTimeout: DefaultMaximumResolutionTimeout,
145151
DefaultSidecarLogPollingInterval: DefaultSidecarLogPollingInterval,
152+
DefaultStepRefConcurrencyLimit: DefaultStepRefConcurrencyLimit,
146153
}
147154

148155
if defaultTimeoutMin, ok := cfgMap[defaultTimeoutMinutesKey]; ok {
@@ -237,6 +244,14 @@ func NewDefaultsFromMap(cfgMap map[string]string) (*Defaults, error) {
237244
tc.DefaultSidecarLogPollingInterval = interval
238245
}
239246

247+
if DefaultStepRefConcurrencyLimit, ok := cfgMap[DefaultStepRefConcurrencyLimitKey]; ok {
248+
stepRefConcurrencyLimit, err := strconv.ParseInt(DefaultStepRefConcurrencyLimit, 10, 0)
249+
if err != nil {
250+
return nil, fmt.Errorf("failed parsing default config %q", DefaultStepRefConcurrencyLimitKey)
251+
}
252+
tc.DefaultStepRefConcurrencyLimit = int(stepRefConcurrencyLimit)
253+
}
254+
240255
return &tc, nil
241256
}
242257

pkg/apis/config/default_test.go

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,7 @@ func TestNewDefaultsFromConfigMap(t *testing.T) {
4747
DefaultImagePullBackOffTimeout: time.Duration(5) * time.Second,
4848
DefaultMaximumResolutionTimeout: 1 * time.Minute,
4949
DefaultSidecarLogPollingInterval: 100 * time.Millisecond,
50+
DefaultStepRefConcurrencyLimit: 5,
5051
},
5152
fileName: config.GetDefaultsConfigName(),
5253
},
@@ -69,6 +70,7 @@ func TestNewDefaultsFromConfigMap(t *testing.T) {
6970
DefaultImagePullBackOffTimeout: 0,
7071
DefaultMaximumResolutionTimeout: 1 * time.Minute,
7172
DefaultSidecarLogPollingInterval: 100 * time.Millisecond,
73+
DefaultStepRefConcurrencyLimit: 5,
7274
},
7375
fileName: "config-defaults-with-pod-template",
7476
},
@@ -94,6 +96,7 @@ func TestNewDefaultsFromConfigMap(t *testing.T) {
9496
DefaultImagePullBackOffTimeout: 0,
9597
DefaultMaximumResolutionTimeout: 1 * time.Minute,
9698
DefaultSidecarLogPollingInterval: 100 * time.Millisecond,
99+
DefaultStepRefConcurrencyLimit: 5,
97100
},
98101
},
99102
{
@@ -108,6 +111,7 @@ func TestNewDefaultsFromConfigMap(t *testing.T) {
108111
DefaultImagePullBackOffTimeout: 0,
109112
DefaultMaximumResolutionTimeout: 1 * time.Minute,
110113
DefaultSidecarLogPollingInterval: 100 * time.Millisecond,
114+
DefaultStepRefConcurrencyLimit: 5,
111115
},
112116
},
113117
{
@@ -125,6 +129,7 @@ func TestNewDefaultsFromConfigMap(t *testing.T) {
125129
DefaultImagePullBackOffTimeout: 0,
126130
DefaultMaximumResolutionTimeout: 1 * time.Minute,
127131
DefaultSidecarLogPollingInterval: 100 * time.Millisecond,
132+
DefaultStepRefConcurrencyLimit: 5,
128133
},
129134
},
130135
{
@@ -139,6 +144,7 @@ func TestNewDefaultsFromConfigMap(t *testing.T) {
139144
DefaultImagePullBackOffTimeout: time.Duration(15) * time.Second,
140145
DefaultMaximumResolutionTimeout: 1 * time.Minute,
141146
DefaultSidecarLogPollingInterval: 100 * time.Millisecond,
147+
DefaultStepRefConcurrencyLimit: 5,
142148
},
143149
},
144150
{
@@ -153,6 +159,7 @@ func TestNewDefaultsFromConfigMap(t *testing.T) {
153159
DefaultImagePullBackOffTimeout: 0,
154160
DefaultMaximumResolutionTimeout: 1 * time.Minute,
155161
DefaultSidecarLogPollingInterval: 100 * time.Millisecond,
162+
DefaultStepRefConcurrencyLimit: 5,
156163
},
157164
},
158165
{
@@ -203,6 +210,25 @@ func TestNewDefaultsFromConfigMap(t *testing.T) {
203210
},
204211
"test": {},
205212
},
213+
DefaultStepRefConcurrencyLimit: 5,
214+
},
215+
},
216+
{
217+
expectedError: true,
218+
fileName: "config-defaults-step-ref-concurrency-limit-err",
219+
},
220+
{
221+
expectedError: false,
222+
fileName: "config-defaults-step-ref-concurrency-limit",
223+
expectedConfig: &config.Defaults{
224+
DefaultStepRefConcurrencyLimit: 10,
225+
DefaultTimeoutMinutes: 60,
226+
DefaultServiceAccount: "default",
227+
DefaultManagedByLabelValue: config.DefaultManagedByLabelValue,
228+
DefaultMaxMatrixCombinationsCount: 256,
229+
DefaultImagePullBackOffTimeout: 0,
230+
DefaultMaximumResolutionTimeout: 1 * time.Minute,
231+
DefaultSidecarLogPollingInterval: 100 * time.Millisecond,
206232
},
207233
},
208234
}
@@ -228,6 +254,7 @@ func TestNewDefaultsFromEmptyConfigMap(t *testing.T) {
228254
DefaultImagePullBackOffTimeout: 0,
229255
DefaultMaximumResolutionTimeout: 1 * time.Minute,
230256
DefaultSidecarLogPollingInterval: 100 * time.Millisecond,
257+
DefaultStepRefConcurrencyLimit: 5,
231258
}
232259
verifyConfigFileWithExpectedConfig(t, DefaultsConfigEmptyName, expectedConfig)
233260
}
@@ -414,6 +441,25 @@ func TestEquals(t *testing.T) {
414441
},
415442
expected: true,
416443
},
444+
{
445+
name: "different default step ref concurrency limit",
446+
left: &config.Defaults{
447+
DefaultStepRefConcurrencyLimit: 5,
448+
},
449+
right: &config.Defaults{
450+
DefaultStepRefConcurrencyLimit: 10,
451+
},
452+
expected: false,
453+
}, {
454+
name: "same default step ref concurrency limit",
455+
left: &config.Defaults{
456+
DefaultStepRefConcurrencyLimit: 5,
457+
},
458+
right: &config.Defaults{
459+
DefaultStepRefConcurrencyLimit: 5,
460+
},
461+
expected: true,
462+
},
417463
}
418464

419465
for _, tc := range testCases {
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
# Copyright 2025 The Tekton Authors
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# https://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
apiVersion: v1
16+
kind: ConfigMap
17+
metadata:
18+
name: config-defaults
19+
namespace: tekton-pipelines
20+
data:
21+
default-step-ref-concurrency-limit: "abc"
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
# Copyright 2025 The Tekton Authors
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# https://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
apiVersion: v1
16+
kind: ConfigMap
17+
metadata:
18+
name: config-defaults
19+
namespace: tekton-pipelines
20+
data:
21+
default-step-ref-concurrency-limit: "10"

pkg/apis/config/testdata/config-defaults.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,4 +22,5 @@ data:
2222
default-service-account: "tekton"
2323
default-managed-by-label-value: "something-else"
2424
default-resolver-type: "git"
25-
default-imagepullbackoff-timeout: "5s"
25+
default-imagepullbackoff-timeout: "5s"
26+
default-step-ref-concurrency-limit: "5"

0 commit comments

Comments
 (0)