WIP: Feature/121 apply blueprint continuously #124

alexander-dammeier · 2025-08-14T05:35:19Z

No description provided.

…1-apply-blueprint-continuously

…pply-blueprint-continuously # Conflicts: # pkg/application/blueprintSpecChangeUseCase.go # pkg/bootstrap.go

this way, whe use cases don't need to load the blueprint everytime. If there is an concurrent update, we handle the conflicts when we try to update the blueprint ourselves. If we use the same blueprint the whole time we don't need to recalculate the state diff. We only have to do this, if we need to reconcile the CR again, e.g. for non-blocking health checks.

Otherwise, we would publish dozens of this events because we will validate the blueprint as every reconciliation now

We have a condition instead. We also want to prevent event duplication over multiple reconciles.

We show the effective blueprint in status. There could be thousands of this event if we recalculate the effective blueprint on every reconciliation.

add condition for healthy ecosystem

We removed the maintenance mode. Therefore, the only thing this use case still did, was to check the dry run flag.

In previous versions, the state diff was calculated once and then got persisted in the blueprint CR. Therefore, we had to write all fields there. Now we calculate the state diff at every reconciliation. Because of that we can now decide which fields we want to publish in the blueprint CR.

If the ecosytem is unhealthy, an error is thrown and another reconciliation gets triggered. This way, we repeat the health check until everything is healthy. If the ecosystem is healthy afterward, the post-processing will always run.

We need this later to skip some steps while applying the blueprint.

We now throw an error to trigger a reconciliation. It is also now safe to just retry. We can now see the exact state of the self-upgrade via the new generated state diff and the component CR install versions.

The reason for the post-processing was to deactivate the maintenance mode and to mark the blueprint as completed or failed. Now, it would only be called if all other steps before it were successful, so we not even needed to handle error states anymore. Therefore, I renamed it to completeBlueprint. The last status phases could also be removed because with the refactoring of the apply-process for dogus and components, they aren't needed any longer.

pkg/application/blueprintSpecChangeUseCase.go

The blocking health checks were not in use anymore. As the operator will now work continuously, we should not log as much per reconciliation. Often it is enough to log the errors and reasons, why the blueprint cannot finish yet (often health).

pkg/adapter/kubernetes/componentcr/componentInstallationRepo_test.go

pkg/adapter/reconciler/blueprint_controller.go

This is very complex, because we set this condition in state diff and while applying dogus. We don't want to override the error message given by applyDogus with the state diff msg.

pkg/application/applyComponentsUseCase_test.go

pkg/application/applyDogusUseCase_test.go

pkg/application/blueprintSpecChangeUseCase.go

pkg/application/completeBlueprintUseCase_test.go

pkg/application/ecosystemConfigUseCase_test.go

pkg/application/ecosystemHealthUseCase.go

We want retries only via reconciliation to make the code easier and non-blocking. We can also reduce errors via a cache.

We yet need to write tests for this. If the operator reconciles very often, this cache helps a lot to reduce network pressure on the registry and can help with network problems

pkg/adapter/doguregistry/cache.go

pkg/domain/blueprintSpec.go

pkg/domain/blueprintSpec_test.go

alexander-dammeier added 30 commits August 5, 2025 09:19

#119 add standard labels to dogus and coponents

bb8af47

#121 create state diff without loading the blueprint again

aa7bea4

#119 update changelog

a5de572

#119 fix duplicate error if config key is present and absent

ae79469

#121 WIP: do not load blueprint in every use case

2085982

Merge branch 'v2/develop' into feature/121-apply-blueprint-continuously

9973c7b

Merge branch 'feature/121-non-blocking-health-checks' into feature/12…

f346f9e

…1-apply-blueprint-continuously

Merge branch 'feature/121-remove-maintenance-mode' into feature/121-a…

07cf18a

…pply-blueprint-continuously # Conflicts: # pkg/application/blueprintSpecChangeUseCase.go # pkg/bootstrap.go

#121 fix reconciler tests

7fe4543

#121 remove StatusPhaseStaticallyValidated

4e32076

#121 fix tests

82ea608

#121 remove BlueprintSpecStaticallyValidated event

ad52028

Otherwise, we would publish dozens of this events because we will validate the blueprint as every reconciliation now

#121 remove StatusPhaseValidated status

1e42af1

#121 remove BlueprintSpecValidated event

dcd9618

We have a condition instead. We also want to prevent event duplication over multiple reconciles.

#121 remove StatusPhaseInvalid

19a8262

#121 remove StatusPhaseEffectiveBlueprintGenerated

b37cfbf

#121 remove EffectiveBlueprintCalculatedEvent

b97f726

We show the effective blueprint in status. There could be thousands of this event if we recalculate the effective blueprint on every reconciliation.

#121 remove StatusPhaseStateDiffDetermined

bce1e0c

#121 remove StatusPhaseEcosystemHealthyUpfront

1aed68a

add condition for healthy ecosystem

#121 remove StatusPhaseEcosystemUnhealthyUpfront

7ba9da7

#121 remove blueprint pre-processing

211cee1

We removed the maintenance mode. Therefore, the only thing this use case still did, was to check the dry run flag.

#121 remove StatusPhaseEcosystemHealthyAfterwards

ba285ae

#121 do not write sensitive data in blueprint CR

9d46832

#121 remove unused censoring functions

0aeff2a

#121 remove StatusPhaseEcosystemUnhealthyAfterwards

7eb4861

If the ecosytem is unhealthy, an error is thrown and another reconciliation gets triggered. This way, we repeat the health check until everything is healthy. If the ecosystem is healthy afterward, the post-processing will always run.

#121 check if changes need to be made or early exit

486d591

We need this later to skip some steps while applying the blueprint.

#121 make self upgrade non-blocking

807cd04

We now throw an error to trigger a reconciliation. It is also now safe to just retry. We can now see the exact state of the self-upgrade via the new generated state diff and the component CR install versions.

#121 remove self upgrade statuses

217e008

alexander-dammeier added 5 commits August 20, 2025 10:36

#121 update changelog

cdcdce0

#121 load and persist conditions via repository

a30adaa

#121 fix event test

10940cf

#121 always write a reason in conditions

85c9620

cesmarvin reviewed Aug 20, 2025

View reviewed changes

pkg/application/blueprintSpecChangeUseCase.go Show resolved Hide resolved

meiserloh reviewed Aug 22, 2025

View reviewed changes

pkg/adapter/kubernetes/componentcr/componentInstallationRepo_test.go Show resolved Hide resolved

meiserloh reviewed Aug 22, 2025

View reviewed changes

pkg/adapter/reconciler/blueprint_controller.go Show resolved Hide resolved

#121 handle DogusApplied conditions after determining StateDiff

a5df1fc

This is very complex, because we set this condition in state diff and while applying dogus. We don't want to override the error message given by applyDogus with the state diff msg.

meiserloh reviewed Aug 27, 2025

View reviewed changes

alexander-dammeier added 3 commits August 27, 2025 16:04

#121 remove retry while loading dogu.jsons

8673e62

We want retries only via reconciliation to make the code easier and non-blocking. We can also reduce errors via a cache.

#121 add cache for dogu descriptors

46f390f

We yet need to write tests for this. If the operator reconciles very often, this cache helps a lot to reduce network pressure on the registry and can help with network problems

#121 fix tests

fd4dd64

cesmarvin reviewed Aug 27, 2025

View reviewed changes

pkg/adapter/doguregistry/cache.go Show resolved Hide resolved

meiserloh reviewed Aug 29, 2025

View reviewed changes

pkg/adapter/doguregistry/cache.go Show resolved Hide resolved