Skip to content

fix: preserve pending sync operations during managed status updates#842

Open
drewbailey wants to merge 3 commits intoargoproj-labs:mainfrom
drewbailey:db--preserve-pending-sync-op
Open

fix: preserve pending sync operations during managed status updates#842
drewbailey wants to merge 3 commits intoargoproj-labs:mainfrom
drewbailey:db--preserve-pending-sync-op

Conversation

@drewbailey
Copy link
Copy Markdown
Collaborator

@drewbailey drewbailey commented Mar 26, 2026

Managed-agent status updates can arrive on the principal with a nil .operation even while a principal-initiated sync request is still pending. The principal UpdateStatus path was previously copying incoming.Operation verbatim, which meant an early status update could clear the sync operation before the Argo CD application controller ever observed it.

That race leaves newly created applications stuck OutOfSync/Missing with no .operation or .status.operationState on either side, even though the deploy flow requested a sync.

Fix this by reusing operationToUse() in the principal UpdateStatus path so an existing operation is preserved when the incoming status payload does not carry one. The operation is still updated when the agent explicitly reports one, and terminating operations keep their existing behavior.

Also add a regression test covering the nil-operation status update case so this remains safe to cherry-pick independently.

Fixes #841

How to test changes / Special notes to the reviewer:

Checklist

  • Documentation update is required by this PR (and has been updated) OR no documentation update is required.

Summary by CodeRabbit

  • Bug Fixes

    • Preserve existing application operations when incoming status updates omit an operation, preventing unintended clearing of principal-initiated operations. When an incoming status indicates a completed operation, the operation is cleared. Status updates also record a last-updated annotation to make operation state stable and traceable.
  • Tests

    • Added coverage verifying preservation/clearing of operations and presence of the last-updated annotation after status updates.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 26, 2026

📝 Walkthrough

Walkthrough

UpdateStatus now preserves an existing Application.Operation when an incoming managed status update omits .operation, selects the operation via statusOperationToUse(existing, incoming) for patch target and assignment, and adds/upserts the argocd-agent.argoproj.io/last-updated annotation on status updates.

Changes

Cohort / File(s) Summary
Core Implementation
internal/manager/application/application.go
Compute statusOperationToUse(existing, incoming) and assign it to existing.Operation and the patch target so an omitted incoming .operation does not clear an existing operation; add/replace argocd-agent.argoproj.io~1last-updated under /metadata/annotations (create annotations map if nil).
Test Coverage
internal/manager/application/application_test.go
Add subtests verifying that when incoming status omits Operation, the existing Operation.InitiatedBy.Username is preserved, that a terminal incoming operation clears the operation, and that LastUpdatedAnnotation is set on the returned application.

Sequence Diagram(s)

sequenceDiagram
    participant Agent as Agent (managed)
    participant Principal as Principal.Manager
    participant Controller as ArgoCD ApplicationController

    Agent->>Principal: send status update (incoming.Operation may be nil)
    Principal->>Principal: read existing.Application and incoming
    Principal->>Principal: compute statusOperationToUse(existing, incoming)
    alt incoming.Operation is nil
        Principal-->>Principal: preserve existing.Operation
    else incoming.Operation present
        Principal-->>Principal: use incoming.Operation
    end
    Principal->>Principal: add/replace last-updated annotation in patch
    Principal->>Controller: apply patched status (operation retained or updated)
    Controller->>Principal: observe/advance operation state
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • chetan-rns
  • jgwest
  • jannfis
  • mikeshng

Poem

🐇
I nibbled at patches with nimble paw,
Held operations safe from an accidental thaw.
When agents whisper "nil" in the night,
I tuck intent in place till controllers light.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix: preserve pending sync operations during managed status updates' accurately summarizes the main change, which prevents nil-operation status updates from clearing pending sync operations.
Linked Issues check ✅ Passed The PR implementation preserves existing operations when incoming status lacks one, honors explicit updates, maintains termination behavior, and adds regression tests—fully addressing all objectives from issue #841.
Out of Scope Changes check ✅ Passed All changes are scoped to operation preservation logic in UpdateStatus and related test coverage, directly addressing the race condition described in issue #841 without introducing unrelated modifications.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Managed-agent status updates can arrive on the principal with a nil .operation even while a principal-initiated sync request is still pending. The principal UpdateStatus path was previously copying incoming.Operation verbatim, which meant an early status update could clear the sync operation before the Argo CD application controller ever observed it.

That race leaves newly created applications stuck OutOfSync/Missing with no .operation or .status.operationState on either side, even though the deploy flow requested a sync.

Fix this by reusing operationToUse() in the principal UpdateStatus path so an existing operation is preserved when the incoming status payload does not carry one. The operation is still updated when the agent explicitly reports one, and terminating operations keep their existing behavior.

Also add a regression test covering the nil-operation status update case so this remains safe to cherry-pick independently.

Signed-off-by: Drew Bailey <drew.bailey@airbnb.com>
@drewbailey drewbailey force-pushed the db--preserve-pending-sync-op branch from cfd7ef2 to ed1efbe Compare March 26, 2026 18:41
stampLastUpdated() wrote to incoming.Annotations but the patchFn in
UpdateStatus only patched Status and Operation fields, so the annotation
was never persisted when SupportsPatch() returned true.

Add an explicit patch operation for LastUpdatedAnnotation, handling the
nil-annotations edge case by creating the map if needed.

Signed-off-by: Drew Bailey <drew.bailey@airbnb.com>
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Mar 27, 2026

Codecov Report

❌ Patch coverage is 68.75000% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 46.34%. Comparing base (7e2d8ed) to head (0ce5b3b).
⚠️ Report is 27 commits behind head on main.

Files with missing lines Patch % Lines
internal/manager/application/application.go 68.75% 3 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #842      +/-   ##
==========================================
+ Coverage   45.28%   46.34%   +1.06%     
==========================================
  Files         118      122       +4     
  Lines       16631    17422     +791     
==========================================
+ Hits         7532     8075     +543     
- Misses       8404     8604     +200     
- Partials      695      743      +48     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

The preserve-pending-sync-op change kept principal spec.operation whenever a managed status update arrived without a top-level operation payload. That fixed the in-flight sync race, but it also left stale operations behind after the agent reported a terminal operationState.\n\nNarrow the principal UpdateStatus path to use a status-specific helper. Preserve the existing operation when the agent still has no operation state or is still in flight, but clear the top-level operation when the incoming status reports a completed phase.\n\nAdd a regression test covering the terminal Succeeded case so the principal app drops spec.operation once the managed agent finishes the sync.

Signed-off-by: Drew Bailey <drew.bailey@airbnb.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
internal/manager/application/application.go (1)

513-519: Annotation patch ordering has a latent edge-case when existing.Annotations is nil and a refresh annotation must be added.

The new last-updated stamping is correct by itself. However, the refresh annotation "add" operation (line 510) is appended before the annotations-map creation (line 516). If existing.Annotations == nil and incomingRefresh is true, the patch will attempt to add a key to a non-existent map, causing the patch to fail.

This is a pre-existing concern (the refresh handling predates this PR), but now that the code explicitly creates the annotations map when nil, it's a good opportunity to consolidate:

♻️ Optional: consolidate annotation-map creation to avoid ordering issues
-		// If the incoming app doesn't have the refresh annotation set, we need
-		// to make sure that we remove it from the version stored on the server
-		// as well.
-		if existingRefresh && !incomingRefresh {
-			patch = append(patch, jsondiff.Operation{Type: "remove", Path: "/metadata/annotations/argocd.argoproj.io~1refresh"})
-		} else if !existingRefresh && incomingRefresh {
-			patch = append(patch, jsondiff.Operation{Type: "add", Path: "/metadata/annotations/argocd.argoproj.io~1refresh", Value: refresh})
-		}
-
-		// Stamp the last-updated annotation. If the existing app has no
-		// annotations map yet we must create it; otherwise add/replace the key.
-		if existing.Annotations == nil {
-			patch = append(patch, jsondiff.Operation{Type: "add", Path: "/metadata/annotations", Value: map[string]string{LastUpdatedAnnotation: incoming.Annotations[LastUpdatedAnnotation]}})
-		} else {
-			patch = append(patch, jsondiff.Operation{Type: "add", Path: "/metadata/annotations/argocd-agent.argoproj.io~1last-updated", Value: incoming.Annotations[LastUpdatedAnnotation]})
-		}
+		// Stamp the last-updated annotation. If the existing app has no
+		// annotations map yet, create it first (including any refresh value).
+		if existing.Annotations == nil {
+			newAnnotations := map[string]string{
+				LastUpdatedAnnotation: incoming.Annotations[LastUpdatedAnnotation],
+			}
+			if incomingRefresh {
+				newAnnotations["argocd.argoproj.io/refresh"] = refresh
+			}
+			patch = append(patch, jsondiff.Operation{Type: "add", Path: "/metadata/annotations", Value: newAnnotations})
+		} else {
+			// Handle refresh annotation on existing map
+			if existingRefresh && !incomingRefresh {
+				patch = append(patch, jsondiff.Operation{Type: "remove", Path: "/metadata/annotations/argocd.argoproj.io~1refresh"})
+			} else if !existingRefresh && incomingRefresh {
+				patch = append(patch, jsondiff.Operation{Type: "add", Path: "/metadata/annotations/argocd.argoproj.io~1refresh", Value: refresh})
+			}
+			patch = append(patch, jsondiff.Operation{Type: "add", Path: "/metadata/annotations/argocd-agent.argoproj.io~1last-updated", Value: incoming.Annotations[LastUpdatedAnnotation]})
+		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/manager/application/application.go` around lines 513 - 519, The
patch appends the refresh "add" operation (incomingRefresh) before creating the
annotations map when existing.Annotations == nil, causing a failure; update the
logic in application.go so that when existing.Annotations is nil you first
append a single "add" operation that creates the /metadata/annotations map and
includes both LastUpdatedAnnotation and, if incomingRefresh is true, the
argocd-agent.argoproj.io~1last-updated key (use incoming.Annotations values),
otherwise if the map exists keep the current behavior of adding the specific
key; reference the symbols existing.Annotations, incoming.Annotations,
LastUpdatedAnnotation, incomingRefresh, patch, and jsondiff.Operation to locate
and reorder/consolidate the operations.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@internal/manager/application/application.go`:
- Around line 513-519: The patch appends the refresh "add" operation
(incomingRefresh) before creating the annotations map when existing.Annotations
== nil, causing a failure; update the logic in application.go so that when
existing.Annotations is nil you first append a single "add" operation that
creates the /metadata/annotations map and includes both LastUpdatedAnnotation
and, if incomingRefresh is true, the argocd-agent.argoproj.io~1last-updated key
(use incoming.Annotations values), otherwise if the map exists keep the current
behavior of adding the specific key; reference the symbols existing.Annotations,
incoming.Annotations, LastUpdatedAnnotation, incomingRefresh, patch, and
jsondiff.Operation to locate and reorder/consolidate the operations.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 32e65a22-ee68-4379-b822-825ff0d7f86c

📥 Commits

Reviewing files that changed from the base of the PR and between fc18e23 and 0ce5b3b.

📒 Files selected for processing (2)
  • internal/manager/application/application.go
  • internal/manager/application/application_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • internal/manager/application/application_test.go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Managed status updates on principal can clear a pending sync operation before application controller observes it

2 participants