Skip to content

Conversation

lucasmrod
Copy link
Member

@lucasmrod lucasmrod commented Oct 13, 2025

Resolves #34055

  • Changes file added for user-visible changes in changes/, orbit/changes/ or ee/fleetd-chrome/changes.

  • Input data is properly validated, SELECT * is avoided, SQL injection is prevented (using placeholders for values in statements)

Testing

  • QA'd all new/changed functionality manually

Summary by CodeRabbit

  • Refactor
    • Optimized software title reconciliation used during vulnerability processing, improving scan performance and reducing database load. More efficient cleanup of orphaned titles and updates to title names.
  • Tests
    • Corrected a test name typo for clarity.
    • Streamlined MDM integration test by removing redundant title recreation steps.

Copy link

codecov bot commented Oct 13, 2025

Codecov Report

❌ Patch coverage is 77.77778% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.21%. Comparing base (742631a) to head (d3fbe7e).
⚠️ Report is 35 commits behind head on main.

Files with missing lines Patch % Lines
server/datastore/mysql/software.go 85.00% 2 Missing and 1 partial ⚠️
cmd/fleet/cron.go 50.00% 1 Missing ⚠️
cmd/fleet/serve.go 0.00% 1 Missing ⚠️
cmd/fleet/vuln_process.go 66.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #34146      +/-   ##
==========================================
+ Coverage   63.81%   64.21%   +0.40%     
==========================================
  Files        1876     2058     +182     
  Lines      185510   206819   +21309     
  Branches     6767     6767              
==========================================
+ Hits       118383   132814   +14431     
- Misses      57465    63577    +6112     
- Partials     9662    10428     +766     
Flag Coverage Δ
backend 65.31% <77.77%> (+0.31%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment on lines -2284 to -2326
INSERT INTO software_titles (name, source, extension_for, bundle_identifier)
SELECT
name,
source,
extension_for,
bundle_identifier
FROM (
SELECT DISTINCT
name,
source,
extension_for,
bundle_identifier
FROM
software s
WHERE
NOT EXISTS (
SELECT 1 FROM software_titles st
WHERE s.bundle_identifier = st.bundle_identifier AND
IF(s.source IN ('apps', 'ios_apps', 'ipados_apps'), s.source = st.source, 1)
)
AND COALESCE(bundle_identifier, '') != ''
UNION ALL
SELECT DISTINCT
name,
source,
extension_for,
NULL as bundle_identifier
FROM
software s
WHERE
NOT EXISTS (
SELECT 1 FROM software_titles st
WHERE (s.name, s.source, s.extension_for) = (st.name, st.source, st.extension_for)
)
AND COALESCE(s.bundle_identifier, '') = ''
) as combined_results
ON DUPLICATE KEY UPDATE
software_titles.name = software_titles.name,
software_titles.source = software_titles.source,
software_titles.extension_for = software_titles.extension_for,
software_titles.bundle_identifier = software_titles.bundle_identifier
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not necessary with recent changes in software ingestion, which pre-inserts software_titles before inserting software (and makes sure all software entries have a title_id set).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense!

Comment on lines 2337 to 2359
UPDATE software s
JOIN software_titles st
ON COALESCE(s.bundle_identifier, '') = '' AND s.name = st.name AND s.source = st.source AND s.extension_for = st.extension_for
SET s.title_id = st.id
WHERE (s.title_id IS NULL OR s.title_id != st.id)
AND COALESCE(s.bundle_identifier, '') = '';
`
func (ds *Datastore) cleanupOrphanedSoftwareTitles(ctx context.Context) error {
var n int64
defer func(start time.Time) {
level.Debug(ds.logger).Log(
"msg", "cleanup orphaned software titles",
"rows_affected", n,
"took", start,
)
}(time.Now())

res, err = tx.ExecContext(ctx, updateSoftwareWithoutIdentifierStmt)
if err != nil {
return ctxerr.Wrap(ctx, err, "update software title_id without bundle identifier")
}
n, _ = res.RowsAffected()
level.Debug(ds.logger).Log("msg", "update software title_id without bundle identifier", "rows_affected", n)

updateSoftwareWithIdentifierStmt := `
UPDATE software s
JOIN software_titles st
ON s.bundle_identifier = st.bundle_identifier AND
IF(s.source IN ('apps', 'ios_apps', 'ipados_apps'), s.source = st.source, 1)
SET s.title_id = st.id
WHERE s.title_id IS NULL
OR s.title_id != st.id;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not necessary with recent changes in software ingestion, which pre-inserts software_titles before inserting software (and makes sure all software entries have a title_id set).

@lucasmrod lucasmrod marked this pull request as ready for review October 13, 2025 11:39
@lucasmrod lucasmrod requested a review from a team as a code owner October 13, 2025 11:39
@lucasmrod
Copy link
Member Author

@coderabbitai review

Copy link
Contributor

coderabbitai bot commented Oct 13, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

coderabbitai bot commented Oct 13, 2025

Walkthrough

Refactors software title reconciliation to a two-step process (cleanup orphaned titles, then update names), adjusts vulnerability cron function factory to not capture context, and updates the usage statistics schedule signature and its call site. Tests updated to reflect new reconciliation and fix a test name typo. Adds a change note.

Changes

Cohort / File(s) Summary
Changelog note
changes/34055-optimize-reconcile-software-titles
Adds note about optimizing software title reconciliation in the vulnerabilities cron job.
Vuln scheduling and signatures
cmd/fleet/cron.go, cmd/fleet/serve.go, cmd/fleet/vuln_process.go
Removes context parameter from getVulnFuncs and updates call sites; newUsageStatisticsSchedule signature drops license parameter; corresponding call in serve.go updated.
Datastore: software reconciliation refactor
server/datastore/mysql/software.go
Replaces in-transaction upsert/updates with two steps: cleanupOrphanedSoftwareTitles (batched deletes) and updateSoftwareTitleNames (joined update using latest software per bundle identifier). Updates ReconcileSoftwareTitles to call these.
Tests updates
server/datastore/mysql/software_test.go, server/service/integration_mdm_test.go
Renames a test to fix a typo; removes a sequence that dropped and re-synced software_titles, adjusting assertions accordingly.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Cron as Cron Scheduler
  participant VF as getVulnFuncs
  participant VJ as VulnFunc(ctx)
  participant VP as cronVulnerabilities
  participant DS as Datastore (MySQL)

  Cron->>VF: Build vuln funcs (no ctx)
  Cron->>VJ: Invoke with ctx
  VJ->>VP: Run vulnerability processing
  VP->>DS: ReconcileSoftwareTitles(ctx)
  activate DS
  DS->>DS: cleanupOrphanedSoftwareTitles(ctx)
  DS->>DS: updateSoftwareTitleNames(ctx)
  deactivate DS
  VP-->>Cron: Complete
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • getvictor
  • jahzielv
  • sgress454

Pre-merge checks and finishing touches

❌ Failed checks (3 warnings)
Check name Status Explanation Resolution
Out of Scope Changes Check ⚠️ Warning The pull request includes modifications to the usage statistics schedule function signature and its invocation, which are unrelated to the software title reconciliation optimization described in issue #34055. These changes introduce API-breaking signature removals for newUsageStatisticsSchedule and its callers that lie outside the scope of the vulnerability processing fix. Including these refactors could complicate review and risk unintended side effects in telemetry scheduling. Extract the usage statistics schedule signature and invocation modifications into a separate pull request dedicated to refactoring telemetry scheduling. This will maintain the current PR’s focus on vulnerability job optimizations and avoid coupling unrelated API changes. If needed, reintroduce those changes once the reconciliation improvements are merged cleanly.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Description Check ⚠️ Warning The pull request description does not adhere to the repository’s template: it uses a standalone “Resolves #34055” line instead of the required “Related issue: Resolves #34055” heading, omits the “# Checklist for submitter” header, and does not include or explicitly remove the other checklist and section headings as specified in the template. Please update the description to match the template exactly by adding a “Related issue: Resolves #34055” section, including the “# Checklist for submitter” header, and ensuring all relevant checklist items and section headings (or their intentional removals) appear as specified.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The title concisely describes the main change, focusing on optimizing the software title reconciliation process within the vulnerabilities job. It highlights the primary area of improvement without including unrelated details or noise. A reader scanning the history will understand that the pull request aims to improve performance of the vulnerability cron.
Linked Issues Check ✅ Passed The changes to the ReconcileSoftwareTitles function directly address the deadlock-causing queries identified in issue #34055 by replacing in-transaction upserts with batched cleanup and update steps. Both problematic UPDATE operations are eliminated or refactored into safer batch operations, preserving context usage within individual vulnerability tasks. The pull request meets the primary objectives of optimizing reconciliation to prevent long-running locks and deadlocks under load.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch 34055-optimize-reconcile-software-titles

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 742631a and 8b83f17.

📒 Files selected for processing (7)
  • changes/34055-optimize-reconcile-software-titles (1 hunks)
  • cmd/fleet/cron.go (2 hunks)
  • cmd/fleet/serve.go (1 hunks)
  • cmd/fleet/vuln_process.go (2 hunks)
  • server/datastore/mysql/software.go (1 hunks)
  • server/datastore/mysql/software_test.go (3 hunks)
  • server/service/integration_mdm_test.go (0 hunks)
💤 Files with no reviewable changes (1)
  • server/service/integration_mdm_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**/*.go

⚙️ CodeRabbit configuration file

When reviewing SQL queries that are added or modified, ensure that appropriate filtering criteria are applied—especially when a query is intended to return data for a specific entity (e.g., a single host). Check for missing WHERE clauses or incorrect filtering that could lead to incorrect or non-deterministic results (e.g., returning the first row instead of the correct one). Flag any queries that may return unintended results due to lack of precise scoping.

Files:

  • cmd/fleet/serve.go
  • cmd/fleet/cron.go
  • cmd/fleet/vuln_process.go
  • server/datastore/mysql/software.go
  • server/datastore/mysql/software_test.go
🔇 Additional comments (5)
cmd/fleet/serve.go (1)

982-982: LGTM! Call site correctly updated.

The license parameter removal aligns with the updated signature in cmd/fleet/cron.go (line 1146), where the license is now retrieved from the context instead of being passed as a parameter.

cmd/fleet/vuln_process.go (2)

115-115: LGTM! Invocation updated to match new signature.

The context parameter removal aligns with the factory signature change at line 162. The VulnFunc closures still receive context when invoked, preserving functionality.


162-203: LGTM! Clean refactoring to decouple factory from context.

Removing the context parameter from the factory function signature is a good design choice. The VulnFunc closures (lines 166, 172, 178, 184, 190, 196) still receive context when invoked, so there's no loss of functionality. This decouples the factory from requiring an active context while preserving context usage inside the vulnerability functions.

cmd/fleet/cron.go (2)

68-68: LGTM! Invocation updated to match new signature.

The context parameter removal aligns with the factory signature change in cmd/fleet/vuln_process.go (line 162). All call sites have been consistently updated.


1146-1165: LGTM! Removed redundant license parameter.

The license parameter removal is appropriate since the license is available in the context and retrieved where needed (line 1174 in trySendStatistics). The call site in cmd/fleet/serve.go (line 982) has been correctly updated to match this new signature.

Copy link
Contributor

@ksykulev ksykulev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes sense. 👍 Only thing is we need to merge it after #34097 because that PR is removing the addition of more software records with name_source=bundle_v4.67

@ksykulev ksykulev merged commit 15518d2 into main Oct 14, 2025
42 checks passed
@ksykulev ksykulev deleted the 34055-optimize-reconcile-software-titles branch October 14, 2025 22:36
ksykulev pushed a commit that referenced this pull request Oct 15, 2025
…s job

Back ported: #34146 to be compatible with the 4.75.0 release.
Original ticket #34055
ksykulev added a commit that referenced this pull request Oct 15, 2025
…s job

Back ported: #34146 to be compatible with the 4.75.0 release.
Original ticket #34055

---------

Co-authored-by: Lucas Manuel Rodriguez <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Vulnerability processing blocks software ingestion

2 participants