Skip to content

Conversation

hugocasa
Copy link
Collaborator

@hugocasa hugocasa commented Aug 28, 2025

Important

Enhances job handling by adding a query for root_job retrieval in MiniPulledJob and fixing root job handling for flow jobs.

  • Behavior:
    • Adds new query to retrieve root_job for MiniPulledJob in query-02c945b5f18a56a826721f6884846d79167747742de236ce57f395561685adc0.json.
    • Fixes root job handling for flow jobs by updating compute_root_job_for_flow() in jobs.rs to use COALESCE for determining root_job.
  • Database:
    • Deletes query-b2e4dfaaee713604d0700ea4675ed2d4534e08471a60005bff85b952874c54c2.json and query-fc243af1bc70f04e28c006364d6dfdfc49981568e8c496fb6a163c499c3e4ad1.json.
    • Renames query-4aaab98ebdaa90f1edf49ac96fba6c391c4d0054a618b861464ee37239f1f1e0.json to query-f3f96e066716e61042519a645d487b578bc63792cdb0f7ddaeb82e9771287c22.json and updates it to include root_job.
  • Code Refactoring:
    • Updates get_root_job_id() in common.rs to use root_job directly.
    • Modifies MiniPulledJob struct in jobs.rs to include root_job.
    • Updates various functions across worker.rs, apps.rs, flows.rs, jobs.rs, resources.rs, scripts.rs, trigger_helpers.rs, variables.rs, worker.rs, worker_flow.rs, worker_lockfiles.rs to handle root_job and flow_innermost_root_job.

This description was created by Ellipsis for 5b75652. You can customize this summary. It will automatically update as commits are pushed.

Copy link

cloudflare-workers-and-pages bot commented Aug 28, 2025

Deploying windmill with  Cloudflare Pages  Cloudflare Pages

Latest commit: 5b75652
Status: ✅  Deploy successful!
Preview URL: https://8c13671e.windmill.pages.dev
Branch Preview URL: https://hc-pass-root-job-to-minipull.windmill.pages.dev

View logs

@hugocasa hugocasa changed the title feat(backend): retrieve root_job for MiniPulledJob + fix root job for flow jobs [merge ee first] feat(backend): retrieve root_job for MiniPulledJob + fix root job for flow jobs Aug 28, 2025
@hugocasa hugocasa changed the title [merge ee first] feat(backend): retrieve root_job for MiniPulledJob + fix root job for flow jobs feat(backend): retrieve root_job for MiniPulledJob + fix root job for flow jobs Aug 29, 2025
@hugocasa hugocasa marked this pull request as ready for review September 1, 2025 17:58
Copy link
Contributor

claude bot commented Sep 1, 2025

Claude finished @hugocasa's task —— View job


Code Review: feat(backend): retrieve root_job for MiniPulledJob + fix root job for flow jobs

I've conducted a comprehensive review of this PR. The changes implement a mechanism to properly track and expose root job IDs throughout the job execution system.


Task List:

  • Read repository guidelines and context
  • Review database/SQL changes
  • Review job processing logic changes
  • Review API endpoint modifications
  • Check for security implications
  • Check for performance considerations
  • Identify potential bugs or issues
  • Provide comprehensive feedback and suggestions
  • Create testing/reproduction instructions

🎯 Overview

This PR adds a root_job field to the MiniPulledJob structure and implements logic to properly track the root job ID throughout the job execution pipeline. The main changes include:

  1. Database schema updates to retrieve root_job field
  2. New get_root_job_id() utility function with fallback logic
  3. Environment variable WM_ROOT_JOB_ID exposure to job scripts
  4. API endpoint for retrieving root job IDs

Positive Aspects

Strong Implementation Design:

  • Clean abstraction: The get_root_job_id() function provides a clear API with sensible fallback logic
  • Consistent pattern: Follows established Windmill conventions for job processing
  • Proper optimization: Database storage optimization (line 4696-4704 in jobs.rs) - only stores root_job when it differs from innermost root job, parent job, or job itself

Security & Performance:

  • SQL injection prevention: Uses parameterized queries consistently
  • Performance optimization: Minimal database schema impact with smart storage decisions
  • Encryption key derivation: Proper use of root job ID for encryption contexts

⚠️ Areas of Concern

1. API Endpoint Implementation (Critical)

Location: backend/windmill-api/src/jobs.rs:335-342

async fn compute_root_job_for_flow(db: &DB, w_id: &str, mut job_id: Uuid) -> error::Result<String> {
    // TODO: use `root_job` ?
    loop {
        job_id = match sqlx::query_scalar!(
            "SELECT parent_job FROM v2_job WHERE id = $1 AND workspace_id = $2",
            job_id,
            w_id
        )

Issues:

  • Inefficient implementation: The compute_root_job_for_flow() function still uses a recursive loop approach instead of leveraging the new root_job column
  • TODO comment indicates incomplete migration: The comment suggests the new root_job field should be used here
  • ⚠️ Performance impact: Multiple database queries vs. single lookup

Recommendation: Update this function to use the new root_job field:

async fn compute_root_job_for_flow(db: &DB, w_id: &str, job_id: Uuid) -> error::Result<String> {
    let root_job = sqlx::query_scalar!(
        "SELECT COALESCE(root_job, flow_innermost_root_job, parent_job, id) FROM v2_job WHERE id = $1 AND workspace_id = $2",
        job_id,
        w_id
    )
    .fetch_one(db)
    .await?;
    Ok(root_job.to_string())
}

2. Inconsistent Naming Convention

Location: backend/windmill-queue/src/jobs.rs:2379

flow_innermost_root_job: job.root_job.clone(), // QueuedJob is taken from v2_as_queue, where root_job corresponds to flow_innermost_root_job in v2_job

Issues:

  • ⚠️ Confusing field mapping: root_job from one structure maps to flow_innermost_root_job in another
  • ⚠️ Comment indicates architectural complexity: The relationship between these fields is not intuitive

Recommendation: Consider refactoring to use consistent field names or provide clearer documentation about the mapping rationale.

3. SQL Query Pattern Inconsistency

Location: New SQL query files and existing queries

Issues:

  • ⚠️ Mixed patterns: Some queries now include root_job while others still rely on flow_innermost_root_job
  • ⚠️ Potential data inconsistency: Different code paths may return different "root job" values

🔧 Minor Issues

1. Import Organization

Several files could benefit from better import grouping according to Windmill's Rust best practices.

2. Error Handling

The new get_root_job_id() function handles None cases gracefully, but consider adding tracing for debugging purposes.


🚀 Suggestions for Improvement

  1. Complete the migration: Update compute_root_job_for_flow() to use the new root_job column
  2. Add integration tests: Test the root job ID propagation through complex flow scenarios
  3. Documentation: Add inline documentation for the get_root_job_id() function explaining the fallback logic
  4. Consider caching: For frequently accessed root job lookups, consider adding caching

🧪 Testing Instructions

To verify the changes work correctly:

Navigate to the Windmill app interface and test the root job functionality by creating a multi-step flow with nested jobs. Create a flow with at least 3 levels of nesting (parent flow → child flow → individual script step). Run this flow and observe that scripts in nested flows now have access to the WM_ROOT_JOB_ID environment variable that correctly points to the original triggering job ID rather than immediate parent job ID. Additionally, test the new API endpoint by making a GET request to /api/w/{workspace}/jobs_u/get_root_job_id/{job_id} for any job in a flow to verify it returns the correct root job identifier.


📊 Overall Assessment

Score: 7.5/10

This is a solid implementation that addresses an important architectural need. The main implementation is correct and follows good practices, but the incomplete migration of the API endpoint and some consistency issues prevent it from being excellent. The core functionality is sound and the performance optimizations are well thought out.

Recommendation: Approve with requested changes - Particularly fixing the compute_root_job_for_flow() function to use the new database column.


Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed everything up to 464711f in 1 minute and 37 seconds. Click for details.
  • Reviewed 919 lines of code in 23 files
  • Skipped 0 files when reviewing.
  • Skipped posting 7 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. backend/windmill-worker/src/worker_lockfiles.rs:205
  • Draft comment:
    Consider using proper path manipulation (e.g. Path::join or calling .parent()) instead of string formatting to compute relative paths in parse_ts_relative_imports. This avoids potential path resolution errors.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
2. backend/windmill-worker/src/worker_lockfiles.rs:210
  • Draft comment:
    Avoid using unwrap() on to_str() when converting a normalized path to a string. Consider handling potential non-UTF8 paths gracefully.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
3. backend/windmill-worker/src/worker_lockfiles.rs:250
  • Draft comment:
    Parsing booleans from job arguments by converting them to strings and comparing with "true" is brittle. Consider explicitly parsing these values as booleans.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
4. backend/windmill-worker/src/worker_lockfiles.rs:1294
  • Draft comment:
    Recreating the job_dir via remove_dir_all() followed by create_dir_all() can be destructive if job_dir is shared. Consider isolating lock file work in a dedicated subdirectory.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
5. backend/windmill-worker/src/worker_lockfiles.rs:1050
  • Draft comment:
    The async recursive functions (e.g. lock_modules and lock_modules_app) use Box::pin and deep nesting. Consider refactoring to simplify the structure for better maintainability and potential performance improvements.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
6. backend/windmill-worker/src/worker_lockfiles.rs:2258
  • Draft comment:
    The capture_dependency_job function contains extensive language‑specific logic. Consider splitting this logic into separate modules or functions to simplify maintenance and enhance testability for each supported language.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
7. backend/windmill-worker/src/worker_flow.rs:2839
  • Draft comment:
    There seems to be an inconsistency in the variable naming: the debug log references job_root, but the root job is assigned to flow_root_job on line 2831. Please confirm whether job_root should be updated to flow_root_job.
  • Reason this comment was not posted:
    Comment was on unchanged code.

Workflow ID: wflow_Du1lYwx3P1Yr5pGm

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed 759ee88 in 1 minute and 15 seconds. Click for details.
  • Reviewed 36 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 8 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. backend/windmill-api/src/jobs.rs:1
  • Draft comment:
    This file exceeds 7000 lines and contains many responsibilities. Consider splitting it into smaller modules for improved maintainability.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
2. backend/windmill-api/src/jobs.rs:6680
  • Draft comment:
    Many SQL query strings are repeated. Consider refactoring common query fragments or helper functions/constants to avoid duplication.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
3. backend/windmill-api/src/jobs.rs:7010
  • Draft comment:
    The loop traversing the parent_job chain (in get_completed_job_result) lacks a safety limit. Introduce a maximum iteration count to avoid potential infinite loops.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
4. backend/windmill-api/src/jobs.rs:6260
  • Draft comment:
    The SSE polling loop in get_job_update_sse_stream spawns a long‐lived task. Ensure proper cancellation (e.g. when client disconnects) to avoid resource leaks.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
5. backend/windmill-api/src/jobs.rs:6950
  • Draft comment:
    Avoid using the literal string 'anonymous' repeatedly; define and use a constant to reduce typos and ease future changes.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
6. backend/windmill-api/src/jobs.rs:2500
  • Draft comment:
    Avoid using unwrap/unwrap_or in SQL query result handling; prefer explicit error handling to prevent panics in production.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
7. backend/windmill-api/src/jobs.rs:1
  • Draft comment:
    Consider separating business logic (SQL queries, job scheduling) from HTTP response construction to improve clarity and testability.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
8. backend/windmill-api/src/jobs.rs:6340
  • Draft comment:
    The polling intervals in the SSE stream are hard-coded. Consider externalizing these durations into configuration to allow tuning for load and performance.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.

Workflow ID: wflow_sfOiVFIkq3cTttQF

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed 5b75652 in 45 seconds. Click for details.
  • Reviewed 56 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. backend/.sqlx/query-0e14ab95a08572f0672db266187335f578c622eb335cfc7cd0969633d85c9f73.json:3
  • Draft comment:
    Verify that the alias 'root_job!' (with an exclamation mark) is intentional and consistent with naming conventions.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None

Workflow ID: wflow_oCTpJj5FNPgIHqL1

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

@rubenfiszel rubenfiszel merged commit fbd942f into main Sep 2, 2025
11 checks passed
@rubenfiszel rubenfiszel deleted the hc/pass-root-job-to-minipulledjob branch September 2, 2025 09:44
@github-actions github-actions bot locked and limited conversation to collaborators Sep 2, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants