Skip to content

Add tracking of submitted transforms #634

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

gordonwatts
Copy link
Collaborator

@gordonwatts gordonwatts commented Jul 22, 2025

Clean up of how we record submitted items, as well as add them to the cache list command.

Summary

  • record submitted transform metadata with cache_submitted_transform
  • return submitted queries from submitted_queries
  • use new cache API when submitting transforms
  • list pending requests in the CLI cache table
  • document that cache list shows submitted queries

Testing

  • pytest -q

Comments

In its current form there is a potential UX issue - the Pending queries can complete, but the local cache won't be updated. This might mean this dump is incorrect (e.g. lies to the user).

  1. We drop this MR
  2. We make it more clear this is the local status
  3. We have a seperate command that dumps this information
  4. We have a command that "updates" the local cache status by scanning all pending items.

https://chatgpt.com/codex/tasks/task_e_68791d01353483208e71471198f5c3ca

Copy link

codecov bot commented Jul 22, 2025

Codecov Report

❌ Patch coverage is 78.57143% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 96.68%. Comparing base (929c2aa) to head (84bf781).
⚠️ Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
servicex/app/cache.py 0.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #634      +/-   ##
==========================================
- Coverage   96.97%   96.68%   -0.29%     
==========================================
  Files          29       29              
  Lines        1948     1960      +12     
==========================================
+ Hits         1889     1895       +6     
- Misses         59       65       +6     
Flag Coverage Δ
unittests 96.68% <78.57%> (-0.29%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Collaborator Author

@gordonwatts gordonwatts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is what it looks like right now:

                                                 Cached Queries                                                  
┏━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━┓
┃ Title         ┃ Codegen  ┃ Transform ID                         ┃ Run Date              ┃ Files   ┃ Format    ┃
┡━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━┩
│ ttbar_trijet  │ atlasr25 │ 14b9d860-06de-47d1-bb52-34b73451ca7b │ Sat, 2025-07-12 09:32 │ 1       │ root-file │
│ object_counts │ atlasr25 │ 169d78f0-b56a-4444-a070-eb130dacab61 │ Sat, 2025-07-12 19:18 │ 1       │ root-file │
│ ttbar_trijet  │ atlasr25 │ c10294a6-ae9a-4c07-b58f-31500ee40280 │ Sun, 2025-07-13 02:01 │ 1       │ root-file │
│ ttbar_trijet  │ atlasr25 │ 7230cfd5-f445-4049-9b0d-2a874700859f │ Sun, 2025-07-13 02:08 │ 1       │ root-file │
│ ttbar_trijet  │ atlasr25 │ 5605d79e-1110-4784-b5a3-157a4aeaa248 │ Sun, 2025-07-13 02:16 │ 1       │ root-file │
│ object_counts │ atlasr25 │ 9fb515d0-f4ba-43db-8302-052bfb31c840 │ Tue, 2025-07-15 10:02 │ 100     │ root-file │
│ object_counts │ atlasr25 │ 8cae73e8-1b59-4528-a6cb-0f647aad641c │ Tue, 2025-07-15 14:31 │ 2       │ root-file │
│ object_counts │ atlasr25 │ 413692bd-70f4-496f-ba51-4f1500f7e742 │ Sun, 2025-07-20 11:59 │ 3       │ root-file │
│ object_counts │ atlasr25 │ 3a13ea62-1850-40a3-8509-93ee33854d68 │ Mon, 2025-07-21 15:05 │ 5300    │ root-file │
│ object_counts │ atlasr25 │ 08c0fca8-9163-41b1-b90b-60b912c61186 │ Tue, 2025-07-22 14:37 │ 1       │ root-file │
│               │          │ e8c18537-65d5-42f0-b68e-3f54471b0ad6 │ Pending               │ Pending │           │
└───────────────┴──────────┴──────────────────────────────────────┴───────────────────────┴─────────┴───────────┘

Currently, in the db entry, more information is not available.

@gordonwatts gordonwatts self-assigned this Jul 22, 2025
@gordonwatts gordonwatts marked this pull request as draft July 22, 2025 20:43
@gordonwatts gordonwatts added the enhancement New feature or request label Jul 22, 2025
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces tracking of submitted transforms in the cache system and improves the cache list command to display pending submissions. The changes consolidate the transform submission recording process and provide better visibility into pending requests.

  • Consolidates submitted transform metadata recording with a new cache_submitted_transform method
  • Adds functionality to retrieve and display submitted (pending) queries in the CLI
  • Updates documentation to clarify that the cache list command shows both completed and pending submissions

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
servicex/query_cache.py Adds cache_submitted_transform method and submitted_queries method for better tracking
servicex/query_core.py Updates transform submission to use new consolidated cache API
servicex/app/cache.py Enhances cache list command to display pending submissions alongside completed ones
tests/test_query_cache.py Updates tests to use new cache API and adds test for submitted queries functionality
tests/test_dataset.py Updates mock method calls to reflect new cache API
docs/command_line.rst Documents that cache list shows pending submissions
Comments suppressed due to low confidence (1)

servicex/app/cache.py:75

  • [nitpick] The variable name 'r' is ambiguous and inconsistent with the loop above which uses the same name. Consider using 'pending_query' or 'pending_record' for clarity.
    for r in pending:

"result_format": transform.result_format,
"request_id": request_id,
"status": "SUBMITTED",
"submit_time": datetime.now(timezone.utc).isoformat(),
Copy link
Preview

Copilot AI Jul 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Consider accepting submit_time as a parameter to make the method more testable and flexible, rather than always using the current time.

Suggested change
"submit_time": datetime.now(timezone.utc).isoformat(),
"submit_time": submit_time,

Copilot uses AI. Check for mistakes.

Copy link
Collaborator

@ponyisi ponyisi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, please take a look at the comments, but on the whole I like the motivation for the changes

@@ -62,6 +62,7 @@ def list():
table.add_column("Files")
table.add_column("Format")
runs = cache.cached_queries()
pending = cache.submitted_queries()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please change "pending" to "submitted"

r.get("codegen", ""),
r.get("request_id", ""),
"Pending",
"Pending",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In particular as to what is shown to the users

@@ -148,6 +149,24 @@ def update_transform_request_id(self, hash_value: str, request_id: str) -> None:
transform.hash == hash_value,
)

def cache_submitted_transform(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefer not to have a submitted-only code path

@@ -203,6 +222,17 @@ def cached_queries(self) -> List[TransformedResults]:
]
return result

def submitted_queries(self) -> List[dict]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be generalized to return queries in a specified state (not just submitted)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
codex enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants