-
Notifications
You must be signed in to change notification settings - Fork 77
Add Profiler Extract Ingestion Job for Local Dashboards #2101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| ) | ||
| ], | ||
| "tasks": [ | ||
| NotebookTask( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sundarshankar89 's comment from PR#2000 "There are 2 ways we can implement this, have the ingestion job as python package and use a wheel task Or have the notebook upload and then run the jobs.I prefer option 1."
|
✅ 46/46 passed, 6 flaky, 3m13s total Flaky tests:
Running from acceptance #2737 |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2101 +/- ##
==========================================
+ Coverage 64.78% 64.93% +0.15%
==========================================
Files 96 96
Lines 7891 7929 +38
Branches 820 822 +2
==========================================
+ Hits 5112 5149 +37
- Misses 2599 2600 +1
Partials 180 180 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| def _job_profiler_ingestion_task(self, task_key: str, description: str, lakebridge_wheel_path: str) -> Task: | ||
| libraries = [ | ||
| compute.Library(whl=lakebridge_wheel_path), | ||
| compute.PythonPyPiLibrary(package="duckdb") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ingestion job is dependent on duckdb library to read the profiler extract tables.
|
|
||
| def main(*argv) -> None: | ||
| logger.debug(f"Arguments received: {argv}") | ||
| assert len(sys.argv) == 4, f"Invalid number of arguments: {len(sys.argv)}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Manually" testing this main() function outside of the wheel file, there appeared to be 3 additional arguments pertaining to the Python notebook session: 1) Interpreter, 2) -f flag, 3) env settings as a JSON file. Please review that the assumption that they will not be present in a wheel based job task is correct.
|
|
||
| [project.entry-points.databricks] | ||
| reconcile = "databricks.labs.lakebridge.reconcile.execute:main" | ||
| dashboards = "databricks.labs.lakebridge.assessments.dashboards.execute:main" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be profiler_dashboards for clarity?
Changes
What does this PR do?
Adds a new function to the common job deployer to install the local ingestion job. The job transforms profiler extracts into Unity Catalog–managed tables in the user’s local Databricks workspace, enabling the profiler summary (“local”) dashboards.
Relevant implementation details
install_stateisn’t lost between create/update and save, especially if an exception is raised before the save.Caveats/things to watch out for when reviewing:
Linked issues
This PR compliments PR#2000.
Functionality
databricks labs lakebridge ...Tests