Add Profiler Extract Ingestion Job for Local Dashboards #2101

goodwillpunning · 2025-10-17T18:54:24Z

Changes

What does this PR do?

Adds a new function to the common job deployer to install the local ingestion job. The job transforms profiler extracts into Unity Catalog–managed tables in the user’s local Databricks workspace, enabling the profiler summary (“local”) dashboards.

Relevant implementation details

The implementation closely follows the existing reconcile job deployment. Please verify that the install_state isn’t lost between create/update and save, especially if an exception is raised before the save.

Caveats/things to watch out for when reviewing:

Linked issues

This PR compliments PR#2000.

Functionality

added relevant user documentation
added new CLI command
modified existing command: databricks labs lakebridge ...
installed as a part of the CLI command (see PR#2000)

Tests

manually tested
added unit tests
added integration tests

goodwillpunning · 2025-10-17T18:56:05Z

src/databricks/labs/lakebridge/deployment/job.py

+                )
+            ],
+            "tasks": [
+                NotebookTask(


@sundarshankar89 's comment from PR#2000 "There are 2 ways we can implement this, have the ingestion job as python package and use a wheel task Or have the notebook upload and then run the jobs.I prefer option 1."

github-actions · 2025-10-17T19:06:35Z

✅ 46/46 passed, 6 flaky, 3m13s total

Flaky tests:

🤪 test_validate_non_empty_tables (24ms)
🤪 test_transpiles_informatica_to_sparksql (14.369s)
🤪 test_transpile_teradata_sql_non_interactive[True] (15.781s)
🤪 test_transpiles_informatica_to_sparksql_non_interactive[False] (3.611s)
🤪 test_transpile_teradata_sql_non_interactive[False] (19.249s)
🤪 test_transpile_teradata_sql (11.75s)

_{Running from acceptance #2737}

codecov · 2025-10-22T00:07:23Z

Codecov Report

❌ Patch coverage is 97.43590% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 64.93%. Comparing base (3d54bc0) to head (b951b8b).

Files with missing lines	Patch %	Lines
.../labs/lakebridge/assessments/profiler_validator.py	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2101      +/-   ##
==========================================
+ Coverage   64.78%   64.93%   +0.15%     
==========================================
  Files          96       96              
  Lines        7891     7929      +38     
  Branches      820      822       +2     
==========================================
+ Hits         5112     5149      +37     
- Misses       2599     2600       +1     
  Partials      180      180

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

goodwillpunning · 2025-10-23T21:35:46Z

src/databricks/labs/lakebridge/deployment/job.py

+    def _job_profiler_ingestion_task(self, task_key: str, description: str, lakebridge_wheel_path: str) -> Task:
+        libraries = [
+            compute.Library(whl=lakebridge_wheel_path),
+            compute.PythonPyPiLibrary(package="duckdb")


The ingestion job is dependent on duckdb library to read the profiler extract tables.

goodwillpunning · 2025-10-23T21:38:28Z

src/databricks/labs/lakebridge/assessments/dashboards/execute.py

+
+def main(*argv) -> None:
+    logger.debug(f"Arguments received: {argv}")
+    assert len(sys.argv) == 4, f"Invalid number of arguments: {len(sys.argv)}"


"Manually" testing this main() function outside of the wheel file, there appeared to be 3 additional arguments pertaining to the Python notebook session: 1) Interpreter, 2) -f flag, 3) env settings as a JSON file. Please review that the assumption that they will not be present in a wheel based job task is correct.

gueniai · 2025-10-23T22:04:37Z

pyproject.toml


 [project.entry-points.databricks]
 reconcile = "databricks.labs.lakebridge.reconcile.execute:main"
+dashboards = "databricks.labs.lakebridge.assessments.dashboards.execute:main"


should this be profiler_dashboards for clarity?

Add initial ingestion job.

fd583f0

goodwillpunning requested review from radhikaathalye-db and sundarshankar89 October 17, 2025 18:54

goodwillpunning self-assigned this Oct 17, 2025

goodwillpunning added the feat/profiler Issues related to profilers label Oct 17, 2025

goodwillpunning requested a review from a team as a code owner October 17, 2025 18:54

goodwillpunning temporarily deployed to tool October 17, 2025 18:54 — with GitHub Actions Inactive

goodwillpunning commented Oct 17, 2025

View reviewed changes

Add unit test for profiler ingestion job deployment.

248936c

goodwillpunning temporarily deployed to tool October 20, 2025 13:38 — with GitHub Actions Inactive

goodwillpunning added 2 commits October 21, 2025 19:58

Update profiler ingestion job to be wheel-based

2f9f283

Update profiler ingestion job unit tests.

b718c16

goodwillpunning temporarily deployed to tool October 22, 2025 00:04 — with GitHub Actions Inactive

Merge branch 'main' into feature/local_ingestion_job

ab07cec

gueniai temporarily deployed to tool October 22, 2025 16:12 — with GitHub Actions Inactive

Add table ingestion logic to profiler ingest job.

a7499b7

goodwillpunning temporarily deployed to tool October 23, 2025 11:47 — with GitHub Actions Inactive

Update table ingestion exception handling.

f2dd27a

goodwillpunning had a problem deploying to tool October 23, 2025 18:19 — with GitHub Actions Error

Add duckdb dependency in Job Task definition.

b951b8b

goodwillpunning deployed to tool October 23, 2025 18:24 — with GitHub Actions Active

goodwillpunning commented Oct 23, 2025

View reviewed changes

gueniai reviewed Oct 23, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Profiler Extract Ingestion Job for Local Dashboards #2101

Add Profiler Extract Ingestion Job for Local Dashboards #2101

goodwillpunning commented Oct 17, 2025

Uh oh!

goodwillpunning Oct 17, 2025

Uh oh!

github-actions bot commented Oct 17, 2025 •

edited

Loading

Uh oh!

codecov bot commented Oct 22, 2025 •

edited

Loading

Uh oh!

goodwillpunning Oct 23, 2025

Uh oh!

goodwillpunning Oct 23, 2025

Uh oh!

gueniai Oct 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add Profiler Extract Ingestion Job for Local Dashboards #2101

Are you sure you want to change the base?

Add Profiler Extract Ingestion Job for Local Dashboards #2101

Conversation

goodwillpunning commented Oct 17, 2025

Changes

What does this PR do?

Relevant implementation details

Caveats/things to watch out for when reviewing:

Linked issues

Functionality

Tests

Uh oh!

goodwillpunning Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

goodwillpunning Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

goodwillpunning Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

gueniai Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions bot commented Oct 17, 2025 •

edited

Loading

codecov bot commented Oct 22, 2025 •

edited

Loading