Skip to content

Conversation

harishkesavarao
Copy link
Contributor

@harishkesavarao harishkesavarao commented Aug 17, 2025

Part of the series of PRs for #13357.

@hsheth2 opening this to get early feedback from you and others on the general approach, especially for the extractor design.

  • Metadata extraction, which is the key piece for the issue, is dependent on the extractors.
  • Rewriting datahub extractors to inherit Airflow plugin extractors (this looks like a major refactor and wanted to confirm if the path to go ahead with the refactoring is the right one, from a design standpoint).

If you all think that the overall design looks good, I can expand this to on_task_instance_failed and the DAG methods.

(some aspects of the integration tests are failing, currently fixing them)

@github-actions github-actions bot added ingestion PR or Issue related to the ingestion of metadata community-contribution PR or Issue raised by member(s) of DataHub Community labels Aug 17, 2025
dagrun: "DagRun" = task_instance.dag_run # type: ignore[attr-defined]
task = task_instance.task
if TYPE_CHECKING:
assert task
Copy link

@aikido-pr-checks aikido-pr-checks bot Aug 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dangerous use of assert - low severity
When running Python in production in optimized mode, assert calls are not executed. This mode is enabled by setting the PYTHONOPTIMIZE command line flag. Optimized mode is usually ON in production. Any safety check done using assert will not be executed.

Remediation: Raise an exception instead of using assert.
View details in Aikido Security

Copy link

codecov bot commented Aug 17, 2025

Bundle Report

Changes will decrease total bundle size by 6.24MB (-21.9%) ⬇️. This is within the configured threshold ✅

Detailed changes
Bundle name Size Change
datahub-react-web-esm 22.24MB -6.24MB (-21.9%) ⬇️

Affected Assets, Files, and Routes:

view changes for bundle: datahub-react-web-esm

Assets Changed:

Asset Name Size Change Total Size Change (%)
assets/index-*.js -220.46kB 18.61MB -1.17%
assets/index-*.css -627 bytes 608.85kB -0.1%
assets/FTE-*.mp4 (Deleted) -1.19MB 0 bytes -100.0% 🗑️
assets/FTE-*.mp4 (Deleted) -1.21MB 0 bytes -100.0% 🗑️
assets/FTE-*.mp4 (Deleted) -1.22MB 0 bytes -100.0% 🗑️
assets/FTE-*.mp4 (Deleted) -1.23MB 0 bytes -100.0% 🗑️
assets/welcome-*.png (Deleted) -1.14MB 0 bytes -100.0% 🗑️
assets/grafana-*.png (Deleted) -31.01kB 0 bytes -100.0% 🗑️
assets/FTE-*.js (Deleted) -64 bytes 0 bytes -100.0% 🗑️
assets/FTE-*.js (Deleted) -65 bytes 0 bytes -100.0% 🗑️
assets/FTE-*.js (Deleted) -64 bytes 0 bytes -100.0% 🗑️

@datahub-cyborg datahub-cyborg bot added the needs-review Label for PRs that need review from a maintainer. label Aug 17, 2025
@harishkesavarao harishkesavarao changed the title Make Airflow plugin fully compatible with Airflow (partial changes) Make Airflow plugin fully compatible with Airflow (WIP: partial changes) Aug 17, 2025
Copy link

codecov bot commented Aug 17, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community-contribution PR or Issue raised by member(s) of DataHub Community ingestion PR or Issue related to the ingestion of metadata needs-review Label for PRs that need review from a maintainer.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant