Skip to content

Conversation

@pmbrull
Copy link
Collaborator

@pmbrull pmbrull commented Nov 23, 2025

Describe your changes:

If we try to execute the deploy-pipelines directly in a hybrid setup, the deploy-pipelines pod in ArgoCD won't have the websocket connection to the hybrid runner to propagate the deployment operations

Fixes #2386, Fixes #2398

Type of change:

  • Bug fix
  • Improvement
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

Summary by Gitar

  • Deploy-pipelines CLI refactor:
    • Replaced PipelineServiceClient websocket-based deployment with HTTP API calls to /api/v1/services/ingestionPipelines/bulk/deploy
    • Implemented batching strategy (20 pipelines per chunk) to optimize bulk deployment performance
  • New authentication flow:
    • getIngestionBotToken() in OpenMetadataOperations.java retrieves and decrypts ingestion-bot JWT from secrets manager
    • Uses HttpClient with 30s connect timeout and 2min request timeout for API calls
  • New DELETE endpoint:
    • DELETE /{id}/pipelineStatus/{runId} in IngestionPipelineResource.java removes specific pipeline status records
  • Database layer enhancement:
    • deleteExtensionByKey() in EntityTimeSeriesDAO.java supports MySQL and PostgreSQL JSON queries for time-series data deletion

@pmbrull pmbrull requested a review from a team as a code owner November 23, 2025 15:34
@gitar-bot
Copy link

gitar-bot bot commented Nov 23, 2025

Gitar analyzes/fixes CI failures, runs prompts as workflows and addresses comments starting with Gitar (docs)

Rules 1 🎸

Gitar Rules

🎸 PR Summary Enhancement: Comprehensive technical summary already in PR description

Flaky Test Auto-Retry: Multiple flaky tests detected but all unrelated to PR changes

⚙️ Options:

  • Auto-apply suggestions - Allow Gitar to commit updates to this branch

@gitar-bot
Copy link

gitar-bot bot commented Nov 23, 2025

CI failed for e379143: Python 3.12 CI failure is unrelated to PR changes - pyodbc dependency compilation issue with deprecated Python C API

Issue

The py-run-tests (3.12) CI job failed during dependency installation, not during test execution. The failure occurs when building the pyodbc wheel from source.

Root Cause

The pyodbc package (version 4.0.39) is attempting to use deprecated Python C API functions that were removed in Python 3.12:

  • PyUnicode_FromUnicode() - removed in Python 3.12
  • PyUnicode_AS_UNICODE() - removed in Python 3.12
  • PyUnicode_GET_SIZE() - removed in Python 3.12

These deprecated APIs were removed as part of PEP 623.

Details

Error excerpt from the build logs:

src/pyodbccompat.h:75:12: error: 'PyUnicode_FromUnicode' was not declared in this scope
src/pyodbccompat.h:86:12: error: 'PyUnicode_AS_UNICODE' was not declared in this scope
src/pyodbccompat.h:126:40: error: 'PyUnicode_GET_SIZE' was not declared in this scope
error: command '/usr/bin/g++' failed with exit code 1
ERROR: Failed to build installable wheels for some pyproject.toml based projects (pyodbc)

Why This is Unrelated to PR #24508

This PR only modifies Java backend code:

  • EntityTimeSeriesDAO.java
  • IngestionPipelineRepository.java
  • IngestionPipelineResource.java
  • OpenMetadataOperations.java
  • IngestionPipelineResourceTest.java

No Python dependencies or ingestion code was modified. This is a pre-existing infrastructure issue with Python 3.12 compatibility.

Suggested Fix

The solution is to upgrade pyodbc to version 5.0.0 or later in the project's Python dependencies, as these versions are compatible with Python 3.12. The fix should be applied in ingestion/setup.py or ingestion/requirements.txt by updating the pyodbc version constraint.

Reply with Gitar please fix to apply these suggestions.
Auto-apply is off - enable it here to let Gitar commit future updates automatically.
Did we get this right? 👍 / 👎 to help improve future analysis.

if (serverUrl == null) {
throw new RuntimeException("SERVER_HOST_API_URL not configured");
}
LOG.info("Deploying pipelines to server URL: {}", serverUrl);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add details which runner we are deploying to? log it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have some debugs on HybridServiceClient that we can turn into info to have control on where different ops are landing. that would be on collate repo directly, let me do that on an other ongoing PR

harshach
harshach previously approved these changes Nov 24, 2025
@gitar-bot
Copy link

gitar-bot bot commented Nov 25, 2025

CI failed for dd625fc: Multiple Playwright E2E test failures in AutoPilot feature tests - all unrelated to PR changes which modify Java ingestion pipeline backend code only.

Issue

Playwright CI job playwright-ci-postgresql (6, 6) failed with 2 test failures in the AutoPilot feature:

  1. Mysql AutoPilot test (AutoPilot.spec.ts:95:11): Expected 3 successful pipelines, but received 4 (first run) and 5 (retry)
  2. Airflow AutoPilot test (AutoPilot.spec.ts:95:11): Expected success message 'AutoPilot agents run completed successfully' never appeared

Root Cause

These failures are unrelated to the PR changes. The PR modifies:

  • OpenMetadataOperations.java - CLI operations for bulk pipeline deployment via API
  • IngestionPipelineResource.java - REST API endpoint for deleting pipeline status
  • EntityTimeSeriesDAO.java - Database DAO layer
  • IngestionPipelineRepository.java - Repository layer and minor bug fix
  • Java test files

The failures are in frontend E2E tests testing the AutoPilot UI feature, which is completely separate from the backend Java changes in this PR. The test expectations are hardcoded to expect specific pipeline counts (3 successful pipelines), but the actual UI is showing different counts (4 and 5), indicating either:

  1. Test environment state pollution from previous tests
  2. Race conditions in pipeline creation/completion timing
  3. Flaky test expectations that don't account for dynamic test data

Details

The test at playwright/e2e/Features/AutoPilot.spec.ts:161 expects exactly 3 successful pipelines:

await expect(
  page.getByTestId('agent-status-summary-item-Successful')
      .getByTestId('pipeline-count')
).toHaveText('3');

But the actual UI shows 4 pipelines on first attempt and 5 on retry, suggesting the test environment has leftover data from other test runs.

The Airflow test expects a specific success message that never appears, likely due to timing issues or the AutoPilot agent run not completing within the expected timeframe.

Additionally, 3 other tests were marked as flaky (Login, Tag, Users tests), further indicating test infrastructure instability rather than code issues.

Suggested Fix

The fix should be applied to the test suite, not the PR code:

  1. Improve test isolation: Ensure AutoPilot tests properly clean up created pipelines before and after each test run
  2. Use dynamic assertions: Instead of hardcoded counts, assert that the count is >= expected minimum or use more flexible matchers
  3. Add retry logic: Increase timeout for AutoPilot agent completion messages
  4. Fix test setup/teardown: Ensure entity-data.teardown.ts properly removes all AutoPilot-created entities between test runs

The PR changes are backend-only and cannot cause frontend E2E test failures in the AutoPilot feature tests.

Reply with Gitar please fix to apply these suggestions.
Auto-apply is off - enable it here to let Gitar commit future updates automatically.
Did we get this right? 👍 / 👎 to help improve future analysis.

@sonarqubecloud
Copy link

@pmbrull pmbrull merged commit 8df39ad into main Nov 25, 2025
24 of 28 checks passed
@pmbrull pmbrull deleted the fix-hybrid-bulk branch November 25, 2025 10:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Ingestion safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants