Skip to content

Conversation

poojanilangekar
Copy link
Contributor

This change adds a PR to test the end-to-end catalog federation functionality.

Steps involved:

  1. Create a new-user account.
  2. Create an INTERNAL catalog, called test-catalog-local.
  3. Grant the following privileges on test-catalog-local :TABLE_WRITE_DATA -> catalog_admin -> service_admin -> new-user.
  4. Create an EXTERNAL catalog called test-catalog-external that points to test-catalog-local and uses the client-id and client-secret credentials for the new_user account.
  5. Grant the following privileges on test-catalog-external :TABLE_WRITE_DATA -> catalog_admin -> service_admin.
  6. Connect to test-catalog-local and use it to create namespaces/tables and insert some data.
  7. Connect to test-catalog-external and select the existing data and insert new rows.
  8. Re-connect to test-catalog-local and verify that the changes made in test-catalog-external are visible.

This change disables URL overlap check since we are federating to a single Polaris instance.

@poojanilangekar
Copy link
Contributor Author

CC @dennishuo @eric-maynard

EOF

echo ""
echo "=== Verifying federation via LOCAL catalog ==="
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do some verification of RBAC as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will update this test once #2223 is merged. For now, we can only do catalog-level RBACs for federated catalogs.

@github-project-automation github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Aug 11, 2025
Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same test seems possible to do under the JUnit5 framework (as an intTest).

That approach is preferable from my POV because it is more easily discoverable in a java IDE and would show regressions at part of the usual gradle validation tasks.

WDYT?

@poojanilangekar
Copy link
Contributor Author

This is a regression test (which is still run by GitHub CI). Can I send out a separate PR for an integration test if that works for you? At that point we can remove this test if necessary. Currently we have no tests for catalog federation, so it might be a good idea to keep this around while I work on the IT test as you suggested.

For some context, I won't be able to dedicate as much time to the project going forward so I expect that to take some time, in the meantime it is a good idea to have some test that check that federation still works and flag PRs that potentially break federation.

Please let me know what you think.

@dimas-b
Copy link
Contributor

dimas-b commented Aug 11, 2025

From my POV running docker-based tests is more cumbersome than running JUnit5-based tests. So, if the same functionality can be tested under the JUnit framework, I'd prefer that approach.

I do not oppose merging this PR, but I believe it too heavy for the job it performs.

@eric-maynard
Copy link
Contributor

@dimas-b I agree that should add more integration tests on top of this , but we'll need docker-based regression tests for federation in general (imagine testing federation to Hive, you need to actually run HMS somewhere). Accordingly I think it's okay or even prudent to have a regression test for IRC federation for the sake of parity.

@snazy
Copy link
Member

snazy commented Aug 12, 2025

Having tests for more complex setups is totally legit. I just don't think that containers are are a requirement for that.

This is where for example apache/polaris-tools#18 comes into play - it runs a Polaris server (actually "anything Quarkus") from the "local" code base or a specific distributed/released version, which you can then use in an integration test to test the federation use case.

From a machine resource usage it's even more lightweight as only the processes are run, but no container infra.

And you'd be able to run it from your IDE and, with some option tweaking, can debug the processes.

Copy link
Contributor

@dennishuo dennishuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I agree we'll want both flavors longer-term (java tests and black-box regtests) because long-term they'll be able to test different scopes of functionality. In general having this black-box end-to-end is still needed for building confidence that it really works from the perspective of an external caller without JVM-local "tricks" that might creep into a JUnit test, especially when the nature of the feature (such as Federation) innately deals with how things interact beyond the single JVM boundary.

Re: heavyweight containers, it doesn't look like this PR actually requires anything related to Docker -- the sh test just extends the pre-existing patterns we already have for other sh tests and AFAICT is equally happy to run against a direct ./gradlew run or IDE-run or java -jar local deployment vs having Polaris run in the Docker container. So the Docker parts are just making sure the pre-existing Docker-based runs such as for precommit/CI are also able to run the new test successfully.

@snazy
Copy link
Member

snazy commented Aug 13, 2025

I'm not against this PR in principle, however it was mentioned that "we'll need docker-based regression tests". I do think that the state of the regression tests as of today. do nothing that could not be done in a Java based integration test.

OTOH, playing devil's advocate here, do we have good enough JUnit (integration) test coverage for federation?
Another devil: are the regression tests actually really testing a "full black box end to end test" including all the things that end users do (real database, external IdP, secured Polaris setup, external x/y/z)?

@dennishuo
Copy link
Contributor

Yeah, I agree the "Docker" discussion is worth expanding in another thread, even though it's orthogonal to this PR. I'm definitely also in favor of supporting the non-Docker scenarios like running the server in an IDE or anywhere else and still being able to run tests against those, and we can make sure the structure of regtests continues supporting that pattern.

Also +1 to expanding the black-box testing scope to really exercise the things a real user will do; having this PR as a starting point will hopefully reduce the barrier of entry for folks interested in exercising those complex scenarios.

Since it sounds like no one is opposed to merging this PR, I'll go ahead and merge it to unblock expanding test coverage.

@dennishuo dennishuo merged commit 22e4c68 into apache:main Aug 14, 2025
12 checks passed
@github-project-automation github-project-automation bot moved this from Ready to merge to Done in Basic Kanban Board Aug 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants