Skip to content

Remove CallContext.copy() #2294

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

snazy
Copy link
Member

@snazy snazy commented Aug 7, 2025

The function is only needed to create a "safe clone" of PolarisCallContext in TaskExecutorImpl, which doesn't have references to CDI request scoped beans.

To still let TaskExecutorImpl making "safe clones", a functionality to get (fresh) instances of RealmContext is required. To enable this, the RealmContextResolver has been enhanced with "RealmContext lookups" by realm-ID. That in turn led to splitting the HTTP/REST-to-realm-context resolution into two parts: HTTP/REST-to-realm-ID and realm-ID-to-context.

This change is part of the effort to allow tasks run run in different JVMs.

The function is only needed to create a "safe clone" of `PolarisCallContext` in `TaskExecutorImpl`, which doesn't have references to CDI request scoped beans.

To still let `TaskExecutorImpl` making "safe clones", a functionality to get (fresh) instances of `RealmContext` is required. To enable this, the `RealmContextResolver` has been enhanced with "`RealmContext` lookups" by realm-ID. That in turn led to splitting the HTTP/REST-to-realm-context resolution into two parts: HTTP/REST-to-realm-ID and realm-ID-to-context.

This change is part of the effort to allow tasks run run in different JVMs.
Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@github-project-automation github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Aug 12, 2025
tryHandleTask(taskEntityId, new TaskContext(callContext), null, 1);
}

record TaskContext(String realmId, PolarisDiagnostics diagnostics) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have we done some testing (at least manually to verify the task is still executing correctly)? as far as I know, we currently do not have a good integration test that actually helps verify the background task. The regtests t_pyspark/test_spark_sql_s3_with_privileges.py contains a tests that helps verify the background purge task, but not running in the CI today.

Can we follow up the readme here https://github.com/apache/polaris/blob/main/regtests/README.md to run the test against aws to verify things are still working?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, @gh-yzou 's comment made me think that we probably need to start a new request context for the task via @ActivateRequestContext as in #1817 and also block context propagation on the task thread pool. That should give us proper CDI context isolation. WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CDI context propagation does not work across different JVMs for the tasks-proposals.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have we done some testing

I mean, sure? There are tests in the code base for this.

The regtests t_pyspark/test_spark_sql_s3_with_privileges.py ... not running in the CI today.

I'd suggest to fix that independently. Mind taking a stab on that, @gh-yzou ? MinIO, Azurite and Google emulator help there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, sure? There are tests in the code base for this.

Our task tests are not very complete right now, so this doesn't give me much confidence.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

t_pyspark/test_spark_sql_s3_with_privileges.py can only be run locally with your avaiable S3 account, @jbonofre is working on getting an S3 account for cli testing.

However, I am not sure what would be the extra benefit this approach would give us. The copy() method is used to propagate the whole call Context and RealmContext to the task execution, and the id look up seems also only used by task execution and requires an extra re-construction step. Furthermore, this approach doesn't seem scalable in the future if we are adding more information to callContext or RealmContext (such as user specific information), especially for RealmContext, it may require us to implement a RealmContextManager for lookup.

If the concern is that this function is only used by task executor, i recall @dimas-b was POC something that leverages CDI feature to propagate the CallContext to background thread, which seems a cleaner way to do that. @dimas-b maybe we can resume that work?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants