-
Notifications
You must be signed in to change notification settings - Fork 7
APP-8642 : Add OAuth to pyatlan #767
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Unclosed OAuth Client Leaks Resources
The aclose() method doesn't close self._async_oauth_token_manager. If OAuth is configured, the AsyncOAuthTokenManager has its own httpx.AsyncClient that needs cleanup via await self._async_oauth_token_manager.aclose(). Without this, the HTTP client remains open, potentially preventing proper resource cleanup and causing connection leaks.
pyatlan/client/aio/client.py#L866-L897
atlan-python/pyatlan/client/aio/client.py
Lines 866 to 897 in ffee571
| Upload an image to Atlan (async version). | |
| :param file: local file to upload | |
| :param filename: name of the file to be uploaded | |
| :returns: details of the uploaded image | |
| :raises AtlanError: on any API communication issue | |
| """ | |
| raw_json = await self._upload_file(UPLOAD_IMAGE, file=file, filename=filename) | |
| return AtlanImage(**raw_json) | |
| async def _upload_file(self, api, file=None, filename=None): | |
| """Async version of _upload_file (matches sync exactly)""" | |
| generator = MultipartDataGenerator() | |
| generator.add_file(file=file, filename=filename) | |
| post_data = generator.get_post_data() | |
| api.produces = f"multipart/form-data; boundary={generator.boundary}" | |
| path = self._create_path(api) | |
| params = await self._create_params( | |
| api, query_params=None, request_obj=None, exclude_unset=True | |
| ) | |
| if LOGGER.isEnabledFor(logging.DEBUG): | |
| self._api_logger(api, path) | |
| return await self._call_api_internal(api, path, params, binary_data=post_data) | |
| def update_headers(self, header: dict[str, str]): | |
| """Update headers for the async session.""" | |
| if self._async_session: | |
| self._async_session.headers.update(header) | |
| async def aclose(self): | |
| """Close async resources""" | |
| if self._async_session: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: `aclose()` leaks async resources.
The aclose() method doesn't close _async_oauth_token_manager, causing a resource leak. When OAuth is used, the token manager owns an httpx.AsyncClient that needs cleanup via await self._async_oauth_token_manager.aclose(), but this is never called. This leaves HTTP connections and other async resources open, preventing proper cleanup when using context managers or manually calling aclose().
pyatlan/client/aio/client.py#L918-L942
atlan-python/pyatlan/client/aio/client.py
Lines 918 to 942 in eaf6dd7
| async def aclose(self): | |
| """Close async resources""" | |
| if self._async_session: | |
| await self._async_session.aclose() | |
| self._async_session = None | |
| # Clean up all client references | |
| self._async_admin_client = None | |
| self._async_asset_client = None | |
| self._async_audit_client = None | |
| self._async_contract_client = None | |
| self._async_credential_client = None | |
| self._async_file_client = None | |
| self._async_group_client = None | |
| self._async_impersonate_client = None | |
| self._async_open_lineage_client = None | |
| self._async_query_client = None | |
| self._async_role_client = None | |
| self._async_search_log_client = None | |
| self._async_sso_client = None | |
| self._async_task_client = None | |
| self._async_token_client = None | |
| self._async_typedef_client = None | |
| self._async_user_client = None |
| {"headers": {"authorization": f"Bearer {self.api_key}"}} | ||
| if self.api_key and self.api_key.strip() | ||
| else {"headers": {}} | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Sync Client OAuth Leaks HTTP Resources
The sync AtlanClient creates _oauth_token_manager when OAuth credentials are provided but has no cleanup mechanism (no close() method or destructor). The token manager owns an httpx.Client that needs to be closed via self._oauth_token_manager.close(), but this never happens, causing a resource leak with unclosed HTTP connections whenever OAuth authentication is used with the sync client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: `aclose()` Leaks OAuth Resources
The aclose() method doesn't clean up _async_oauth_token_manager, causing a resource leak. When OAuth authentication is used, the AsyncOAuthTokenManager creates an httpx.AsyncClient that needs to be properly closed. The async client's cleanup should call await self._async_oauth_token_manager.aclose() when the manager exists, similar to how _async_session is closed. This becomes especially problematic when using the async context manager pattern since __aexit__ calls aclose().
pyatlan/client/aio/client.py#L927-L948
atlan-python/pyatlan/client/aio/client.py
Lines 927 to 948 in c7903dc
| async def aclose(self): | |
| """Close async resources""" | |
| if self._async_session: | |
| await self._async_session.aclose() | |
| self._async_session = None | |
| # Clean up all client references | |
| self._async_admin_client = None | |
| self._async_asset_client = None | |
| self._async_audit_client = None | |
| self._async_contract_client = None | |
| self._async_credential_client = None | |
| self._async_file_client = None | |
| self._async_group_client = None | |
| self._async_impersonate_client = None | |
| self._async_open_lineage_client = None | |
| self._async_query_client = None | |
| self._async_role_client = None | |
| self._async_search_log_client = None | |
| self._async_sso_client = None | |
| self._async_task_client = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Async cleanup fails, leaking HTTP connections.
The aclose method fails to close _async_oauth_token_manager which owns an HTTP client that needs cleanup. The AsyncOAuthTokenManager has an aclose method that should be called to properly close its _http_client when it owns it. This creates a resource leak where the HTTP connection remains open after the async client is closed.
pyatlan/client/aio/client.py#L927-L951
atlan-python/pyatlan/client/aio/client.py
Lines 927 to 951 in c383729
| async def aclose(self): | |
| """Close async resources""" | |
| if self._async_session: | |
| await self._async_session.aclose() | |
| self._async_session = None | |
| # Clean up all client references | |
| self._async_admin_client = None | |
| self._async_asset_client = None | |
| self._async_audit_client = None | |
| self._async_contract_client = None | |
| self._async_credential_client = None | |
| self._async_file_client = None | |
| self._async_group_client = None | |
| self._async_impersonate_client = None | |
| self._async_open_lineage_client = None | |
| self._async_query_client = None | |
| self._async_role_client = None | |
| self._async_search_log_client = None | |
| self._async_sso_client = None | |
| self._async_task_client = None | |
| self._async_token_client = None | |
| self._async_typedef_client = None | |
| self._async_user_client = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Async client leaks OAuth resources.
The aclose() method in AsyncAtlanClient closes the async session but doesn't close the _async_oauth_token_manager if it exists. When OAuth authentication is used, the AsyncOAuthTokenManager owns an httpx.AsyncClient instance that should be properly closed via await self._async_oauth_token_manager.aclose() to prevent resource leaks. The method should check if _async_oauth_token_manager exists and close it before setting client references to None.
pyatlan/client/aio/client.py#L927-L932
atlan-python/pyatlan/client/aio/client.py
Lines 927 to 932 in ec8c0ec
| async def aclose(self): | |
| """Close async resources""" | |
| if self._async_session: | |
| await self._async_session.aclose() | |
| self._async_session = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Async OAuth leaks resources on close.
The aclose method fails to close the _async_oauth_token_manager resource when it exists. The AsyncOAuthTokenManager owns an httpx.AsyncClient instance (when _owns_client is True) that needs to be properly closed via aclose(). Without this cleanup, the async HTTP client connection pool and resources remain open, causing a resource leak when using OAuth authentication with the async client.
pyatlan/client/aio/client.py#L927-L952
atlan-python/pyatlan/client/aio/client.py
Lines 927 to 952 in 88c909a
| async def aclose(self): | |
| """Close async resources""" | |
| if self._async_session: | |
| await self._async_session.aclose() | |
| self._async_session = None | |
| # Clean up all client references | |
| self._async_admin_client = None | |
| self._async_asset_client = None | |
| self._async_audit_client = None | |
| self._async_contract_client = None | |
| self._async_credential_client = None | |
| self._async_file_client = None | |
| self._async_group_client = None | |
| self._async_impersonate_client = None | |
| self._async_open_lineage_client = None | |
| self._async_query_client = None | |
| self._async_role_client = None | |
| self._async_search_log_client = None | |
| self._async_sso_client = None | |
| self._async_task_client = None | |
| self._async_token_client = None | |
| self._async_typedef_client = None | |
| self._async_user_client = None | |
| self._async_workflow_client = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Async Cleanup Fails, Leaking HTTP Resources
The aclose method fails to clean up _async_oauth_token_manager. When AsyncAtlanClient is used with OAuth authentication, an AsyncOAuthTokenManager instance is created that owns an httpx.AsyncClient. The aclose method properly closes _async_session and nullifies all client references, but never calls await self._async_oauth_token_manager.aclose() or sets it to None. This creates a resource leak where the OAuth manager's HTTP client remains open, preventing proper cleanup when using the async context manager pattern (async with AsyncAtlanClient() as client).
pyatlan/client/aio/client.py#L927-L952
atlan-python/pyatlan/client/aio/client.py
Lines 927 to 952 in 0ff1ca6
| async def aclose(self): | |
| """Close async resources""" | |
| if self._async_session: | |
| await self._async_session.aclose() | |
| self._async_session = None | |
| # Clean up all client references | |
| self._async_admin_client = None | |
| self._async_asset_client = None | |
| self._async_audit_client = None | |
| self._async_contract_client = None | |
| self._async_credential_client = None | |
| self._async_file_client = None | |
| self._async_group_client = None | |
| self._async_impersonate_client = None | |
| self._async_open_lineage_client = None | |
| self._async_query_client = None | |
| self._async_role_client = None | |
| self._async_search_log_client = None | |
| self._async_sso_client = None | |
| self._async_task_client = None | |
| self._async_token_client = None | |
| self._async_typedef_client = None | |
| self._async_user_client = None | |
| self._async_workflow_client = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Falsy String Disrupts Authentication Fallback
The call to get_client(run_time_config.user_id or "") passes an empty string when user_id is None, but the function signature was changed to accept Optional[str] = None. This causes the function to receive an empty string instead of None, which breaks the OAuth fallback logic at line 86 where it checks elif user_id: - an empty string is falsy but won't match the intended None check pattern, potentially causing unexpected authentication behavior.
pyatlan/pkg/utils.py#L187-L188
atlan-python/pyatlan/pkg/utils.py
Lines 187 to 188 in 9c08e1f
| """ | |
| client = get_client(run_time_config.user_id or "") |
| print(oauth_client_id) | ||
| print(oauth_client_secret) | ||
| print(type(oauth_client_id)) | ||
| print(type(oauth_client_secret)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Sensitive Data Leaking from Debug Prints
Debug print() statements were left in the get_client function that output OAuth credentials (oauth_client_id, oauth_client_secret) and their types. These debugging statements should be removed before production as they expose sensitive credentials and create unnecessary output noise.
| oauth_client_id = os.environ.get("ATLAN_OAUTH_CLIENT_ID", "") | ||
| oauth_client_secret = os.environ.get("ATLAN_OAUTH_CLIENT_SECRET", "") | ||
|
|
||
| print(oauth_client_id) |
Check failure
Code scanning / CodeQL
Clear-text logging of sensitive information High
sensitive data (password)
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 4 days ago
To fix this issue, remove all direct printing of sensitive environment variable values, specifically oauth_client_id and oauth_client_secret. These variables should never be printed to stdout or logs in clear text. If you need to confirm their presence for debugging, log only masked indicators (e.g., log if values are present, but not their contents), and ensure no sensitive values are exposed.
- In
pyatlan/pkg/utils.py, remove or replace lines 89-92 that print the sensitive values and their types. - It is best to remove these lines entirely unless absolutely necessary for debugging; if some indication is required, log only non-sensitive status (e.g., "OAuth client ID is set" or "OAuth client secret is set").
- No additional imports are required.
- No replacement needed for existing functionality—the sensitive variables should no longer be printed.
| @@ -86,10 +86,6 @@ | ||
| oauth_client_id = os.environ.get("ATLAN_OAUTH_CLIENT_ID", "") | ||
| oauth_client_secret = os.environ.get("ATLAN_OAUTH_CLIENT_SECRET", "") | ||
|
|
||
| print(oauth_client_id) | ||
| print(oauth_client_secret) | ||
| print(type(oauth_client_id)) | ||
| print(type(oauth_client_secret)) | ||
| if oauth_client_id and oauth_client_secret: | ||
| LOGGER.info("Using OAuth client credentials for authentication.") | ||
| client = AtlanClient( |
| oauth_client_secret = os.environ.get("ATLAN_OAUTH_CLIENT_SECRET", "") | ||
|
|
||
| print(oauth_client_id) | ||
| print(oauth_client_secret) |
Check failure
Code scanning / CodeQL
Clear-text logging of sensitive information High
sensitive data (password)
This expression logs
sensitive data (secret)
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 4 days ago
To fix the issue, we must ensure that sensitive information like the OAuth client secret is never printed or logged. The lines that print (or could log) the client secret should be entirely removed. If debugging is necessary, a redacted/obfuscated string could be logged for confirmation, but never the secret itself. The same applies (though less urgently) to the OAuth client ID, since while it's less sensitive than the secret, it's often preferable to avoid printing both ID and secret. However, only the client secret is definitely highly sensitive. The print(type(...)) calls should also be removed unless there's a very strong need to check types, but such checks should never include the value.
Therefore:
- Remove lines 89 to 92 (
print(oauth_client_id),print(oauth_client_secret),print(type(oauth_client_id)),print(type(oauth_client_secret))) from pyatlan/pkg/utils.py. - No code needs to be added in place—just these lines should be removed.
| @@ -86,10 +86,6 @@ | ||
| oauth_client_id = os.environ.get("ATLAN_OAUTH_CLIENT_ID", "") | ||
| oauth_client_secret = os.environ.get("ATLAN_OAUTH_CLIENT_SECRET", "") | ||
|
|
||
| print(oauth_client_id) | ||
| print(oauth_client_secret) | ||
| print(type(oauth_client_id)) | ||
| print(type(oauth_client_secret)) | ||
| if oauth_client_id and oauth_client_secret: | ||
| LOGGER.info("Using OAuth client credentials for authentication.") | ||
| client = AtlanClient( |
✨ Description
Briefly explain the purpose of this PR and what it covers. Mention any related issues or Jira tickets.
Jira link: [Insert link here]
🧩 Type of change
Select all that apply:
✅ How has this been tested? (e.g. screenshots, logs, workflow links)
Describe how the change was tested. Include:
📋 Checklist
Note
Adds OAuth client-credentials authentication to Atlan clients (sync/async), including token acquisition, caching, and 401-driven refresh, with env var support and tests.
OAuthTokenManager(sync) andAsyncOAuthTokenManager(async) for token fetch, cache, expiry handling, and invalidation.AtlanClientandAsyncAtlanClientto prefer API key, else OAuth viaATLAN_OAUTH_CLIENT_ID/SECRET.authorizationheader from OAuth token in_create_params; refresh on 401 in_handle_401_token_refresh.GET_OAUTH_CLIENTAPI constant and updateEndPoint.HERACLESto HTTPS.get_client/get_client_asyncto support OAuth via env vars; preserve API key and impersonation fallback..env.examplewith OAuth variables.authlib(and types for dev).tests/unit/test_oauth_client.py) and async (tests/unit/aio/test_oauth_client.py) OAuth flows.Written by Cursor Bugbot for commit 899363b. This will update automatically on new commits. Configure here.