Skip to content

Refactor FilesExt: centralize hostname routing and simplify session management#1294

Merged
parthban-db merged 3 commits intomainfrom
parthban-db/stack/refactor-files-client-2
Mar 10, 2026
Merged

Refactor FilesExt: centralize hostname routing and simplify session management#1294
parthban-db merged 3 commits intomainfrom
parthban-db/stack/refactor-files-client-2

Conversation

@parthban-db
Copy link
Contributor

@parthban-db parthban-db commented Feb 24, 2026

🥞 Stacked PR

Use this link to review incremental changes.


Why

Today the FilesExt class builds API URLs using hardcoded relative paths (e.g. "/api/2.0/fs/create-upload-part-urls") and creates a new unauthenticated requests.Session on every upload/download operation. This works, but has two limitations:

  • No single point of control for request routing. When we add storage-proxy support, uploads need to be routed to a different hostname. With paths scattered across a dozen methods, everyone needs to change. A central _get_hostname() makes the future change a one-liner.
  • Unnecessary session churn. A cloud_provider_session was created by a high-level method and threaded through 5+ layers of calls as a parameter. Some methods received it only to pass it further down. This made signatures noisy, and the data flow hard to follow.

What changed

Interface changes

None. All changes are to private methods.

Behavioral changes

None. This is a pure refactor — the cloud provider session is now cached and reused instead of re-created per operation, but requests.Session is safe to share across calls.

Internal changes

  • _get_hostname() — new method returning the workspace hostname. All self._api.do() calls now use url=f"{hostname}/api/..." instead of relative paths. This is the future extension point for storage-proxy routing.
  • _cloud_provider_session() — renamed from _create_cloud_provider_session(). Caches the session on first call. Callers invoke self._cloud_provider_session() directly instead of storing a local variable.

How is this tested?

  • All existing unit tests.
  • One new test case was added ("Create resumable URL: node exists but URL is empty") to cover the build_resumable_upload_url validation path, where the response node exists but contains no URL.

NO_CHANGELOG=true

return session
The session is created on first call and cached for reuse.
"""
if self._cached_cloud_provider_session is None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The self._cached_cloud_provider_session is cached forever once it's created. I am a bit concerned about this as in rare cases the Session might enter an unrecoverable state and all of the following requests will fail and the entire WorkspaceClient has to be re-created to recover.

Example of such case (though this particular case has already been fixed): link

Copy link
Collaborator

@yuanjieding-db yuanjieding-db Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few possible ways to avoid this scenario:

  • Automatic discard of cache: Maybe we can clear this cache after certain time (say 1h)
  • Active discard of cache by caller: We can also add this logic to the retry function to make sure the cache is cleared if all retry has failed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_BaseClient._session follows the exact same pattern — it's created once in __init__ and never invalidated or refreshed. So, I would not like to fix this for this specific scenario, given that we don't have much evidence that it is a recurring problem.

def _create_cloud_provider_session(self) -> requests.Session:
"""Creates a separate session which does not inherit auth headers from BaseClient session."""
session = requests.Session()
def _cloud_provider_session(self) -> requests.Session:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this function atomic? Multiple threads may be calling the function at the same time and it's possible to create multiple new Sessions while only one Session is cached and returned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense

@github-actions
Copy link

github-actions bot commented Mar 9, 2026

If integration tests don't run automatically, an authorized user can run them manually by following the instructions below:

Trigger:
go/deco-tests-run/sdk-py

Inputs:

  • PR number: 1294
  • Commit SHA: bb9261544e3ae337462fe1cf1f2ae725c015f7c2

Checks will be approved automatically on success.

Copy link
Collaborator

@yuanjieding-db yuanjieding-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the PR!

@parthban-db parthban-db added this pull request to the merge queue Mar 10, 2026
Merged via the queue into main with commit 127b960 Mar 10, 2026
17 checks passed
@parthban-db parthban-db deleted the parthban-db/stack/refactor-files-client-2 branch March 10, 2026 17:16
@parthban-db parthban-db restored the parthban-db/stack/refactor-files-client-2 branch March 10, 2026 17:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants