Skip to content

Add storage proxy support for FilesExt uploads#1319

Merged
parthban-db merged 4 commits intomainfrom
parthban-db/stack/refactor-files-client-4
Mar 13, 2026
Merged

Add storage proxy support for FilesExt uploads#1319
parthban-db merged 4 commits intomainfrom
parthban-db/stack/refactor-files-client-4

Conversation

@parthban-db
Copy link
Contributor

@parthban-db parthban-db commented Mar 10, 2026

🥞 Stacked PR

Use this link to review incremental changes.


Summary

Adds optional storage-proxy routing for file uploads, gated behind the experimental_files_ext_enable_storage_proxy config flag. When enabled and the proxy is reachable, uploads bypass the presigned URL coordination APIs and go directly to the storage proxy.

Why

When running inside a Databricks cluster or notebook, a storage proxy is available on the data plane. It can handle file upload operations directly.

The feature is disabled by default and marked experimental, so there is no impact on existing users.

What changed

Interface changes

One new config field:

  • experimental_files_ext_enable_storage_proxy: bool = False — enables the storage proxy path.

Behavioral changes

  • When the flag is enabled and the probe succeeds, multipart uploads (AWS/Azure) and resumable uploads (GCP) go directly to the proxy instead of fetching presigned URLs from coordination APIs.
  • The proxy requires SDK authentication on every request, so an authenticated session (session.auth callback) is used instead of the unauthenticated session used for presigned URL uploads.
  • build_abort_url was added to _PresignedUrlRequestBuilder, extracting inline abort URL logic from _abort_multipart_upload.

Internal changes

  • _StorageProxyRequestBuilder — new class with the same method signatures as _PresignedUrlRequestBuilder (build_upload_part_urls, build_resumable_upload_url, build_abort_url). Constructs URLs directly to the proxy endpoint instead of calling coordination APIs.
  • _create_request_builder() — factory method that returns the proxy builder when the proxy is available, otherwise the presigned URL builder.
  • _probe_storage_proxy() — GET to the proxy ping endpoint with SDK auth. Result is cached.
  • _create_storage_proxy_session() — cached requests.Session with session.auth callback for SDK credentials. Protected by a lock for thread safety.
  • _cloud_provider_session() — updated to return the authenticated proxy session when the proxy is active. Session creation protected by a lock for thread safety.
  • _get_hostname() — updated to return the proxy hostname when the flag is enabled and the probe succeeds.
  • _abort_multipart_upload — now uses the builder instead of inline create-abort-upload-url API call.
  • Test infrastructure: UploadTestCase gains use_storage_proxy parameter. Storage proxy test cases added to existing MultipartUploadTestCase and ResumableUploadTestCase parametrized lists. Presigned URL coordination handlers assert they are never called when storage proxy is active.

How is this tested?

  • Unit tests.
  • E2E validated on two cloud environments with 200 MB file uploads:
    • AWS (AWS staging): multipart upload via proxy, 20 parallel parts, hash verified.
    • GCP (GCP staging): resumable upload via proxy, 20 sequential chunks, hash verified.
  • Both environments confirmed: _StorageProxyRequestBuilder used, no create-upload-part-urls/create-resumable-upload-url calls, session.auth callback active, uploaded and downloaded hashes match.

NO_CHANGELOG=true

@parthban-db parthban-db changed the title update Add storage proxy support for FilesExt uploads Mar 10, 2026
@parthban-db parthban-db force-pushed the parthban-db/stack/refactor-files-client-4 branch from 2fdc0b2 to 92878a6 Compare March 10, 2026 17:20
@parthban-db parthban-db marked this pull request as ready for review March 10, 2026 21:34
github-merge-queue bot pushed a commit that referenced this pull request Mar 11, 2026
…1295)

## 🥞 Stacked PR
Use this
[link](https://github.com/databricks/databricks-sdk-py/pull/1295/files)
to review incremental changes.
-
[**stack/refactor-files-client-3**](#1295)
[[Files
changed](https://github.com/databricks/databricks-sdk-py/pull/1295/files)]
-
[stack/refactor-files-client-4](#1319)
[[Files
changed](https://github.com/databricks/databricks-sdk-py/pull/1319/files/9e504b191ad065bf8adbeb003fa493ba51bcc165..8218fc4ae84ff4b9ec2f838b26faa0c3a957f2e5)]
-
[stack/refactor-files-client-5](#1324)
[[Files
changed](https://github.com/databricks/databricks-sdk-py/pull/1324/files/8218fc4ae84ff4b9ec2f838b26faa0c3a957f2e5..2c4f3e63287e86b0984a4dfb323547cdaacfa058)]

---------
## Summary

Extracts the presigned URL coordination logic (`create-upload-part-urls`
and `create-resumable-upload-url` API calls and response parsing) into a
dedicated `_PresignedUrlRequestBuilder` class, replacing inline code in
three upload methods.

## Why

The `create-upload-part-urls` API call, response validation, and header
parsing was duplicated across `_do_upload_one_part`,
`_perform_multipart_upload`, and `_perform_resumable_upload` (with a
similar pattern for `create-resumable-upload-url`). Each copy built the
request body, called `_api.do()`, validated the response structure, and
converted the headers list to a dict — all inline. This made the upload
methods longer than necessary and meant any change to the coordination
logic required updating multiple places.

`_PresignedUrlRequestBuilder` consolidates this into a single class. It
also prepares for storage-proxy routing (#1278), where a different
builder implementation can construct URLs directly instead of calling
the presigned URL APIs.

## What changed

### Interface changes

None. All changes are to private methods.

### Behavioral changes

- Malformed presigned URL responses (`ValueError`/`KeyError` from
response parsing) now trigger fallback to single-shot upload on the
first part, instead of propagating as hard errors. Previously the
`_api.do()` call was wrapped in `try/except` but the response parsing
was outside it. Now both are encapsulated in the builder, so parsing
errors are also caught. This is more resilient — a broken coordination
response should not abort the upload when a simpler path exists.

### Internal changes

- **`_PresignedUrl`** — new dataclass holding a resolved presigned URL
and its associated headers.
- **`_PresignedUrlRequestBuilder`** — new class with
`build_upload_part_urls()` and `build_resumable_upload_url()`.
Encapsulates the coordination API calls and response parsing.
- **`_do_upload_one_part`** — replaced inline `create-upload-part-urls`
call and response parsing with `builder.build_upload_part_urls(...,
count=1)[0]`.
- **`_perform_multipart_upload`** — replaced inline
`create-upload-part-urls` batch call and response parsing with
`builder.build_upload_part_urls(..., count=batch_size)`.
- **`_perform_resumable_upload`** — replaced inline
`create-resumable-upload-url` call and response parsing with
`builder.build_resumable_upload_url()`.

## How is this tested?

Unit tests.

NO_CHANGELOG=true
@parthban-db parthban-db force-pushed the parthban-db/stack/refactor-files-client-4 branch from 8218fc4 to c7928c1 Compare March 11, 2026 22:38
@parthban-db
Copy link
Contributor Author

Range-diff: main (8218fc4 -> c7928c1)
databricks/sdk/mixins/files.py
@@ -45,25 +45,31 @@
 +        results = []
 +        for i in range(count):
 +            part_number = start_part_number + i
-+            query = parse.urlencode({
-+                "session_token": session_token,
-+                "upload_type": "multipart",
-+                "part_number": part_number,
-+            })
-+            results.append(_PresignedUrl(
-+                url=f"{base}?{query}",
-+                headers={"Content-Type": "application/octet-stream"},
-+            ))
++            query = parse.urlencode(
++                {
++                    "session_token": session_token,
++                    "upload_type": "multipart",
++                    "part_number": part_number,
++                }
++            )
++            results.append(
++                _PresignedUrl(
++                    url=f"{base}?{query}",
++                    headers={"Content-Type": "application/octet-stream"},
++                )
++            )
 +        return results
 +
 +    def build_resumable_upload_url(self, path: str, session_token: str) -> _PresignedUrl:
 +        """Builds a URL for resumable upload directly to the storage proxy."""
 +        escaped = _escape_multi_segment_path_parameter(path)
 +        base = f"{self._hostname}/api/2.0/fs/files{escaped}"
-+        query = parse.urlencode({
-+            "session_token": session_token,
-+            "upload_type": "resumable",
-+        })
++        query = parse.urlencode(
++            {
++                "session_token": session_token,
++                "upload_type": "resumable",
++            }
++        )
 +        return _PresignedUrl(
 +            url=f"{base}?{query}",
 +            headers={"Content-Type": "application/octet-stream"},
@@ -73,10 +79,12 @@
 +        """Builds a URL for aborting an upload directly on the storage proxy."""
 +        escaped = _escape_multi_segment_path_parameter(path)
 +        base = f"{self._hostname}/api/2.0/fs/files{escaped}"
-+        query = parse.urlencode({
-+            "action": "abort-upload",
-+            "session_token": session_token,
-+        })
++        query = parse.urlencode(
++            {
++                "action": "abort-upload",
++                "session_token": session_token,
++            }
++        )
 +        return _PresignedUrl(
 +            url=f"{base}?{query}",
 +            headers={"Content-Type": "application/json"},
tests/test_files.py
@@ -346,5 +346,4 @@
      chunk = os.urandom(chunk_size)
 -    # Repeat it until we reach n bytes
 +    # Repeat it until we reach n bytes.
-     return (chunk * (n // chunk_size + 1))[:n]
-+
\ No newline at end of file
+     return (chunk * (n // chunk_size + 1))[:n]
\ No newline at end of file

Reproduce locally: git range-diff 9e504b1..8218fc4 ff61559..c7928c1 | Disable: git config gitstack.push-range-diff false

@github-actions
Copy link

If integration tests don't run automatically, an authorized user can run them manually by following the instructions below:

Trigger:
go/deco-tests-run/sdk-py

Inputs:

  • PR number: 1319
  • Commit SHA: 85824ec07a796f348263077a05c5339a190f1131

Checks will be approved automatically on success.

@parthban-db parthban-db added this pull request to the merge queue Mar 13, 2026
Merged via the queue into main with commit af8f88c Mar 13, 2026
17 checks passed
@parthban-db parthban-db deleted the parthban-db/stack/refactor-files-client-4 branch March 13, 2026 09:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants