sandbox-image: complete with dataplane-signed URI#698
Open
sgirones wants to merge 16 commits into
Open
Conversation
Adds a CLI bridge so `build_sandbox_image` works against both the legacy platform-api response (embedded pre-signed `upload` block) and the new versioned-response shape (`snapshotRelPath` only). On the new path the CLI calls the sandbox-proxy `POST /api/v1/blob/sign` endpoint and splices the returned upload spec into the raw prepared spec before handing spec.json to the in-sandbox rootfs builder. The branch key (`snapshot_rel_path`) is the only field added to the typed `PreparedSandboxTemplateBuild`. Everything else — including the `upload` block from either path — stays opaque inside the raw passthrough `Value`, preserving the property that future fields added to the platform-api ↔ in-sandbox-builder contract don't require an SDK release. Always multipart on the new path with 100 MB parts, clamped to ≥ 1 and saturated at u32::MAX; size hint reuses the existing `rootfs_disk_bytes` precedence (explicit --disk_mb → parent's rootfsDiskBytes for diff builds → default). Bindings (Python, Node) are unchanged — they only see the final registered-template JSON. Co-authored-by: Cursor <cursoragent@cursor.com>
Platform-api is moving the snapshot location off `snapshotUri` and onto `snapshotRelPath` (the rel-path then gets resolved client-side via `SandboxProxyClient::sign_blob`). Stop requiring `snapshotUri` on the prepared-spec response so the CLI keeps deserializing once platform-api drops the field. The completion path now prefers the in-sandbox builder's metadata.json for the final URI (it always knows where it landed the upload), falls back to the prepared value for the legacy path, and errors clearly if neither source provides one — instead of POSTing an empty string to platform-api's complete endpoint. Co-authored-by: Cursor <cursoragent@cursor.com>
`pick_upload_op` always returned `MultipartPut` — it "picked" nothing. The whole helper, plus `disk_mb_for_upload`, plus the four boundary tests, were just wrapping a one-line part-count computation around the sole call site in `build_sandbox_image`. Inline it. The splice now reuses the `rootfs_disk_bytes` value already computed just upstream for builder sizing, so we don't recompute the same precedence (explicit --disk_mb → parent rootfsDiskBytes for diff → default). `MULTIPART_PART_SIZE_MB` stays as the one tunable, and the clamp / saturation rationale moves into the comment at the call site. Net -42 lines. Co-authored-by: Cursor <cursoragent@cursor.com>
Drop `#[serde(rename_all = "camelCase")]` so `rel_path` goes on the wire as `rel_path` to match the sandbox-proxy's expected payload shape. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Drop the part size from 100 MiB to 64 MiB and cap the requested part count at S3's 10,000-part limit so absurd disk budgets don't ask the proxy to mint an invalid multipart op. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extend SignBlobRequest to accept either a rel_path or a full uri and add a SingleGet BlobOp so the proxy can presign downloads. When a prepared spec includes a parent, fetch a signed download for the parent manifest URI and inject it into the prepared spec.
Cross-reference MAX_MULTIPART_PARTS in the dataplane's sign_blob endpoint so a future change to either side flags the other.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sets pyproject.toml, Cargo.toml, and crates/rust-cloud-sdk-py/pyproject.toml to 0.5.28 and regenerates Cargo.lock. Skips 0.5.27, which main partially claimed by hand-bumping only the root pyproject.toml. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…pending `wait_for_sandbox_status` propagated the transient 502 / PROXY_ERROR that the lifecycle gateway returns while a sandbox is still `pending` (not yet routable), failing the build before it ever reached the builder. In slower environments the pending window is a minute or two, which broke `sbx image create` outright. Retry transient proxy errors the same way `wait_for_proxy_ready` already does, reusing `is_transient_proxy_error`; the existing deadline still bounds the wait, and non-transient errors still fail immediately. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Sandbox image builds now get their rootfs upload URLs from the dataplane instead of from platform-api. When
/preparereturnssnapshotRelPath, the CLI asks the builder sandbox's proxy to call dataplanePOST /api/v1/blob/sign; dataplane composes the final object URI from its regional config, signs the multipart upload, and returns that canonicaluriwith the upload spec.The CLI treats dataplane's signed
urias the source of truth. It writes that URI intospec.snapshotUriso the in-sandbox builder keeps the existing input contract, keeps the same URI in CLI state, and sends it directly to/complete. Builder metadata still supplies computed fields likesnapshot_size_bytes,snapshot_format_version, androotfs_disk_bytes; if metadata includessnapshot_uri, the CLI checks that it matches the dataplane URI.The legacy
/prepareshape is unchanged. IfsnapshotRelPathis absent, the prepared spec already containsupload, and completion continues to use builder metadata with the preparedsnapshotUrifallback.Notes For Review
snapshotRelPathis only used to request dataplane signing. Completion is keyed off the saved signed URI, not the rel path.uri; missingurifails before the builder runs.uri.SingleGetrequests.Test Plan
cargo +nightly fmt --checkgit diff --checkcargo test -p tensorlake sandbox_imagesupload.sign_blob.Related PRs
platform-api: https://github.com/tensorlakeai/platform-api/pull/530compute-engine-internal: https://github.com/tensorlakeai/compute-engine-internal/pull/1039tensorlake: sandbox-image: complete with dataplane-signed URI #698Flow
For parent snapshot reads, the CLI uses
{ uri, op: SingleGet }because the parent manifest already has a full stored URI.