Skip to content

OCDBT support for graphene supervoxel splitting#631

Open
akhileshh wants to merge 10 commits intospelunkerfrom
spelunker-ocdbt
Open

OCDBT support for graphene supervoxel splitting#631
akhileshh wants to merge 10 commits intospelunkerfrom
spelunker-ocdbt

Conversation

@akhileshh
Copy link
Copy Markdown

Adds support for reading graphene segmentation backed by an OCDBT database with a kvstack overlay: an immutable base layer holds source watershed data shared across graphs, and per-graph "fork" layers hold edits. Readers must route through the full stack to see the fork's view — reading base alone misses edits, reading the fork alone misses everything that hasn't been edited.

  • Add kvstack kvstore driver (base/exact/prefix matchers, last-match-wins, lazy layer resolution). URL form kvstack:<percent-encoded-JSON>[/<path>].
  • Graphene reads graph.ocdbt_kvstore_spec from the server and builds a kvstack:...|ocdbt: pipeline URL; per-scale routing sends OCDBT-backed scales through the fork and hides non-OCDBT scales so stale precomputed data can't leak in.
  • After a multicut, fire an RPC that clears the ocdbt:manifest/btree/version caches so the next read resolves a fresh B+tree root without a manual reload.
  • Optionally skip the multicut "supervoxel already selected" guard when OCDBT is active, to support graphene backends whose supervoxel splits happen within a single supervoxel (user selects the same supervoxel on both sides of the cut).
  • Add SimpleAsyncCache.invalidateAll() and validator/encoding hardening on the kvstack URL; surface OCDBT scale-list failures with the offending URL instead of an opaque kvstore error.

Adds invalidateOcdbtCaches() so consumers can clear the three
metadata caches after a server-side OCDBT mutation, forcing the
next read to resolve a fresh root.

Also removes the per-instance root memoization on OcdbtKvStore --
it was redundant with the ocdbt:version SimpleAsyncCache and
prevented invalidation from taking effect.
Adds ocdbt_seg / ocdbt_path graph-info fields. When ocdbt_seg is
set, segmentation volume reads route through a per-scale OCDBT
pipeline URL (scales auto-discovered via list()). Non-OCDBT
scales are filtered from getSources() so the graphene layer only
shows data available in the fork.

After a multicut, invalidates the OCDBT metadata caches via RPC
so split supervoxels become visible without a manual reload.

Also skips the "supervoxel already selected" guard in the
multicut tool when ocdbt_seg is active, since SV splits require
selecting the same supervoxel on both sides of the cut.
Adds a base kvstore driver that routes reads/stats to different
backing stores based on per-layer matchers (base / exact / prefix),
matching the semantics of tensorstore's kvstack driver. Last-match
wins on overlaps.

URL form is `kvstack:<percent-encoded-JSON-spec>[/<path>]`, with
the JSON matching tensorstore's `{"layers":[{base,exact|prefix}]}`
shape so specs are portable.

Used as the base under OCDBT for pcg v3 fork layouts (next commit).
Replaces the client-side construction of the OCDBT URL from
ocdbt_seg + ocdbt_path with a single new info-JSON field
`graph.ocdbt_kvstore_spec` that carries the full tensorstore
kvstore spec (ocdbt wrapping kvstack) verbatim from pcg.

The client unwraps the spec, URL-encodes its `.base` (the kvstack
layers) as `kvstack:<percent-encoded-json>`, and appends `|ocdbt:`
to get the neuroglancer pipeline URL. OCDBT-level `config` and
`*_data_prefix` fields in the spec are ignored on reads per
tensorstore docs.

Presence of ocdbt_kvstore_spec is now the OCDBT-enabled signal;
absent spec bypasses the kvstack path entirely so legacy v2
graphene layers are unaffected. All remaining `ocdbtSeg` checks
swap to `ocdbtKvstoreSpec === undefined` / presence.
…ncCache.invalidateAll

Adds a new SimpleAsyncCache.invalidateAll() helper and uses it to
collapse the three hand-rolled invalidation loops in
invalidateOcdbtCaches into three one-liners.

Drops the unused baseUrl parameter: invalidation was already
whole-context broad (every OCDBT database in the viewer is flushed),
and the param was never consulted. RPC payload and handler shrink
accordingly.

Adds a comment explaining why the inline stub factories throw:
real factories are registered by prior getManifest/getBtreeNode/getRoot
calls, so memoize.get returns the existing SimpleAsyncCache without
touching the stubs.

No changes to upstream OCDBT functions (getManifest, getBtreeNode,
getRoot); invalidateOcdbtCaches was added by us and stays the only
OCDBT-side entry point.
Two small fixes to avoid footguns:

- formatKvStackUrl now percent-encodes the key portion, not just the
  JSON. Keys containing `?`, `#`, or `%` previously produced URLs that
  either failed to parse or round-tripped to a different value, since
  parseKvStackUrl already decodes the path via decodeURIComponent.

- validateKvStackSpec now rejects empty `layers`, empty `base`, and
  empty `prefix`. Empty layers silently routed nothing; empty prefix
  degenerated to a catch-all (`"".startsWith("")` is true), shadowing
  any preceding `base` layer in unobvious ways. Callers should use an
  explicit base-only layer for catch-all routing.
parseGrapheneMultiscaleVolumeInfo calls list() on the OCDBT URL to
discover which scales are backed by the fork. Previously any failure
(transient network, auth, misconfigured spec) propagated as an opaque
error from deep in kvstore; the user saw "read failed" with no hint
that OCDBT setup was the root cause.

Wrap the list() call in a try/catch that rethrows with the OCDBT URL
and a note that the graphene layer cannot render without the scale
list. No behavior change on the happy path.
@akhileshh akhileshh requested a review from chrisj April 23, 2026 22:01
@github-actions github-actions Bot temporarily deployed to spelunker-ocdbt April 24, 2026 16:34 Destroyed
@github-actions github-actions Bot temporarily deployed to spelunker-ocdbt April 24, 2026 16:37 Destroyed
…outing context

Wraps each kvstack read/stat in a bounded retry loop so that
transient failures don't get latched into the OCDBT metadata
caches (which `asyncMemoizeWithProgress` caches permanently for
the page lifetime). The retry handles HttpError status 0
(network/CORS) and 502 -- 429/503/504 are already retried inside
fetchOk so we don't double-cover them.

Backoff reuses the existing pickDelay helper from
util/http_request.ts (jittered exponential, no new magic numbers).
Sleeps are abort-aware so navigating away cancels them cleanly.

After retries are exhausted (or on a non-retryable error), wraps
with a message naming the matched layer (base / exact:KEY /
prefix:PREFIX) and the backing URL, with the original error in
`cause`. Makes "the fork manifest 404'd" obvious instead of "some
random GCS read failed".
@github-actions github-actions Bot temporarily deployed to spelunker-ocdbt April 30, 2026 15:08 Destroyed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants