Skip to content

Commit 0a0bbda

Browse files
authored
(chore): clearer handling of consolidated metadata (#2015)
1 parent 7821a27 commit 0a0bbda

File tree

6 files changed

+51
-6
lines changed

6 files changed

+51
-6
lines changed

docs/release-notes/0.12.0rc3.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
(v0.12.0rc3)=
22
### 0.12.0rc3 {small}`2025-05-20`
33

4-
### Bug fixes
4+
#### Bug fixes
55

66
- Update zarr v3 bound to >3.0.8 to prevent corrupted data {issue}`zarr-developers/zarr-python#3061` {user}`ilan-gold` ({pr}`1993`)

docs/release-notes/0.12.0rc4.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,3 +13,7 @@
1313
### Performance
1414

1515
- Improve {func}`~anndata.experimental.read_elem_lazy` performance for `h5ad` files by not caching `indptr`. {user}`ilan-gold` ({pr}`2005`)
16+
17+
#### Development
18+
19+
- Bound {mod}`zarr` to `<3.1` until {pr}`1995` is merged to handle the new data type structure. {user}`ilan-gold` ({pr}`2013`)

docs/release-notes/2013.development.md

Lines changed: 0 additions & 1 deletion
This file was deleted.

docs/tutorials/zarr-v3.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,23 @@
44
Users should notice a significant performance improvement, especially for cloud data, but also likely for local data as well.
55
Here is a quick guide on some of our learnings so far:
66

7+
## Consolidated Metadata
8+
9+
All `zarr` stores are now consolidated by default when written via {func}`anndata.io.write_zarr` or {meth}`anndata.AnnData.write_zarr`. For more information on this topic, please seee {ref}`the zarr docs <zarr:user-guide-consolidated-metadata>`. Practcally, this changes means that once a store has been written, it should be treated as immutable **unless you remove the consolidated metadata and/or rewrite after the mutating operation** i.e., if you wish to use `anndata.io.write_elem` to add a column to `obs`, a `layer` etc. to an existing store. For example, to mutate an existing store on-disk, you may do:
10+
11+
```python
12+
g = zarr.open_group(orig_path, mode="a", use_consolidated=False)
13+
ad.io.write_elem(
14+
g,
15+
"obs",
16+
obs,
17+
dataset_kwargs=dict(chunks=(250,)),
18+
)
19+
zarr.consolidate_metadata(g.store)
20+
```
21+
22+
In this example, the store was opened unconsolidated (trying to open it as a consolidated store would error out), edited, and then reconsolidated. Alternatively, one could simple delete the file containing the consolidated metadata first at the root, `.zmetadata`.
23+
724
## Remote data
825

926
We now provide the {func}`anndata.experimental.read_lazy` feature for reading as much of the {class}`~anndata.AnnData` object as lazily as possible, using `dask` and {mod}`xarray`.

src/anndata/_io/specs/registry.py

Lines changed: 15 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -360,11 +360,22 @@ def write_elem(
360360
dest_type = type(store)
361361

362362
# Normalize k to absolute path
363-
if (isinstance(store, ZarrGroup) and is_zarr_v2()) or (
364-
isinstance(store, h5py.Group) and not PurePosixPath(k).is_absolute()
365-
):
363+
if (
364+
is_zarr_v2_store := (
365+
(is_zarr_store := isinstance(store, ZarrGroup)) and is_zarr_v2()
366+
)
367+
) or (isinstance(store, h5py.Group) and not PurePosixPath(k).is_absolute()):
366368
k = str(PurePosixPath(store.name) / k)
367-
369+
is_consolidated = False
370+
if is_zarr_v2_store:
371+
from zarr.storage import ConsolidatedMetadataStore
372+
373+
is_consolidated = isinstance(store.store, ConsolidatedMetadataStore)
374+
elif is_zarr_store:
375+
is_consolidated = store.metadata.consolidated_metadata is not None
376+
if is_consolidated:
377+
msg = "Cannot overwrite/edit a store with consolidated metadata"
378+
raise ValueError(msg)
368379
if k == "/":
369380
if isinstance(store, ZarrGroup) and not is_zarr_v2():
370381
from zarr.core.sync import sync

tests/test_readwrite.py

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@
1414
import pandas as pd
1515
import pytest
1616
import zarr
17+
import zarr.convenience
1718
from numba.core.errors import NumbaDeprecationWarning
1819
from scipy.sparse import csc_array, csc_matrix, csr_array, csr_matrix
1920

@@ -966,3 +967,16 @@ def test_read_lazy_import_error(func, tmp_path):
966967
tmp_path if func is ad.experimental.read_lazy else tmp_path / "obs"
967968
)
968969
)
970+
971+
972+
def test_write_elem_consolidated(tmp_path: Path):
973+
ad.AnnData(np.ones((10, 10))).write_zarr(tmp_path)
974+
g = (
975+
zarr.convenience.open_consolidated(tmp_path)
976+
if is_zarr_v2()
977+
else zarr.open(tmp_path)
978+
)
979+
with pytest.raises(
980+
ValueError, match="Cannot overwrite/edit a store with consolidated metadata"
981+
):
982+
ad.io.write_elem(g["obs"], "foo", np.arange(10))

0 commit comments

Comments
 (0)