From 5f8c6fb9295be77c57e995a07a5a419e1e3200fe Mon Sep 17 00:00:00 2001 From: Emmanuel Mathot Date: Fri, 22 Aug 2025 10:12:37 +0200 Subject: [PATCH 1/9] refactor terminology and structure in documentation --- .../clause_4_terms_and_definitions.adoc | 11 ++-- .../sections/clause_7_unified_data_model.adoc | 58 +++++++++---------- .../sections/clause_9_zarr_encoding_core.adoc | 7 +-- .../clause_9_zarr_encoding_overviews.adoc | 17 +++--- 4 files changed, 45 insertions(+), 48 deletions(-) diff --git a/standard/template/sections/clause_4_terms_and_definitions.adoc b/standard/template/sections/clause_4_terms_and_definitions.adoc index 007f320..fa81a97 100644 --- a/standard/template/sections/clause_4_terms_and_definitions.adoc +++ b/standard/template/sections/clause_4_terms_and_definitions.adoc @@ -2,6 +2,9 @@ === Terms and definitions +GeoZarr specification inherits https://zarr-specs.readthedocs.io/en/latest/v3/core/index.html#concepts-and-terminology[concepts and terminology from the Zarr core specification]. +The following terms adds Geozarr specificity to the existing Zarr terminology + ==== array A multidimensional, regularly spaced collection of values (e.g., raster data or gridded measurements), typically indexed by dimensions such as time, latitude, longitude, or spectral band. @@ -22,17 +25,17 @@ An array containing the primary geospatial or scientific measurements of interes An index axis along which arrays are organised. Dimensions provide a naming and ordering scheme for accessing data in multidimensional arrays (e.g., `time`, `x`, `y`, `band`). -==== group +==== dataset -A container for datasets, variables, dimensions, and metadata in Zarr. Groups may be nested to represent a logical hierarchy (e.g., for resolutions or collections). +A group that contains one or more data variables along with their associated coordinate variables, having a consistent relationship between these components. A dataset represents a coherent set of related data arrays and follows the unified data model. ==== metadata Structured information describing the content, context, and semantics of datasets, variables, and attributes. GeoZarr metadata includes CF attributes, geotransform definitions, and links to STAC metadata where applicable. -==== multiscale dataset +==== multiscale group -A dataset that includes multiple representations of the same data variable at varying spatial resolutions. Each resolution level is associated with a tile matrix from an OGC Tile Matrix Set. +A group that contains 2 or more child groups representing the same data at different resolutions, where each child group is a <>. The multiscale group includes metadata describing the relationship between resolution levels. ==== tile matrix set diff --git a/standard/template/sections/clause_7_unified_data_model.adoc b/standard/template/sections/clause_7_unified_data_model.adoc index 8af7598..64c073a 100644 --- a/standard/template/sections/clause_7_unified_data_model.adoc +++ b/standard/template/sections/clause_7_unified_data_model.adoc @@ -87,11 +87,11 @@ To enable discovery of resources within the hierarchical structure of the data m A STAC extension consists of embedding or referencing STAC Collection and Item metadata within the data model: -* Each dataset resource MAY reference a corresponding STAC `Collection` or `Item` using an identifier or embedded object. +* Each store resource MAY reference a corresponding STAC `Collection` or `Item` using an identifier or embedded object. * STAC properties such as `datetime`, `bbox`, and `eo:bands` MAY be included in the metadata to enable spatial, temporal, and spectral filtering. * The structure is compatible with external STAC APIs and metadata harvesting systems. -STAC integration is non-intrusive and modular. It does not impose changes on the internal organisation of datasets and MAY be adopted incrementally by implementations requiring catalogue-based discovery capabilities. +STAC integration is non-intrusive and modular. It does not impose changes on the internal organisation of the store and MAY be adopted incrementally by implementations requiring catalogue-based discovery capabilities. ==== Modularity and Interoperability @@ -101,22 +101,22 @@ Each extension point is specified independently. Implementations may advertise s === Unified Model Structure -This clause defines the structural organisation of datasets conforming to the unified data model (UDM). It consolidates the foundational elements and optional extensions into a coherent architecture suitable for Zarr encoding, while remaining format-agnostic. The model establishes a modular and extensible framework that supports structured representation of multidimensional, geospatially-referenced resources. +This clause defines the structural organisation of stores conforming to the unified data model (UDM). It consolidates the foundational elements and optional extensions into a coherent architecture suitable for Zarr encoding, while remaining format-agnostic. The model establishes a modular and extensible framework that supports structured representation of multidimensional, geospatially-referenced resources. The model represents datasets as abstract compositions of dimensions, coordinate variables, data variables, and associated metadata. This abstraction ensures that applications and services can reason about the content and semantics of a dataset without reliance on storage layout or specific serialisation. -==== Dataset Structure +==== Store Structure -A dataset conforming to the Unified Data Model (UDM) is structured as a hierarchy rooted at a top-level dataset entity. This design enables modularity and facilitates the representation of complex, multi-resolution, or thematically partitioned data collections. +A store conforming to the Unified Data Model (UDM) is structured as a hierarchy rooted at a top-level group. This design enables modularity and facilitates the representation of complex, multi-resolution, or thematically partitioned data collections. -Each dataset node comprises the following core components, aligned with the Unidata Common Data Model (CDM) and Climate and Forecast (CF) Conventions: +Each <> comprises the following core components, aligned with the Unidata Common Data Model (CDM) and Climate and Forecast (CF) Conventions: - **Dimensions** – Named, integer-valued axes defining the extent of data variables. Examples include `time`, `x`, `y`, and `band`. - **Coordinate Variables** – Arrays that supply coordinate values along dimensions, providing spatial, temporal, or contextual referencing. These may be scalar or higher-dimensional, depending on the referencing scheme. - **Data Variables** – Multidimensional arrays representing physical measurements or derived products. Defined over one or more dimensions, these variables are associated with coordinate variables and annotated with metadata. - **Attributes** – Key-value pairs attached to variables or dataset components. Attributes convey semantic information such as units, standard names, and geospatial metadata. -The hierarchy is implemented through **groups**, which function as containers for variables, dimensions, and metadata. Groups may define local context while inheriting attributes from parent nodes. This supports the logical subdivision of datasets by theme, resolution, or processing stage, and enhances the clarity and reusability of complex geospatial structures. +A Zarr hierarchy is a tree structure, where each node in the tree is either a group or an array. Group nodes may have children but array nodes may not. This supports the logical subdivision by theme, resolution, or processing stage, and enhances the clarity and reusability of complex geospatial structures. The diagram below represents the structural layer of the unified data model, derived from the Unidata Common Data Model, which serves as the foundational framework for supporting all overlaying model layer. @@ -129,7 +129,7 @@ The diagram below represents the structural layer of the unified data model, der .... @startuml CDM_DAL_Object_Model -class Dataset { +class Store { + String location + open() + close() @@ -137,10 +137,9 @@ class Dataset { class Group { + String name - + List subgroups - + List variables - + List dimensions - + List attributes +} + +class Dataset { } class Dimension { @@ -152,9 +151,6 @@ class Dimension { class Variable { + String name - + DataType dataType - + List shape - + List attributes + read() } @@ -169,19 +165,20 @@ class Attribute { + List values } -Dataset --> Group : rootGroup -Group --> Group : contains > -Group --> Variable : contains > -Group --> Dimension : defines > -Group --> Attribute : has > -Variable --> Dimension : uses > -Variable --> DataType : has > -Variable --> Attribute : has > +Store "1" --> "*" Group : rootGroup +Group "1" --> "*" Group : contains +Dataset -up-|> Group +Dataset --> "*" Variable : contains +Dataset --> "*" Dimension : defines +Group --> "*" Attribute : has +Variable --> "*" Dimension : uses +Variable --> "1" DataType : has +Variable --> "*" Attribute : has @enduml .... //endif::never-shown[] -Note that, conceptually, node within this hierarchy might be treated as a self-contained dataset. +Note that, conceptually, node within this hierarchy might be treated as a self-contained store. ==== Coordinate Referencing @@ -196,7 +193,7 @@ The model accommodates both standard CF-compatible definitions and extended refe Metadata may be declared at various levels within the model structure: -- **Global Metadata** – Attributes describing the dataset as a whole, including elements such as `title`, `summary`, and `license`. +- **Global Metadata** – Attributes describing the store as a whole, including elements such as `title`, `summary`, and `license`. - **Variable Metadata** – Attributes associated with individual data or coordinate variables, conveying descriptive or semantic information. - **Extension Metadata** – Structured metadata linked to optional model extensions (e.g., multiscale tiling, catalogue references, geotransform properties). @@ -218,15 +215,15 @@ Overviews enable: ===== Conceptual Structure -An *Overviews* construct is defined as a *hierarchical set of multiscale representations* of one or more data variables. It comprises the following components: +A <> contains child groups representing the data at different resolutions, where each child group is a <> following the unified data model. It comprises the following components: [horizontal] -*Base Variable*:: The original, highest-resolution variable to which the overview hierarchy is anchored. It is defined using the standard `DataVariable` structure in the model. -*Overview Levels*:: A sequence of variables representing the same logical quantity as the base variable, but sampled at coarser spatial resolutions. +*Base Dataset*:: The original, highest-resolution dataset to which the multiscale hierarchy is anchored. +*Zoom Level Datasets*:: A sequence of datasets representing the same data as the base dataset, but sampled at coarser spatial resolutions. *Zoom Level Identifier*:: A unique identifier associated with each level, ordered from finest (e.g. `"0"`) to coarsest resolution (e.g. `"N"`). *Tile Grid Definition*:: A mapping that associates each zoom level with a spatial tiling layout, defined in alignment with a `TileMatrixSet`. -*Spatial Alignment*:: Each overview variable MUST be spatially aligned with the base variable using a consistent coordinate reference system and compatible axis orientation. -*Resampling Method*:: A declared method indicating the technique used to derive coarser levels from the base variable (e.g. `nearest`, `average`, `cubic`). +*Spatial Alignment*:: Each zoom-level dataset MUST be spatially aligned with the base dataset using a consistent coordinate reference system and compatible axis orientation. +*Resampling Method*:: A declared method indicating the technique used to derive coarser levels from the base dataset (e.g. `nearest`, `average`, `cubic`). ===== Model Components @@ -351,4 +348,3 @@ The unified data model facilitates interoperability with tools and libraries acr - *Cloud-native infrastructure*: support for parallel access, chunked storage, and hierarchical grouping compatible with object storage. Tooling support is expected to grow via standard-conformant implementations, easing adoption across domains and infrastructures. - diff --git a/standard/template/sections/clause_9_zarr_encoding_core.adoc b/standard/template/sections/clause_9_zarr_encoding_core.adoc index a2d6a2e..eedb689 100644 --- a/standard/template/sections/clause_9_zarr_encoding_core.adoc +++ b/standard/template/sections/clause_9_zarr_encoding_core.adoc @@ -1,7 +1,7 @@ === Hierarchical Structure -A dataset conforming to the unified data model is represented as a hierarchical structure of groups, variables (arrays), dimensions, and metadata. The dataset is rooted in a *top-level group*, which may contain: +A store conforming to the unified data model is structured as a hierarchy of groups, variables (arrays), dimensions, and metadata. Following Zarr conventions, this hierarchy is rooted in a group, which may contain: - Arrays representing coordinate or data variables - Child groups for modular organisation, including logical sub-collections or resolution levels @@ -14,7 +14,7 @@ Each group adheres to a consistent structure, allowing recursive composition. Th |=== |Model Element |Zarr v2 Encoding |Zarr v3 Encoding -|Root Dataset | Directory with `.zgroup` and `.zattrs` | Directory with `zarr.json`, with `node_type: group` +|Root Group | Directory with `.zgroup` and `.zattrs` | Directory with `zarr.json`, with `node_type: group` |Child Group | Subdirectory with `.zgroup` and `.zattrs` | Subdirectory with `zarr.json`, with `node_type: group` @@ -115,7 +115,7 @@ Example: === Global Metadata -Metadata associated with the dataset as a whole is stored at the root group level. +Metadata associated with the store is stored at the root group level. [cols="1,2,2"] @@ -157,4 +157,3 @@ In all cases: - Attribute names are case-sensitive and encoded as UTF-8 strings - Values shall conform to JSON-compatible types (string, number, boolean, array) - diff --git a/standard/template/sections/clause_9_zarr_encoding_overviews.adoc b/standard/template/sections/clause_9_zarr_encoding_overviews.adoc index b20092e..abf6832 100644 --- a/standard/template/sections/clause_9_zarr_encoding_overviews.adoc +++ b/standard/template/sections/clause_9_zarr_encoding_overviews.adoc @@ -1,30 +1,30 @@ === Encoding of Multiscale Overviews in Zarr -This clause specifies how multiscale tiling (also known as overviews or pyramids) is encoded in Zarr-based datasets conforming to the unified data model. The encoding supports both Zarr Version 2 and Version 3 and is aligned with the OGC Two Dimensional Tile Matrix Set Standard. +This clause specifies how multiscale tiling (also known as overviews or pyramids) is encoded in Zarr stores conforming to the unified data model. The encoding supports both Zarr Version 2 and Version 3 and is aligned with the OGC Two Dimensional Tile Matrix Set Standard. -Multiscale datasets are composed of a set of Zarr groups representing multiple zoom levels. Each level stores coarser-resolution resampled versions of the original data variables. +A multiscale group contains child groups, where each child group is a <> representing a zoom level that stores a coarser-resolution resampled version of the original data variables. ==== Hierarchical Layout -Each zoom level SHALL be represented as a Zarr group, identified by the Tile Matrix identifier (e.g., `"0"`, `"1"`, `"2"`). These groups SHALL be organised hierarchically under a common multiscale root group. Each zoom-level group SHALL contain the complete set of variables (Zarr arrays) corresponding to that resolution. +Each zoom level SHALL be represented as a child group, identified by the Tile Matrix identifier (e.g., `"0"`, `"1"`, `"2"`). These child groups SHALL be organized hierarchically under a common multiscale group and each SHALL be a <> containing the complete set of variables (arrays) corresponding to that resolution. All zoom-level datasets MUST maintain consistent structure. [cols="1,2,2"] |=== |Structure |Zarr v2 |Zarr v3 -|Zoom level groups | Subdirectories with `.zgroup` and `.zattrs` | Subdirectories with `zarr.json`, `node_type: group` +|Zoom level datasets | Subdirectories with `.zgroup` and `.zattrs` | Subdirectories with `zarr.json`, `node_type: group` -|Variables at each level | Zarr arrays (`.zarray`, `.zattrs`) in each group | Zarr arrays (`zarr.json`, `node_type: array`) in each group +|Variables at each level | Arrays (`.zarray`, `.zattrs`) in each dataset | Arrays (`zarr.json`, `node_type: array`) in each dataset -|Global metadata | `multiscales` defined in parent `.zattrs` | `multiscales` defined in parent group `zarr.json` under `attributes` +|Multiscale metadata | `multiscales` defined in multiscale group `.zattrs` | `multiscales` defined in multiscale group `zarr.json` under `attributes` |=== -Each multiscale group MUST define chunking (tiling) along the spatial dimensions (`X`, `Y`, or `lon`, `lat`). Recommended chunk sizes are 256×256 or 512×512. +Each zoom-level dataset MUST define chunking (tiling) along the spatial dimensions (`X`, `Y`, or `lon`, `lat`). Recommended chunk sizes are 256×256 or 512×512. ==== Metadata Encoding -Multiscale metadata SHALL be defined using a `multiscales` attribute located in the parent group of the zoom levels. This attribute SHALL be a JSON object with the following members: +Multiscale metadata SHALL be defined using a `multiscales` attribute located in the multiscale group. This attribute SHALL be a JSON object with the following members: - `tile_matrix_set` – Identifier, URI, or inline JSON object compliant with OGC TileMatrixSet v2 - `resampling_method` – One of the standard string values (e.g., `"nearest"`, `"average"`) @@ -98,4 +98,3 @@ The `resampling_method` MUST indicate the method used for downsampling across zo `nearest`, `average`, `bilinear`, `cubic`, `cubic_spline`, `lanczos`, `mode`, `max`, `min`, `med`, `sum`, `q1`, `q3`, `rms`, `gauss` The same method MUST apply across all levels. - From bb05f3fe597083de3bb1d7caf7e4f65a5ac67808 Mon Sep 17 00:00:00 2001 From: Emmanuel Mathot Date: Tue, 2 Sep 2025 23:08:16 +0200 Subject: [PATCH 2/9] Enhance documentation clarity and detail in the GeoZarr Unified Data Model, addressing semantic constructs and use cases for geospatial data workflows. --- .../sections/clause_0_front_material.adoc | 8 +++---- .../template/sections/clause_1_scope.adoc | 24 +++++++++++++++++-- .../sections/clause_7_unified_data_model.adoc | 2 +- 3 files changed, 27 insertions(+), 7 deletions(-) diff --git a/standard/template/sections/clause_0_front_material.adoc b/standard/template/sections/clause_0_front_material.adoc index b9f7975..a9f2ba6 100644 --- a/standard/template/sections/clause_0_front_material.adoc +++ b/standard/template/sections/clause_0_front_material.adoc @@ -11,11 +11,11 @@ This Standard has been developed in collaboration with contributors from Earth o [abstract] == Abstract -The GeoZarr Unified Data Model and Encoding Standard specifies a conceptual and implementation framework for representing multidimensional, geospatial datasets using the Zarr format. This Standard builds upon the Unidata Common Data Model (CDM) and the Climate and Forecast (CF) Conventions, and introduces interoperable constructs for tiling, georeferencing, and metadata integration. +Zarr provides efficient chunked storage for n-dimensional arrays but do not provide with the semantic constructs required for geospatial and scientific data workflows. The GeoZarr Unified Data Model and Encoding Standard addresses this gap by adding essential concepts—coordinate systems, grid mappings, temporal semantics, and CF-compliant metadata—on top of Zarr's storage foundation. -The model defines core elements—dimensions, coordinate variables, data variables, attributes—and optional extensions for multi-resolution overviews, affine geotransforms, and STAC metadata. Encoding guidance is provided for Zarr Version 2 and Zarr Version 3, including chunking, group hierarchy, and metadata conventions. +The Standard builds upon proven concepts from the Common Data Model (CDM) and Climate and Forecast (CF) Conventions to define core elements—dimensions, coordinate variables, data variables, and attributes—along with extensions for multi-resolution overviews, affine geotransforms, and STAC metadata. This layered approach ensures applications can work with semantically rich geospatial data while leveraging Zarr's cloud-optimized storage capabilities. -GeoZarr aims to bridge scientific and geospatial communities by enabling round-trip transformations with formats such as NetCDF and GeoTIFF, and supporting compatibility with tools in the scientific Python and geospatial ecosystems. This Standard enables scalable, standards-compliant, and semantically rich data structures for cloud-native Earth observation applications. +By providing a standardized framework for geospatial semantics, GeoZarr enables scientific and geospatial applications to fully utilize cloud-native storage architectures while maintaining the rich metadata and coordinate referencing required for Earth observation workflows. The result is a modern, scalable approach to storing and accessing geospatial data that meets the needs of both data providers and consumers. == Submitters @@ -29,4 +29,4 @@ All questions regarding this submission should be directed to the editor or the |Brianna Pagán _(editor)_ | DevSeed |Ryan Abernathey| EarthMover | TBD | TBD -|=== \ No newline at end of file +|=== diff --git a/standard/template/sections/clause_1_scope.adoc b/standard/template/sections/clause_1_scope.adoc index 93a5d91..1275ba4 100644 --- a/standard/template/sections/clause_1_scope.adoc +++ b/standard/template/sections/clause_1_scope.adoc @@ -2,6 +2,26 @@ The GeoZarr Unified Data Model and Encoding Standard defines a conceptual and implementation framework for representing and encoding geospatial and scientific datasets using the Zarr format. The scope of this Standard includes the definition of a format-agnostic unified data model, the specification of its encoding into Zarr Version 2 and Version 3, and the establishment of extension points to support interoperability with external metadata and tiling standards. -This Standard addresses the needs of Earth observation, environmental monitoring, and geospatial analysis applications that require efficient, scalable access to multidimensional datasets. It enables the harmonisation of existing data models, such as the Unidata Common Data Model (CDM) and the Climate and Forecast (CF) Conventions, with operational encoding formats suitable for cloud-native storage and analysis. +These capabilities are necessary because Zarr does not provide semantic constructs for geospatial data interpretation. Applications need to understand not just array shapes and values, but coordinate meanings, projection parameters, and scientific metadata. GeoZarr fills this gap without compromising Zarr's performance characteristics. -Typical use cases include the storage, transformation, discovery, and processing of raster and gridded data, data cubes with temporal or vertical dimensions, and catalogue-enabled datasets integrated with metadata standards such as STAC and OGC Tile Matrix Sets. +=== Why GeoZarr Exists + +Zarr, by design, is a low-level container for storing n-dimensional arrays and metadata. While this simplicity is a strength for performance and interoperability, it means Zarr lacks higher-level concepts that geospatial applications require: + +* *Coordinate Systems:* No native way to associate spatial or temporal meaning with array dimensions +* *Grid Mappings:* No standard mechanism for projection and coordinate reference system metadata +* *Semantic Metadata:* No conventions for units, standard names, or scientific attributes +* *Variable Relationships:* No formal distinction between coordinate variables and data variables + +These concepts are essential for geospatial workflows but must be layered on top of Zarr's array storage. GeoZarr provides this semantic layer through proven standards (Common Data Model and CF conventions) while preserving Zarr's cloud-native advantages. + +=== Use Cases and Applications + +This Standard addresses the needs of Earth observation, environmental monitoring, and geospatial analysis applications that require efficient, scalable access to multidimensional datasets. It enables the harmonisation of existing data models with operational encoding formats suitable for cloud-native storage and analysis. + +Typical use cases include: +* Storage and processing of raster and gridded data +* Management of data cubes with temporal or vertical dimensions +* Integration with catalogue systems through standardized metadata +* Multi-resolution tiling for efficient visualization and analysis +* Cloud-optimized access to large geospatial datasets diff --git a/standard/template/sections/clause_7_unified_data_model.adoc b/standard/template/sections/clause_7_unified_data_model.adoc index 64c073a..ff740ae 100644 --- a/standard/template/sections/clause_7_unified_data_model.adoc +++ b/standard/template/sections/clause_7_unified_data_model.adoc @@ -21,7 +21,7 @@ This clause specifies the logical composition of the unified model, the external === Foundational Model and Standards Reuse -The unified data model described in this Standard is derived from established community specifications to maximise interoperability and to enable the reuse of mature tools and practices. The model is grounded in the Unidata Common Data Model (CDM) and the Climate and Forecast (CF) Conventions, which together provide a robust framework for representing scientific and geospatial datasets. +GeoZarr adopts established data model concepts because Zarr itself provides only array storage without semantic interpretation. The Unidata Common Data Model (CDM) provides the conceptual framework for understanding dimensions, variables, and attributes, while CF Conventions provide standardized metadata semantics. This reuse ensures compatibility with existing scientific software while avoiding reinvention of proven concepts. ==== Common Data Model (CDM) From 561edd94e1e18d0b93268d0c06efe3cf4deb84c4 Mon Sep 17 00:00:00 2001 From: Emmanuel Mathot Date: Wed, 3 Sep 2025 14:33:07 +0200 Subject: [PATCH 3/9] Refine descriptions of multiscale groups in documentation for clarity and completeness --- standard/template/sections/clause_4_terms_and_definitions.adoc | 2 +- .../template/sections/clause_9_zarr_encoding_overviews.adoc | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/standard/template/sections/clause_4_terms_and_definitions.adoc b/standard/template/sections/clause_4_terms_and_definitions.adoc index fa81a97..2ba8d1e 100644 --- a/standard/template/sections/clause_4_terms_and_definitions.adoc +++ b/standard/template/sections/clause_4_terms_and_definitions.adoc @@ -35,7 +35,7 @@ Structured information describing the content, context, and semantics of dataset ==== multiscale group -A group that contains 2 or more child groups representing the same data at different resolutions, where each child group is a <>. The multiscale group includes metadata describing the relationship between resolution levels. +A group that contains child groups representing the same data at different resolutions, where each child group is a <>. The multiscale group includes metadata describing the relationship between resolution levels. A multiscale group can be initialized with a single dataset and expanded with additional resolution levels over time. ==== tile matrix set diff --git a/standard/template/sections/clause_9_zarr_encoding_overviews.adoc b/standard/template/sections/clause_9_zarr_encoding_overviews.adoc index abf6832..bd3a69f 100644 --- a/standard/template/sections/clause_9_zarr_encoding_overviews.adoc +++ b/standard/template/sections/clause_9_zarr_encoding_overviews.adoc @@ -3,7 +3,7 @@ This clause specifies how multiscale tiling (also known as overviews or pyramids) is encoded in Zarr stores conforming to the unified data model. The encoding supports both Zarr Version 2 and Version 3 and is aligned with the OGC Two Dimensional Tile Matrix Set Standard. -A multiscale group contains child groups, where each child group is a <> representing a zoom level that stores a coarser-resolution resampled version of the original data variables. +A <> contains one or more child groups, where each child group is a <> representing a zoom level of the data. Additional resolution levels can be added over time, with each new level storing a coarser-resolution resampled version of the original data variables. ==== Hierarchical Layout From f99d7427f4e3a64eafbbe4e0684f23709eb04e72 Mon Sep 17 00:00:00 2001 From: Emmanuel Mathot Date: Wed, 3 Sep 2025 20:15:38 +0200 Subject: [PATCH 4/9] Capitalize "Unified Data Model" for consistency across documentation sections --- .../clause_4_terms_and_definitions.adoc | 6 ++-- .../sections/clause_7_unified_data_model.adoc | 28 +++++++++---------- .../clause_9_zarr_encoding_overviews.adoc | 2 +- 3 files changed, 18 insertions(+), 18 deletions(-) diff --git a/standard/template/sections/clause_4_terms_and_definitions.adoc b/standard/template/sections/clause_4_terms_and_definitions.adoc index 2ba8d1e..4aae161 100644 --- a/standard/template/sections/clause_4_terms_and_definitions.adoc +++ b/standard/template/sections/clause_4_terms_and_definitions.adoc @@ -27,7 +27,7 @@ An index axis along which arrays are organised. Dimensions provide a naming and ==== dataset -A group that contains one or more data variables along with their associated coordinate variables, having a consistent relationship between these components. A dataset represents a coherent set of related data arrays and follows the unified data model. +A group that contains one or more data variables along with their associated coordinate variables, having a consistent relationship between these components. A dataset represents a coherent set of related data arrays and follows the Unified Data Model. ==== metadata @@ -45,9 +45,9 @@ A spatial tiling scheme defined by a hierarchy of zoom levels and consistent gri An affine transformation used to convert between grid coordinates and geospatial coordinates, typically defined using the GDAL GeoTransform convention. -==== unified data model (UDM) +==== Unified Data Model (UDM) -A conceptual model that defines how to structure geospatial data in Zarr using CDM-based constructs, including support for coordinate referencing, metadata integration, and multiscale representations. +A conceptual model that defines how to structure geospatial data in Zarr using CDM-based constructs, including support for coordinate referencing, metadata integration, and multiscale representations. The Unified Data Model provides a standardized framework for expressing spatial relationships, coordinate systems, and scientific metadata. === Abbreviated Terms diff --git a/standard/template/sections/clause_7_unified_data_model.adoc b/standard/template/sections/clause_7_unified_data_model.adoc index ff740ae..48cdaa9 100644 --- a/standard/template/sections/clause_7_unified_data_model.adoc +++ b/standard/template/sections/clause_7_unified_data_model.adoc @@ -4,9 +4,9 @@ === Scope and Purpose -This Standard defines a unified data model (UDM) that provides a conceptual framework for representing geospatial and scientific data in Zarr. The purpose of this model is to support standards-based interoperability across Earth observation systems and analytical environments, while preserving compatibility with existing data models and software ecosystems.. +This Standard defines the Unified Data Model (UDM) that provides a conceptual framework for representing geospatial and scientific data in Zarr. The purpose of this model is to support standards-based interoperability across Earth observation systems and analytical environments, while preserving compatibility with existing data models and software ecosystems.. -The unified data model incorporates and extends the following established specifications and community standards: +The Unified Data Model incorporates and extends the following established specifications and community standards: - **Unidata Common Data Model (CDM)** – Provides the foundational resource structure for scientific datasets, encompassing dimensions, coordinate systems, variables, and associated metadata elements. - **CF (Climate and Forecast) Conventions** – Defines a widely adopted metadata profile for describing spatiotemporal semantics in CDM-based datasets. @@ -15,9 +15,9 @@ The unified data model incorporates and extends the following established specif - **GDAL geotransform metadata**, used to express affine transformations and interpolation characteristics. - **SpatioTemporal Asset Catalog (STAC)** metadata elements for resource discovery and cataloguing (Collection and Item constructs). -The unified model is format-agnostic and describes the abstract structure of resources independently of the physical encoding. It does not redefine the semantics of the CDM or CF conventions, but introduces integration and extension points required to support tiled multiscale data, geospatial referencing, and metadata for discovery. +The Unified Data Model is format-agnostic and describes the abstract structure of resources independently of the physical encoding. It does not redefine the semantics of the CDM or CF conventions, but introduces integration and extension points required to support tiled multiscale data, geospatial referencing, and metadata for discovery. -This clause specifies the logical composition of the unified model, the external standards it leverages, and the conformance points that facilitate harmonised implementation within the GeoZarr framework. +This clause specifies the logical composition of the Unified Data Model, the external standards it leverages, and the conformance points that facilitate harmonised implementation within the GeoZarr framework. === Foundational Model and Standards Reuse @@ -25,7 +25,7 @@ GeoZarr adopts established data model concepts because Zarr itself provides only ==== Common Data Model (CDM) -The CDM defines a generalised schema for representing array-based scientific datasets. The following constructs are reused directly within the unified model: +The CDM defines a generalised schema for representing array-based scientific datasets. The following constructs are reused directly within the Unified Data Model: - **Dimensions** – Integer-valued, named axes that define the extents of data variables. - **Coordinate Variables** – Variables that supply coordinate values along dimensions, establishing spatial or temporal context. @@ -33,7 +33,7 @@ The CDM defines a generalised schema for representing array-based scientific dat - **Attributes** – Key-value metadata elements used to describe variables and datasets semantically. - **Groups** – Optional hierarchical containers enabling logical organisation of resources and metadata. -The unified data model adopts these CDM components without modification excluding the user-defined types. Semantic interpretation remains consistent with the original CDM specification. GeoZarr structures are mapped to CDM constructs to ensure compatibility and clarity. +The Unified Data Model adopts these CDM components without modification excluding the user-defined types. Semantic interpretation remains consistent with the original CDM specification. GeoZarr structures are mapped to CDM constructs to ensure compatibility and clarity. ==== CF Conventions @@ -44,7 +44,7 @@ The CF Conventions specify standardised metadata attributes and practices to des - Physical units - Standard variable naming -The unified data model supports CF-compliant metadata, including attributes such as `standard_name`, `units`, and `grid_mapping`. The unified data model does not prescribe CF compliance but enables it through permissive design. Partial adoption of CF attributes is supported, and non-compliant datasets may selectively adopt CF metadata as needed. +The Unified Data Model supports CF-compliant metadata, including attributes such as `standard_name`, `units`, and `grid_mapping`. The Unified Data Model does not prescribe CF compliance but enables it through permissive design. Partial adoption of CF attributes is supported, and non-compliant datasets may selectively adopt CF metadata as needed. ==== Standards-Based Extensions @@ -58,7 +58,7 @@ These extensions are integrated in a modular fashion and do not alter the core s === Model Extension Points -The unified data model specifies a series of optional, standards-aligned extension points to support functionality beyond the base CDM and CF constructs. These extensions enhance applicability to Earth observation and spatial analysis use cases without imposing additional mandatory requirements. +The Unified Data Model specifies a series of optional, standards-aligned extension points to support functionality beyond the base CDM and CF constructs. These extensions enhance applicability to Earth observation and spatial analysis use cases without imposing additional mandatory requirements. Each extension is defined as an independent module. Implementation of any given extension does not necessitate support for others. @@ -101,7 +101,7 @@ Each extension point is specified independently. Implementations may advertise s === Unified Model Structure -This clause defines the structural organisation of stores conforming to the unified data model (UDM). It consolidates the foundational elements and optional extensions into a coherent architecture suitable for Zarr encoding, while remaining format-agnostic. The model establishes a modular and extensible framework that supports structured representation of multidimensional, geospatially-referenced resources. +This clause defines the structural organisation of stores conforming to the Unified Data Model (UDM). It consolidates the foundational elements and optional extensions into a coherent architecture suitable for Zarr encoding, while remaining format-agnostic. The model establishes a modular and extensible framework that supports structured representation of multidimensional, geospatially-referenced resources. The model represents datasets as abstract compositions of dimensions, coordinate variables, data variables, and associated metadata. This abstraction ensures that applications and services can reason about the content and semantics of a dataset without reliance on storage layout or specific serialisation. @@ -118,7 +118,7 @@ Each <> comprises the following core components, aligned A Zarr hierarchy is a tree structure, where each node in the tree is either a group or an array. Group nodes may have children but array nodes may not. This supports the logical subdivision by theme, resolution, or processing stage, and enhances the clarity and reusability of complex geospatial structures. -The diagram below represents the structural layer of the unified data model, derived from the Unidata Common Data Model, which serves as the foundational framework for supporting all overlaying model layer. +The diagram below represents the structural layer of the Unified Data Model, derived from the Unidata Common Data Model, which serves as the foundational framework for supporting all overlaying model layer. //image::udm-core.png[] @@ -215,7 +215,7 @@ Overviews enable: ===== Conceptual Structure -A <> contains child groups representing the data at different resolutions, where each child group is a <> following the unified data model. It comprises the following components: +A <> contains child groups representing the data at different resolutions, where each child group is a <> following the Unified Data Model. It comprises the following components: [horizontal] *Base Dataset*:: The original, highest-resolution dataset to which the multiscale hierarchy is anchored. @@ -227,7 +227,7 @@ A <> contains child groups representing ===== Model Components -The *Overviews* construct is represented in the unified data model using the following logical elements: +The *Overviews* construct is represented in the Unified Data Model using the following logical elements: [cols="1,3"] |=== @@ -307,7 +307,7 @@ This extensibility framework supports both minimum-viable use and high-fidelity === Interoperability Considerations -Interoperability is a core objective of the GeoZarr unified data model. The model is designed to bridge diverse Earth observation and scientific data ecosystems by enabling structural and semantic compatibility with established formats and standards, while providing a forward-looking foundation for scalable, cloud-native workflows. +Interoperability is a core objective of the GeoZarr Unified Data Model. The model is designed to bridge diverse Earth observation and scientific data ecosystems by enabling structural and semantic compatibility with established formats and standards, while providing a forward-looking foundation for scalable, cloud-native workflows. This section outlines the principles and mechanisms supporting interoperability across formats, tools, and communities. @@ -341,7 +341,7 @@ This approach enables seamless integration into modern data catalogues and platf ==== Tool and Ecosystem Support -The unified data model facilitates interoperability with tools and libraries across the following domains: +The Unified Data Model facilitates interoperability with tools and libraries across the following domains: - *Scientific computing*: NetCDF-based libraries (e.g., xarray, netCDF4), Zarr-compatible clients. - *Geospatial processing*: GDAL, rasterio, QGIS (via Zarr driver extensions or translations). diff --git a/standard/template/sections/clause_9_zarr_encoding_overviews.adoc b/standard/template/sections/clause_9_zarr_encoding_overviews.adoc index bd3a69f..e91920e 100644 --- a/standard/template/sections/clause_9_zarr_encoding_overviews.adoc +++ b/standard/template/sections/clause_9_zarr_encoding_overviews.adoc @@ -1,7 +1,7 @@ === Encoding of Multiscale Overviews in Zarr -This clause specifies how multiscale tiling (also known as overviews or pyramids) is encoded in Zarr stores conforming to the unified data model. The encoding supports both Zarr Version 2 and Version 3 and is aligned with the OGC Two Dimensional Tile Matrix Set Standard. +This clause specifies how multiscale tiling (also known as overviews or pyramids) is encoded in Zarr stores conforming to the Unified Data Model. The encoding supports both Zarr Version 2 and Version 3 and is aligned with the OGC Two Dimensional Tile Matrix Set Standard. A <> contains one or more child groups, where each child group is a <> representing a zoom level of the data. Additional resolution levels can be added over time, with each new level storing a coarser-resolution resampled version of the original data variables. From b8c988b28c3bb842325d003587c8376e020c7c49 Mon Sep 17 00:00:00 2001 From: Emmanuel Mathot Date: Wed, 3 Sep 2025 20:16:14 +0200 Subject: [PATCH 5/9] Update section title to "Unified Data Model Structure" for consistency in documentation --- standard/template/sections/clause_7_unified_data_model.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/standard/template/sections/clause_7_unified_data_model.adoc b/standard/template/sections/clause_7_unified_data_model.adoc index 48cdaa9..0f3cf4b 100644 --- a/standard/template/sections/clause_7_unified_data_model.adoc +++ b/standard/template/sections/clause_7_unified_data_model.adoc @@ -99,7 +99,7 @@ STAC integration is non-intrusive and modular. It does not impose changes on the Each extension point is specified independently. Implementations may advertise support for one or more extensions by declaring conformance to corresponding extension modules. This modularity facilitates incremental adoption, promotes reuse, and enhances interoperability across varied implementation environments. -=== Unified Model Structure +=== Unified Data Model Structure This clause defines the structural organisation of stores conforming to the Unified Data Model (UDM). It consolidates the foundational elements and optional extensions into a coherent architecture suitable for Zarr encoding, while remaining format-agnostic. The model establishes a modular and extensible framework that supports structured representation of multidimensional, geospatially-referenced resources. From 8cac80c42d3a1356323ee04a81b998389f3e37b4 Mon Sep 17 00:00:00 2001 From: Emmanuel Mathot Date: Thu, 4 Sep 2025 15:24:04 +0200 Subject: [PATCH 6/9] Enhance documentation clarity by defining relationships to Zarr core concepts, refining terminology, and ensuring consistent references to hierarchies and stores across multiple sections. --- .../template/sections/clause_1_scope.adoc | 4 ++++ .../clause_4_terms_and_definitions.adoc | 4 ++++ .../sections/clause_7_unified_data_model.adoc | 23 ++++++++++++------- .../sections/clause_9_zarr_encoding_core.adoc | 4 ++-- .../clause_9_zarr_encoding_overviews.adoc | 2 +- 5 files changed, 26 insertions(+), 11 deletions(-) diff --git a/standard/template/sections/clause_1_scope.adoc b/standard/template/sections/clause_1_scope.adoc index 1275ba4..3aa9b72 100644 --- a/standard/template/sections/clause_1_scope.adoc +++ b/standard/template/sections/clause_1_scope.adoc @@ -15,6 +15,10 @@ Zarr, by design, is a low-level container for storing n-dimensional arrays and m These concepts are essential for geospatial workflows but must be layered on top of Zarr's array storage. GeoZarr provides this semantic layer through proven standards (Common Data Model and CF conventions) while preserving Zarr's cloud-native advantages. +=== Relationship to Zarr Core Concepts + +GeoZarr builds upon Zarr's foundational concepts of <> and <>. A Zarr store provides the storage and retrieval interface (e.g., filesystem, cloud object storage), while a hierarchy defines the logical tree structure of groups and arrays within that store. GeoZarr specifies how to organize and structure hierarchies to support geospatial semantics, without modifying the underlying store interface. + === Use Cases and Applications This Standard addresses the needs of Earth observation, environmental monitoring, and geospatial analysis applications that require efficient, scalable access to multidimensional datasets. It enables the harmonisation of existing data models with operational encoding formats suitable for cloud-native storage and analysis. diff --git a/standard/template/sections/clause_4_terms_and_definitions.adoc b/standard/template/sections/clause_4_terms_and_definitions.adoc index 4aae161..cee4bbd 100644 --- a/standard/template/sections/clause_4_terms_and_definitions.adoc +++ b/standard/template/sections/clause_4_terms_and_definitions.adoc @@ -37,6 +37,10 @@ Structured information describing the content, context, and semantics of dataset A group that contains child groups representing the same data at different resolutions, where each child group is a <>. The multiscale group includes metadata describing the relationship between resolution levels. A multiscale group can be initialized with a single dataset and expanded with additional resolution levels over time. +==== store + +A system that provides storage and retrieval operations for Zarr hierarchies, as defined in the https://zarr-specs.readthedocs.io/en/latest/v3/core/index.html#stores[Zarr core specification]. A store implements the abstract store interface and can be backed by various storage technologies such as filesystems, cloud object storage, or databases. GeoZarr hierarchies are stored within and accessed through Zarr stores. + ==== tile matrix set A spatial tiling scheme defined by a hierarchy of zoom levels and consistent grid parameters (e.g., scale, CRS). Tile Matrix Sets enable spatial indexing and tiling of gridded data. diff --git a/standard/template/sections/clause_7_unified_data_model.adoc b/standard/template/sections/clause_7_unified_data_model.adoc index 0f3cf4b..0afc80f 100644 --- a/standard/template/sections/clause_7_unified_data_model.adoc +++ b/standard/template/sections/clause_7_unified_data_model.adoc @@ -4,7 +4,9 @@ === Scope and Purpose -This Standard defines the Unified Data Model (UDM) that provides a conceptual framework for representing geospatial and scientific data in Zarr. The purpose of this model is to support standards-based interoperability across Earth observation systems and analytical environments, while preserving compatibility with existing data models and software ecosystems.. +This Standard defines the Unified Data Model (UDM) that provides a conceptual framework for representing geospatial and scientific data in Zarr. The purpose of this model is to support standards-based interoperability across Earth observation systems and analytical environments, while preserving compatibility with existing data models and software ecosystems. + +The Unified Data Model operates within the Zarr framework, where a <> provides the storage and retrieval interface, and a hierarchy defines the logical organization of groups and arrays within that store. GeoZarr hierarchies are stored in and accessed through Zarr stores, which can be implemented using various storage technologies such as filesystems, cloud object storage, or databases. The Unified Data Model incorporates and extends the following established specifications and community standards: @@ -87,11 +89,11 @@ To enable discovery of resources within the hierarchical structure of the data m A STAC extension consists of embedding or referencing STAC Collection and Item metadata within the data model: -* Each store resource MAY reference a corresponding STAC `Collection` or `Item` using an identifier or embedded object. +* Each hierarchy MAY reference a corresponding STAC `Collection` or `Item` using an identifier or embedded object. * STAC properties such as `datetime`, `bbox`, and `eo:bands` MAY be included in the metadata to enable spatial, temporal, and spectral filtering. * The structure is compatible with external STAC APIs and metadata harvesting systems. -STAC integration is non-intrusive and modular. It does not impose changes on the internal organisation of the store and MAY be adopted incrementally by implementations requiring catalogue-based discovery capabilities. +STAC integration is non-intrusive and modular. It does not impose changes on the internal organisation of the hierarchy and MAY be adopted incrementally by implementations requiring catalogue-based discovery capabilities. ==== Modularity and Interoperability @@ -101,13 +103,13 @@ Each extension point is specified independently. Implementations may advertise s === Unified Data Model Structure -This clause defines the structural organisation of stores conforming to the Unified Data Model (UDM). It consolidates the foundational elements and optional extensions into a coherent architecture suitable for Zarr encoding, while remaining format-agnostic. The model establishes a modular and extensible framework that supports structured representation of multidimensional, geospatially-referenced resources. +This clause defines the structural organisation of Zarr hierarchies conforming to the Unified Data Model (UDM). It consolidates the foundational elements and optional extensions into a coherent architecture suitable for Zarr encoding, while remaining format-agnostic. The model establishes a modular and extensible framework that supports structured representation of multidimensional, geospatially-referenced resources. The model represents datasets as abstract compositions of dimensions, coordinate variables, data variables, and associated metadata. This abstraction ensures that applications and services can reason about the content and semantics of a dataset without reliance on storage layout or specific serialisation. -==== Store Structure +==== Hierarchy Structure -A store conforming to the Unified Data Model (UDM) is structured as a hierarchy rooted at a top-level group. This design enables modularity and facilitates the representation of complex, multi-resolution, or thematically partitioned data collections. +A Zarr hierarchy conforming to the Unified Data Model (UDM) is structured as a tree rooted at a top-level group. This design enables modularity and facilitates the representation of complex, multi-resolution, or thematically partitioned data collections. Each <> comprises the following core components, aligned with the Unidata Common Data Model (CDM) and Climate and Forecast (CF) Conventions: @@ -135,6 +137,10 @@ class Store { + close() } +class Hierarchy { + + String name +} + class Group { + String name } @@ -165,7 +171,8 @@ class Attribute { + List values } -Store "1" --> "*" Group : rootGroup +Store "1" --> "*" Hierarchy : implements +Hierarchy "1" *-- "1" Group : has root Group "1" --> "*" Group : contains Dataset -up-|> Group Dataset --> "*" Variable : contains @@ -193,7 +200,7 @@ The model accommodates both standard CF-compatible definitions and extended refe Metadata may be declared at various levels within the model structure: -- **Global Metadata** – Attributes describing the store as a whole, including elements such as `title`, `summary`, and `license`. +- **Global Metadata** – Attributes describing the hierarchy as a whole, including elements such as `title`, `summary`, and `license`. - **Variable Metadata** – Attributes associated with individual data or coordinate variables, conveying descriptive or semantic information. - **Extension Metadata** – Structured metadata linked to optional model extensions (e.g., multiscale tiling, catalogue references, geotransform properties). diff --git a/standard/template/sections/clause_9_zarr_encoding_core.adoc b/standard/template/sections/clause_9_zarr_encoding_core.adoc index eedb689..8a1972c 100644 --- a/standard/template/sections/clause_9_zarr_encoding_core.adoc +++ b/standard/template/sections/clause_9_zarr_encoding_core.adoc @@ -1,7 +1,7 @@ === Hierarchical Structure -A store conforming to the unified data model is structured as a hierarchy of groups, variables (arrays), dimensions, and metadata. Following Zarr conventions, this hierarchy is rooted in a group, which may contain: +A hierarchy conforming to the Unified Data Model is structured as a tree of groups, variables (arrays), dimensions, and metadata. Following Zarr conventions, this hierarchy is rooted in a group, which may contain: - Arrays representing coordinate or data variables - Child groups for modular organisation, including logical sub-collections or resolution levels @@ -115,7 +115,7 @@ Example: === Global Metadata -Metadata associated with the store is stored at the root group level. +Metadata associated with the hierarchy is stored at the root group level. [cols="1,2,2"] diff --git a/standard/template/sections/clause_9_zarr_encoding_overviews.adoc b/standard/template/sections/clause_9_zarr_encoding_overviews.adoc index e91920e..c367b6d 100644 --- a/standard/template/sections/clause_9_zarr_encoding_overviews.adoc +++ b/standard/template/sections/clause_9_zarr_encoding_overviews.adoc @@ -1,7 +1,7 @@ === Encoding of Multiscale Overviews in Zarr -This clause specifies how multiscale tiling (also known as overviews or pyramids) is encoded in Zarr stores conforming to the Unified Data Model. The encoding supports both Zarr Version 2 and Version 3 and is aligned with the OGC Two Dimensional Tile Matrix Set Standard. +This clause specifies how multiscale tiling (also known as overviews or pyramids) is encoded in Zarr hierarchies conforming to the Unified Data Model. The encoding supports both Zarr Version 2 and Version 3 and is aligned with the OGC Two Dimensional Tile Matrix Set Standard. A <> contains one or more child groups, where each child group is a <> representing a zoom level of the data. Additional resolution levels can be added over time, with each new level storing a coarser-resolution resampled version of the original data variables. From 08caa63294f58a8b73260d168641acf11d6be272 Mon Sep 17 00:00:00 2001 From: Emmanuel Mathot Date: Thu, 4 Sep 2025 15:32:10 +0200 Subject: [PATCH 7/9] Refine Unified Data Model description to clarify adaptations for Zarr's type system and ensure compatibility with CDM semantics. --- standard/template/sections/clause_7_unified_data_model.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/standard/template/sections/clause_7_unified_data_model.adoc b/standard/template/sections/clause_7_unified_data_model.adoc index 0afc80f..f5937a9 100644 --- a/standard/template/sections/clause_7_unified_data_model.adoc +++ b/standard/template/sections/clause_7_unified_data_model.adoc @@ -35,7 +35,7 @@ The CDM defines a generalised schema for representing array-based scientific dat - **Attributes** – Key-value metadata elements used to describe variables and datasets semantically. - **Groups** – Optional hierarchical containers enabling logical organisation of resources and metadata. -The Unified Data Model adopts these CDM components without modification excluding the user-defined types. Semantic interpretation remains consistent with the original CDM specification. GeoZarr structures are mapped to CDM constructs to ensure compatibility and clarity. +The Unified Data Model adopts these CDM components with adaptations for Zarr's type system. While the conceptual structure remains consistent with the original CDM specification, attribute types are mapped to Zarr's JSON-compatible type system. GeoZarr structures preserve CDM semantics while conforming to Zarr's encoding constraints. ==== CF Conventions From 4db26fbeda15ec215f2066fb758fb4949cbce134 Mon Sep 17 00:00:00 2001 From: Emmanuel Mathot Date: Thu, 4 Sep 2025 16:07:07 +0200 Subject: [PATCH 8/9] Fix relationship notation for Store and Hierarchy in Unified Data Model diagram --- standard/template/sections/clause_7_unified_data_model.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/standard/template/sections/clause_7_unified_data_model.adoc b/standard/template/sections/clause_7_unified_data_model.adoc index f5937a9..bb61701 100644 --- a/standard/template/sections/clause_7_unified_data_model.adoc +++ b/standard/template/sections/clause_7_unified_data_model.adoc @@ -171,7 +171,7 @@ class Attribute { + List values } -Store "1" --> "*" Hierarchy : implements +Store "1" ..|> "*" Hierarchy : implements Hierarchy "1" *-- "1" Group : has root Group "1" --> "*" Group : contains Dataset -up-|> Group From 1500a6d4ec1fe87e43eafbeaa73f0263ead4d1ce Mon Sep 17 00:00:00 2001 From: Emmanuel Mathot Date: Thu, 4 Sep 2025 16:25:16 +0200 Subject: [PATCH 9/9] Refine Unified Data Model relationships by correcting notation and enhancing clarity in class diagram --- .../sections/clause_7_unified_data_model.adoc | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/standard/template/sections/clause_7_unified_data_model.adoc b/standard/template/sections/clause_7_unified_data_model.adoc index bb61701..a1317ac 100644 --- a/standard/template/sections/clause_7_unified_data_model.adoc +++ b/standard/template/sections/clause_7_unified_data_model.adoc @@ -171,16 +171,16 @@ class Attribute { + List values } -Store "1" ..|> "*" Hierarchy : implements +Store "1" ..|> Hierarchy : implements Hierarchy "1" *-- "1" Group : has root -Group "1" --> "*" Group : contains -Dataset -up-|> Group +Group --* Group : is part of +Dataset -up-|> Group : is a Dataset --> "*" Variable : contains -Dataset --> "*" Dimension : defines -Group --> "*" Attribute : has -Variable --> "*" Dimension : uses -Variable --> "1" DataType : has -Variable --> "*" Attribute : has +Dimension --* "*" Dataset : is shared in +Group *-- "*" Attribute +Dimension --o "*" Variable : define the shape of +Variable --> "1" DataType +Variable *-- "*" Attribute @enduml .... //endif::never-shown[]