Amsterdam Schema aims to describe and validate open data published by the City of Amsterdam, in order to make the storing and publishing of different datasets more structured, simpler and better documented.
This repository contains:
- JSON documents that describe the structure and metadata of datasets (i.e.:
dataset schemasnot to be confused withJSON-schemas); - JSON documents that describe the structure and metadata of tables (i.e.:
table schemasnot to be confused withJSON-schemas); - A JSON-Schema metaschema to validate the documents mentioned under 1) and 2).
More specifically, metaschemas are JSON-Schemas that can make sure every dataset published by the City of Amsterdam always contains the right metadata and is of the right form.
This is done by running structural and semantic validation.
The structural part is handled by the metaschema defined in this repository. The logic for semantic validation is defined in the schematools repository.
Apart from the technical description an in-depth textual specification of the Amsterdam Schema can be found at https://schemas.data.amsterdam.nl/docs/ams-schema-spec.html.
The Amsterdam Schema is chosen to be delimited in such a way that it can interoperate with as many systems as possible. The results of this analysis can be found at the Grootst Gemene Deler page.
Each instance of Amsterdam Schema exists of:
- Metadata about a single dataset;
- Metadata about each table in this single dataset;
- For each table, a table-schema to describe and validate the data in these tables.
An overview of the current schemas can be found at https://github.com/Amsterdam/amsterdam-schema/tree/master/datasets.
In Amsterdam Schema, we're using the following concepts:
| Type | Description |
|---|---|
| Dataset | A single dataset, with contents and metadata |
| Table | A single table with objects of a single class/type |
| Row | A row in such a table (a single object, a row in a source CSV file or feature in a source Shapefile, for example) |
| Field | A property of a single object |
For example:
- The dataset
bagcontains data for each building and address in the city; - This dataset contains two tables:
buildingsandaddresses; - To describe this dataset according to Amsterdam Schema, we first describe the metadata of the dataset (such as its identifier, title, description and DCAT fields) in a dataset.json file;
- For each table in this dataset, we describe the table metadata in a separate JSON file. We can also choose to combine the dataset and table JSON data in a single JSON file;
- For each table, we create a table-schema to describe its contents. This JSON Schema describes all the fields in a single table row, and the types of these fields;
- Amsterdam Schema is used to validate the dataset and table JSON data
- Amsterdam Schema is used to validate the table row JSON Schema, with a meta-schema (a JSON Schema to verify a JSON Schema).
You can find all historical versions of the Amsterdam Schema definition in this repository. Version numbers are shown as '@1.0.0' where we follow SchemaVer for versioning. This will allow for a gradual evolution of capabilities.
For more information, see (some of these pages are in Dutch):