Skip to content
This repository was archived by the owner on Jul 14, 2025. It is now read-only.

deprecated in favor of MLM extension #16

Merged
merged 11 commits into from
Jul 14, 2025
68 changes: 68 additions & 0 deletions MIGRATION_TO_MLM.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
# Migration Guide: ML Model Extension to MLM Extension

<!-- lint disable no-undefined-references -->

>[!IMPORTANT]
> For specific field migration details from [ML-Model](README.md) to [Machine Learning Model (MLM)][mlm] please refer
> to the [MLM Migration Document](https://github.com/stac-extensions/mlm/blob/main/docs/legacy/ml-model.md).

<!-- lint enable no-undefined-references -->

## Context

The ML Model Extension was started at Radiant Earth on October 4th, 2021.
It was possibly the first STAC extension dedicated to describing machine learning models.
The extension incorporated inputs from 9 different organizations and was used to describe models
in Radiant Earth's MLHub API. The announcement of this extension and its use in Radiant Earth's MLHub
is described [here](https://medium.com/radiant-earth-insights/geospatial-models-now-available-in-radiant-mlhub-a41eb795d7d7).
Radiant Earth's MLHub API and Python SDK are now [deprecated](https://mlhub.earth/?gad_source=1&gclid=CjwKCAjwk8e1BhALEiwAc8MHiBZ1JcpErgQXlna7FsB3dd-mlPpMF-jpLQJolBgtYLDOeH2k-cxxLRoCEqQQAvD_BwE).
In order to support other current users of the ML Model extension, this document lays out a migration path to convert
metadata to the [Machine Learning Model Extension (MLM)][mlm].

## Shared Goals

Both the ML Model Extension and the [Machine Learning Model (MLM)][mlm] extension aim to provide a standard way to
catalog machine learning (ML) models that work with, but are not limited to, Earth observation (EO) data.

Their main goals are:

1. **Search and Discovery**: Helping users find and use ML models.
2. **Describing Inference and Training Requirements**: Making it easier to run these models by describing input requirements and outputs.
3. **Reproducibility**: Providing runtime information and links to assets so that model inference is reproducible.

## Schema Changes

### ML Model Extension
- **Scope**: Item, Collection
- **Field Name Prefix**: `ml-model`
- **Key Sections**:
- Item Properties
- Asset Objects
- Inference/Training Runtimes
- Relation Types
- Interpretation of STAC Fields

### MLM Extension
- **Scope**: Collection, Item, Asset, Links
- **Field Name Prefix**: `mlm`
- **Key Sections**:
- Item Properties and Collection Fields
- Asset Objects
- Relation Types
- Model Input/Output Objects
- Best Practices

### Notable Differences

- The MLM Extension covers more details at both the Item and Asset levels, making it easier to describe and use model metadata.
- The MLM Extension covers more runtime requirements using distinct asset roles.
- The MLM extension has better integration with the STAC Extensions and Python ecosystem.

## Getting Help

If you have any questions about a migration, feel free to contact the maintainers by opening a discussion or issue
on the [MLM repository][mlm].

If you see a feature missing in the MLM, feel free to open an issue describing your feature request.

[mlm]: https://github.com/stac-extensions/mlm
36 changes: 28 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,34 @@
# ML Model Extension Specification

<!-- lint disable no-undefined-references -->

> [!WARNING]
> This repository is deprecated in favor of
> [https://github.com/stac-extensions/mlm](https://github.com/stac-extensions/mlm). <br>
> The corresponding schemas are made available on
> [https://stac-extensions.github.io/mlm/](https://stac-extensions.github.io/mlm/).
> Documentation on migrating from the ML-Model extension to the Machine Learning Model (MLM) extension
> is [here](./MIGRATION_TO_MLM.md). Further details are also available in the
> [MLM Migration Document](https://github.com/stac-extensions/mlm/blob/main/docs/legacy/ml-model.md)
>
> It is **STRONGLY** recommended to migrate `ml-model` definitions to the `mlm` extension.
> The `mlm` extension improves the model metadata definition and properties with added support
> for use cases not directly supported by `ml-model`.
> It also provides increased interoperability with other STAC extensions, adds best-practices recommendations,
> provides tooling for creating STAC attributes, and works toward alignement efforts from both geospatial and
> machine learning communities.

<!-- lint enable no-undefined-references -->

- **Title:** ML Model
- **Identifier:** <https://stac-extensions.github.io/ml-model/v1.0.0/schema.json>
- **Field Name Prefix:** ml-model
- **Scope:** Item, Collection
- **Extension [Maturity Classification](https://github.com/radiantearth/stac-spec/tree/master/extensions/README.md#extension-maturity):** Proposal
- **Extension [Maturity Classification](https://github.com/radiantearth/stac-spec/tree/master/extensions/README.md#extension-maturity):** Deprecated
- **Owner**: @duckontheweb

This document explains the ML Model Extension to the [SpatioTemporal Asset
Catalog](https://github.com/radiantearth/stac-spec) (STAC) specification.
Catalog](https://github.com/radiantearth/stac-spec) (STAC) specification.

- Examples:
- [Item example](examples/dummy/item.json): Shows the basic usage of the extension in a STAC Item
Expand Down Expand Up @@ -49,7 +69,7 @@ these models for the following types of use-cases:
institutions are making an effort to publish code and examples along with academic publications to enable this kind of reproducibility. However,
the quality and usability of this code and related documentation can vary widely and there are currently no standards that ensure that a new
researcher could reproduce a given set of published results from the documentation. The STAC ML Model Extension aims to address this issue by
providing a detailed description of the training data and environment used in a ML model experiment.
providing a detailed description of the training data and environment used in a ML model experiment.

## Item Properties

Expand All @@ -66,7 +86,7 @@ these models for the following types of use-cases:

#### ml-model:learning_approach

Describes the learning approach used to train the model. It is STRONGLY RECOMMENDED that you use one of the
Describes the learning approach used to train the model. It is STRONGLY RECOMMENDED that you use one of the
following values, but other values are allowed.

- `"supervised"`
Expand All @@ -76,7 +96,7 @@ following values, but other values are allowed.

#### ml-model:prediction_type

Describes the type of predictions made by the model. It is STRONGLY RECOMMENDED that you use one of the
Describes the type of predictions made by the model. It is STRONGLY RECOMMENDED that you use one of the
following values, but other values are allowed. Note that not all Prediction Type values are valid
for a given [Learning Approach](#ml-modellearning_approach).

Expand Down Expand Up @@ -120,7 +140,7 @@ While the Compose file defines nearly all of the parameters required to run the
directory containing input data should be mounted to the container and to which host directory the output predictions should be written. The Compose
file MUST define volume mounts for input and output data using the Compose
[Interpolation syntax](https://github.com/compose-spec/compose-spec/blob/master/spec.md#interpolation). The input data volume MUST be defined by an
`INPUT_DATA` variable and the output data volume MUST be defined by an `OUTPUT_DATA` variable.
`INPUT_DATA` variable and the output data volume MUST be defined by an `OUTPUT_DATA` variable.

For example, the following Compose file snippet would mount the host input directory to `/var/data/input` in the container and would mount the host
output data directory to `/var/data/output` in the host container. In this contrived example, the script to run the model takes 2 arguments: the
Expand Down Expand Up @@ -208,10 +228,10 @@ extension, please open a PR to include it in the `examples` directory. Here are

### Running tests

The same checks that run as checks on PR's are part of the repository and can be run locally to verify that changes are valid.
The same checks that run as checks on PR's are part of the repository and can be run locally to verify that changes are valid.
To run tests locally, you'll need `npm`, which is a standard part of any [node.js installation](https://nodejs.org/en/download/).

First you'll need to install everything with npm once. Just navigate to the root of this repository and on
First you'll need to install everything with npm once. Just navigate to the root of this repository and on
your command line run:
```bash
npm install
Expand Down