Skip to content

Create repository of extractors #8

@vsoch

Description

@vsoch

Right now these are living in the dockerfiles repo (a full example) but we should also provide simple examples in a separate repo, with the goal of being able plug easily into other tools (e.g., datalad @yarikoptic

These extractors (in progress!) will be here: https://github.com/openschemas/extractors

@yarikoptic I'm done with the schemaorg python tooling, and I'm waiting to hear from the library about use cases to do the first implementations with datalad. I'll also have "ImageDefinition" examples finished soon, just waiting on a few PRs into container-diff to get all the metadata that I want. There will be a full "dockerfiles" example with embedded metadata for schemaorg also soon (it's parsing now).

The general goal will be that if there is a datalad user with some dataset thing that fits a schema.org definition, they can grab one of these extractors to use with datalad (and schemaorg) to generate the metadata (web view) for their dataset.

Another question for you - do you have any datasets / community needs that would do well with a Python extractor with datalad? Since these are ready to go and I'm really wanting to get started working (and I'm not sure how long the library would take) it might be faster to find another use case too.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions