Skip to content

Add search API compatible simprint format #21

@titusz

Description

@titusz

Add support for outputting simprints in a format compatible with the search API schema.

Current behavior:

  • Simprints use maintype, subtype, version fields
  • Features organized in FeatureSet with Index-Format or Object-Format

Required changes:

  • Add to_api_format() method to convert metadata to API-compatible structure
  • Organize simprints by type key (e.g., SEMANTIC_TEXT_V0) instead of using separate maintype/subtype/version fields
  • Keep all existing fields including optional content, offset, size

Target API Schema:

{
  "simprints": {
    "SEMANTIC_TEXT_V0": [
      {
        "simprint": "XZjeSfdyVi0",
        "offset": 0,
        "size": 512,
        "content": "optional text chunk"
      }
    ]
  }
}

Implementation Notes:

  • Add to_api_format() method similar to existing to_index_format() and to_object_format()
  • Consider adding CLI flag (e.g., --api-format) to output in this format
  • Type key format: {MAINTYPE}_{SUBTYPE}_V{version} (e.g., SEMANTIC_TEXT_V0)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions