Skip to content

Files at raw.githubusercontent.com return 404 #414

@peterdesmet

Description

@peterdesmet

We are currently experiencing an issue where raw files (served from raw.githubusercontent.com) are returning an 404. For example:

This seems to be an repo-wide issue, for all files, branches, versions. It does not affect other tdwg repositories (e.g. https://raw.githubusercontent.com/tdwg/dwc/refs/heads/master/build/termlist-dictionary.en.json).

This affects reading of Camtrap DP datasets

This is problematic because Camtrap DP datapackage.json files typically refer to raw.githubusercontent.com. E.g.:

"schema": "https://raw.githubusercontent.com/tdwg/camtrap-dp/1.0.2/deployments-table-schema.json"

When the schema returns a 404, the camtrapdp R package for example won't be able to read the data.

Note: this does not affect Camtrap DP files created with the GBIF IPT, which refer to rs.gbif.org (e.g. http://rs.gbif.org/data-packages/camtrap-dp/1.0/table-schemas/deployments.json)

Cause

The cause might be rate limiting of raw.githubusercontent.com, see e.g. https://stackoverflow.com/a/74960542 The fact that the issue was (temporarily) resolved (files were accessible again) when I wrote the above seems to indicate that.

Solution

For current datasets, this is beyond our control. In the mid-term, we should probably serve our source files from elsewhere. Either:

  • rs.tdwg.org (need a deployment method then)
  • rs.gbif.org (already used by IPT)
  • camtrap-dp.tdwg.org (the camtrap-dp website): need to manually create version directories then.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions