-
Notifications
You must be signed in to change notification settings - Fork 5
Description
We are currently experiencing an issue where raw files (served from raw.githubusercontent.com) are returning an 404. For example:
- File in repo (fine): https://github.com/tdwg/camtrap-dp/blob/1.0.2/deployments-table-schema.json
- Raw file (errors): https://raw.githubusercontent.com/tdwg/camtrap-dp/refs/tags/1.0.2/deployments-table-schema.json
This seems to be an repo-wide issue, for all files, branches, versions. It does not affect other tdwg repositories (e.g. https://raw.githubusercontent.com/tdwg/dwc/refs/heads/master/build/termlist-dictionary.en.json).
This affects reading of Camtrap DP datasets
This is problematic because Camtrap DP datapackage.json files typically refer to raw.githubusercontent.com. E.g.:
camtrap-dp/example/datapackage.json
Line 10 in 43aba7d
| "schema": "https://raw.githubusercontent.com/tdwg/camtrap-dp/1.0.2/deployments-table-schema.json" |
When the schema returns a 404, the camtrapdp R package for example won't be able to read the data.
Note: this does not affect Camtrap DP files created with the GBIF IPT, which refer to rs.gbif.org (e.g. http://rs.gbif.org/data-packages/camtrap-dp/1.0/table-schemas/deployments.json)
Cause
The cause might be rate limiting of raw.githubusercontent.com, see e.g. https://stackoverflow.com/a/74960542 The fact that the issue was (temporarily) resolved (files were accessible again) when I wrote the above seems to indicate that.
Solution
For current datasets, this is beyond our control. In the mid-term, we should probably serve our source files from elsewhere. Either:
- rs.tdwg.org (need a deployment method then)
- rs.gbif.org (already used by IPT)
- camtrap-dp.tdwg.org (the camtrap-dp website): need to manually create version directories then.