From 9741a6b2ba4ee80ded6094a315ca27a0dee829a2 Mon Sep 17 00:00:00 2001 From: Yaroslav Halchenko Date: Wed, 21 Aug 2024 21:43:19 -0400 Subject: [PATCH 01/24] Initial draft of a design document for the Zenodo like DOI per dandiset --- doc/design/doi-generation-2.md | 66 ++++++++++++++++++++++++++++++++++ 1 file changed, 66 insertions(+) create mode 100644 doc/design/doi-generation-2.md diff --git a/doc/design/doi-generation-2.md b/doc/design/doi-generation-2.md new file mode 100644 index 000000000..54ec508ba --- /dev/null +++ b/doc/design/doi-generation-2.md @@ -0,0 +1,66 @@ +# DOI for Draft Dandisets + +Author: Yaroslav O. Halchenko & Dorota Jarecka + +The current approach: + +- [initial design doc](./doi-generation-1.md) +- overall: + - inject fake DOI upon dandiset creation + - mint proper DOI only upon dandiset publication + +## Issues with the Existing Approach + +- [Stop injecting "fake" DOIs into draft dandisets](https://github.com/dandi/dandi-archive/issues/1709) +- [Unpublished Dandisets display a DOI under `Cite As`](https://github.com/dandi/dandi-archive/issues/1932) + +## Proposed Solution + +Initially proposed/discussed in + +- [Create and maintain a "Findable" DOI for the Dandiset as a whole](https://github.com/dandi/dandi-archive/issues/1319) + +and boils down to the adoption of approach of Zenodo of having a DOI which always points to the latest version of the record. + +DataCite allows for three types of DOIs ([DataCite](https://support.datacite.org/docs/what-does-the-state-of-the-doi-mean)): + +- `Draft`. We do not use those. + *Can be deleted, and they require only the identifier itself in order to be created or saved. They can be updated to either Registered or Findable DOIs. Registered and Findable DOIs may not be returned to the Draft state, which means that changing the state of a Draft is final.* +- `Registered`. Like `Findable` but not indexed for search, so we do not use them. +- `Findable`. Is the type we use for published dandisets. + Requires to be valid (pass validation to fit the datacite schema) to be created. + +We propose to: + +- Instead of a fake DOI, upon creation of dandiset, mint and use a legit `Draft` DOI `10.48324/dandi.{dandiset.id}` with + - metadata entered during creation request (title, description, license) + - DLP URL `https://dandiarchive.org/dandiset/{dandiset.id}` + - For embargoed dandiset, **do not** specify any metadata besides the DLP URL. +- Upon changes to dandiset metadata record, for public (non-embargoed dandisets), try to update datacite metadata record while keeping the same target URL. + - For Draft DOI (dandiset was not published yet), there is no validation. + - **Question to clear up**: what happens to Draft DOI if metadata record is invalid? Does it fail to update altogether? does it update only the fields it knows about? + - For Findable DOI (dandiset was published), metadata record must pass validation, so we might fail to update. + But that should be ok. +- Upon unembargoing dandiset: update (Draft) DOI metadata record with current metadata. +- Upon publication of the dandiset: + - (already done currently) mint a proper `Findable` version DOI `10.48324/dandi.{dandiset.id}/{version}` + - update Dandiset-wide DOI `10.48324/dandi.{dandiset.id}` with metadata provided for version DOI, while keeping URL pointing to DLP instead of the released version. + - if Dandiset-wide DOI was in Draft state, it would be updated to Findable state (should work since we know metadata record passed validation). + +## Concerns to keep in mind/address + +- Draft dandiset might not have sufficient metadata to mint a proper DOI, or metadata might not be "proper" (fail validation) thus causing issues with minting a DOI + - Solution: start with Draft (not findable) DOI, and then upon publication mint a "findable" DOI + - **Follow up concern**: after dandiset and DOI publish, metadata of the Draft version of the dandiset could still be changed. + This potentially making changed record again "invalid".then when people change metadata +- test site of datacite had different result of validation that the primary one + +- `Findable` DOI cannot be deleted, but in principle we allow for deletion of dandisets. + - We might want a dedicated 404 page for deleted dandisets, or at least a message that the dandiset was deleted, and ideally describe the reason why it was deleted ("Upon request of maintainer", "Due to violation of terms of service", etc.) + - Then we adjust DOI record to point to that page. + +- Should we do anything at dandischema level? + +- Should we do anything at DLP level? + +- Should we somehow reflect interactions with DataCite in Audit log? From 3cf9b1dabefbf2c74ae4e3a7cc56f763fea9e3da Mon Sep 17 00:00:00 2001 From: Yaroslav Halchenko Date: Fri, 6 Dec 2024 13:03:39 -0500 Subject: [PATCH 02/24] Elaborate plan a little more and add TODO for a test script across dandisets --- doc/design/doi-generation-2.md | 44 +++++++++++++++++++++++----------- 1 file changed, 30 insertions(+), 14 deletions(-) diff --git a/doc/design/doi-generation-2.md b/doc/design/doi-generation-2.md index 54ec508ba..a2fff0d06 100644 --- a/doc/design/doi-generation-2.md +++ b/doc/design/doi-generation-2.md @@ -2,14 +2,14 @@ Author: Yaroslav O. Halchenko & Dorota Jarecka -The current approach: +## The current approach - [initial design doc](./doi-generation-1.md) - overall: - inject fake DOI upon dandiset creation - - mint proper DOI only upon dandiset publication + - mint proper DOI only upon dandiset publication (function `create_doi`) -## Issues with the Existing Approach +### Issues with the Existing Approach - [Stop injecting "fake" DOIs into draft dandisets](https://github.com/dandi/dandi-archive/issues/1709) - [Unpublished Dandisets display a DOI under `Cite As`](https://github.com/dandi/dandi-archive/issues/1932) @@ -32,28 +32,34 @@ DataCite allows for three types of DOIs ([DataCite](https://support.datacite.org We propose to: -- Instead of a fake DOI, upon creation of dandiset, mint and use a legit `Draft` DOI `10.48324/dandi.{dandiset.id}` with - - metadata entered during creation request (title, description, license) +- Instead of a fake DOI, upon creation of a **public** dandiset, mint and use a legit `Draft DOI` `10.48324/dandi.{dandiset.id}` with + - *minimal metadata* entered during creation request (title, description, license) - DLP URL `https://dandiarchive.org/dandiset/{dandiset.id}` - For embargoed dandiset, **do not** specify any metadata besides the DLP URL. + - If minting a DOI fails, we need to raise exception to inform developers about the issue but proceed with the creation of the dandiset. - Upon changes to dandiset metadata record, for public (non-embargoed dandisets), try to update datacite metadata record while keeping the same target URL. - - For Draft DOI (dandiset was not published yet), there is no validation. + - For `Draft DOI` (dandiset was not published yet), there is no validation. - **Question to clear up**: what happens to Draft DOI if metadata record is invalid? Does it fail to update altogether? does it update only the fields it knows about? - - For Findable DOI (dandiset was published), metadata record must pass validation, so we might fail to update. - But that should be ok. -- Upon unembargoing dandiset: update (Draft) DOI metadata record with current metadata. + - For `Findable DOI` (dandiset was published), metadata record must pass validation, so we might fail to update. + - That should be ok. Alternatively we could try to update only some most important metadata fields from the last released version of the dandiset (title, authors, ...). + - **TODO: figure out how to annotate Draft version, so it always says that it is a draft version and thus potentially not used for citation if that could be avoided** +- Upon changes to dandiset metadata record, for embargoed dandisets don't do anything. +- Upon unembargoing dandiset: update `Draft DOI` metadata record with current metadata **after** unembargoing. - Upon publication of the dandiset: - (already done currently) mint a proper `Findable` version DOI `10.48324/dandi.{dandiset.id}/{version}` - - update Dandiset-wide DOI `10.48324/dandi.{dandiset.id}` with metadata provided for version DOI, while keeping URL pointing to DLP instead of the released version. - - if Dandiset-wide DOI was in Draft state, it would be updated to Findable state (should work since we know metadata record passed validation). + - update Dandiset-wide DOI (`Draft` or `Findable`) `10.48324/dandi.{dandiset.id}` with metadata provided for the version DOI, while keeping URL pointing to DLP instead of the released version. + - if Dandiset-wide DOI was in `Draft` state, it would be updated to `Findable` state (should work since we know metadata record passed validation). + - **Question to clear up**:how to do that in API + - **Question to clear up**: behavior on what happens if metadata record is invalid? ## Concerns to keep in mind/address - Draft dandiset might not have sufficient metadata to mint a proper DOI, or metadata might not be "proper" (fail validation) thus causing issues with minting a DOI - - Solution: start with Draft (not findable) DOI, and then upon publication mint a "findable" DOI + - **Solution**: start with Draft (not findable) DOI, and then upon publication mint a "findable" DOI - **Follow up concern**: after dandiset and DOI publish, metadata of the Draft version of the dandiset could still be changed. - This potentially making changed record again "invalid".then when people change metadata -- test site of datacite had different result of validation that the primary one + This potentially making changed record again "invalid". + Should be Ok'ish +- Test site of datacite had different result of validation that the primary one - `Findable` DOI cannot be deleted, but in principle we allow for deletion of dandisets. - We might want a dedicated 404 page for deleted dandisets, or at least a message that the dandiset was deleted, and ideally describe the reason why it was deleted ("Upon request of maintainer", "Due to violation of terms of service", etc.) @@ -64,3 +70,13 @@ We propose to: - Should we do anything at DLP level? - Should we somehow reflect interactions with DataCite in Audit log? + + +# Targets TODO before implementation + +- develop a script, which tests on test fabric of datacite changes as introduced to all dandisets in the archive by + - for each dandiset + - generate a record for overall *dandiset DOI* corresponding to metadata of the first release if any exists, otherwise corresponding to metadata of the draft version + - for each release: mint a new *version DOI* for that release + possibly update *dandiset DOI* to correspond to potential changes in metadata + - update *dandiset DOI* to metadata of draft version + From d317fb68d39d0cb2dc84cd86a55ead97e9e3d3aa Mon Sep 17 00:00:00 2001 From: Yaroslav Halchenko Date: Fri, 17 Jan 2025 09:18:48 -0500 Subject: [PATCH 03/24] fix indentation --- doc/design/doi-generation-2.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/doc/design/doi-generation-2.md b/doc/design/doi-generation-2.md index a2fff0d06..36b37f2d8 100644 --- a/doc/design/doi-generation-2.md +++ b/doc/design/doi-generation-2.md @@ -33,10 +33,10 @@ DataCite allows for three types of DOIs ([DataCite](https://support.datacite.org We propose to: - Instead of a fake DOI, upon creation of a **public** dandiset, mint and use a legit `Draft DOI` `10.48324/dandi.{dandiset.id}` with - - *minimal metadata* entered during creation request (title, description, license) - - DLP URL `https://dandiarchive.org/dandiset/{dandiset.id}` - - For embargoed dandiset, **do not** specify any metadata besides the DLP URL. - - If minting a DOI fails, we need to raise exception to inform developers about the issue but proceed with the creation of the dandiset. + - *minimal metadata* entered during creation request (title, description, license) + - DLP URL `https://dandiarchive.org/dandiset/{dandiset.id}` + - For embargoed dandiset, **do not** specify any metadata besides the DLP URL. + - If minting a DOI fails, we need to raise exception to inform developers about the issue but proceed with the creation of the dandiset. - Upon changes to dandiset metadata record, for public (non-embargoed dandisets), try to update datacite metadata record while keeping the same target URL. - For `Draft DOI` (dandiset was not published yet), there is no validation. - **Question to clear up**: what happens to Draft DOI if metadata record is invalid? Does it fail to update altogether? does it update only the fields it knows about? From 9aa3e013578b97e81908c5783a04a2fc839b206d Mon Sep 17 00:00:00 2001 From: Yaroslav Halchenko Date: Fri, 17 Jan 2025 10:01:51 -0500 Subject: [PATCH 04/24] Simplify operation -- no changes to dandiset wide DOI record for a "Published" draft version of dandiset --- doc/design/doi-generation-2.md | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/doc/design/doi-generation-2.md b/doc/design/doi-generation-2.md index 36b37f2d8..4f4082e0c 100644 --- a/doc/design/doi-generation-2.md +++ b/doc/design/doi-generation-2.md @@ -37,11 +37,10 @@ We propose to: - DLP URL `https://dandiarchive.org/dandiset/{dandiset.id}` - For embargoed dandiset, **do not** specify any metadata besides the DLP URL. - If minting a DOI fails, we need to raise exception to inform developers about the issue but proceed with the creation of the dandiset. -- Upon changes to dandiset metadata record, for public (non-embargoed dandisets), try to update datacite metadata record while keeping the same target URL. - - For `Draft DOI` (dandiset was not published yet), there is no validation. +- Upon changes to dandiset metadata record (so, of a draft version of dandiset), for public (non-embargoed dandisets): + - For `Draft DOI` (dandiset was not published yet), there is no validation, try to update datacite metadata record while keeping the same target URL - **Question to clear up**: what happens to Draft DOI if metadata record is invalid? Does it fail to update altogether? does it update only the fields it knows about? - - For `Findable DOI` (dandiset was published), metadata record must pass validation, so we might fail to update. - - That should be ok. Alternatively we could try to update only some most important metadata fields from the last released version of the dandiset (title, authors, ...). + - For `Findable DOI` (dandiset was published at least once), we do not update anything since DLP points to that published version. - **TODO: figure out how to annotate Draft version, so it always says that it is a draft version and thus potentially not used for citation if that could be avoided** - Upon changes to dandiset metadata record, for embargoed dandisets don't do anything. - Upon unembargoing dandiset: update `Draft DOI` metadata record with current metadata **after** unembargoing. @@ -55,17 +54,19 @@ We propose to: ## Concerns to keep in mind/address - Draft dandiset might not have sufficient metadata to mint a proper DOI, or metadata might not be "proper" (fail validation) thus causing issues with minting a DOI - - **Solution**: start with Draft (not findable) DOI, and then upon publication mint a "findable" DOI + - **Solution**: start with Draft (not findable) DOI, and then upon publication mint a "findable" DOI. - **Follow up concern**: after dandiset and DOI publish, metadata of the Draft version of the dandiset could still be changed. This potentially making changed record again "invalid". Should be Ok'ish - Test site of datacite had different result of validation that the primary one +- `Draft` DOI is not visible/usable by users. We might want to switch it to `Findable` as soon ASAP (when datacite validates record ok). - `Findable` DOI cannot be deleted, but in principle we allow for deletion of dandisets. - We might want a dedicated 404 page for deleted dandisets, or at least a message that the dandiset was deleted, and ideally describe the reason why it was deleted ("Upon request of maintainer", "Due to violation of terms of service", etc.) - Then we adjust DOI record to point to that page. - Should we do anything at dandischema level? + - yes, `to_datacite` and potentially model might need to change to accomodate for needed changes. - Should we do anything at DLP level? From 3f39295a34c29ea5f6d03098bc5a122b01bdc8c5 Mon Sep 17 00:00:00 2001 From: Yaroslav Halchenko Date: Fri, 7 Mar 2025 09:36:47 -0500 Subject: [PATCH 05/24] Update doc/design/doi-generation-2.md --- doc/design/doi-generation-2.md | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/doc/design/doi-generation-2.md b/doc/design/doi-generation-2.md index 4f4082e0c..2aa7d2187 100644 --- a/doc/design/doi-generation-2.md +++ b/doc/design/doi-generation-2.md @@ -38,9 +38,13 @@ We propose to: - For embargoed dandiset, **do not** specify any metadata besides the DLP URL. - If minting a DOI fails, we need to raise exception to inform developers about the issue but proceed with the creation of the dandiset. - Upon changes to dandiset metadata record (so, of a draft version of dandiset), for public (non-embargoed dandisets): - - For `Draft DOI` (dandiset was not published yet), there is no validation, try to update datacite metadata record while keeping the same target URL - - **Question to clear up**: what happens to Draft DOI if metadata record is invalid? Does it fail to update altogether? does it update only the fields it knows about? - - For `Findable DOI` (dandiset was published at least once), we do not update anything since DLP points to that published version. + - For `Draft DOI` (dandiset was not published yet): try to update/make it `Findable`. + - If fails - keep Draft since there is no validation, try to update datacite metadata record while keeping the same target URL. + - **Question to clear up**: what happens to Draft DOI if metadata record is invalid? It seems to create one with no metadata, but does it update only the fields it knows about? + - For `Findable DOI` + - if it is still a draft version but which had legit metadata, we try to update metadata. If fails, we either ignore or just add a comment somewhere that "record might not reflect the most recent changes to draft version". + - I think we need to add to validation procedures, validation against datacite metadata record, and reporting errors to the user so that users address them before trying to publish. May be we should validate only if no other errors (our schema validation) were detected to reduce noise, or just give a summary that "Metadata is not satisfying datacite model, fix known metadata errors first." + - if dandiset was published at least once (has version) -- we do not update anything since DLP points to that published version. - **TODO: figure out how to annotate Draft version, so it always says that it is a draft version and thus potentially not used for citation if that could be avoided** - Upon changes to dandiset metadata record, for embargoed dandisets don't do anything. - Upon unembargoing dandiset: update `Draft DOI` metadata record with current metadata **after** unembargoing. From 4e833ccf0531d7b1896042ff8226037b6430dcdd Mon Sep 17 00:00:00 2001 From: Austin Macdonald Date: Tue, 8 Apr 2025 12:09:28 -0500 Subject: [PATCH 06/24] add overview --- doc/design/doi-generation-2.md | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/doc/design/doi-generation-2.md b/doc/design/doi-generation-2.md index 2aa7d2187..33480a693 100644 --- a/doc/design/doi-generation-2.md +++ b/doc/design/doi-generation-2.md @@ -1,6 +1,19 @@ # DOI for Draft Dandisets -Author: Yaroslav O. Halchenko & Dorota Jarecka +Authors: Yaroslav O. Halchenko, Dorota Jarecka, Austin Macdonald + +## Overview + +This document describes an updated strategy for DOI management within the Dandi Archive. +Upon creation, every public Dandiset will receive a **Dandiset DOI** that will represent the current draft and all future versions. +Every public published version of a Dandiset will recieve a **Version DOI**. + +For example: +Dandiset DOI: `https://doi.org/10.48324/dandi.000027/` +Version DOI: `https://doi.org/10.48324/dandi.000027/0.210831.2033` + +Prior to publication, the Dandiset DOI will refer to the latest draft version. +Following publication, the Dandiset DOI will refer to the latest published version. ## The current approach From ffaccae9e1c7abfce39bc9357f1466b5895b0a25 Mon Sep 17 00:00:00 2001 From: Austin Macdonald Date: Wed, 9 Apr 2025 12:22:22 -0500 Subject: [PATCH 07/24] Clarify and correct existing approach --- doc/design/doi-generation-2.md | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/doc/design/doi-generation-2.md b/doc/design/doi-generation-2.md index 33480a693..3931730ea 100644 --- a/doc/design/doi-generation-2.md +++ b/doc/design/doi-generation-2.md @@ -19,8 +19,11 @@ Following publication, the Dandiset DOI will refer to the latest published versi - [initial design doc](./doi-generation-1.md) - overall: - - inject fake DOI upon dandiset creation - - mint proper DOI only upon dandiset publication (function `create_doi`) + - leave DOI absent upon dandiset creation + - upon publication + - inject fake DOI (but do not save) and validate + - after validation, create a new `Version DOI` (function `create_doi`) + - publish dandiset ### Issues with the Existing Approach From d2d45ad92a1de654cb1b8b4da534943bc8cc9b63 Mon Sep 17 00:00:00 2001 From: Austin Macdonald Date: Wed, 9 Apr 2025 12:52:35 -0500 Subject: [PATCH 08/24] Clarify Dandiset DOI vs Version DOI, and update based on asmacdo/yarikoptic discussion --- doc/design/doi-generation-2.md | 61 ++++++++++++++++++++++------------ 1 file changed, 40 insertions(+), 21 deletions(-) diff --git a/doc/design/doi-generation-2.md b/doc/design/doi-generation-2.md index 3931730ea..d54ab8076 100644 --- a/doc/design/doi-generation-2.md +++ b/doc/design/doi-generation-2.md @@ -15,6 +15,9 @@ Version DOI: `https://doi.org/10.48324/dandi.000027/0.210831.2033` Prior to publication, the Dandiset DOI will refer to the latest draft version. Following publication, the Dandiset DOI will refer to the latest published version. +At creation the `Dandiset DOI` will be a DataCite `Draft DOI`, but will be "promoted" to a DataCite `Findable DOI` as soon as possible. +`Version DOI` will always be a `Findable DOI`. + ## The current approach - [initial design doc](./doi-generation-1.md) @@ -37,6 +40,7 @@ Initially proposed/discussed in - [Create and maintain a "Findable" DOI for the Dandiset as a whole](https://github.com/dandi/dandi-archive/issues/1319) and boils down to the adoption of approach of Zenodo of having a DOI which always points to the latest version of the record. +[Zenodo uses the language](https://support.zenodo.org/help/en-gb/1-upload-deposit/97-what-is-doi-versioning) `Concept DOI` to mean a top-level DOI that references all versions, which we will refer to as `Dandiset DOI`. DataCite allows for three types of DOIs ([DataCite](https://support.datacite.org/docs/what-does-the-state-of-the-doi-mean)): @@ -48,33 +52,40 @@ DataCite allows for three types of DOIs ([DataCite](https://support.datacite.org We propose to: -- Instead of a fake DOI, upon creation of a **public** dandiset, mint and use a legit `Draft DOI` `10.48324/dandi.{dandiset.id}` with +- Upon creation of a **public** dandiset, mint a `Dandiset DOI` (a DataCite `Draft DOI`) `10.48324/dandi.{dandiset.id}` with - *minimal metadata* entered during creation request (title, description, license) - DLP URL `https://dandiarchive.org/dandiset/{dandiset.id}` - For embargoed dandiset, **do not** specify any metadata besides the DLP URL. - If minting a DOI fails, we need to raise exception to inform developers about the issue but proceed with the creation of the dandiset. -- Upon changes to dandiset metadata record (so, of a draft version of dandiset), for public (non-embargoed dandisets): - - For `Draft DOI` (dandiset was not published yet): try to update/make it `Findable`. - - If fails - keep Draft since there is no validation, try to update datacite metadata record while keeping the same target URL. - - **Question to clear up**: what happens to Draft DOI if metadata record is invalid? It seems to create one with no metadata, but does it update only the fields it knows about? - - For `Findable DOI` - - if it is still a draft version but which had legit metadata, we try to update metadata. If fails, we either ignore or just add a comment somewhere that "record might not reflect the most recent changes to draft version". - - I think we need to add to validation procedures, validation against datacite metadata record, and reporting errors to the user so that users address them before trying to publish. May be we should validate only if no other errors (our schema validation) were detected to reduce noise, or just give a summary that "Metadata is not satisfying datacite model, fix known metadata errors first." +- Upon changes to a non-embargoed, draft dandiset metadata record: + - If `Draft DOI`, attempt to "promote" it to `Findable`. + - If validation fails - keep `Draft DOI` (very limited validation), attempt to update datacite metadata record while keeping the same target URL. + - **Question to clear up**: what happens to `Draft DOI` if metadata record is invalid? It seems to create one with no metadata, but does it update only the fields it knows about? + - If `Findable DOI` + - if draft version with no prior publications, but which had legit metadata, we try to update metadata. + - **Question to clear up** If fails, we either ignore or just add a comment somewhere that "record might not reflect the most recent changes to draft version". + - **Question to clear up** If we add to validation procedures to dandiset updates, (validation against datacite metadata record), we can report errors to the user so they can be addressed prior to attempted publication. May be we should validate only if no other errors (our schema validation) were detected to reduce noise, or just give a summary that "Metadata is not satisfying datacite model, fix known metadata errors first." - if dandiset was published at least once (has version) -- we do not update anything since DLP points to that published version. - **TODO: figure out how to annotate Draft version, so it always says that it is a draft version and thus potentially not used for citation if that could be avoided** -- Upon changes to dandiset metadata record, for embargoed dandisets don't do anything. -- Upon unembargoing dandiset: update `Draft DOI` metadata record with current metadata **after** unembargoing. +- Upon changes to embargoed dandiset metadata record, don't do anything. +- Upon unembargoing dandiset: update `Draft DOI` metadata record with current metadata and attempt to promote to a `Findable DOI` **after** unembargoing. - Upon publication of the dandiset: - - (already done currently) mint a proper `Findable` version DOI `10.48324/dandi.{dandiset.id}/{version}` - - update Dandiset-wide DOI (`Draft` or `Findable`) `10.48324/dandi.{dandiset.id}` with metadata provided for the version DOI, while keeping URL pointing to DLP instead of the released version. - - if Dandiset-wide DOI was in `Draft` state, it would be updated to `Findable` state (should work since we know metadata record passed validation). + - (already done currently) mint a proper `Findable` `Version DOI`, ie `10.48324/dandi.{dandiset.id}/{version}` + - update `Dandiset DOI` `10.48324/dandi.{dandiset.id}` with metadata provided for the `Version DOI`, **while keeping URL pointing to DLP instead of the released version**. + - if `Dandiset DOI` was a `Draft DOI` state, promote to `Findable DOI` (should work since we know metadata record passed validation). - **Question to clear up**:how to do that in API - **Question to clear up**: behavior on what happens if metadata record is invalid? +### Migration + +A django-admin script should be created and executed to create a `Dandiset DOI` for all existing dandisets. + +**Question to address**: Will adding a `Dandiset DOI` in addition to `Version DOI` require a db migration? + ## Concerns to keep in mind/address -- Draft dandiset might not have sufficient metadata to mint a proper DOI, or metadata might not be "proper" (fail validation) thus causing issues with minting a DOI - - **Solution**: start with Draft (not findable) DOI, and then upon publication mint a "findable" DOI. +- Draft dandiset might not have sufficient (or valid) metadata to promote to a `Findable` DOI, thus causing issues with minting a DOI + - **Solution**: start with Draft (not findable) DOI, and then upon publication promote to a "findable" DOI. - **Follow up concern**: after dandiset and DOI publish, metadata of the Draft version of the dandiset could still be changed. This potentially making changed record again "invalid". Should be Ok'ish @@ -82,16 +93,24 @@ We propose to: - `Draft` DOI is not visible/usable by users. We might want to switch it to `Findable` as soon ASAP (when datacite validates record ok). - `Findable` DOI cannot be deleted, but in principle we allow for deletion of dandisets. - - We might want a dedicated 404 page for deleted dandisets, or at least a message that the dandiset was deleted, and ideally describe the reason why it was deleted ("Upon request of maintainer", "Due to violation of terms of service", etc.) - - Then we adjust DOI record to point to that page. + - Option 1: We might want a dedicated 404 page for deleted dandisets, then we adjust DOI record to point to that page. + - Option 2: at least a message that the dandiset was deleted, and ideally describe the reason why it was deleted ("Upon request of maintainer", "Due to violation of terms of service", etc.) - Should we do anything at dandischema level? - - yes, `to_datacite` and potentially model might need to change to accomodate for needed changes. + - yes + - Needs to be able to mint `Draft DOI` + - Needs to be able to promote `Draft DOI` to `Findable DOI` + - potentially model might need to change - Should we do anything at DLP level? - -- Should we somehow reflect interactions with DataCite in Audit log? - + - We may want to include `Dandiset DOI` somewhere, in addition to the `Version DOI` which we currently use. + +- Should we somehow reflect interactions with DataCite in Audit log? Possible things to log: + - `Dandiset DOI` + - Success/Fail creation of `Draft DOI` + - Success/Fail promotion of `Draft DOI` to `Findable DOI` (Expected to fail if metadata is incomplete) + - `Version DOI` + - Success/Fail creation of `Findable DOI` # Targets TODO before implementation From 981355a5163a952228848043a3037787f1e795b3 Mon Sep 17 00:00:00 2001 From: Austin Macdonald Date: Wed, 9 Apr 2025 15:41:04 -0500 Subject: [PATCH 09/24] add mermaid diagram (Yarik + Claude draft) --- doc/design/doi-generation-2.md | 70 ++++++++++++++++++++++++++++++++++ 1 file changed, 70 insertions(+) diff --git a/doc/design/doi-generation-2.md b/doc/design/doi-generation-2.md index d54ab8076..cfef584f0 100644 --- a/doc/design/doi-generation-2.md +++ b/doc/design/doi-generation-2.md @@ -76,6 +76,76 @@ We propose to: - **Question to clear up**:how to do that in API - **Question to clear up**: behavior on what happens if metadata record is invalid? +### Sequence Diagram + +```mermaid +sequenceDiagram + participant User + participant DandiArchive as Dandi Archive + participant DataCite + + Note over User,DataCite: Public Dandiset Creation + User->>DandiArchive: Create public dandiset + DandiArchive->>DataCite: Request Draft DOI (10.48324/dandi.{id}) + DataCite-->>DandiArchive: Return Draft DOI + Note right of DataCite: Minimal metadata + DLP URL + DandiArchive-->>User: Return dandiset with Draft DOI + + Note over User,DataCite: Metadata Updates (Public Dandiset) + User->>DandiArchive: Update dandiset metadata + alt Draft DOI (not published yet) + DandiArchive->>DataCite: Try to update & make DOI Findable + alt Validation successful + DataCite-->>DandiArchive: Update to Findable DOI + else Validation fails + DataCite-->>DandiArchive: Keep as Draft DOI + DandiArchive->>DataCite: Update metadata record only + end + else Findable DOI (previous draft with valid metadata) + DandiArchive->>DataCite: Try to update metadata + alt Validation successful + DataCite-->>DandiArchive: Update metadata + else Validation fails + DandiArchive-->>DandiArchive: Log error, continue + end + else Published dandiset (has version) + DandiArchive-->>DandiArchive: No DOI update (points to published version) + end + DandiArchive-->>User: Return updated dandiset + + Note over User,DataCite: Embargoed Dandiset Handling + User->>DandiArchive: Create embargoed dandiset + DandiArchive->>DataCite: Request Draft DOI (minimal info) + Note right of DataCite: Only DLP URL, no metadata + DataCite-->>DandiArchive: Return Draft DOI + DandiArchive-->>User: Return dandiset with Draft DOI + + User->>DandiArchive: Unembargo dandiset + DandiArchive->>DataCite: Update Draft DOI with current metadata + DataCite-->>DandiArchive: Update Draft DOI + DandiArchive-->>User: Return unembaroed dandiset + + Note over User,DataCite: Dandiset Publication + User->>DandiArchive: Publish dandiset + DandiArchive->>DataCite: Mint version DOI (10.48324/dandi.{id}/{version}) + DataCite-->>DandiArchive: Return Findable version DOI + + DandiArchive->>DataCite: Update dandiset-wide DOI with version metadata + alt Draft DOI + DataCite-->>DandiArchive: Update to Findable state + else Already Findable + DataCite-->>DandiArchive: Update metadata only + end + + DandiArchive-->>User: Return published dandiset with both DOIs + + Note over User,DataCite: Dandiset Deletion + User->>DandiArchive: Delete dandiset + DandiArchive->>DataCite: Update DOI to point to deletion page + DataCite-->>DandiArchive: Update DOI target URL + DandiArchive-->>User: Confirm deletion +``` + ### Migration A django-admin script should be created and executed to create a `Dandiset DOI` for all existing dandisets. From 028ca6753b99f42e77171faf5be2d69773caed96 Mon Sep 17 00:00:00 2001 From: Austin Macdonald Date: Wed, 9 Apr 2025 16:04:12 -0500 Subject: [PATCH 10/24] Add recovery flow for updating dandisets that don't have DOIs due to previous failures Clarify no DOIs for dandisets while embargoed --- doc/design/doi-generation-2.md | 70 ++++++++++++++++++++++------------ 1 file changed, 45 insertions(+), 25 deletions(-) diff --git a/doc/design/doi-generation-2.md b/doc/design/doi-generation-2.md index cfef584f0..173baffd4 100644 --- a/doc/design/doi-generation-2.md +++ b/doc/design/doi-generation-2.md @@ -86,60 +86,80 @@ sequenceDiagram Note over User,DataCite: Public Dandiset Creation User->>DandiArchive: Create public dandiset - DandiArchive->>DataCite: Request Draft DOI (10.48324/dandi.{id}) - DataCite-->>DandiArchive: Return Draft DOI + DandiArchive->>DataCite: Mint Dandiset DOI (Draft) (10.48324/dandi.{id}) Note right of DataCite: Minimal metadata + DLP URL - DandiArchive-->>User: Return dandiset with Draft DOI + alt DOI minting successful + DataCite-->>DandiArchive: Return Dandiset DOI (Draft) + else DOI minting fails + DataCite-->>DandiArchive: Error + Note right of DandiArchive: Log error but continue dandiset creation + end + DandiArchive-->>User: Return dandiset with or without Dandiset DOI - Note over User,DataCite: Metadata Updates (Public Dandiset) + Note over User,DataCite: Metadata Updates (Non-embargoed Draft) User->>DandiArchive: Update dandiset metadata - alt Draft DOI (not published yet) - DandiArchive->>DataCite: Try to update & make DOI Findable + alt No Dandiset DOI exists (previous mint failed) + DandiArchive->>DataCite: Mint Dandiset DOI (Draft) + DataCite-->>DandiArchive: Return Dandiset DOI (Draft) + DandiArchive->>DataCite: Try to promote to Dandiset DOI (Findable) + alt Validation successful + DataCite-->>DandiArchive: Update to Dandiset DOI (Findable) + else Validation fails + DataCite-->>DandiArchive: Keep as Dandiset DOI (Draft) + end + else Dandiset DOI (Draft) exists + DandiArchive->>DataCite: Try to promote to Dandiset DOI (Findable) alt Validation successful - DataCite-->>DandiArchive: Update to Findable DOI + DataCite-->>DandiArchive: Update to Dandiset DOI (Findable) else Validation fails - DataCite-->>DandiArchive: Keep as Draft DOI + DataCite-->>DandiArchive: Keep as Dandiset DOI (Draft) DandiArchive->>DataCite: Update metadata record only end - else Findable DOI (previous draft with valid metadata) - DandiArchive->>DataCite: Try to update metadata + else Dandiset DOI (Findable) exists (no prior publications) + DandiArchive->>DataCite: Update metadata alt Validation successful DataCite-->>DandiArchive: Update metadata else Validation fails DandiArchive-->>DandiArchive: Log error, continue end - else Published dandiset (has version) + else Dandiset has published versions DandiArchive-->>DandiArchive: No DOI update (points to published version) end DandiArchive-->>User: Return updated dandiset Note over User,DataCite: Embargoed Dandiset Handling User->>DandiArchive: Create embargoed dandiset + DandiArchive-->>DandiArchive: No DOI updates for embargoed dandisets + User->>DandiArchive: Update embargoed dandiset + DandiArchive-->>DandiArchive: No DOI updates for embargoed dandisets + + User->>DandiArchive: Unembargo dandiset DandiArchive->>DataCite: Request Draft DOI (minimal info) Note right of DataCite: Only DLP URL, no metadata DataCite-->>DandiArchive: Return Draft DOI - DandiArchive-->>User: Return dandiset with Draft DOI - - User->>DandiArchive: Unembargo dandiset - DandiArchive->>DataCite: Update Draft DOI with current metadata - DataCite-->>DandiArchive: Update Draft DOI - DandiArchive-->>User: Return unembaroed dandiset + DandiArchive->>DataCite: Try to promote to Dandiset DOI (Findable) + alt Validation successful + DataCite-->>DandiArchive: Update to Dandiset DOI (Findable) + else Validation fails + DataCite-->>DandiArchive: Keep as Dandiset DOI (Draft) + end + DandiArchive-->>User: Return unembargoed dandiset Note over User,DataCite: Dandiset Publication User->>DandiArchive: Publish dandiset - DandiArchive->>DataCite: Mint version DOI (10.48324/dandi.{id}/{version}) - DataCite-->>DandiArchive: Return Findable version DOI - - DandiArchive->>DataCite: Update dandiset-wide DOI with version metadata - alt Draft DOI - DataCite-->>DandiArchive: Update to Findable state + alt Dandiset DOI is Draft + DandiArchive->>DataCite: Promote to Dandiset DOI (Findable) + DataCite-->>DandiArchive: Update to Dandiset DOI (Findable) else Already Findable DataCite-->>DandiArchive: Update metadata only end - + DandiArchive->>DataCite: Mint Version DOI (Findable) (10.48324/dandi.{id}/{version}) + DataCite-->>DandiArchive: Return Version DOI (Findable) + DandiArchive->>DataCite: Update Dandiset DOI with version metadata + Note right of DandiArchive: Keep URL pointing to DLP DandiArchive-->>User: Return published dandiset with both DOIs - Note over User,DataCite: Dandiset Deletion + Note over User,DataCite: Dandiset Deletion (Optional) User->>DandiArchive: Delete dandiset DandiArchive->>DataCite: Update DOI to point to deletion page DataCite-->>DandiArchive: Update DOI target URL From ac7ce09a8ac6ff51f713b97fe2f57a811dea8f92 Mon Sep 17 00:00:00 2001 From: Yaroslav Halchenko Date: Mon, 14 Apr 2025 15:53:39 -0400 Subject: [PATCH 11/24] Add Option 3 for delete to turn into Registered Co-authored-by: Austin Macdonald --- doc/design/doi-generation-2.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/design/doi-generation-2.md b/doc/design/doi-generation-2.md index 173baffd4..a91264b51 100644 --- a/doc/design/doi-generation-2.md +++ b/doc/design/doi-generation-2.md @@ -185,7 +185,7 @@ A django-admin script should be created and executed to create a `Dandiset DOI` - `Findable` DOI cannot be deleted, but in principle we allow for deletion of dandisets. - Option 1: We might want a dedicated 404 page for deleted dandisets, then we adjust DOI record to point to that page. - Option 2: at least a message that the dandiset was deleted, and ideally describe the reason why it was deleted ("Upon request of maintainer", "Due to violation of terms of service", etc.) - + - Option 3: We can "hide" the DOI by changing it from Findable to Registered - Should we do anything at dandischema level? - yes - Needs to be able to mint `Draft DOI` From edf75e64e4655ef61924c4b6476d560f3fda04ad Mon Sep 17 00:00:00 2001 From: Yaroslav Halchenko Date: Wed, 23 Apr 2025 16:05:46 -0400 Subject: [PATCH 12/24] Apply suggestions from @asmacdo code review to streamline logic/diagram Co-authored-by: Austin Macdonald --- doc/design/doi-generation-2.md | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/doc/design/doi-generation-2.md b/doc/design/doi-generation-2.md index a91264b51..ce82f9304 100644 --- a/doc/design/doi-generation-2.md +++ b/doc/design/doi-generation-2.md @@ -6,14 +6,13 @@ Authors: Yaroslav O. Halchenko, Dorota Jarecka, Austin Macdonald This document describes an updated strategy for DOI management within the Dandi Archive. Upon creation, every public Dandiset will receive a **Dandiset DOI** that will represent the current draft and all future versions. -Every public published version of a Dandiset will recieve a **Version DOI**. +Every public published version of a Dandiset will receive a **Version DOI**. For example: Dandiset DOI: `https://doi.org/10.48324/dandi.000027/` Version DOI: `https://doi.org/10.48324/dandi.000027/0.210831.2033` -Prior to publication, the Dandiset DOI will refer to the latest draft version. -Following publication, the Dandiset DOI will refer to the latest published version. +The Dandiset DOI will always refer to the DLP At creation the `Dandiset DOI` will be a DataCite `Draft DOI`, but will be "promoted" to a DataCite `Findable DOI` as soon as possible. `Version DOI` will always be a `Findable DOI`. @@ -33,7 +32,7 @@ At creation the `Dandiset DOI` will be a DataCite `Draft DOI`, but will be "prom - [Stop injecting "fake" DOIs into draft dandisets](https://github.com/dandi/dandi-archive/issues/1709) - [Unpublished Dandisets display a DOI under `Cite As`](https://github.com/dandi/dandi-archive/issues/1932) -## Proposed Solution +## Background Initially proposed/discussed in @@ -50,7 +49,7 @@ DataCite allows for three types of DOIs ([DataCite](https://support.datacite.org - `Findable`. Is the type we use for published dandisets. Requires to be valid (pass validation to fit the datacite schema) to be created. -We propose to: +## Proposed Solution - Upon creation of a **public** dandiset, mint a `Dandiset DOI` (a DataCite `Draft DOI`) `10.48324/dandi.{dandiset.id}` with - *minimal metadata* entered during creation request (title, description, license) @@ -161,7 +160,10 @@ sequenceDiagram Note over User,DataCite: Dandiset Deletion (Optional) User->>DandiArchive: Delete dandiset - DandiArchive->>DataCite: Update DOI to point to deletion page + alt Dandiset DOI is Draft + DandiArchive->>DataCite: Delete Draft DOI + else Dandiset DOI is Findable + DandiArchive->>DataCite: "hide" DOI (Convert to "Registered") and point DOI to tombstone page DataCite-->>DandiArchive: Update DOI target URL DandiArchive-->>User: Confirm deletion ``` From 51dc9d339ec05c7626608bd00834d3ad2aa7a617 Mon Sep 17 00:00:00 2001 From: Yaroslav Halchenko Date: Fri, 2 May 2025 16:07:34 -0400 Subject: [PATCH 13/24] Decide upon deletion of dandiset and DOI --- doc/design/doi-generation-2.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/doc/design/doi-generation-2.md b/doc/design/doi-generation-2.md index ce82f9304..4bd6f5e36 100644 --- a/doc/design/doi-generation-2.md +++ b/doc/design/doi-generation-2.md @@ -184,10 +184,11 @@ A django-admin script should be created and executed to create a `Dandiset DOI` - Test site of datacite had different result of validation that the primary one - `Draft` DOI is not visible/usable by users. We might want to switch it to `Findable` as soon ASAP (when datacite validates record ok). -- `Findable` DOI cannot be deleted, but in principle we allow for deletion of dandisets. - - Option 1: We might want a dedicated 404 page for deleted dandisets, then we adjust DOI record to point to that page. - - Option 2: at least a message that the dandiset was deleted, and ideally describe the reason why it was deleted ("Upon request of maintainer", "Due to violation of terms of service", etc.) - - Option 3: We can "hide" the DOI by changing it from Findable to Registered +- Dandisets can be deleted. In such cases, upon deletion of a dandiset: + - if DOI was a `Draft` DOI - just delete it as well. + - if DOI was a `Findable` DOI - convert to `Registered` DOI (follows [datacite best practices](https://support.datacite.org/docs/tombstone-pages)) + - Also at the level of the DANDI archive itself we should provide tombstone page so URL is still "working" (#3211) + - If no tombstone page support added, just adjusted URL in datacite record to point to https://www.datacite.org/invalid.html - Should we do anything at dandischema level? - yes - Needs to be able to mint `Draft DOI` From f8b560ff60b7fba31590edf832f685281ddf17ab Mon Sep 17 00:00:00 2001 From: Austin Macdonald Date: Mon, 5 May 2025 12:30:43 -0500 Subject: [PATCH 14/24] Alter text after discussion with @yarikoptic and @djarecka --- doc/design/doi-generation-2.md | 127 ++++++++++++++++++++------------- 1 file changed, 77 insertions(+), 50 deletions(-) diff --git a/doc/design/doi-generation-2.md b/doc/design/doi-generation-2.md index 4bd6f5e36..e55843bc7 100644 --- a/doc/design/doi-generation-2.md +++ b/doc/design/doi-generation-2.md @@ -14,8 +14,8 @@ Version DOI: `https://doi.org/10.48324/dandi.000027/0.210831.2033` The Dandiset DOI will always refer to the DLP -At creation the `Dandiset DOI` will be a DataCite `Draft DOI`, but will be "promoted" to a DataCite `Findable DOI` as soon as possible. -`Version DOI` will always be a `Findable DOI`. +At creation the `Dandiset DOI` will be a DataCite `Draft DOI`, but will be "promoted" to a DataCite `Findable DOI` as soon as there is a published version. +A `Version DOI` will be created as a `Findable DOI`. ## The current approach @@ -51,29 +51,39 @@ DataCite allows for three types of DOIs ([DataCite](https://support.datacite.org ## Proposed Solution -- Upon creation of a **public** dandiset, mint a `Dandiset DOI` (a DataCite `Draft DOI`) `10.48324/dandi.{dandiset.id}` with - - *minimal metadata* entered during creation request (title, description, license) - - DLP URL `https://dandiarchive.org/dandiset/{dandiset.id}` - - For embargoed dandiset, **do not** specify any metadata besides the DLP URL. - - If minting a DOI fails, we need to raise exception to inform developers about the issue but proceed with the creation of the dandiset. -- Upon changes to a non-embargoed, draft dandiset metadata record: - - If `Draft DOI`, attempt to "promote" it to `Findable`. - - If validation fails - keep `Draft DOI` (very limited validation), attempt to update datacite metadata record while keeping the same target URL. - - **Question to clear up**: what happens to `Draft DOI` if metadata record is invalid? It seems to create one with no metadata, but does it update only the fields it knows about? - - If `Findable DOI` - - if draft version with no prior publications, but which had legit metadata, we try to update metadata. - - **Question to clear up** If fails, we either ignore or just add a comment somewhere that "record might not reflect the most recent changes to draft version". - - **Question to clear up** If we add to validation procedures to dandiset updates, (validation against datacite metadata record), we can report errors to the user so they can be addressed prior to attempted publication. May be we should validate only if no other errors (our schema validation) were detected to reduce noise, or just give a summary that "Metadata is not satisfying datacite model, fix known metadata errors first." - - if dandiset was published at least once (has version) -- we do not update anything since DLP points to that published version. - - **TODO: figure out how to annotate Draft version, so it always says that it is a draft version and thus potentially not used for citation if that could be avoided** -- Upon changes to embargoed dandiset metadata record, don't do anything. -- Upon unembargoing dandiset: update `Draft DOI` metadata record with current metadata and attempt to promote to a `Findable DOI` **after** unembargoing. -- Upon publication of the dandiset: - - (already done currently) mint a proper `Findable` `Version DOI`, ie `10.48324/dandi.{dandiset.id}/{version}` - - update `Dandiset DOI` `10.48324/dandi.{dandiset.id}` with metadata provided for the `Version DOI`, **while keeping URL pointing to DLP instead of the released version**. - - if `Dandiset DOI` was a `Draft DOI` state, promote to `Findable DOI` (should work since we know metadata record passed validation). - - **Question to clear up**:how to do that in API - - **Question to clear up**: behavior on what happens if metadata record is invalid? +- For **Public dandisets**: + - Upon creation: + - mint a `Dandiset DOI` (a DataCite `Draft DOI`) `10.48324/dandi.{dandiset.id}` with *minimal metadata* entered during creation request (title, description, license) + - URL should point be DLP `https://dandiarchive.org/dandiset/{dandiset.id}` + - If minting a DOI fails, we need to raise exception to inform developers about the issue but proceed with the creation of the dandiset. + - Upon updates to a draft dandiset metadata **prior to first publication**: + - Update the datacite metadata of the `Draft DOI`, (leave as draft) + - If validation fails, log error and continue + - Upon deletion of a draft dandiset metadata **prior to first publication**: + - Delete the `Dandiset DOI` (Draft) from Datacite + - Upon **first publication** of a dandiset: + - Mint a new `Version DOI` (Findable) (already done currently), ie `10.48324/dandi.{dandiset.id}/{version}` + - Update `Dandiset DOI` metadata to match published version + - promote `Dandiset DOI` (Draft) to `Findable DOI` + - Upon updates to draft dandiset metadata **after the first publication**" + - no-op. The `Dandiset DOI` metadata will match the most recent publication. + - Upon deletion of a draft dandiset metadata **after the first publication**: + - "hide" the `Dandiset DOI` (Findable) to `Registered DOI` + - Upon **subsequent publications** of a dandiset: + - Mint a new `Version DOI` + - Update `Dandiset DOI` metadata to match published version +- For **embargoed dandiset**: + - Upon creation, no DOI is created. + - Upon changes to embargoed dandiset metadata record, don't do anything. + - Upon deletion of an embargoed dandiset: + - Delete the `Dandiset DOI` (Draft) from Datacite + - Upon unembargoing dandiset: + - If there are published versions: + - Mint `Dandiset DOI` (Findable) with latest published version of metadata, + - Mint `Version DOI` for each published version. + - If there are no published versions: + - Mint `Dandiset DOI` (Draft) with latest metadata, + ### Sequence Diagram @@ -172,44 +182,61 @@ sequenceDiagram A django-admin script should be created and executed to create a `Dandiset DOI` for all existing dandisets. -**Question to address**: Will adding a `Dandiset DOI` in addition to `Version DOI` require a db migration? +No new field will be added for `Dandiset DOI`. +Instead, the `Draft Dandiset` DOI field will be where the `Dandiset DOI` is stored. -## Concerns to keep in mind/address +### Dandi Schema Changes -- Draft dandiset might not have sufficient (or valid) metadata to promote to a `Findable` DOI, thus causing issues with minting a DOI - - **Solution**: start with Draft (not findable) DOI, and then upon publication promote to a "findable" DOI. - - **Follow up concern**: after dandiset and DOI publish, metadata of the Draft version of the dandiset could still be changed. - This potentially making changed record again "invalid". - Should be Ok'ish -- Test site of datacite had different result of validation that the primary one +`dandi-schema` function `to_datacite` is currently only able to create a `Draft DOI` (`publish=True`) or create a `Findable DOI` (`publish=False`) + +It will need to be extended to: + - "publish" `Draft DOI` to `Findable DOI` + - "hide" `Findable DOI` to `Registered DOI` + +We will keep (and deprecate) the `publish` parameter, and add a new parameter `event` which is either: + - (None): Draft DOI + - `publish`: Findable DOI + - `hide`: Registered DOI + +## Alternatives Explored + +### Creating DOIs for Embargoed Dandisets + +We opted not to create DOIs for embargoed Dandisets because: + - We own the prefix, and so there is no need to "reserve" + - We should avoid sending any potentially secret metadata to a 3rd party, even if it is not publicly searchable. + - If we were to create a DOI with fake metadata that probably would not have any value at all. + - What the DOIs will eventually be upon publication is semantically determined, so the value can be used even prior to being "real". + +### Promoting Draft DOIs to Findable for Draft Dandisets + +There might be some value in having a `Findable DOI` (Version DOI and/or Dandiset DOI) that points to the draft version of a Dandiset. +This is because `Draft` DOI is not visible/usable by users. + +However, if we promote the `Draft DOI` to `Findable` as soon as it is valid, and the user then change it to be invalid again, the DOI metadata will be wrong. +We discussed annotating the DOI, ie "potentially incorrect metadata", but we ultimately decided that the messiness is not worth the value. -- `Draft` DOI is not visible/usable by users. We might want to switch it to `Findable` as soon ASAP (when datacite validates record ok). - Dandisets can be deleted. In such cases, upon deletion of a dandiset: - if DOI was a `Draft` DOI - just delete it as well. - if DOI was a `Findable` DOI - convert to `Registered` DOI (follows [datacite best practices](https://support.datacite.org/docs/tombstone-pages)) - Also at the level of the DANDI archive itself we should provide tombstone page so URL is still "working" (#3211) - If no tombstone page support added, just adjusted URL in datacite record to point to https://www.datacite.org/invalid.html -- Should we do anything at dandischema level? - - yes - - Needs to be able to mint `Draft DOI` - - Needs to be able to promote `Draft DOI` to `Findable DOI` - - potentially model might need to change -- Should we do anything at DLP level? - - We may want to include `Dandiset DOI` somewhere, in addition to the `Version DOI` which we currently use. +## Concerns to keep in mind/address +- **Question to clear up**: what happens to `Draft DOI` if metadata record is invalid? + - It seems to create one with no metadata, but does it update only the fields it knows about? +- **Question to clear up** If we add to validation procedures to dandiset updates, (validation against datacite metadata record), we can report errors to the user so they can be addressed prior to attempted publication. May be we should validate only if no other errors (our schema validation) were detected to reduce noise, or just give a summary that "Metadata is not satisfying datacite model, fix known metadata errors first." +- **TODO: figure out how to annotate Draft version, so it always says that it is a draft version and thus potentially not used for citation if that could be avoided** + - We do not need to annotate `Draft DOI` metadata since it is not visible. + - If the `Dandiset DOI` is visible on the Draft Dandiset page, we should consider changing the "Cite As" or add an additional field. + - Zenodo's "Concept DOIs" are presented as "Cite all versions" but we didn't think this was clear enough. +- Test site of datacite had different result of validation that the primary one +- We may want to include `Dandiset DOI` somewhere on published versions too, in addition to the `Version DOI` which we currently use. + - The "Draft Dandiset" Version will be populated with `Dandiset DOI`, so this may not be necessary. - Should we somehow reflect interactions with DataCite in Audit log? Possible things to log: - `Dandiset DOI` - Success/Fail creation of `Draft DOI` - Success/Fail promotion of `Draft DOI` to `Findable DOI` (Expected to fail if metadata is incomplete) - `Version DOI` - Success/Fail creation of `Findable DOI` - -# Targets TODO before implementation - -- develop a script, which tests on test fabric of datacite changes as introduced to all dandisets in the archive by - - for each dandiset - - generate a record for overall *dandiset DOI* corresponding to metadata of the first release if any exists, otherwise corresponding to metadata of the draft version - - for each release: mint a new *version DOI* for that release + possibly update *dandiset DOI* to correspond to potential changes in metadata - - update *dandiset DOI* to metadata of draft version - From 83718ee3857c5baee6880065132595462f1d3391 Mon Sep 17 00:00:00 2001 From: Austin Macdonald Date: Mon, 5 May 2025 12:35:21 -0500 Subject: [PATCH 15/24] Chart Remove: Findable DOI prior to publication --- doc/design/doi-generation-2.md | 35 +++------------------------------- 1 file changed, 3 insertions(+), 32 deletions(-) diff --git a/doc/design/doi-generation-2.md b/doc/design/doi-generation-2.md index e55843bc7..3df9ac1b9 100644 --- a/doc/design/doi-generation-2.md +++ b/doc/design/doi-generation-2.md @@ -110,29 +110,6 @@ sequenceDiagram alt No Dandiset DOI exists (previous mint failed) DandiArchive->>DataCite: Mint Dandiset DOI (Draft) DataCite-->>DandiArchive: Return Dandiset DOI (Draft) - DandiArchive->>DataCite: Try to promote to Dandiset DOI (Findable) - alt Validation successful - DataCite-->>DandiArchive: Update to Dandiset DOI (Findable) - else Validation fails - DataCite-->>DandiArchive: Keep as Dandiset DOI (Draft) - end - else Dandiset DOI (Draft) exists - DandiArchive->>DataCite: Try to promote to Dandiset DOI (Findable) - alt Validation successful - DataCite-->>DandiArchive: Update to Dandiset DOI (Findable) - else Validation fails - DataCite-->>DandiArchive: Keep as Dandiset DOI (Draft) - DandiArchive->>DataCite: Update metadata record only - end - else Dandiset DOI (Findable) exists (no prior publications) - DandiArchive->>DataCite: Update metadata - alt Validation successful - DataCite-->>DandiArchive: Update metadata - else Validation fails - DandiArchive-->>DandiArchive: Log error, continue - end - else Dandiset has published versions - DandiArchive-->>DandiArchive: No DOI update (points to published version) end DandiArchive-->>User: Return updated dandiset @@ -146,20 +123,14 @@ sequenceDiagram DandiArchive->>DataCite: Request Draft DOI (minimal info) Note right of DataCite: Only DLP URL, no metadata DataCite-->>DandiArchive: Return Draft DOI - DandiArchive->>DataCite: Try to promote to Dandiset DOI (Findable) - alt Validation successful - DataCite-->>DandiArchive: Update to Dandiset DOI (Findable) - else Validation fails - DataCite-->>DandiArchive: Keep as Dandiset DOI (Draft) - end DandiArchive-->>User: Return unembargoed dandiset Note over User,DataCite: Dandiset Publication User->>DandiArchive: Publish dandiset - alt Dandiset DOI is Draft + alt Dandiset DOI is Draft (first publication) DandiArchive->>DataCite: Promote to Dandiset DOI (Findable) DataCite-->>DandiArchive: Update to Dandiset DOI (Findable) - else Already Findable + else Already Findable (already published at least once) DataCite-->>DandiArchive: Update metadata only end DandiArchive->>DataCite: Mint Version DOI (Findable) (10.48324/dandi.{id}/{version}) @@ -168,7 +139,7 @@ sequenceDiagram Note right of DandiArchive: Keep URL pointing to DLP DandiArchive-->>User: Return published dandiset with both DOIs - Note over User,DataCite: Dandiset Deletion (Optional) + Note over User,DataCite: Dandiset Deletion User->>DandiArchive: Delete dandiset alt Dandiset DOI is Draft DandiArchive->>DataCite: Delete Draft DOI From a3f13e25e19d44202e9d48231299363530dd2d07 Mon Sep 17 00:00:00 2001 From: Austin Macdonald Date: Mon, 5 May 2025 12:43:01 -0500 Subject: [PATCH 16/24] remove concern about test vs prod validation could not reproduce --- doc/design/doi-generation-2.md | 1 - 1 file changed, 1 deletion(-) diff --git a/doc/design/doi-generation-2.md b/doc/design/doi-generation-2.md index 3df9ac1b9..941cab655 100644 --- a/doc/design/doi-generation-2.md +++ b/doc/design/doi-generation-2.md @@ -202,7 +202,6 @@ We discussed annotating the DOI, ie "potentially incorrect metadata", but we ult - We do not need to annotate `Draft DOI` metadata since it is not visible. - If the `Dandiset DOI` is visible on the Draft Dandiset page, we should consider changing the "Cite As" or add an additional field. - Zenodo's "Concept DOIs" are presented as "Cite all versions" but we didn't think this was clear enough. -- Test site of datacite had different result of validation that the primary one - We may want to include `Dandiset DOI` somewhere on published versions too, in addition to the `Version DOI` which we currently use. - The "Draft Dandiset" Version will be populated with `Dandiset DOI`, so this may not be necessary. - Should we somehow reflect interactions with DataCite in Audit log? Possible things to log: From 20bf562f1d7cd401a366c58bbae0aad3125316a2 Mon Sep 17 00:00:00 2001 From: Austin Macdonald Date: Mon, 5 May 2025 12:44:17 -0500 Subject: [PATCH 17/24] Add Dandiset DOI to dandi-schema TODO --- doc/design/doi-generation-2.md | 1 + 1 file changed, 1 insertion(+) diff --git a/doc/design/doi-generation-2.md b/doc/design/doi-generation-2.md index 941cab655..1ab8c0032 100644 --- a/doc/design/doi-generation-2.md +++ b/doc/design/doi-generation-2.md @@ -163,6 +163,7 @@ Instead, the `Draft Dandiset` DOI field will be where the `Dandiset DOI` is stor It will need to be extended to: - "publish" `Draft DOI` to `Findable DOI` - "hide" `Findable DOI` to `Registered DOI` + - Produce `Dandiset DOI` and `Version DOI` (only does version DOI currently) We will keep (and deprecate) the `publish` parameter, and add a new parameter `event` which is either: - (None): Draft DOI From 6600849d5093ee406ea7cef9454d0547de4e200b Mon Sep 17 00:00:00 2001 From: Austin Macdonald Date: Mon, 5 May 2025 12:57:56 -0500 Subject: [PATCH 18/24] fixup: mermaid syntax and language --- doc/design/doi-generation-2.md | 48 ++++++++++++++++++++++------------ 1 file changed, 32 insertions(+), 16 deletions(-) diff --git a/doc/design/doi-generation-2.md b/doc/design/doi-generation-2.md index 1ab8c0032..dc33caac0 100644 --- a/doc/design/doi-generation-2.md +++ b/doc/design/doi-generation-2.md @@ -110,42 +110,58 @@ sequenceDiagram alt No Dandiset DOI exists (previous mint failed) DandiArchive->>DataCite: Mint Dandiset DOI (Draft) DataCite-->>DandiArchive: Return Dandiset DOI (Draft) + else Dandiset DOI exists + DandiArchive->>DataCite: Update metadata of Dandiset DOI (Draft) + alt Update successful + DataCite-->>DandiArchive: Confirm update + else Update fails + DataCite-->>DandiArchive: Error + Note right of DandiArchive: Log error but continue + end end DandiArchive-->>User: Return updated dandiset Note over User,DataCite: Embargoed Dandiset Handling User->>DandiArchive: Create embargoed dandiset - DandiArchive-->>DandiArchive: No DOI updates for embargoed dandisets + DandiArchive-->>DandiArchive: No DOI created for embargoed dandisets + DandiArchive-->>User: Return dandiset without DOI + User->>DandiArchive: Update embargoed dandiset DandiArchive-->>DandiArchive: No DOI updates for embargoed dandisets + DandiArchive-->>User: Return updated dandiset User->>DandiArchive: Unembargo dandiset - DandiArchive->>DataCite: Request Draft DOI (minimal info) - Note right of DataCite: Only DLP URL, no metadata - DataCite-->>DandiArchive: Return Draft DOI - DandiArchive-->>User: Return unembargoed dandiset + DandiArchive->>DataCite: Mint Dandiset DOI (Draft) + Note right of DataCite: DLP URL + current metadata + DataCite-->>DandiArchive: Return Dandiset DOI (Draft) + DandiArchive-->>User: Return unembargoed dandiset with DOI Note over User,DataCite: Dandiset Publication User->>DandiArchive: Publish dandiset + DataCite-->>DandiArchive: Return Version DOI (Findable) + alt Dandiset DOI is Draft (first publication) - DandiArchive->>DataCite: Promote to Dandiset DOI (Findable) - DataCite-->>DandiArchive: Update to Dandiset DOI (Findable) + DandiArchive->>DataCite: Mint Version DOI (Findable) (10.48324/dandi.{id}/{version}) + DandiArchive->>DataCite: Update Dandiset DOI with version metadata + DandiArchive->>DataCite: Promote Dandiset DOI to Findable + DataCite-->>DandiArchive: Confirm update else Already Findable (already published at least once) - DataCite-->>DandiArchive: Update metadata only + DandiArchive->>DataCite: Update Dandiset DOI metadata + DataCite-->>DandiArchive: Confirm update end - DandiArchive->>DataCite: Mint Version DOI (Findable) (10.48324/dandi.{id}/{version}) - DataCite-->>DandiArchive: Return Version DOI (Findable) - DandiArchive->>DataCite: Update Dandiset DOI with version metadata - Note right of DandiArchive: Keep URL pointing to DLP + Note right of DandiArchive: Dandiset DOI keeps URL pointing to DLP DandiArchive-->>User: Return published dandiset with both DOIs Note over User,DataCite: Dandiset Deletion User->>DandiArchive: Delete dandiset alt Dandiset DOI is Draft - DandiArchive->>DataCite: Delete Draft DOI - else Dandiset DOI is Findable - DandiArchive->>DataCite: "hide" DOI (Convert to "Registered") and point DOI to tombstone page - DataCite-->>DandiArchive: Update DOI target URL + DandiArchive->>DataCite: Delete Draft DOI + DataCite-->>DandiArchive: Confirm deletion + else Dandiset DOI is Findable + DandiArchive->>DataCite: "Hide" DOI (Convert to "Registered") + DandiArchive->>DataCite: Point DOI to tombstone page + DataCite-->>DandiArchive: Confirm update + end DandiArchive-->>User: Confirm deletion ``` From 8562dbe63bfe09f58835d7860cff32fcd1740008 Mon Sep 17 00:00:00 2001 From: Austin Macdonald Date: Tue, 13 May 2025 11:04:41 -0500 Subject: [PATCH 19/24] Apply suggestions from code review Co-authored-by: Yaroslav Halchenko --- doc/design/doi-generation-2.md | 21 ++++++++++----------- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/doc/design/doi-generation-2.md b/doc/design/doi-generation-2.md index dc33caac0..5ae99952c 100644 --- a/doc/design/doi-generation-2.md +++ b/doc/design/doi-generation-2.md @@ -67,22 +67,19 @@ DataCite allows for three types of DOIs ([DataCite](https://support.datacite.org - promote `Dandiset DOI` (Draft) to `Findable DOI` - Upon updates to draft dandiset metadata **after the first publication**" - no-op. The `Dandiset DOI` metadata will match the most recent publication. - - Upon deletion of a draft dandiset metadata **after the first publication**: - - "hide" the `Dandiset DOI` (Findable) to `Registered DOI` + - Upon deletion of a published dandiset version (`VersionViewSet.destroy`) : + - "hide" the `Version DOI` (Findable) to `Registered DOI` + - Upon deletion of a dandiset (`DandisetViewSet.destroy`): + - "hide" the `Dandiset DOI` if `Findable` and delete if `Draft` - Upon **subsequent publications** of a dandiset: - Mint a new `Version DOI` - Update `Dandiset DOI` metadata to match published version - For **embargoed dandiset**: - Upon creation, no DOI is created. - Upon changes to embargoed dandiset metadata record, don't do anything. - - Upon deletion of an embargoed dandiset: - - Delete the `Dandiset DOI` (Draft) from Datacite + - Upon deletion of an embargoed dandiset: don't do anything. - Upon unembargoing dandiset: - - If there are published versions: - - Mint `Dandiset DOI` (Findable) with latest published version of metadata, - - Mint `Version DOI` for each published version. - - If there are no published versions: - - Mint `Dandiset DOI` (Draft) with latest metadata, + - Mint `Dandiset DOI` (Draft) with latest metadata, ### Sequence Diagram @@ -169,8 +166,8 @@ sequenceDiagram A django-admin script should be created and executed to create a `Dandiset DOI` for all existing dandisets. -No new field will be added for `Dandiset DOI`. -Instead, the `Draft Dandiset` DOI field will be where the `Dandiset DOI` is stored. +No DB migration will be needed, as no new field will be added to `Dandiset` model, and +instead, the `Dandiset DOI` will be stored in the "draft" `Version`. ### Dandi Schema Changes @@ -195,6 +192,8 @@ We opted not to create DOIs for embargoed Dandisets because: - We should avoid sending any potentially secret metadata to a 3rd party, even if it is not publicly searchable. - If we were to create a DOI with fake metadata that probably would not have any value at all. - What the DOIs will eventually be upon publication is semantically determined, so the value can be used even prior to being "real". + + We might reconsider, if decision would be made to expose metadata of Embargoed Dandisets for the purpose of discovery. ### Promoting Draft DOIs to Findable for Draft Dandisets From c499cc1ec0f042209047d69d7b9b14e7b67f7a4a Mon Sep 17 00:00:00 2001 From: Austin Macdonald Date: Thu, 15 May 2025 13:37:11 -0500 Subject: [PATCH 20/24] Update Schema changes based on discussion with @yarikoptic --- doc/design/doi-generation-2.md | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/doc/design/doi-generation-2.md b/doc/design/doi-generation-2.md index 5ae99952c..f78eabaf8 100644 --- a/doc/design/doi-generation-2.md +++ b/doc/design/doi-generation-2.md @@ -183,6 +183,15 @@ We will keep (and deprecate) the `publish` parameter, and add a new parameter `e - `publish`: Findable DOI - `hide`: Registered DOI +In the current implementation, only published dandisets are given a DOI, so we are using the pydantic validation for `PublishedDandiset`. +This is too restrictive for our case. +Instead, we'll try `PublishedDandiset` first, then fallback to `Dandiset`, then fall back to unvalidated. + +#### (Option considered but rejected) Prevent Findable DOIs when not validated + +If we fallback to unvalidated, we could prevent the DOI from becoming findable. +Instead though, we've opted to just try to update the DOI via Datacite anyway and handle the API failure if it happens. + ## Alternatives Explored ### Creating DOIs for Embargoed Dandisets @@ -192,7 +201,7 @@ We opted not to create DOIs for embargoed Dandisets because: - We should avoid sending any potentially secret metadata to a 3rd party, even if it is not publicly searchable. - If we were to create a DOI with fake metadata that probably would not have any value at all. - What the DOIs will eventually be upon publication is semantically determined, so the value can be used even prior to being "real". - + We might reconsider, if decision would be made to expose metadata of Embargoed Dandisets for the purpose of discovery. ### Promoting Draft DOIs to Findable for Draft Dandisets From 04092bc33c71fff9cf2fb603a4a59509c537c555 Mon Sep 17 00:00:00 2001 From: Austin Macdonald Date: Thu, 15 May 2025 13:47:28 -0500 Subject: [PATCH 21/24] Add cautions (gotchas) to be explicit about settings behavior --- doc/design/doi-generation-2.md | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-) diff --git a/doc/design/doi-generation-2.md b/doc/design/doi-generation-2.md index f78eabaf8..ec18e53e5 100644 --- a/doc/design/doi-generation-2.md +++ b/doc/design/doi-generation-2.md @@ -81,6 +81,21 @@ DataCite allows for three types of DOIs ([DataCite](https://support.datacite.org - Upon unembargoing dandiset: - Mint `Dandiset DOI` (Draft) with latest metadata, +### Cautions + +If DANDI_DOI_PUBLISH is false (default) + - creation as `Findable` should be disabled + - update to `Findable` and `Registered` should be disabled + +If all DOI configuration options are not set: + - all required options: + - `DANDI_DOI_API_URL` + - `DANDI_DOI_API_URL` + - `DANDI_DOI_API_PASSWORD` + - `DANDI_DOI_API_PASSWORD` + - DOIs CRUD through Datacite API should be entirely disabled + - DOI (the string) should not be added to the version + ### Sequence Diagram @@ -187,12 +202,13 @@ In the current implementation, only published dandisets are given a DOI, so we a This is too restrictive for our case. Instead, we'll try `PublishedDandiset` first, then fallback to `Dandiset`, then fall back to unvalidated. -#### (Option considered but rejected) Prevent Findable DOIs when not validated +## Alternatives Explored + +#### Prevent Findable DOIs when not validated If we fallback to unvalidated, we could prevent the DOI from becoming findable. Instead though, we've opted to just try to update the DOI via Datacite anyway and handle the API failure if it happens. -## Alternatives Explored ### Creating DOIs for Embargoed Dandisets From b6b4338cb927b6e73778f77bf3d4651103ebfdfa Mon Sep 17 00:00:00 2001 From: Austin Macdonald Date: Mon, 2 Jun 2025 10:08:07 -0500 Subject: [PATCH 22/24] fixup formatting --- doc/design/doi-generation-2.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/doc/design/doi-generation-2.md b/doc/design/doi-generation-2.md index ec18e53e5..8d9a4051e 100644 --- a/doc/design/doi-generation-2.md +++ b/doc/design/doi-generation-2.md @@ -9,10 +9,10 @@ Upon creation, every public Dandiset will receive a **Dandiset DOI** that will r Every public published version of a Dandiset will receive a **Version DOI**. For example: -Dandiset DOI: `https://doi.org/10.48324/dandi.000027/` -Version DOI: `https://doi.org/10.48324/dandi.000027/0.210831.2033` + - Dandiset DOI: `https://doi.org/10.48324/dandi.000027/` + - Version DOI: `https://doi.org/10.48324/dandi.000027/0.210831.2033` -The Dandiset DOI will always refer to the DLP +Dandiset DOI redirect will always refer to the DLP. At creation the `Dandiset DOI` will be a DataCite `Draft DOI`, but will be "promoted" to a DataCite `Findable DOI` as soon as there is a published version. A `Version DOI` will be created as a `Findable DOI`. From c54bd38a38d2a749fce042474010b292f3eda007 Mon Sep 17 00:00:00 2001 From: Austin Macdonald Date: Mon, 2 Jun 2025 10:36:08 -0500 Subject: [PATCH 23/24] chore: cleanup format and clarify language --- doc/design/doi-generation-2.md | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/doc/design/doi-generation-2.md b/doc/design/doi-generation-2.md index 8d9a4051e..693432fc3 100644 --- a/doc/design/doi-generation-2.md +++ b/doc/design/doi-generation-2.md @@ -14,8 +14,9 @@ For example: Dandiset DOI redirect will always refer to the DLP. -At creation the `Dandiset DOI` will be a DataCite `Draft DOI`, but will be "promoted" to a DataCite `Findable DOI` as soon as there is a published version. -A `Version DOI` will be created as a `Findable DOI`. +At creation the `Dandiset DOI` will be a DataCite `Draft DOI`. +`Dandiset DOI` will remain a `Draft DOI` until there is a published version, at that time we will "promote" to a DataCite `Findable DOI`. +For each published version there will be a `Version DOI` created as a `Findable DOIs`. ## The current approach @@ -79,7 +80,7 @@ DataCite allows for three types of DOIs ([DataCite](https://support.datacite.org - Upon changes to embargoed dandiset metadata record, don't do anything. - Upon deletion of an embargoed dandiset: don't do anything. - Upon unembargoing dandiset: - - Mint `Dandiset DOI` (Draft) with latest metadata, + - Mint `Dandiset DOI` (Draft) with latest metadata ### Cautions @@ -90,9 +91,9 @@ If DANDI_DOI_PUBLISH is false (default) If all DOI configuration options are not set: - all required options: - `DANDI_DOI_API_URL` - - `DANDI_DOI_API_URL` - - `DANDI_DOI_API_PASSWORD` + - `DANDI_DOI_API_USER` - `DANDI_DOI_API_PASSWORD` + - `DANDI_DOI_API_PREFIX` - DOIs CRUD through Datacite API should be entirely disabled - DOI (the string) should not be added to the version @@ -228,7 +229,7 @@ This is because `Draft` DOI is not visible/usable by users. However, if we promote the `Draft DOI` to `Findable` as soon as it is valid, and the user then change it to be invalid again, the DOI metadata will be wrong. We discussed annotating the DOI, ie "potentially incorrect metadata", but we ultimately decided that the messiness is not worth the value. -- Dandisets can be deleted. In such cases, upon deletion of a dandiset: +How Findable DOIs for Draft Dandisets would work upon deletion of a dandiset: - if DOI was a `Draft` DOI - just delete it as well. - if DOI was a `Findable` DOI - convert to `Registered` DOI (follows [datacite best practices](https://support.datacite.org/docs/tombstone-pages)) - Also at the level of the DANDI archive itself we should provide tombstone page so URL is still "working" (#3211) From 24ea6eb26146f31cbabbe6a4bc2f969ebefcb229 Mon Sep 17 00:00:00 2001 From: Austin Macdonald Date: Mon, 2 Jun 2025 10:40:24 -0500 Subject: [PATCH 24/24] We opted to fallback to unvalidated fallback to Dandiset pydantic model would require changes to that model. --- doc/design/doi-generation-2.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/design/doi-generation-2.md b/doc/design/doi-generation-2.md index 693432fc3..e2e4d9fcd 100644 --- a/doc/design/doi-generation-2.md +++ b/doc/design/doi-generation-2.md @@ -201,7 +201,7 @@ We will keep (and deprecate) the `publish` parameter, and add a new parameter `e In the current implementation, only published dandisets are given a DOI, so we are using the pydantic validation for `PublishedDandiset`. This is too restrictive for our case. -Instead, we'll try `PublishedDandiset` first, then fallback to `Dandiset`, then fall back to unvalidated. +Instead, we'll try `PublishedDandiset` first, then fallback to unvalidated. ## Alternatives Explored