-
Notifications
You must be signed in to change notification settings - Fork 531
Description
What steps does it take to reproduce the issue?
- Upload a file to a dataset via API that has the same checksum as an existing file (duplicate file)
- The API returns success with a duplicate warning, but the
messagefield is malformed
Example:
curl -X POST "https://demo.dataverse.org/api/datasets/:persistentId/add?persistentId=doi:..." \
-H "X-Dataverse-key: $API_TOKEN" \
-F "[email protected]"Expected response format:
{
"status": "OK",
"message": "This file has the same content as existing-file.txt that is in the dataset.",
"data": { ... }
}Actual response format:
{
"status": "OK",
"message": {
"message": "This file has the same content as existing-file.txt that is in the dataset."
},
"data": { ... }
}- When does this issue occur?
This occurs whenever the ok(String msg, JsonObjectBuilder bld) method in AbstractApiBean.java is called. The bug was introduced in commit f311312c34 on May 12, 2020, as part of issue #4813 ("File Upload - allow files with same MD5 in a dataset").
- Which page(s) does it occur on?
This affects 7 API endpoints that use the buggy ok(String, JsonObjectBuilder) method:
| File | Line | API Endpoint |
|---|---|---|
src/main/java/edu/harvard/iq/dataverse/api/Datasets.java |
3087 | POST /api/datasets/{id}/add (duplicate file warning) |
src/main/java/edu/harvard/iq/dataverse/api/Admin.java |
233 | PUT /api/admin/settings |
src/main/java/edu/harvard/iq/dataverse/api/Dataverses.java |
754 | PUT /api/dataverses/{id} |
src/main/java/edu/harvard/iq/dataverse/api/Dataverses.java |
788 | PUT /api/dataverses/{id}/inputLevels |
src/main/java/edu/harvard/iq/dataverse/api/SavedSearches.java |
161 | POST /api/admin/savedsearches |
src/main/java/edu/harvard/iq/dataverse/api/HarvestingClients.java |
309 | PUT /api/harvest/clients/{nickName} |
src/main/java/edu/harvard/iq/dataverse/api/HarvestingServer.java |
228 | PUT /api/harvest/server/oaisets/{specname} |
- What happens?
The root cause is in src/main/java/edu/harvard/iq/dataverse/api/AbstractApiBean.java lines 997-1003:
protected Response ok( String msg, JsonObjectBuilder bld ) {
return Response.ok().entity(Json.createObjectBuilder()
.add("status", ApiConstants.STATUS_OK)
.add("message", Json.createObjectBuilder().add("message",msg)) // BUG: wraps in extra object
.add("data", bld).build())
.type(MediaType.APPLICATION_JSON)
.build();
}Line 1000 should be:
.add("message", msg)- To whom does it occur (all users, curators, superusers)?
All API users who call the affected endpoints and parse the message field as a string (as documented). Most commonly encountered when uploading duplicate files via API, which triggers the duplicate warning message path in Datasets.java.
Impact:
- API clients that strictly type the
messagefield as a string will fail to parse responses from affected endpoints - This violates the documented API response format where
messageis expected to be a string - The bug has existed since Dataverse 5.0 (May 2020) but is rarely encountered because:
- Most endpoints use other
ok()overloads that work correctly - The duplicate file warning path is only triggered when uploading files with matching checksums
- Most endpoints use other
Related issue: #4813