Skip to content

Conversation

@ErykKul
Copy link
Collaborator

@ErykKul ErykKul commented Nov 4, 2025

What this PR does / why we need it:
Bug fix.

Which issue(s) this PR closes:

Special notes for your reviewer:

  • see also the test; it illustrates how the bug can happen

Suggestions on how to test this:

  • hard to reproduce the bug, the easiest way was to write a test for it and ensure no regression can happen.

Does this PR introduce a user interface change? If mockups are available, please link/include them here:

  • no

Is there a release notes update needed for this change?:

  • probably not? Maybe one liner: "bugfix: controlled vocab values validation when empty"?

Additional documentation:

@github-actions github-actions bot added the Type: Bug a defect label Nov 4, 2025
@coveralls
Copy link

coveralls commented Nov 4, 2025

Coverage Status

coverage: 24.318% (+0.001%) from 24.317%
when pulling 08bfa30 on 11900-improved-cvoc-value-validation
into c4b4e82 on develop.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@pdurbin pdurbin moved this to Ready for Triage in IQSS Dataverse Project Nov 4, 2025
@qqmyers qqmyers added the Size: 3 A percentage of a sprint. 2.1 hours. label Nov 5, 2025
@github-actions

This comment has been minimized.

@scolapasta scolapasta moved this from Ready for Triage to Ready for Review ⏩ in IQSS Dataverse Project Nov 12, 2025
@qqmyers qqmyers moved this from Ready for Review ⏩ to In Review 🔎 in IQSS Dataverse Project Nov 12, 2025
@qqmyers qqmyers self-assigned this Nov 12, 2025
i.remove();
continue;
}
if (isControlledVocabulary && dsf.getDatasetFieldType().getControlledVocabularyValue(v) == null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know if these changes are needed (the ones in this class. I can see why the change in DatasetPage is important)? At a minimum, I think if isControlledVocabulary is true, there shouldn't ever be a regular value unless it's temporary, so you could always remove it in that case (possibly just adding || isControlledVocabulary to line 1795 instead of a second if statement.

Beyond that though - we are hopefully never adding a temporary value that is not blank or NA_VALUE, so I'd think that the original code should catch and remove the temp value without changes in this class (with your change to DatasetPage). (If not - if there are temp values getting added that are other strings, maybe we should find/fix those? Or catch them for normal fields too and not just controlled vocab?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ErykKul - just a ping - this is waiting on you for a response

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was mixing two things here, the code here was for removing no-longer-existing vocab values, but I am not sure if that was a good idea, and definitely should not be mixed in this PR with the bug solution for "valid" empty values (I first though it was the reason for them being empty and valid: an old controlled vocabulary value that was removed from the metadata-block shows empty but it is not empty and it is not NA value -> however, that was not the reason for the bug, I could reproduce it in a way as the issue describes).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reverted the file.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also removed the unit test; it was testing the not-in-vocabulary value in CV field:

        DatasetField subjectField = new DatasetField();
        subjectField.setDatasetFieldType(subjectType);
        subjectField.setDatasetFieldValues(new ArrayList<>());
        subjectField.setControlledVocabularyValues(new ArrayList<>());
        DatasetFieldValue orphanValue = new DatasetFieldValue(subjectField);
        orphanValue.setValue("__placeholder__");
        subjectField.getDatasetFieldValues().add(orphanValue);

This is not the cause of the bug, so it was testing the wrong thing (and started failing after cleanup of the file)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also re-added workingVersion.validate(); in DatasetPage, it has side effects we still want to have (set validation messages on dataset fields, that are then showed in the UI).

@qqmyers
Copy link
Member

qqmyers commented Nov 12, 2025

FWIW: The previous build had a failure in the Harvest from DataCite test, but just rerunning the job fixed it, so presumably random/transient and not related to this PR.

@cmbz cmbz added FY26 Sprint 10 FY26 Sprint 10 (2025-11-05 - 2025-11-19) FY26 Sprint 11 FY26 Sprint 11 (2025-11-20 - 2025-12-03) labels Nov 20, 2025
@github-actions

This comment has been minimized.

2 similar comments
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

Copy link
Member

@qqmyers qqmyers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. isValid() is checking that the version is valid after removal of temporary N/A values which makes sense. Not sure the underlying code is very efficient (cloning the version) but we use it elsewhere.

@github-project-automation github-project-automation bot moved this from In Review 🔎 to Ready for QA ⏩ in IQSS Dataverse Project Dec 1, 2025
@cmbz cmbz added the FY26 Sprint 12 FY26 Sprint 12 (2025-12-03 - 2025-12-17) label Dec 3, 2025
@stevenwinship stevenwinship moved this from Ready for QA ⏩ to QA ✅ in IQSS Dataverse Project Dec 4, 2025
@stevenwinship stevenwinship self-assigned this Dec 4, 2025
@stevenwinship
Copy link
Contributor

stevenwinship commented Dec 4, 2025

@stevenferey @ErykKul
Unable to reproduce this bug. More info in the main issue

@pdurbin pdurbin moved this from QA ✅ to In Review 🔎 in IQSS Dataverse Project Dec 5, 2025
@cmbz cmbz added the FY26 Sprint 13 FY26 Sprint 13 (2025-12-17 - 2025-12-31) label Dec 17, 2025
@cmbz cmbz added the FY26 Sprint 14 FY26 Sprint 14 (2025-12-31 - 2026-01-14) label Dec 31, 2025
@cmbz cmbz added the FY26 Sprint 15 FY26 Sprint 15 (2026-01-14 - 2026-01-28) label Jan 15, 2026
@github-actions
Copy link

📦 Pushed preview images as

ghcr.io/gdcc/dataverse:11900-improved-cvoc-value-validation
ghcr.io/gdcc/configbaker:11900-improved-cvoc-value-validation

🚢 See on GHCR. Use by referencing with full name as printed above, mind the registry name.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

FY26 Sprint 10 FY26 Sprint 10 (2025-11-05 - 2025-11-19) FY26 Sprint 11 FY26 Sprint 11 (2025-11-20 - 2025-12-03) FY26 Sprint 12 FY26 Sprint 12 (2025-12-03 - 2025-12-17) FY26 Sprint 13 FY26 Sprint 13 (2025-12-17 - 2025-12-31) FY26 Sprint 14 FY26 Sprint 14 (2025-12-31 - 2026-01-14) FY26 Sprint 15 FY26 Sprint 15 (2026-01-14 - 2026-01-28) Size: 3 A percentage of a sprint. 2.1 hours. Type: Bug a defect Waiting

Projects

Status: In Review 🔎

Development

Successfully merging this pull request may close these issues.

Required controlledVocabulary metadata marked as valid while empty

6 participants