Skip to content

Conversation

@crkaz
Copy link
Collaborator

@crkaz crkaz commented Sep 25, 2025

I've tried to automate this and make it repeatable with a bunch of scripts and documentation - but maybe some of this doesn't want to be in the repo?

Either way, I've added the blacklist mechanism and some test coverage. The final list and the rules for generating it, in the scripts, would benefit from second opinions.

Please let me know any concerns

@crkaz crkaz requested a review from bebbi September 25, 2025 22:34
@crkaz crkaz linked an issue Sep 25, 2025 that may be closed by this pull request
Copy link
Owner

@bebbi bebbi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are interesting scripts. We can keep them as they are well described.
I haven't seen the exact difference between two of them but haven't looked deep.
Do the scripts only list attributes not already excluded/blacklisted? I think that would be useful, see individual comments.

I think the list is a great start but needs a manual review as per detailed comments.
I think we're handling attrinbute type correctly, right?
Optional improvement would be having all scripts/ be typescript, unless that's difficult.

'ConfigurationInformationDescription',
'ContainerComponentDescription',
'ContentDescription',
'ContextGroupExtensionCreatorUID',
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all UIDs are safe? Can we exclude them from the manual blacklist?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They should be accepted internally (despite being in the blacklist) as per the comments, but yes it's cleaner for me to remove them here :)

'PatientEyeMovementCommandCodeSequence',
'PatientEyeMovementCommanded',
'PatientGantryRelationshipCodeSequence',
'PatientIdentityRemoved',
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a breaker if blacklisted (see other logic in code).
Could you do a manual review of the list to avoid these?

'PatientEyeMovementCommanded',
'PatientGantryRelationshipCodeSequence',
'PatientIdentityRemoved',
'PatientMotionCorrected',
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this a boolean? Can you review all Patient flags for false positives

'ReceivingPresentationAddress',
'SendingPresentationAddress',
'SourcePresentationAddress',
'WedgeInContactLength',
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a "contact" in a personal sense here?

// ========================================================================

'ASLContext',
'ASLCrusherDescription',
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to remove all "Description" and "Comment" fields from this manual blacklist for consistency, as we're handling Descriptors in a consistent way already (per an actual option laid out in the 3.15 standard):
See deidentifyPS315E.ts:370

]
} else if (
manualBlacklistSet.has(normalName) &&
vr !== 'UI' && // Don't blacklist UIDs - they have special handling
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reminder to remove this as we shouldn't put UIs in blacklist - instead, ensure UIs don't get flagged by scripts or blacklisted.

@bebbi
Copy link
Owner

bebbi commented Oct 20, 2025

@crkaz asking Lubos to jump in for this one.
Good second issue @LubosD - could you review the PR and add your flavor? Either on the PR branch or a new one.

@bebbi bebbi requested review from LubosD and removed request for LubosD October 20, 2025 10:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Review derived DICOM attributes whitelist for data leaks

4 participants