Skip to content

Handle duplicate BAM user IDs when parsing AntragstellerIn #9

@TomCharlesRousseau

Description

@TomCharlesRousseau

Description:

Currently, the parser assumes a BAM user ID by taking the first letter of the first name + up to the first 7 letters of the family name. This works in most cases, but fails for duplicates:

Example:
John Doe → jdoe
Jack Doe → jdoe → becomes jdoe1 in BAM

This cannot be predicted easily and may lead to incorrect assignment in the parser.

Proposed solution:

Add a new column (e.g., Resolved BAM User ID) in the DataFrame or Excel export to store the final user ID.
Parse the AntragstellerIn column (Familyname, Firstname) to get the full name.
Use the function get_userid_from_fullname(name, openbis) from bam_users.py to resolve the correct BAM user ID by checking once against all existing OpenBIS users.
Populate the new column with the resolved userId, ensuring duplicates are correctly handled.
Use this column downstream wherever the “Responsible person” is needed.

Benefits:

Avoids guessing or manual handling of duplicate BAM IDs.
Ensures consistency with OpenBIS.
Cleanly separates resolved user IDs from raw input data.

References:

bam_users.py → get_userid_from_fullname
Current parsing logic for AntragstellerIn → Responsible person

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions