Skip to content

Conversation

@palkerecsenyi
Copy link
Member

@palkerecsenyi palkerecsenyi commented Aug 21, 2025

Closes #552


  • The new OAuth remote for GitLab will use the same system as all other remotes (i.e. being registered via invenio-oauthclient). This will allow us to plug in custom handlers at various points. Notably, we can override the GitLab-specific account_info_serializer function and perform our own validation on the raw userinfo object, even before it gets turned into the generic object with significantly less information.

  • GitLab returns an identities field in the userinfo, containing a list of external authentication providers the user has linked to their account, and the ID each provider uses to represent the user.

    For example:

    {
        "id": 1234,
        "username": "johnsmith",
        "identities": [
            {
                "provider": "openid_connect",
                "extern_uid": "johnsmith",
                "saml_provider_id": null,
            },
            {
                "provider": "ldapmain",
                "extern_uid": "cn=johnsmith,ou=users,ou=organic units,dc=cern,dc=ch",
                "saml_provider_id": null,
            },
            {
                "provider": "kerberos",
                "extern_uid": "[email protected]",
                "saml_provider_id": null,
            },
        ],
        ...
    }

    In this case, we can see the GitLab user johnsmith has a CERN username of johnsmith. It is possible for the GitLab username to be different to the CERN username, so we cannot simply rely on the GitLab username.

  • We already store the user's CERN username in the database as part of the extra_data field for the CERN/keycloak RemoteAccount. We just have to look this up (based on the OAuth client ID of the CERN remote app) and compare the two values.

  • This way, we ensure the user has logged in with the GitLab account associated with the same CERN account as their CDS account, which avoids some security issues and bugs.

  • The function gitlab_account_info_serializer overrides the account_info_serializer (but still makes a call to the original once validation has passed). This override can be enabled in invenio.cfg which I will also update soon.

Error handling

Right now, the error handling means a 500 Internal Server Error is returned when there's a mismatch between the user IDs. This could be an acceptable outcome, since such behaviour is not likely to happen without careful and deliberate interference with the authentication process. The following error is logged on the backend:

cds_rdm.errors.KeycloakGitLabMismatchError: GitLab user 1234 has a different CERN SSO identity (cdsgltest) to currently signed-in CDS user 1 (pkerecse)

Since we don't have direct control of the view handler (that's in invenio-oauthclient) it's a little tricky to customise the error that's returned to the end user.

Testing

Running this locally is a little complicated due to the need to connect to various services.

Here's what you need:

@palkerecsenyi palkerecsenyi force-pushed the gitlab branch 2 times, most recently from f52e4c1 to 3d42ad5 Compare October 13, 2025 13:04
@palkerecsenyi palkerecsenyi changed the title WIP: User ID validator for GitLab feat(vcs): support for new VCS integration Oct 31, 2025
@palkerecsenyi palkerecsenyi force-pushed the gitlab branch 2 times, most recently from 93f0c16 to cf12ab5 Compare October 31, 2025 10:31
@palkerecsenyi palkerecsenyi marked this pull request as ready for review October 31, 2025 13:06
@palkerecsenyi palkerecsenyi force-pushed the gitlab branch 6 times, most recently from 4fbe832 to 89ae57b Compare October 31, 2025 15:08
@palkerecsenyi palkerecsenyi force-pushed the gitlab branch 2 times, most recently from 2e80702 to 1c8c4ee Compare November 12, 2025 10:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement GitLab integration for CDS

2 participants