CAS 7.1.X Entra Delegation OIDC Issue

86 views
Skip to first unread message

jbstowe22

unread,
Oct 7, 2024, 11:35:34 AM10/7/24
to CAS Community
Hi,

We have been testing out CAS 7.1.0 the past few weeks and believe we have discovered some sort of issue with delegating logins to Entra (potentially other OIDC IDPs too). The issue started after upgrading to 7.1.0 from 7.0.4 with sporadic reports from users getting a "Unauthorized Access" CAS error page. We were able to replicate the issue by logging in and out of applications being delegated to Entra. Initially, we were getting the error on roughly 1 out of every 10 delegated logins, the other ~9 times it worked perfectly. We turned up logging levels and extracted these logs when the error occurs:

2024-09-24 15:24:16,132 DEBUG [org.pac4j.oidc.profile.azuread.AzureAdProfile] - <adding => key: access_token / value: [redacted]
2024-09-24 15:24:16,132 DEBUG [org.pac4j.oidc.profile.azuread.AzureAdProfile] - <adding => key: expiration / value: 1727214663132 / class java.lang.Long>
2024-09-24 15:24:16,132 DEBUG [org.pac4j.oidc.profile.azuread.AzureAdProfile] - <adding => key: id_token / value: [redacted]
2024-09-24 15:24:16,132 WARN [org.apereo.cas.util.function.FunctionUtils] - <Cannot invoke "org.pac4j.oidc.profile.creator.TokenValidator.validate(com.nimbusds.jwt.JWT, com.nimbusds.openid.connect.sdk.Nonce)" because the return value of "org.pac4j.oidc.metadata.OidcOpMetadataResolver.getTokenValidator()" is null
        OidcProfileCreator.java:create:115
        BaseClient.java:getUserProfile:146
        DelegatedClientAuthenticationHandler.java:lambda$doAuthentication$2:89
        FunctionUtils.java:lambda$doAndHandle$12:425
>
2024-09-24 15:24:16,132 ERROR [org.apereo.cas.authentication.DefaultAuthenticationManager] - <Authentication has failed. Credentials may be incorrect or CAS cannot find authentication handler that supports [ClientCredential(credentials=OidcCredentials

I'll also note, we have read through the CAS 7.1 release notes and made all the appropriate changes in the overlay regarding the consolidation of the pac4j libraries.



At this point we assumed it was related to something in our config/other libraries we were pulling in. We pulled the latest 7.1 overlay, and added the absolute bare minimum to try to replicate the issue. Only thing we changed in the overlay was adding the minimum libraries:

implementation "org.apereo.cas:cas-server-support-json-service-registry"
implementation "org.apereo.cas:cas-server-webapp"
implementation "org.apereo.cas:cas-server-support-pac4j-oidc"

The bare minimum in cas.properties:

cas.server.tomcat.http[0].enabled=true
cas.server.name=http://localhost:8080
cas.server.prefix=http://localhost:8080/cas
logging.config=file:/etc/cas/config/log4j2.xml
cas.service-registry.json.location=file:/etc/cas/services
cas.authn.pac4j.oidc[0].azure.tenant=[redacted]
cas.authn.pac4j.oidc[0].azure.id=[redacted]
cas.authn.pac4j.oidc[0].azure.secret=[redacted]
cas.authn.pac4j.oidc[0].azure.client-name=AADAuth
cas.authn.pac4j.oidc[0].generic.discovery-uri=https://login.microsoftonline.com/2a00728e-f0d0-40b4-a4e8-ce433f3fbca7/v2.0/.well-known/openid-configuration

A simple service definition:

{
    "@class": "org.apereo.cas.services.CasRegisteredService",
    "serviceId": ".*",
    "name": "local",
    "id": 1,
    "description": "Login"
}

Then ran CAS locally on the embedded tomcat to test. Initially we could not replicate the error, delegation worked every time. On a whim, we decided to put a little load on the locally running instance of CAS using locust (25 Users ramping up 3 at a time). We immediately started getting the same error locally. The error rate seems to be correlated with the number of requests CAS is handling. For instance if we increase the number of users in locust we get the error much more frequently to the point where you will get it constantly when there is enough requests to CAS. I should also point out that I don't think it is a resource issue. Any amount of load will eventually let you get the error. Also, the CAS instance on the server has a 4 core CPU and 16 GB of ram available to the JVM and hardly any of that is being consumed when getting the error. It almost seems there is some sort on concurrency issue happening when CAS handles the response from Entra (which we confirmed contains a valid access/ID token) when there are multiple requests. We have replicated the error on CAS 7.1.1 and CAS 7.2.0-SNAPSHOT as well.

It could still be we have something misconfigured as well but we are unsure where to go from here, any help would be greatly appreciated!

Thanks!






Reply all
Reply to author
Forward
0 new messages