I have been running Simplesaml 1.16.2 for about two years and it was working well.
- Hosting on Aws Elastic Beanstalk
- We are acting as a SAML SP for several IDP's, using SAML 2
- We have at least two instances, and are using SQL as a SAML session store.
- Our users were able to log in on their respective IDP's, redirect back to our site an be logged in and out without issues.
- Each IDP uses their own domain or subdomain our our site.
It stopped working. We are not sure when it stopped, which makes this much harder to trace. I suspected it might have something to do with upgrading cookies to secure, and samesite None. So I upgraded to SimpleSaml 1.18.7 but the problem remains.
Tracing: When the browser visits our site, we detect that the user is not authenticated. SimpleSAML generates a new state ID, which is saved into SQL before redirecting to the IDP:
May 22 11:58:19 simplesamlphp DEBUG [723e632f68] Session: 'xxx-config-name' not valid because we are not authenticated.
May 22 11:58:19 simplesamlphp DEBUG [723e632f68] Saved state: '_00ff72471339804eac8960f921b29e80db54dd0369'
I can see the new session in the SQL database:
session ee3e5a5b30f11acb5333c405eaefd49e C%3A18%3A%22SimpleSAML%5CSession%22... 2020-05-22 19:58:19
... and the user authenticates at the IDP.
When the redirect comes back into our server, it request is decrytped and the same state is retrieved from the request:
May 22 11:58:20 simplesamlphp DEBUG [55c1cedb8b] Loading state: '_00ff72471339804eac8960f921b29e80db54dd0369'
May 22 11:58:20 simplesamlphp WARNING [55c1cedb8b] Could not load state specified by InResponseTo: NOSTATE Processing response as unsolicited.
And because simplesaml does not return an authenticated response to my code, a new authentication is requested from SimpleSAML. A new state ID is created, saved in SQL and the code redirects back to the IDP. This forms an infinite redirect loop.
- We have no Varnish cache running
- We tried using memcached for the SAML session storage - and yes, the memcache server works - exactly the same issue
- We tried using php sessions with only a single instance - same issue.
- We switched back to SQL saml session storage, which was stable for two years - same issue
- We are not using different domains for one SAML authentication. We do have different domains, but each use its own IDP and runs separate.
- We do not jump between HTTP and HTTPS - all the failing requests are all on https.
We have recently made quite a few changes to enhance security, and one of then could be the cause of this issue:
- We added a AWS WAF in our loadbalancers - but the return redirect contains the correct state ID - so this does not seem to be related.
- We changed our site cookies to samesite strict, secure.
- We added OAUTH as a SSO option - but that is for other clients, using other domains on our servers. It should not have any influence on this.
So the issue is this:
- Clearly the SQL session storage works - the session is being saved before redirecting to the IDP - as confirmed by the timestamp retrieved independently from SQL.
- This was traced into Session->setData(), which sets up the session data in an array dataStore, and calls markDirty();
- This sets up a callback to save(), and the session is saved into SQL.
- Upon the return call, the data is decrypted and the same state id is retrieved from the call.
- Simplesaml tries to retrieve the session back using the state id:
- traced into Session->getData(), but when getData is called, the $this->dataStore array is empty.
- Using this state id, the state could not be retrieved from the SQL database.
I have been banging my head on this issue for two days and would really appreciate some help?