> and I have two installations of simplesamlphp, one for sp and one for idp. I could have done it on the same install, but I decided to go this route to get a clear idea of what needs to be done on each side.”
You may have requests getting routed to the wrong instance as another vague hunch, but as mentioned, it’s hard to guess from here.
> The weird this is that I used simplesamlphp to set up an adfs SSO and it was pretty straightforward. Worked easily after I got the claim rules set up correctly. But I've never tried to set up a sp/idp on the same machine. Didn't think it would be a problem until I read that "note" in the docs (which had no explanation of how to do it).
I’ve tried to explain it 17 different ways to myself, but I just end up more confused than when I started, with sprinkles. ADFS and other implementations can sometimes make simplifying assumptions about their deployment environment or pour a lot of resources into configuration tools, luxuries that simpleSAMLphp doesn’t have.
If I could go back and rewrite the Internet, I’d start with the OSI model, a crowbar, and some superglue… but, until that time, secure messaging between arbitrary systems through a third party vector(e.g. idp <-> sp via browser) will always take some jiggling or assumptions.
Anyway, to debug each leg specifically:
A) Compare the SP metadata as loaded by the IdP to the AuthnRequest content, a check the IdP makes, which is succeeding, in your case, so skip; just adding for archives
B) Compare the Audience of the decrypted assertion to the SP’s configuration
My guess is that you’ll either find the SP’s entityID in the Audience, while the SP thinks its real name is the IdP’s entityID, or you’ll find the IdP’s entityID. I would suspect the former. That said, your currently planned approach makes sense to me, other than the futility of ending up with session persistence in plaintext cookies over http, which I presume is a shoelace you’ll go back for.
If you’re not allergic to code — and this sort of mutilation is the extent of my development talent, so forgive me for mooning the list — you might also just print out or log the two pieces being compared to see why the equality match is failing. A grep should point to the specific processing points. Adding a little more verbosity to the error message in the distribution wouldn’t be a bad feature request, IMO.