Slow User Entity Queries using `where email like '%ed@example.com%`

635 views
Skip to first unread message

Ed Lepedus

unread,
Jul 30, 2021, 4:06:41 AM7/30/21
to Keycloak Dev
Hi all,

We have a large KeyCloak deployment with about 1.7M users, and we're seeing terrible performance under load with queries of this form doing a full table scan and taking many tens of seconds each. (see example below)

We're on KeyCloak 11.0.2 and the database system is MySQL 5.7 on CloudSQL.

Is this expected behaviour? Why is it necessary to use a Like clause instead of simple equality, which would be much more index-friendly. Is it simply an artefact of our our KC/MySQL versions, etc? 

Example query that took 82s on a 48-core, 90GB RAM MySQL 5.7 instance:

select userentity0_.ID as ID1_73_,
userentity0_.CREATED_TIMESTAMP as CREATED_2_73_,
userentity0_.EMAIL as EMAIL3_73_,
userentity0_.EMAIL_CONSTRAINT as EMAIL_CO4_73_,
userentity0_.EMAIL_VERIFIED as EMAIL_VE5_73_,
userentity0_.ENABLED as ENABLED6_73_,
userentity0_.FEDERATION_LINK as FEDERATI7_73_,
userentity0_.FIRST_NAME as FIRST_NA8_73_,
userentity0_.LAST_NAME as LAST_NAM9_73_,
userentity0_.NOT_BEFORE as NOT_BEF10_73_,
userentity0_.REALM_ID as REALM_I11_73_,
userentity0_.SERVICE_ACCOUNT_CLIENT_LINK as SERVICE12_73_,
userentity0_.USERNAME as USERNAM13_73_
from USER_ENTITY userentity0_
where userentity0_.REALM_ID = 'realm'
and (lower(userentity0_.EMAIL) like '%e...@example.com%')
order by userentity0_.USERNAME asc
limit 100


I'd be very grateful for any insights as to whether this is expected behaviour, and/or whether updating KeyCloak and/or MySQL, or even switching to PostgreSQL would make a meaningful difference.

Thanks in advance,
Ed

Ed Lepedus

unread,
Jul 30, 2021, 4:54:04 PM7/30/21
to Keycloak Dev
Mystery solved. We have a system that handles registrations through the admin API. The original developer of that system didn't notice that the /register endpoint returns the url of the newly created user in the location header, so issued a subsequent call to /users?email=just,register...@example.com. This search endpoint caused the %like% query, and it's default limit of 100 meant that, when queried with an exact email address, it would always scan the entire table. Removing this call dropped our DB CPU usage from 100% to 1% and our 99 %ile latency from >30s to ~0.5s.

Fesenmeyer Daniel (IOC/PAU2)

unread,
Aug 2, 2021, 3:26:43 AM8/2/21
to Ed Lepedus, Keycloak Dev

Hi Ed,

 

Does the shown SQL query result from a search query via the REST API?

If yes, you can use the “exact” query parameter: https://github.com/keycloak/keycloak/blob/ef3a0ee06c1d0b43d35eba234615c03df14b613e/services/src/main/java/org/keycloak/services/resources/admin/UsersResource.java#L250

 

 

Mit freundlichen Grüßen / Best regards

Daniel Fesenmeyer


Bosch IoT Permissions - Product Area User Management (IOC/PAU-PM)
Bosch.IO GmbH | Ziegelei 7 | 88090 Immenstaad | GERMANY |
www.bosch.io
Tel. +49 7545 202-360 | Telefax +49 7545 202-301 |
Daniel.F...@bosch.io

Sitz: Berlin, Registergericht: Amtsgericht Charlottenburg; HRB 148411 B
Aufsichtsratsvorsitzender: Dr.-Ing. Thorsten Lücke; Geschäftsführung: Dr. Stefan Ferber, Dr. Aleksandar Mitrovic, Yvonne Reckling

--
You received this message because you are subscribed to the Google Groups "Keycloak Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to keycloak-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/keycloak-dev/3d5ee9e4-b344-4843-8e0b-1413b717fecbn%40googlegroups.com.

Ed Lepedus

unread,
Aug 2, 2021, 4:38:33 AM8/2/21
to Keycloak Dev
Hi Daniel,

That's a great find -- thank you! We have already re-worked the logic in question to call the URL from the location header, but it's very useful to know about the exact query param. It would have saved us from having to do some very delicate surgery! :D

Best,
Ed

Reply all
Reply to author
Forward
0 new messages