what could cause 700ms in "checking query cache for query" on a server with zero load and only a single record in table?

Paul Warner

unread,

Nov 19, 2016, 7:38:14 AM11/19/16

to codership, Lalitha Rachamalla

Hello all. I need help understanding why my query is randomly slow?

I have mariadb-galera-server-5.5 5.5.49+maria-1~trusty running with Galera enabled, but this is a single instance. I have a few databases and tables, total mysqldump is about 750 KB.

I killed all clients except a single mysql client instance. I am running a simple SELECT query on a table with a single row. About 4 out of 5 times the query returns in under 10ms, but randomly it will spike to 100s of ms, close to 1 second. The query is not on the primary key, but with one row I am not sure that matters...

Profiling with `SET SESSION profiling = 1;` shows that basically all the time for the slow queries is spent in "checking query cache for query". Full profile at:

https://paste.ubuntu.com/23498077/

SHOW VARIABLES: https://paste.ubuntu.com/23498090/

I think that based on the profile the time is spent between these lines: https://github.com/MariaDB/server/blob/5.5-galera/sql/sql_cache.cc#L1855-L1931

Any ideas hints?

Thank you in advance!

Paul

hunter86bg

unread,

Nov 22, 2016, 4:26:59 PM11/22/16

to codership, nrac...@ciena.com

As your version is 5.5.40+ it shouldn't be a problem to use query cache.

Have you tried running the server without query cache,if yes - is it the same result ?

Paul Warner

unread,

Dec 27, 2016, 4:18:50 PM12/27/16

to codership, nrac...@ciena.com

Thanks for the suggestion on the query cache.

I have tracked it down to the call to wsrep_sync_wait

https://github.com/MariaDB/server/blob/10.1/sql/sql_cache.cc#L1960

So it totally makes sense to me why this call could be slow on a real cluster with a write load, but I am still confused why this would be slow on a single node cluster with zero write load.

I am digging into the Galera code a bit and will try to get `perf trace` working in the Docker container to see if I can get insight into where the time is going.

If anyone has ideas let me know. For now the work around we are using is to just set `wsrep_on=off` for cluster size one, which maybe is the right answer any ways, but I still consider it a sign of a problem with either Galera or more likely my system that having replication on makes this run to slow.

Paul

Philip Stoev

unread,

Dec 28, 2016, 9:07:01 AM12/28/16

to Paul Warner, codership, nrac...@ciena.com

Hello,

Yes, the fact that wsrep_sync_wait can hang in single-node clusters is a
known problem in Galera. We have seen it in our internal testing as well.

Philip Stoev

--
You received this message because you are subscribed to the Google Groups
"codership" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to codership-tea...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Warner, Paul

unread,

Dec 28, 2016, 12:40:15 PM12/28/16

to Philip Stoev, codership, Rachamalla, Naga

Philip,

Thanks. Not urgent, but is there an issue tracking this? I have the obvious work around of disabling replication for single node (makes sense) but I might be interested to dive into the Galera code and have a look at trying to get a fix eventually. I can't commit since I don't really have experience debugging multi-threaded apps and haven't really written C++ in years, but it might be fun.

Thanks!

Paul

From: Philip Stoev <philip...@galeracluster.com>
Sent: Wednesday, December 28, 2016 6:06:58 AM
To: Warner, Paul; codership
Cc: Rachamalla, Naga
Subject: Re: [codership-team] Re: what could cause 700ms in "checking query cache for query" on a server with zero load and only a single record in table?

Philip Stoev

unread,

Jan 3, 2017, 3:59:40 AM1/3/17

to Warner, Paul, codership, Rachamalla, Naga

Hello,

Apologies for the late reply. We do not have an existing issue at this time,
so please feel free to open a new one here
https://github.com/codership/mysql-wsrep/issues

Philip Stoev

-----Original Message-----
From: Warner, Paul
Sent: Wednesday, December 28, 2016 19:40
To: Philip Stoev ; codership
Cc: Rachamalla, Naga
Subject: Re: [codership-team] Re: what could cause 700ms in "checking query
cache for query" on a server with zero load and only a single record in
table?

Philip,

Thanks. Not urgent, but is there an issue tracking this? I have the
obvious work around of disabling replication for single node (makes sense)
but I might be interested to dive into the Galera code and have a look at
trying to get a fix eventually. I can't commit since I don't really have
experience debugging multi-threaded apps and haven't really written C++ in
years, but it might be fun.

Thanks!

Paul

Reply all

Reply to author

Forward