This morning I was trying to fill Galera's ring buffer cache by flooding a cluster with transactions while doing an SST. I was pleasantly surprised to find that Galera started to page off more cache files when it realized the first one was full. When it started to replay all it's transaction cache, I got these errors in the Joiner that I don't understand and hopefully someone can fill me in.
They only seem to occur for the first and second cache pages (out of 5 total pages), and there's about 380 errors from a total transaction set of about 100,000:
The joining node apparently caught up, but I'm not sure if these errors mean some transactions could have potentially been missed. The tests I was using to flood the database is a simple IOPS insert test, it continually insert the same data into the same table with the same keys, then truncate those tables and start all over again. This is Galera 2.1 that comes bundled with Percona XtraDB Cluster 5.5.27.
Thanks,
-Luke
-- Luke Bigum Senior Systems Engineer
Information Systems luke.bi...@lmax.com | http://www.lmax.com LMAX, Yellow Building, 1A Nicholas Road, London W11 4AN
This warning is a false positive (bug in reporting) which should be fixed in later revisions. What is happening is that writesets that were already contained in snapshot are discarded and cause this message. As you can see it started right after SST completed and cache replay had begun, latter pages didn't produce such warnings because they contained writesets accumulated after state snapshot was made (during catch-up).
The warnings belong to truncates, which are executed (in this case skipped) like DDL in TO-isolation.
As a side note, I guess you'd have much more luck (much faster trx rates and saturation) if you used updates ;)
> This morning I was trying to fill Galera's ring buffer cache by > flooding a
> cluster with transactions while doing an SST. I was pleasantly > surprised to
> find that Galera started to page off more cache files when it > realized the
> first one was full. When it started to replay all it's transaction > cache, I
> got these errors in the Joiner that I don't understand and hopefully
> someone can fill me in.
> They only seem to occur for the first and second cache pages (out of > 5
> total pages), and there's about 380 errors from a total transaction > set of
> about 100,000:
> The joining node apparently caught up, but I'm not sure if these > errors
> mean some transactions could have potentially been missed. The tests > I was
> using to flood the database is a simple IOPS insert test, it > continually
> insert the same data into the same table with the same keys, then > truncate
> those tables and start all over again. This is Galera 2.1 that comes
> bundled with Percona XtraDB Cluster 5.5.27.
> Thanks,
> -Luke
> --
> Luke Bigum
> Senior Systems Engineer
> Information Systems
> luke.bi...@lmax.com | http://www.lmax.com > LMAX, Yellow Building, 1A Nicholas Road, London W11 4AN
-- Alexey Yurchenko,
Codership Oy, www.codership.com Skype: alexey.yurchenko, Phone: +358-400-516-011