Hello everyone,
I had previously posted here a long time ago
https://groups.google.com/forum/embed/?place=forum/codership-team&showsearch=true&showpopout=true&hl=en&parenturl=http%3A%2F%2Fgaleracluster.com%2Fcommunity%2F#!searchin/codership-team/geo$20distributed/codership-team/BAFId1pzXl0/6iKJ1v0kJgAJThose settings dramatically helped in stabilizing the cluster.
However, something that either we did not notice or has suddenly come out are hourly spikes.
Here is what happens: every hour DB gets very slow. We are on SSD SAN so IO stays around 10-20%.
But everything slows down. Queries start taking upto 80 seconds to finish.
Given that yes it might be a problem with the application's operation/DB, is there anything we can do/optimize on Galera?
This happens once an hour and sometimes lasts upto 15-20 mins or even longer.
Same application with an even higher load is running perfectly fine on a Postgres DB.
Could this have anything to do with the replication or replication settings?
my.cnf file attached.
As per Philip's recommendation, the sysctl.conf updated with these on all nodes
$net_core_rmem_max = "16777216",
$net_core_wmem_max = "16777216",
$net_core_rmem_default = "16777216",
$net_core_wmem_default = "16777216",
$net_ipv4_tcp_rmem = "4096 87380 16777216",
$net_ipv4_tcp_wmem = "4096 65536 16777216",
$net_ipv4_tcp_slow_start_after_idle = "0",
Any help much appreciated.
Thanks
John