sync delay

Apoorva Gaurav

unread,

Jul 2, 2014, 3:13:10 PM7/2/14

to codersh...@googlegroups.com

Hello All,

I'm just started evaluating Galera so just a novice here. I've spawned a 3 node Galera cluster on AWS EC2 instances and accessing it via HAProxy. These are c3.large instances running CentOS 5.10, mysql rpm I've used is MySQL-server-5.5.34_wsrep_25.9-1.rhel5.x86_64.rpm and Galera rpm is galera-25.2.9-1.rhel5.x86_64.rpm haven't installed xtrabackup as yet. Running some tests on this setup, one of them being :-

Created a table having auto generated primary key and three integer columns.

CREATE TABLE `table1` (

`id` int(11) NOT NULL AUTO_INCREMENT,

`col1` int(11) DEFAULT NULL,

`col2` int(11) DEFAULT NULL,

`col3` int(11) DEFAULT NULL,

PRIMARY KEY (`id`)

)

10 threads are running the following in loop

-Connect

-Insert a new row with random numbers rand1, rand2 and rand3 for col1, col2 and col3. get the auto generated id for the same.

-Disconnect

-Connect

-Update col3 for same id with another random number rand3b

-Disconnect

-Connect

-Select row for same id

-Disconnect

-Check values of col1, col2, col3

Close to 1% updates are failing because the data is not synced and around 1% times value of col3 comes to rand3 instead of rand3b which indicates that there is some lag. If I check wsrep% status on mysql instances its always in Synced. Is such behaviour expected or there might be some errors in my configuration. /etc/my.cnf look like this :-

[mysqld]

datadir=/var/lib/mysql

socket=/var/lib/mysql/mysql.sock

user=mysql

# Default to using old password format for compatibility with mysql 3.x

# clients (those using the mysqlclient10 compatibility package).

old_passwords=1

query_cache_size=0

binlog_format=ROW

default_storage_engine=innodb

innodb_autoinc_lock_mode=2

wsrep_provider=/usr/lib64/galera/libgalera_smm.so

wsrep_provider_options="gcache.size=2G; gcache.page_size=1G"

wsrep_cluster_address=gcomm://IP2,IP3

wsrep_cluster_name='my_galera_cluster'

wsrep_node_address='IP1'

wsrep_node_name='node1'

# Disabling symbolic-links is recommended to prevent assorted security risks;

# to do so, uncomment this line:

# symbolic-links=0

[mysqld_safe]

log-error=/var/log/mysqld.log

pid-file=/var/run/mysqld/mysqld.pid

I understand that this is not a real world problem as one won't be updating the immediately written data using a new connection. If the behaviour is expected how can these be mitigated.

Thanks in advance

-Apoorva

Daniel Black

unread,

Jul 2, 2014, 7:03:47 PM7/2/14

to Apoorva Gaurav, codersh...@googlegroups.com

> Close to 1% updates are failing because the data is not synced and

> around 1% times value of col3 comes to r a n d 3 instead of rand3b

> which indicates that there is some lag. If I check wsrep% status on
> mysql instances its always in Synced. Is such behaviour expected or
> there might be some errors in my configuration.

Its expected but admittedly not widely documented.

http://www.percona.com/files/presentations/percona-live/nyc-2012/PLNY12-synchronous-replication-use-cases.pdf slide 61+

And thanks for asking because I wouldn't have known about wsrep_causal_reads unless you had asked.

--
Daniel Black, Engineer @ Open Query (http://openquery.com.au)
Remote expertise & maintenance for MySQL/MariaDB server environments.

Apoorva Gaurav

unread,

Jul 2, 2014, 10:41:40 PM7/2/14

to codersh...@googlegroups.com, apoorva...@myntra.com, daniel...@openquery.com

Thanks Daniel. I've now setup the wsrep_causal_read=ON on all three mysql nodes. Percentage of stale reads has decreased to around .5% but still not fully eliminated.

Daniel Black

unread,

Jul 2, 2014, 11:06:26 PM7/2/14

to Apoorva Gaurav, codersh...@googlegroups.com

----- Original Message -----
> Thanks Daniel. I've now setup the wsrep_causal_read=ON on all three
> mysql nodes.

Its a per session variable only. Is this being set on all 3 connections?

> Percentage of stale reads has decreased to around .5% but
> still not fully eliminated.

If the above is the case, I can only imagine this with wsrep_provider_options=repl.commit_order=1

What SQL statements are you using for the insert/update?

Is is possibly that an update is being processed on a node before the insert?

Suggest keeping binlogs on each node and seeing what order statements are in those for the % that fail.

Apoorva Gaurav

unread,

Jul 3, 2014, 12:27:16 AM7/3/14

to Daniel Black, codersh...@googlegroups.com

Got it. Running the test with each connection having wsrep_causal_read=ON. Results are much better but sill a few (~.1%) inconsistent ones are there. Will dive deep into these using binlogs. So this setting should be used when consistency is utmost important else we can leave this OFF, isn't it?

--
Thanks & Regards,
Apoorva

Daniel Black

unread,

Jul 3, 2014, 12:54:41 AM7/3/14

to Apoorva Gaurav, codersh...@googlegroups.com

----- Original Message -----
> Got it. Running the test with each connection having
> wsrep_causal_read=ON. Results are much better but sill a few (~.1%)
> inconsistent ones are there. Will dive deep into these using binlogs.

Interested to know the results. It may actually constitute a bug. I suggest you publish your code in a bug report.

> So this setting should be used when consistency is utmost important
> else we can leave this OFF, isn't it?

correct. Having it ON incurred delay in table read statements.

Daniel Black

unread,

Jul 4, 2014, 8:00:13 AM7/4/14

to Apoorva Gaurav, codersh...@googlegroups.com

----- Original Message -----
> ----- Original Message -----
> > Got it. Running the test with each connection having
> > wsrep_causal_read=ON. Results are much better but sill a few (~.1%)
> > inconsistent ones are there. Will dive deep into these using
> > binlogs.
>
> Interested to know the results. It may actually constitute a bug. I
> suggest you publish your code in a bug report.

Is it possible that...
* A node that received an update statement had received
* the certification from all other cluster members report this to A node
* Node A has sent the apply message
* The Apply message hadn't been got received by node B
* Node B received a select statement related to the previous update statement, and even with wsrep_casual_read=ON, the certified but not in the apply queue write set, was not considered before the result of the select statement was generated

?

Apoorva Gaurav

unread,

Jul 4, 2014, 8:33:11 AM7/4/14

to Daniel Black, codersh...@googlegroups.com

On Fri, Jul 4, 2014 at 5:30 PM, Daniel Black <daniel...@openquery.com> wrote:

----- Original Message -----
> ----- Original Message -----
> > Got it. Running the test with each connection having
> > wsrep_causal_read=ON. Results are much better but sill a few (~.1%)
> > inconsistent ones are there. Will dive deep into these using
> > binlogs.
>
> Interested to know the results. It may actually constitute a bug. I
> suggest you publish your code in a bug report.

Is it possible that...

* A node that received an update statement had received
* the certification from all other cluster members report this to A node
* Node A has sent the apply message
* The Apply message hadn't been got received by node B
* Node B received a select statement related to the previous update statement, and even with wsrep_casual_read=ON, the certified but not in the apply queue write set, was not considered before the result of the select statement was generated

?

Anyway to check it out.

--
Daniel Black, Engineer @ Open Query (http://openquery.com.au)
Remote expertise & maintenance for MySQL/MariaDB server environments.

Daniel Black

unread,

Jul 4, 2014, 7:43:56 PM7/4/14

to Apoorva Gaurav, codersh...@googlegroups.com

----- Original Message -----
> On Fri, Jul 4, 2014 at 5:30 PM, Daniel Black <
> daniel...@openquery.com > wrote:
>
>
>
>
>
> ----- Original Message -----
> > ----- Original Message -----
> > > Got it. Running the test with each connection having
> > > wsrep_causal_read=ON. Results are much better but sill a few
> > > (~.1%)
> > > inconsistent ones are there. Will dive deep into these using
> > > binlogs.
> >
> > Interested to know the results. It may actually constitute a bug. I
> > suggest you publish your code in a bug report.
>
> Is it possible that...
>
> * A node that received an update statement had received
> * the certification from all other cluster members report this to A
> node
> * Node A has sent the apply message
> * The Apply message hadn't been got received by node B
> * Node B received a select statement related to the previous update
> statement, and even with wsrep_casual_read=ON, the certified but not
> in the apply queue write set, was not considered before the result of
> the select statement was generated
>
>
>
> ?
>
> Anyway to check it out.

Look at the code. I was more asking the galera devs here if the above could be the case.

Alternately to determine its a bug, at the same time (i.e. same select statement or following one) you select the updated value, select the the hostname and binary log position or gtid. When the non-updated result is retrieved look at the binlog and see if your select came before the update.

You're on your own now. I think I've helped enough.

alexey.y...@galeracluster.com

unread,

Jul 5, 2014, 6:10:43 AM7/5/14

to codersh...@googlegroups.com

On 2014-07-02 22:13, Apoorva Gaurav wrote:
> Hello All,
>
> I'm just started evaluating Galera so just a novice here. I've spawned
> a 3
> node Galera cluster on AWS EC2 instances and accessing it via HAProxy.

> These are *c3.large* instances running *CentOS 5.10*, mysql rpm I've
> used
> is *MySQL-server-5.5.34_wsrep_25.9-1.rhel5.x86_64.rpm *and Galera rpm
> is
> *galera-25.2.9-1.rhel5.x86_64.rpm* haven't installed xtrabackup as

> yet. Running some tests on this setup, one of them being :-
> Created a table having auto generated primary key and three integer
> columns.
>

> *CREATE TABLE `table1` (*
> * `id` int(11) NOT NULL AUTO_INCREMENT,*
> * `col1` int(11) DEFAULT NULL,*
> * `col2` int(11) DEFAULT NULL,*
> * `col3` int(11) DEFAULT NULL,*
> * PRIMARY KEY (`id`)*
> *)*

>
> 10 threads are running the following in loop
> -Connect

> -Insert a new row with random numbers *rand1, rand2 *and
> *r**a**n**d**3* for
> *col1, col2 *and* col3*. get the auto generated id for the same.
> -Disconnect
> -Connect
> -Update *col3* for same *id *with another random number *rand3b*

At this point you're hitting the issue
https://bugs.launchpad.net/codership-mysql/+bug/1277053.
wsrep_causal_reads does not affect autocommit writes.

However if you wrap it in BEGIN/COMMIT clauses, you should see 100%
synchronous results.

> -Disconnect
> -Connect
> -Select row for same *id*
> -Disconnect
> -Check values of *col1, col2, **col3*

>
> Close to 1% updates are failing because the data is not synced and

> around 1% times value of *col3 *comes to *r**a**n**d**3* instead of
> *rand3b
> *which indicates that there is some lag. If I check wsrep% status on

> mysql
> instances its always in Synced. Is such behaviour expected or there
> might
> be some errors in my configuration. /etc/my.cnf look like this :-

> *[mysqld]*
> *datadir=/var/lib/mysql*
> *socket=/var/lib/mysql/mysql.sock*
> *user=mysql*
> *# Default to using old password format for compatibility with mysql
> 3.x*
> *# clients (those using the mysqlclient10 compatibility package).*
> *old_passwords=1*
>
> *query_cache_size=0*
> *binlog_format=ROW*
> *default_storage_engine=innodb*
> *innodb_autoinc_lock_mode=2*
>
> *wsrep_provider=/usr/lib64/galera/libgalera_smm.so*
> *wsrep_provider_options="gcache.size=2G; gcache.page_size=1G"*
> *wsrep_cluster_address=gcomm://IP2,IP3*
> *wsrep_cluster_name='my_galera_cluster'*
> *wsrep_node_address='IP1'*
> *wsrep_node_name='node1'*
>
> *# Disabling symbolic-links is recommended to prevent assorted security
> risks;*
> *# to do so, uncomment this line:*
> *# symbolic-links=0*
>
> *[mysqld_safe]*
> *log-error=/var/log/mysqld.log*
> *pid-file=/var/run/mysqld/mysqld.pid*

Apoorva Gaurav

unread,

Jul 5, 2014, 8:57:18 AM7/5/14

to alexey.y...@galeracluster.com, codersh...@googlegroups.com

Thank Alexy, trying with start transaction / commit (since I'm using JDBC its conn.setAutoCommit(true)). Results are even better but still not 100% consistent. I think I can live with it.

--
You received this message because you are subscribed to the Google Groups "codership" group.
To unsubscribe from this group and stop receiving emails from it, send an email to codership-team+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

alexey.y...@galeracluster.com

unread,

Jul 5, 2014, 9:23:51 AM7/5/14

to codersh...@googlegroups.com

On 2014-07-05 15:56, Apoorva Gaurav wrote:
> Thank Alexy, trying with start transaction / commit (since I'm using
> JDBC
> its conn.setAutoCommit(true)). Results are even better but still not
> 100%
> consistent. I think I can live with it.

Either you are not setting wsrep_causal_reads for all updating and
selecting connections or we have a bug there. Perhaps if you posted the
actual SQL you're using there we could try to reproduce it.

Also, does your application correctly handle all errors, including
connection errors. What I'm hinting at is that some inserts or updates
may fail for whatever reason and select should also predictably fail,
but that won't constitute synchronization failure.

>> email to codership-tea...@googlegroups.com.

Apoorva Gaurav

unread,

Jul 7, 2014, 1:46:49 AM7/7/14

to Alexey Yurchenko, codersh...@googlegroups.com

I can post the Java code I've written if that can help

On Sat, Jul 5, 2014 at 6:53 PM, <alexey.y...@galeracluster.com> wrote:

On 2014-07-05 15:56, Apoorva Gaurav wrote:

Thank Alexy, trying with start transaction / commit (since I'm using JDBC
its conn.setAutoCommit(true)). Results are even better but still not 100%
consistent. I think I can live with it.

Either you are not setting wsrep_causal_reads for all updating and selecting connections or we have a bug there. Perhaps if you posted the actual SQL you're using there we could try to reproduce it.

Also, does your application correctly handle all errors, including connection errors. What I'm hinting at is that some inserts or updates may fail for whatever reason and select should also predictably fail, but that won't constitute synchronization failure.

email to codership-team+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
Thanks & Regards,
Apoorva

--
You received this message because you are subscribed to the Google Groups "codership" group.

To unsubscribe from this group and stop receiving emails from it, send an email to codership-team+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Apoorva Gaurav

unread,

Jul 7, 2014, 3:46:47 AM7/7/14

to Alexey Yurchenko, codersh...@googlegroups.com

Here it is, README.txt mentions the setup, can describe in detail if needed.

galeratest.zip

Okan Civelek

unread,

Feb 13, 2015, 8:23:48 AM2/13/15

to codersh...@googlegroups.com, alexey.y...@galeracluster.com

Hi Apoorva,

Have you solved your problem? I'm having same problem right now...

email to codership-tea...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
Thanks & Regards,
Apoorva

--
You received this message because you are subscribed to the Google Groups "codership" group.

To unsubscribe from this group and stop receiving emails from it, send an email to codership-tea...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
Thanks & Regards,
Apoorva

Apoorva Gaurav

unread,

Feb 22, 2015, 8:18:32 AM2/22/15

to Okan Civelek, codersh...@googlegroups.com, Alexey Yurchenko

setting wsrep_causal_reads helped

Reply all

Reply to author

Forward