Node unable to join after updating because of CVE-2015-0235

Akulatraxas Drak

unread,

Jan 29, 2015, 5:05:26 AM1/29/15

to codersh...@googlegroups.com

Hey!

Yesterday we started updating our RHEL Percona XtraDB Cluster because of CVE-2015-0235.

We are running MySQL 5.5.39-36.0-55-log / wsrep_25.11.r4023 2.11(r318911d).

After updating the OS, including the glibc to 2.12-1.149 , the updated node was unable to join the cluster.

-------------------------------------------------------------------------------

150128 16:42:15 [Note] WSREP: wsrep_sst_grab()

150128 16:42:15 [Note] WSREP: Start replication

150128 16:42:15 [Note] WSREP: Setting initial position to 5100e81e-96d7-11e2-0800-e237b5e6b647:1664231

150128 16:42:15 [Note] WSREP: protonet asio version 0

150128 16:42:15 [Note] WSREP: backend: asio

150128 16:42:15 [Note] WSREP: GMCast version 0

150128 16:42:15 [Note] WSREP: (3babb45c, 'tcp://192.168.28.22:4587') listening at tcp://192.168.28.22:4587

150128 16:42:15 [Note] WSREP: (3babb45c, 'tcp://192.168.28.22:4587') multicast: , ttl: 1

150128 16:42:15 [Note] WSREP: EVS version 0

150128 16:42:15 [Note] WSREP: PC version 0

150128 16:42:15 [Note] WSREP: gcomm: connecting to group 'Galera_VM_Play', peer 'robert.sql:4587'

150128 16:42:18 [Warning] WSREP: no nodes coming from prim view, prim not possible

150128 16:42:18 [Warning] WSREP: last inactive check more than PT1.5S ago (PT3.50345S), skipping check

150128 16:42:48 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)

at gcomm/src/pc.cpp:connect():141

150128 16:42:48 [ERROR] WSREP: gcs/src/gcs_core.cpp:long int gcs_core_open(gcs_core_t*, const char*, const char*, bool)():204: Failed to open backend connection: -110 (Connection timed out)

150128 16:42:48 [ERROR] WSREP: gcs/src/gcs.cpp:long int gcs_open(gcs_conn_t*, const char*, const char*, bool)():1303: Failed to open channel 'Galera_VM_Play' at 'gcomm://robert.sql:4587': -110 (Connection timed out)

150128 16:42:48 [ERROR] WSREP: gcs connect failed: Connection timed out

150128 16:42:48 [ERROR] WSREP: wsrep::connect() failed: 7

150128 16:42:48 [ERROR] Aborting

-------------------------------------------------------------------------------

There is no connection made from the joining node to the cluster at all. No single packet going from the node.

We really searched for quite some time, after commenting out the provider option it suddenly worked:

wsrep_provider_options = "ist.recv_addr=urmond.sql:4588;gmcast.listen_addr=tcp://urmond.sql:4587;gcache.size=2G; gcache.page_size=1G;"

The option that makes the cluster-connect fail is:

gmcast.listen_addr=tcp://urmond.sql:4587

if we change this to

gmcast.listen_addr=tcp://<ip address of urmond.sql>:4587

My first thought is, that this is directly related to the fix in the glibc. Can somebody confirm this?

I would like to use the hostname in this option

Greetings

Aku

alexey.y...@galeracluster.com

unread,

Jan 29, 2015, 6:19:54 AM1/29/15

to Akulatraxas Drak, codersh...@googlegroups.com

Hi,

It does look like it may be glibc related, but what about name service
configuration. Could it have changed? Does ping urmond.sql work? What it
resolves to?

Akulatraxas Drak

unread,

Jan 29, 2015, 10:08:21 AM1/29/15

to codersh...@googlegroups.com, akulatr...@googlemail.com

Hi,

the first thing I tried was telnetting from urmond to robert.sql:4587. It works and iam seeing an answer from the other side.

Name Resolution works in the whole cluster, forward and backward. So nope, no problem here.

Greetz

Reply all

Reply to author

Forward