Node unable to join after updating because of CVE-2015-0235

86 views
Skip to first unread message

Akulatraxas Drak

unread,
Jan 29, 2015, 5:05:26 AM1/29/15
to codersh...@googlegroups.com
Hey!

Yesterday we started updating our RHEL Percona XtraDB Cluster because of CVE-2015-0235.
We are running MySQL 5.5.39-36.0-55-log / wsrep_25.11.r4023 2.11(r318911d).

After updating the OS, including the glibc to 2.12-1.149 , the updated node was unable to join the cluster. 

-------------------------------------------------------------------------------
150128 16:42:15 [Note] WSREP: wsrep_sst_grab()
150128 16:42:15 [Note] WSREP: Start replication
150128 16:42:15 [Note] WSREP: Setting initial position to 5100e81e-96d7-11e2-0800-e237b5e6b647:1664231
150128 16:42:15 [Note] WSREP: protonet asio version 0
150128 16:42:15 [Note] WSREP: backend: asio
150128 16:42:15 [Note] WSREP: GMCast version 0
150128 16:42:15 [Note] WSREP: (3babb45c, 'tcp://192.168.28.22:4587') listening at tcp://192.168.28.22:4587
150128 16:42:15 [Note] WSREP: (3babb45c, 'tcp://192.168.28.22:4587') multicast: , ttl: 1
150128 16:42:15 [Note] WSREP: EVS version 0
150128 16:42:15 [Note] WSREP: PC version 0
150128 16:42:15 [Note] WSREP: gcomm: connecting to group 'Galera_VM_Play', peer 'robert.sql:4587'
150128 16:42:18 [Warning] WSREP: no nodes coming from prim view, prim not possible
150128 16:42:18 [Warning] WSREP: last inactive check more than PT1.5S ago (PT3.50345S), skipping check
150128 16:42:48 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)
         at gcomm/src/pc.cpp:connect():141
150128 16:42:48 [ERROR] WSREP: gcs/src/gcs_core.cpp:long int gcs_core_open(gcs_core_t*, const char*, const char*, bool)():204: Failed to open backend connection: -110 (Connection timed out)
150128 16:42:48 [ERROR] WSREP: gcs/src/gcs.cpp:long int gcs_open(gcs_conn_t*, const char*, const char*, bool)():1303: Failed to open channel 'Galera_VM_Play' at 'gcomm://robert.sql:4587': -110 (Connection timed out)
150128 16:42:48 [ERROR] WSREP: gcs connect failed: Connection timed out
150128 16:42:48 [ERROR] WSREP: wsrep::connect() failed: 7
150128 16:42:48 [ERROR] Aborting
-------------------------------------------------------------------------------

There is no connection made from the joining node to the cluster at all. No single packet going from the node.
We really searched for quite some time, after commenting out the provider option it suddenly worked:

wsrep_provider_options          = "ist.recv_addr=urmond.sql:4588;gmcast.listen_addr=tcp://urmond.sql:4587;gcache.size=2G; gcache.page_size=1G;"

The option that makes the cluster-connect fail is:
gmcast.listen_addr=tcp://urmond.sql:4587
if we change this to
gmcast.listen_addr=tcp://<ip address of urmond.sql>:4587

My first thought is, that this is directly related to the fix in the glibc. Can somebody confirm this?
I would like to use the hostname in this option

Greetings
Aku

alexey.y...@galeracluster.com

unread,
Jan 29, 2015, 6:19:54 AM1/29/15
to Akulatraxas Drak, codersh...@googlegroups.com
Hi,

It does look like it may be glibc related, but what about name service
configuration. Could it have changed? Does ping urmond.sql work? What it
resolves to?

Akulatraxas Drak

unread,
Jan 29, 2015, 10:08:21 AM1/29/15
to codersh...@googlegroups.com, akulatr...@googlemail.com
Hi,

the first thing I tried was telnetting from urmond to robert.sql:4587. It works and iam seeing an answer from the other side.
Name Resolution works in the whole cluster, forward and backward. So nope, no problem here.

Greetz
Reply all
Reply to author
Forward
0 new messages