case $OS in
FreeBSD)
local port_info="$(sockstat -46lp ${rsync_port} 2>/dev/null | \
grep ":${rsync_port}")"
local is_rsync="$(echo $port_info | \
grep -w '[[:space:]]\+rsync[[:space:]]\+'"$rsync_pid" 2>/dev/null)"
;;
*)
if ! which lsof > /dev/null; then
wsrep_log_error "lsof tool not found in PATH! Make sure you have it installed."
exit 2 # ENOENT
fi
local port_info="$(lsof -i :$rsync_port -Pn 2>/dev/null | \
grep "(LISTEN)")"
local is_rsync="$(echo $port_info | \
grep -w '^rsync[[:space:]]\+'"$rsync_pid" 2>/dev/null)"
;;
esac
local is_listening_all="$(echo $port_info | \
grep "*:$rsync_port" 2>/dev/null)"
local is_listening_addr="$(echo $port_info | \
grep "$rsync_addr:$rsync_port" 2>/dev/null)"
if [ ! -z "$is_listening_all" -o ! -z "$is_listening_addr" ]; then
if [ -z "$is_rsync" ]; then
wsrep_log_error "rsync daemon port '$rsync_port' has been taken"
exit 16 # EBUSY
fi
fi
# sockstat -46lp 4444
USER COMMAND PID FD PROTO LOCAL ADDRESS FOREIGN ADDRESS
# netstat -an
Active Internet connections (including servers)
Proto Recv-Q Send-Q Local Address Foreign Address (state)
tcp4 0 36 172.16.15.21.22 192.168.120.224.44212 ESTABLISHED
tcp4 0 0 *.22 *.* LISTEN
udp4 0 0 127.0.0.1.123 *.*
udp6 0 0 fe80::1%lo0.123 *.*
udp6 0 0 ::1.123 *.*
udp4 0 0 172.16.15.21.123 *.*
udp4 0 0 *.123 *.*
udp6 0 0 *.123 *.*
udp4 0 0 *.514 *.*
udp6 0 0 *.514 *.*
Active UNIX domain sockets
Address Type Recv-Q Send-Q Inode Conn Refs Nextref Addr
fffff8001a6b0870 stream 0 0 0 fffff8001a6b0960 0 0
fffff8001a6b0960 stream 0 0 0 fffff8001a6b0870 0 0
fffff8001a6b0b40 stream 0 0 fffff8001a6f8588 0 0 0 /var/run/devd.pipe
fffff8001a6b05a0 dgram 0 0 0 fffff8001a6b5780 0 fffff8001a6b5690
fffff8001a6b5690 dgram 0 0 0 fffff8001a6b5780 0 0
fffff8001a6b5780 dgram 0 0 fffff8001a664ce8 0 fffff8001a6b05a0 0 /var/run/logpriv
fffff8001a6b5870 dgram 0 0 fffff8003acaa000 0 0 0 /var/run/log
fffff8001a6b0a50 seqpac 0 0 fffff8001a6f83b0 0 0 0 /var/run/devd.seqpacket.pipe
>
> I found the source script that generates this error here:
> /usr/local/bin/wsrep_sst_rsync and the code that is relevant is below:
> [...]
Just to be sure, could you modify the script so that the contents of all
variables in question are logged (adding "set>/tmp/wsrep_sst_rsync.log"
right in front of the call to "wsrep_log_error" should be sufficient)
and provide us with the filtered/sanitised output?
(This, e.g., will show whether $OS really is "FreeBSD" as expected.)
local is_rsync="$(echo $port_info | \ grep -w '[[:space:]]\+rsync[[:space:]]\+'"$rsync_pid" 2>/dev/null)"
local is_rsync="$(echo "$port_info" | grep -w rsync.*"$rsync_pid" 2>/dev/null)"
On Fri, Sep 8, 2017 at 3:56 PM, Markus Ueberall <uebe...@projektzentrisch.de> wrote:>
> I found the source script that generates this error here:
> /usr/local/bin/wsrep_sst_rsync and the code that is relevant is below:
> [...]
Just to be sure, could you modify the script so that the contents of all
variables in question are logged (adding "set>/tmp/wsrep_sst_rsync.log"
right in front of the call to "wsrep_log_error" should be sufficient)
and provide us with the filtered/sanitised output?(not sure if I should remove anything from this output...)(This, e.g., will show whether $OS really is "FreeBSD" as expected.)
[...]# cat /tmp/wsrep_sst_rsync.log
is_listening_addr='mysql rsync 4517 5 tcp4 172.16.15.21:4444 *:*'
is_listening_all=''
is_rsync=''
pid_file=/var/db/mysql//rsync_sst.pid
port_info='mysql rsync 4517 5 tcp4 172.16.15.21:4444 *:*'
rsync_addr=172.16.15.21
rsync_pid=4517
rsync_port=4444
[...]
(I'm using the browser based Google Groups UI for this because neither my initial post nor your below answer landed in my mailbox.)After looking at the variable contents and playing with a test script consisting of only of the relevant fragments, I conclude that the problem is somehow related to the way $is_rsync is initialised. When I execute the command in question on the command line, the output is as expected–however, when executing the same command in a subshell the 'grep part' will fail.As a quick fix, please replace the following lines in the wsrep_sst_rsync script readingwith
local is_rsync="$(echo $port_info | \grep -w '[[:space:]]\+rsync[[:space:]]\+'"$rsync_pid" 2>/dev/null)"
local is_rsync="$(echo "$port_info" | grep -w rsync.*"$rsync_pid" 2>/dev/null)"and rerun your test. (This is not exactly the same, but a good enough regular expression which should not lead to false positives.)
[N]ow I get a new error:[...]
2017-09-08 22:24:39 34820008448 [Note] WSREP: Prepared SST request: rsync|172.16.15.21:4444/rsync_sst
2017-09-08 22:24:39 34820008448 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2017-09-08 22:24:39 34820008448 [Note] WSREP: REPL Protocols: 7 (3, 2)
2017-09-08 22:24:39 34820008448 [Note] WSREP: Assign initial position for certification: 0, protocol version: 3
2017-09-08 22:24:39 34424858112 [Note] WSREP: Service thread queue flushed.
2017-09-08 22:24:39 34820008448 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (2749a781-9248-11e7-83a2-0262448701d2): 1 (Operation not permitted)
at galera/src/replicator_str.cpp:prepare_for_IST():482. IST will be unavailable.
2017-09-08 22:24:39 34820005888 [Note] WSREP: Member 0.0 (hout3) requested state transfer from '*any*'. Selected 1.0 (hout2)(SYNCED) as donor.
2017-09-08 22:24:39 34820005888 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 0)
2017-09-08 22:24:39 34820008448 [Note] WSREP: Requesting state transfer: success, donor: 1
2017-09-08 22:24:39 34820008448 [Note] WSREP: GCache history reset: 00000000-0000-0000-0000-000000000000:0 -> 2749a781-9248-11e7-83a2-0262448701d2:0
2017-09-08 22:24:42 34424859392 [Note] WSREP: (bd2fcb2e, 'tcp://0.0.0.0:4567') turning message relay requesting off
2017-09-08 22:25:14 34820005888 [Warning] WSREP: 1.0 (hout2): State transfer to 0.0 (hout3) failed: -255 (Unknown error: 255)
2017-09-08 22:25:14 34820005888 [ERROR] WSREP: gcs/src/gcs_group.cpp:gcs_group_handle_join_msg():736: Will never receive state. Need to abort.
2017-09-08 22:25:14 34820005888 [Note] WSREP: gcomm: terminating thread
2017-09-08 22:25:14 34820005888 [Note] WSREP: gcomm: joining thread
2017-09-08 22:25:14 34820005888 [Note] WSREP: gcomm: closing backend
--
You received this message because you are subscribed to the Google Groups "codership" group.
To unsubscribe from this group and stop receiving emails from it, send an email to codership-team+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
There's no firewall between any of these machines, they're on their own network segment.I'll check the hout2 logs tonight.
regards,
Roland
(sent from my phone)
I cannot, however, find any reference to /test the config files. Where does this come from?