Thanks. Some more information in diagnosis.
In Summary:
* -A is causing grief for VIP's on another server. Removing it gives
me a partial solution.
arping of a IP on the current server/interface fails. Yet to confirm
if you do this/working on RH.
* Was considering HeartBeat, Monty T tells me they have written there
own arp implementation.
$ sudo arping -V
arping: invalid option -- V
ARPing 2.05, by Thomas Habets <tho...@habets.pp.se>
usage: arping [ -0aAbdFpqrRuv ] [ -w <us> ] [ -S <host/ip> ] [ -T <host/ip ]
[ -s <MAC> ] [ -t <MAC> ] [ -c <count> ] [ -i <interface> ]
<host/ip/MAC | -B>
No version arg, but the version in the output/
./flipper developer status 2>/dev/null
MASTERPAIR: developer
NODE: beta181 has read IP, is read-only, replication running, 0s delay
NODE: alpha187 has write IP, is writable, replication running, 0s delay
$ ifconfig
eth0 Link encap:Ethernet HWaddr 00:16:3e:58:f8:64
inet addr:192.168.2.187 Bcast:192.168.2.255 Mask:255.255.255.0
inet6 addr: 2002:44a1:ed31:1234:216:3eff:fe58:f864/64 Scope:Global
inet6 addr: fe80::216:3eff:fe58:f864/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:7255 errors:0 dropped:0 overruns:0 frame:0
TX packets:1705 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:700657 (684.2 KB) TX bytes:257208 (251.1 KB)
eth0:1 Link encap:Ethernet HWaddr 00:16:3e:58:f8:64
inet addr:192.168.2.81 Bcast:192.168.2.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
The other server
$ ifconfig
eth0 Link encap:Ethernet HWaddr 00:16:3e:a8:c8:41
inet addr:192.168.2.181 Bcast:192.168.2.255 Mask:255.255.255.0
inet6 addr: 2002:44a1:ed31:1234:216:3eff:fea8:c841/64 Scope:Global
inet6 addr: fe80::216:3eff:fea8:c841/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1694898 errors:0 dropped:0 overruns:0 frame:0
TX packets:52793 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:344595511 (328.6 MB) TX bytes:6034185 (5.7 MB)
eth0:0 Link encap:Ethernet HWaddr 00:16:3e:a8:c8:41
inet addr:192.168.2.71 Bcast:192.168.2.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
$ sudo /usr/sbin/arping -I eth0 -c 5 192.168.2.71
ARPING 192.168.2.71
42 bytes from 00:16:3e:a8:c8:41 (192.168.2.71): index=0 time=107.765 usec
42 bytes from 00:16:3e:a8:c8:41 (192.168.2.71): index=1 time=241.041 usec
42 bytes from 00:16:3e:a8:c8:41 (192.168.2.71): index=2 time=214.100 usec
42 bytes from 00:16:3e:a8:c8:41 (192.168.2.71): index=3 time=256.062 usec
42 bytes from 00:16:3e:a8:c8:41 (192.168.2.71): index=4 time=237.226 usec
--- 192.168.2.71 statistics ---
5 packets transmitted, 5 packets received, 0% unanswered
$ sudo /usr/sbin/arping -I eth0 -c 5 -A 192.168.2.71
ARPING 192.168.2.71
--- 192.168.2.71 statistics ---
5 packets transmitted, 0 packets received, 100% unanswered
arping(8) arping(8)
NAME
arping - sends arp and/or ip pings to a given host
SYNOPSIS
arping [-abhpqrRd0uv] [-S host/ip] [-T host/ip] [-s MAC] [-t MAC]
[-c count] [-i interface] [ -w us ] <host | -B>
DESCRIPTION
The arping utility sends ARP and/or ICMP requests to the specified host
and displays the replies. The host may be specified by its hostname,
its IP address, or its MAC address.
One request is sent each second.
When pinging an IP an ARP who-has query is sent. When pinging a MAC
address a directed broadcast ICMP Echo request is sent. For more tech-
nical explaination and an FAQ, see the README file.
Important note on timing
ARP packets are usually replied to (on a LAN) so fast that the OS task
scheduler can't keep up to get exact enough timing. On an idle system
the roundtrip times will be pretty much accurate, but with more load
the timing gets less exact.
To get more exact timing on a non-idle system, re-nice arping to -15 or
so.
# nice -n -15 arping foobar
This is not just an issue with arping, it is with normal ping also (at
least it is on my system). But it doesn't show up as much with ping
since arping packets (when pinging IP) doesn't traverse the IP stack
when received and are therefore replied to faster.
OPTIONS
-0 Use this option to ping with source IP address 0.0.0.0. Use this
when you haven't configured your interface yet. Note that this
may get the MAC-ping unanswered. This is an alias for -S
0.0.0.0.
-a Audiable ping.
-A Only count addresses matching requested address (This *WILL*
break most things you do. Only useful if you are arpinging many
hosts at once. See arping-scan-net.sh for an example).
-b Like -0 but source broadcast source address (255.255.255.255).
Note that this may get the arping unanswered since it's not nor-
mal behavior for a host.
-B Use instead of host if you want to address 255.255.255.255.
-c count
Only send count requests.
-d Find duplicate replies.
-F Don't try to be smart about the interface name. (even if this
switch is not given, -i overrides smartness.
-h Displays a help message and exits.
-i interface
Use the specified interface.
-q Does not display messages, except error messages.
-r Raw output: only the MAC/IP address is displayed for each reply.
-R Raw output: Like -r but shows "the other one", can be combined
with -r.
-s MAC Set source MAC address. You may need to use -p with this.
-S IP Like -b and -0 but with set source address. Note that this may
get the arping unanswered if the target does not have routing to
the IP. If you don't own the IP you are using, you may need to
turn on promiscious mode on the interface (with -p). With this
switch you can find out what IP-address a host has without tak-
ing an IP-address yourself.
-t MAC Set target MAC address to use when pinging IP address.
-T IP Use -T as target address when pinging MACs that won't respond to
a broadcast ping but perhaps to a directed broadcast.
Example:
To check the address of MAC-A, use knowledge of MAC-B and IP-B.
$ arping -S <IP-B> -s <MAC-B> -p <MAC-A>
-p Turn on promiscious mode on interface, use this if you don't
"own" the MAC address you are using.
-u Show index=received/sent instead of just index=received when
pinging MACs.
-v Verbose output. Use twice for more messages.
-w (arping 2.x only) Time to wait between pings, in microseconds.
BUGS
You have to use -B instead of arpinging 255.255.255.255, and -b instead
of -S 255.255.255.255. This is libnets fault.
SEE ALSO
ping(8), arp(8), rarp(8)
AUTHOR
Arping was written by Thomas Habets <tho...@habets.pp.se>.
arping 21th June, 2003 arping(8)
I'm not sure because I know nothing about ARP is this necessary, can
this part be skipped.
However, obviously if The -U (== "Unsolicited ARP mode, update your
neighbours") if necessary, that's a bigger problem.
Ronald
$ ./flipper developer status 2> /dev/null
MASTERPAIR: developer
NODE: beta181 is read-only, replication running, 0s delay
NODE: alpha187 is writable, replication running, 0s delay
WARNING: No node has the read IP
WARNING: No node has the write IP
$ ./flipper developer set write beta181
Connection to 192.168.2.181 closed.
Connection to 192.168.2.187 closed.
WARNING: write IP is not up on alpha187 node.
WARNING: Won't attempt to take down write IP on alpha187 node.
Connection to 192.168.2.181 closed.
Connection to 192.168.2.181 closed.
Connection to 192.168.2.181 closed.
Connection to 192.168.2.181 closed.
ERROR: Error occurred when executing ssh -l flipper 192.168.2.181
sudo /usr/sbin/arping -I eth0 -c 5 192.168.2.81
ERROR: Couldn't execute send ARP commmand /usr/sbin/arping -I eth0 -c
5 192.168.2.81
$ sudo /usr/sbin/arping -I eth0 -c 5 192.168.2.81
ARPING 192.168.2.81
42 bytes from 00:16:3e:a8:c8:41 (192.168.2.81): index=0 time=161.886 usec
42 bytes from 00:16:3e:a8:c8:41 (192.168.2.81): index=1 time=237.942 usec
42 bytes from 00:16:3e:a8:c8:41 (192.168.2.81): index=2 time=236.034 usec
--- 192.168.2.81 statistics ---
3 packets transmitted, 3 packets received, 0% unanswered
$ sudo apt-get install heartbeat
Various Manual Testing across node, read/write ip, other server vip. All Good.
$ /usr/lib/heartbeat/send_arp -p /tmp/send_arp.pid -i 100 -r 5 eth0
192.168.2.16 auto 192.168.2.255 255.255.255.0; echo $?
0
$ /usr/lib/heartbeat/send_arp -p /tmp/send_arp.pid -i 100 -r 5 eth0
192.168.2.81 auto 192.168.2.255 255.255.255.0;echo $?
0
$ /usr/lib/heartbeat/send_arp -p /tmp/send_arp.pid -i 100 -r 5 eth0
192.168.2.181 auto 192.168.2.255 255.255.255.0; echo $?
0
$ /usr/lib/heartbeat/send_arp -p /tmp/send_arp.pid -i 100 -r 5 eth0
192.168.2.71 auto 192.168.2.255 255.255.255.0; echo $?
0
Do it via Flipper.
$ ./flipper developer status
MASTERPAIR: developer
Connection to 192.168.2.181 closed.
NODE: beta181 has write IP, is writable, replication running, 0s delay
Connection to 192.168.2.187 closed.
NODE: alpha187 is read-only, replication running, 0s delay
WARNING: No node has the read IP
$ ./flipper developer set write alpha187
Connection to 192.168.2.187 closed.
Connection to 192.168.2.181 closed.
Connection to 192.168.2.181 closed.
Connection to 192.168.2.187 closed.
Connection to 192.168.2.187 closed.
Connection to 192.168.2.187 closed.
Connection to 192.168.2.187 closed.
ERROR: Error occurred when executing ssh -l flipper 192.168.2.187
sudo /usr/lib/heartbeat/send_arp -p /tmp/send_arp.pid -i 100 -r 5 eth0
192.168.2.81 auto 255.255.255.0
ERROR: Couldn't execute send ARP commmand /usr/lib/heartbeat/send_arp
-p /tmp/send_arp.pid -i 100 -r 5 eth0 192.168.2.81 auto 255.255.255.0
Takes like 1-2 minutes to timeout.
First obvious observation, flipper /etc/sudoers will need
/usr/lib/heartbeat/send_arp
Do that, and repeat, no luck.
$ ./flipper developer status
MASTERPAIR: developer
Connection to 192.168.2.181 closed.
NODE: beta181 is read-only, replication running, 0s delay
Connection to 192.168.2.187 closed.
NODE: alpha187 has write IP, is writable, replication running, 0s delay
WARNING: No node has the read IP
rbradfor@ronald:~/svn/trunk/upco/exploration/ronald/database/external/flipper/bin$
./flipper developer set read beta181
Connection to 192.168.2.181 closed.
Connection to 192.168.2.187 closed.
WARNING: read IP is not up on alpha187 node.
WARNING: Won't attempt to take down read IP on alpha187 node.
Connection to 192.168.2.181 closed.
Connection to 192.168.2.181 closed.
Connection to 192.168.2.181 closed.
Connection to 192.168.2.181 closed.
ERROR: Error occurred when executing ssh -l flipper 192.168.2.181
sudo /usr/lib/heartbeat/send_arp -p /tmp/send_arp.pid -i 100 -r 5 eth0
192.168.2.71 auto 255.255.255.0
ERROR: Couldn't execute send ARP commmand /usr/lib/heartbeat/send_arp
-p /tmp/send_arp.pid -i 100 -r 5 eth0 192.168.2.71 auto 255.255.255.0
The issue from analysis is missing broadcast, i.e. 192.168.2.255
sudo /usr/lib/heartbeat/send_arp -p /tmp/send_arp.pid -i 100 -r 5 eth0
192.168.2.81 auto 192.168.2.255 255.255.255.0
Been a while since I sent this, so this is where I'm up to testing
now, which I assume is just setting correctly. Just I'm working on
three things so will get back to in in a few mins.
Ronald
I would recommend you add some validation to variables on execution
for non empty.
In this case, $sendarp_broadcast evaluated to Nil, you should throw
out an error, or even just a warning before command execution. Helps
in diagnosing more easily.
Ronald
INSERT INTO masterpair (masterpair, name, value) VALUES
('developer', 'broadcast', '192.168.2.255');
./flipper developer status
MASTERPAIR: developer
Connection to 192.168.2.181 closed.
NODE: beta181 has read IP, is read-only, replication running, 0s delay
Connection to 192.168.2.187 closed.
NODE: alpha187 has write IP, is writable, replication running, 0s delay
$ ./flipper developer set read alpha187
Connection to 192.168.2.187 closed.
Connection to 192.168.2.181 closed.
Connection to 192.168.2.181 closed.
Connection to 192.168.2.187 closed.
Connection to 192.168.2.187 closed.
Connection to 192.168.2.187 closed.
Connection to 192.168.2.187 closed.
$ ./flipper developer status
MASTERPAIR: developer
Connection to 192.168.2.181 closed.
NODE: beta181 is read-only, replication running, 0s delay
Connection to 192.168.2.187 closed.
NODE: alpha187 has read IP, has write IP, is writable, replication
running, 0s delay
WARNING: MySQL server on read IP is writable