Error with arping when using swap/set write for example

49 views
Skip to first unread message

Ronald

unread,
Nov 25, 2008, 1:38:07 PM11/25/08
to flipper-devel
I have I believe a correctly working flipper environment. Status
shows.

$ ./flipper developer status 2>/dev/null
MASTERPAIR: developer
NODE: beta181 has write IP, is writable, replication running, 0s delay
NODE: alpha187 has read IP, is read-only, replication running, 0s
delay

When I try to swap I get an arping error, I can reproduce this for
example with the write command.

$ ./flipper developer set write beta181
Connection to 192.168.2.181 closed.
INFO: The write IP is already up on the beta181 node.
$ ./flipper developer set write alpha187
Connection to 192.168.2.187 closed.
Connection to 192.168.2.181 closed.
Connection to 192.168.2.181 closed.
Connection to 192.168.2.187 closed.
Connection to 192.168.2.187 closed.
Connection to 192.168.2.187 closed.
Connection to 192.168.2.187 closed.
ERROR: Error occurred when executing ssh -l flipper 192.168.2.187
sudo /usr/sbin/arping -I eth0 -c 5 -A 192.168.2.81
ERROR: Couldn't execute send ARP commmand /usr/sbin/arping -I eth0 -c
5 -A 192.168.2.81


Testing the components manually, confirms SSH, a ping that works, but
the arping does not.

$ ssh -l flipper 192.168.2.187
$ ping -c 5 -A 192.168.2.81
PING 192.168.2.81 (192.168.2.81) 56(84) bytes of data.
64 bytes from 192.168.2.81: icmp_seq=1 ttl=64 time=0.020 ms
64 bytes from 192.168.2.81: icmp_seq=2 ttl=64 time=0.033 ms
64 bytes from 192.168.2.81: icmp_seq=3 ttl=64 time=0.038 ms
64 bytes from 192.168.2.81: icmp_seq=4 ttl=64 time=0.022 ms
64 bytes from 192.168.2.81: icmp_seq=5 ttl=64 time=0.032 ms

--- 192.168.2.81 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 800ms
rtt min/avg/max/mdev = 0.020/0.029/0.038/0.006 ms, ipg/ewma
200.003/0.024 ms

$ sudo /usr/sbin/arping -I eth0 -c 5 -A 192.168.2.81
ARPING 192.168.2.81

--- 192.168.2.81 statistics ---
5 packets transmitted, 0 packets received, 100% unanswered
$ echo $?
1


I really have no idea about the arping so any help is appreciated.

Running under Ubuntu 8.04

$ uname -a
Linux ronald 2.6.24-19-xen #1 SMP Wed Aug 20 21:08:51 UTC 2008 x86_64
GNU/Linux

Current configuration of IP's

$ ifconfig
eth0 Link encap:Ethernet HWaddr 00:16:3e:58:f8:64
inet addr:192.168.2.187 Bcast:192.168.2.255 Mask:
255.255.255.0
inet6 addr: 2002:44a1:ed31:1234:216:3eff:fe58:f864/64
Scope:Global
inet6 addr: fe80::216:3eff:fe58:f864/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:960591 errors:0 dropped:0 overruns:0 frame:0
TX packets:79649 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:127365433 (121.4 MB) TX bytes:132820102 (126.6 MB)

eth0:0 Link encap:Ethernet HWaddr 00:16:3e:58:f8:64
inet addr:192.168.2.71 Bcast:192.168.2.255 Mask:
255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

eth0:1 Link encap:Ethernet HWaddr 00:16:3e:58:f8:64
inet addr:192.168.2.81 Bcast:192.168.2.255 Mask:
255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:7578 errors:0 dropped:0 overruns:0 frame:0
TX packets:7578 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:1021493 (997.5 KB) TX bytes:1021493 (997.5 KB)



Regards

Ronald




Jeremy Cole

unread,
Nov 25, 2008, 1:50:20 PM11/25/08
to flippe...@googlegroups.com, soft...@provenscaling.com
Hi Ronald,

Good to see you're trying out Flipper!

Could you provide arping -V (or equivalent 'version' command)? It
sounds like Ubuntu has a different/broken arping than what we expect
from CentOS. Given your comment in your other mail:

> When using Ubuntu 8.04 the arping command for example is
> '/usr/sbin/arping -I $sendarp_interface -c 5 -A $sendarp_ip'
>
> /usr/sbin rather then /sbin
> -U is not supported.

The -U (== "Unsolicited ARP mode, update your neighbours") mode for
arping is required for what we're using arping for in the first place.
That's why your IP takeover is not working.

I think you will need to provide a working send_arp_command. You could
try installing the one from CentOS et al., or as an alternative, we can
also support the send_arp executable from 'heartbeat' (we avoid this
because it is a very heavy install). In any case, to use send_arp on
CentOS, we do the following:

Install these RPMs (or equivalent):
http://mirror.centos.org/centos/4/os/i386/CentOS/RPMS/gnutls-1.0.20-4.el4_6.i386.rpm
http://mirror.centos.org/centos/4/extras/i386/RPMS/heartbeat-stonith-2.1.3-3.el4.centos.i386.rpm
http://mirror.centos.org/centos/4/extras/i386/RPMS/heartbeat-pils-2.1.3-3.el4.centos.i386.rpm
http://mirror.centos.org/centos/4/extras/i386/RPMS/heartbeat-2.1.3-3.el4.centos.i386.rpm

UPDATE masterpair SET value="/usr/lib/heartbeat/send_arp -p
/tmp/send_arp.pid -i 100 -r 5 $sendarp_interface $sendarp_ip auto
$sendarp_broadcast $sendarp_netmask" WHERE masterpair="masterpair_name"
AND name="send_arp_command";

I assume you can derive something useful for Ubuntu from the above. If
you find a working set of options for Ubuntu just let me know and I'll
update the documentation!

Regards,

Jeremy
--
high performance mysql consulting
www.provenscaling.com

Ronald Bradford

unread,
Nov 25, 2008, 2:58:45 PM11/25/08
to flippe...@googlegroups.com
Hi Jeremy,

Thanks. Some more information in diagnosis.

In Summary:
* -A is causing grief for VIP's on another server. Removing it gives
me a partial solution.
arping of a IP on the current server/interface fails. Yet to confirm
if you do this/working on RH.
* Was considering HeartBeat, Monty T tells me they have written there
own arp implementation.

$ sudo arping -V
arping: invalid option -- V
ARPing 2.05, by Thomas Habets <tho...@habets.pp.se>
usage: arping [ -0aAbdFpqrRuv ] [ -w <us> ] [ -S <host/ip> ] [ -T <host/ip ]
[ -s <MAC> ] [ -t <MAC> ] [ -c <count> ] [ -i <interface> ]
<host/ip/MAC | -B>

No version arg, but the version in the output/


./flipper developer status 2>/dev/null
MASTERPAIR: developer

NODE: beta181 has read IP, is read-only, replication running, 0s delay
NODE: alpha187 has write IP, is writable, replication running, 0s delay

$ ifconfig
eth0 Link encap:Ethernet HWaddr 00:16:3e:58:f8:64
inet addr:192.168.2.187 Bcast:192.168.2.255 Mask:255.255.255.0
inet6 addr: 2002:44a1:ed31:1234:216:3eff:fe58:f864/64 Scope:Global
inet6 addr: fe80::216:3eff:fe58:f864/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

RX packets:7255 errors:0 dropped:0 overruns:0 frame:0
TX packets:1705 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:700657 (684.2 KB) TX bytes:257208 (251.1 KB)

eth0:1 Link encap:Ethernet HWaddr 00:16:3e:58:f8:64
inet addr:192.168.2.81 Bcast:192.168.2.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

The other server

$ ifconfig
eth0 Link encap:Ethernet HWaddr 00:16:3e:a8:c8:41
inet addr:192.168.2.181 Bcast:192.168.2.255 Mask:255.255.255.0
inet6 addr: 2002:44a1:ed31:1234:216:3eff:fea8:c841/64 Scope:Global
inet6 addr: fe80::216:3eff:fea8:c841/64 Scope:Link


UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

RX packets:1694898 errors:0 dropped:0 overruns:0 frame:0
TX packets:52793 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:344595511 (328.6 MB) TX bytes:6034185 (5.7 MB)

eth0:0 Link encap:Ethernet HWaddr 00:16:3e:a8:c8:41


inet addr:192.168.2.71 Bcast:192.168.2.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

$ sudo /usr/sbin/arping -I eth0 -c 5 192.168.2.71
ARPING 192.168.2.71
42 bytes from 00:16:3e:a8:c8:41 (192.168.2.71): index=0 time=107.765 usec
42 bytes from 00:16:3e:a8:c8:41 (192.168.2.71): index=1 time=241.041 usec
42 bytes from 00:16:3e:a8:c8:41 (192.168.2.71): index=2 time=214.100 usec
42 bytes from 00:16:3e:a8:c8:41 (192.168.2.71): index=3 time=256.062 usec
42 bytes from 00:16:3e:a8:c8:41 (192.168.2.71): index=4 time=237.226 usec

--- 192.168.2.71 statistics ---
5 packets transmitted, 5 packets received, 0% unanswered

$ sudo /usr/sbin/arping -I eth0 -c 5 -A 192.168.2.71
ARPING 192.168.2.71

--- 192.168.2.71 statistics ---


5 packets transmitted, 0 packets received, 100% unanswered


arping(8) arping(8)

NAME
arping - sends arp and/or ip pings to a given host


SYNOPSIS
arping [-abhpqrRd0uv] [-S host/ip] [-T host/ip] [-s MAC] [-t MAC]
[-c count] [-i interface] [ -w us ] <host | -B>


DESCRIPTION
The arping utility sends ARP and/or ICMP requests to the specified host
and displays the replies. The host may be specified by its hostname,
its IP address, or its MAC address.

One request is sent each second.

When pinging an IP an ARP who-has query is sent. When pinging a MAC
address a directed broadcast ICMP Echo request is sent. For more tech-
nical explaination and an FAQ, see the README file.

Important note on timing

ARP packets are usually replied to (on a LAN) so fast that the OS task
scheduler can't keep up to get exact enough timing. On an idle system
the roundtrip times will be pretty much accurate, but with more load
the timing gets less exact.

To get more exact timing on a non-idle system, re-nice arping to -15 or
so.

# nice -n -15 arping foobar

This is not just an issue with arping, it is with normal ping also (at
least it is on my system). But it doesn't show up as much with ping
since arping packets (when pinging IP) doesn't traverse the IP stack
when received and are therefore replied to faster.


OPTIONS
-0 Use this option to ping with source IP address 0.0.0.0. Use this
when you haven't configured your interface yet. Note that this
may get the MAC-ping unanswered. This is an alias for -S
0.0.0.0.

-a Audiable ping.

-A Only count addresses matching requested address (This *WILL*
break most things you do. Only useful if you are arpinging many
hosts at once. See arping-scan-net.sh for an example).

-b Like -0 but source broadcast source address (255.255.255.255).
Note that this may get the arping unanswered since it's not nor-
mal behavior for a host.

-B Use instead of host if you want to address 255.255.255.255.

-c count
Only send count requests.

-d Find duplicate replies.

-F Don't try to be smart about the interface name. (even if this
switch is not given, -i overrides smartness.

-h Displays a help message and exits.

-i interface
Use the specified interface.

-q Does not display messages, except error messages.

-r Raw output: only the MAC/IP address is displayed for each reply.

-R Raw output: Like -r but shows "the other one", can be combined
with -r.

-s MAC Set source MAC address. You may need to use -p with this.

-S IP Like -b and -0 but with set source address. Note that this may
get the arping unanswered if the target does not have routing to
the IP. If you don't own the IP you are using, you may need to
turn on promiscious mode on the interface (with -p). With this
switch you can find out what IP-address a host has without tak-
ing an IP-address yourself.

-t MAC Set target MAC address to use when pinging IP address.

-T IP Use -T as target address when pinging MACs that won't respond to
a broadcast ping but perhaps to a directed broadcast.

Example:
To check the address of MAC-A, use knowledge of MAC-B and IP-B.

$ arping -S <IP-B> -s <MAC-B> -p <MAC-A>

-p Turn on promiscious mode on interface, use this if you don't
"own" the MAC address you are using.

-u Show index=received/sent instead of just index=received when
pinging MACs.

-v Verbose output. Use twice for more messages.

-w (arping 2.x only) Time to wait between pings, in microseconds.


BUGS
You have to use -B instead of arpinging 255.255.255.255, and -b instead
of -S 255.255.255.255. This is libnets fault.


SEE ALSO
ping(8), arp(8), rarp(8)


AUTHOR
Arping was written by Thomas Habets <tho...@habets.pp.se>.

arping 21th June, 2003 arping(8)

Ronald Bradford

unread,
Nov 25, 2008, 3:12:14 PM11/25/08
to flippe...@googlegroups.com
So just to confirm, It's when you set a read/write IP on a server,
then try to do an arping ON THAT SERVER to the read/write IP. Works
fine on the other server.

I'm not sure because I know nothing about ARP is this necessary, can
this part be skipped.
However, obviously if The -U (== "Unsolicited ARP mode, update your
neighbours") if necessary, that's a bigger problem.

Ronald


$ ./flipper developer status 2> /dev/null
MASTERPAIR: developer
NODE: beta181 is read-only, replication running, 0s delay
NODE: alpha187 is writable, replication running, 0s delay
WARNING: No node has the read IP
WARNING: No node has the write IP


$ ./flipper developer set write beta181
Connection to 192.168.2.181 closed.

Connection to 192.168.2.187 closed.
WARNING: write IP is not up on alpha187 node.
WARNING: Won't attempt to take down write IP on alpha187 node.


Connection to 192.168.2.181 closed.
Connection to 192.168.2.181 closed.
Connection to 192.168.2.181 closed.
Connection to 192.168.2.181 closed.

ERROR: Error occurred when executing ssh -l flipper 192.168.2.181
sudo /usr/sbin/arping -I eth0 -c 5 192.168.2.81
ERROR: Couldn't execute send ARP commmand /usr/sbin/arping -I eth0 -c
5 192.168.2.81


$ sudo /usr/sbin/arping -I eth0 -c 5 192.168.2.81
ARPING 192.168.2.81
42 bytes from 00:16:3e:a8:c8:41 (192.168.2.81): index=0 time=161.886 usec
42 bytes from 00:16:3e:a8:c8:41 (192.168.2.81): index=1 time=237.942 usec
42 bytes from 00:16:3e:a8:c8:41 (192.168.2.81): index=2 time=236.034 usec

--- 192.168.2.81 statistics ---
3 packets transmitted, 3 packets received, 0% unanswered

Jeremy Cole

unread,
Nov 25, 2008, 3:43:04 PM11/25/08
to flippe...@googlegroups.com
Hi Ronald,

It looks like -A means something totally different on your arping. It's
probable that your arping is completely unworkable.

On mine, -A means:

-A The same as -U, but ARP REPLY packets used instead of ARP
REQUEST.

So that's yet another option that is really quite required.

On RHEL5, arping comes from:

>>>>>
(jcole@etna) [~]$ rpm -qf /sbin/arping
iputils-20020927-43.el5
<<<<<

It looks like this "iputils-arping" package is the same, for debian:

http://packages.debian.org/etch/iputils-arping

It provides /usr/bin/arping:

http://packages.debian.org/etch/amd64/iputils-arping/filelist

Can you install and try that one?

Regards,

Jeremy

Jeremy Cole

unread,
Nov 25, 2008, 3:49:36 PM11/25/08
to flippe...@googlegroups.com
Hi,

Just a quick update:

It looks like Ubuntu has packages for it:

https://launchpad.net/ubuntu/+source/iputils

(Newer version, but likely OK.)

Regards,

Jeremy

Ronald Bradford wrote:

Ronald Bradford

unread,
Nov 25, 2008, 4:17:10 PM11/25/08
to flippe...@googlegroups.com
Reconfigured for heatbeat

$ sudo apt-get install heartbeat

Various Manual Testing across node, read/write ip, other server vip. All Good.


$ /usr/lib/heartbeat/send_arp -p /tmp/send_arp.pid -i 100 -r 5 eth0
192.168.2.16 auto 192.168.2.255 255.255.255.0; echo $?
0
$ /usr/lib/heartbeat/send_arp -p /tmp/send_arp.pid -i 100 -r 5 eth0
192.168.2.81 auto 192.168.2.255 255.255.255.0;echo $?
0
$ /usr/lib/heartbeat/send_arp -p /tmp/send_arp.pid -i 100 -r 5 eth0
192.168.2.181 auto 192.168.2.255 255.255.255.0; echo $?
0
$ /usr/lib/heartbeat/send_arp -p /tmp/send_arp.pid -i 100 -r 5 eth0
192.168.2.71 auto 192.168.2.255 255.255.255.0; echo $?
0

Do it via Flipper.

$ ./flipper developer status
MASTERPAIR: developer
Connection to 192.168.2.181 closed.
NODE: beta181 has write IP, is writable, replication running, 0s delay
Connection to 192.168.2.187 closed.
NODE: alpha187 is read-only, replication running, 0s delay


WARNING: No node has the read IP


$ ./flipper developer set write alpha187
Connection to 192.168.2.187 closed.


Connection to 192.168.2.181 closed.
Connection to 192.168.2.181 closed.

Connection to 192.168.2.187 closed.
Connection to 192.168.2.187 closed.
Connection to 192.168.2.187 closed.
Connection to 192.168.2.187 closed.
ERROR: Error occurred when executing ssh -l flipper 192.168.2.187
sudo /usr/lib/heartbeat/send_arp -p /tmp/send_arp.pid -i 100 -r 5 eth0
192.168.2.81 auto 255.255.255.0
ERROR: Couldn't execute send ARP commmand /usr/lib/heartbeat/send_arp
-p /tmp/send_arp.pid -i 100 -r 5 eth0 192.168.2.81 auto 255.255.255.0

Takes like 1-2 minutes to timeout.

First obvious observation, flipper /etc/sudoers will need
/usr/lib/heartbeat/send_arp
Do that, and repeat, no luck.


$ ./flipper developer status
MASTERPAIR: developer
Connection to 192.168.2.181 closed.


NODE: beta181 is read-only, replication running, 0s delay

Connection to 192.168.2.187 closed.
NODE: alpha187 has write IP, is writable, replication running, 0s delay


WARNING: No node has the read IP

rbradfor@ronald:~/svn/trunk/upco/exploration/ronald/database/external/flipper/bin$
./flipper developer set read beta181


Connection to 192.168.2.181 closed.
Connection to 192.168.2.187 closed.

WARNING: read IP is not up on alpha187 node.
WARNING: Won't attempt to take down read IP on alpha187 node.


Connection to 192.168.2.181 closed.
Connection to 192.168.2.181 closed.
Connection to 192.168.2.181 closed.
Connection to 192.168.2.181 closed.
ERROR: Error occurred when executing ssh -l flipper 192.168.2.181

sudo /usr/lib/heartbeat/send_arp -p /tmp/send_arp.pid -i 100 -r 5 eth0
192.168.2.71 auto 255.255.255.0
ERROR: Couldn't execute send ARP commmand /usr/lib/heartbeat/send_arp
-p /tmp/send_arp.pid -i 100 -r 5 eth0 192.168.2.71 auto 255.255.255.0

The issue from analysis is missing broadcast, i.e. 192.168.2.255

sudo /usr/lib/heartbeat/send_arp -p /tmp/send_arp.pid -i 100 -r 5 eth0
192.168.2.81 auto 192.168.2.255 255.255.255.0

Been a while since I sent this, so this is where I'm up to testing
now, which I assume is just setting correctly. Just I'm working on
three things so will get back to in in a few mins.

Ronald

Ronald Bradford

unread,
Nov 25, 2008, 4:21:41 PM11/25/08
to flippe...@googlegroups.com
Add a broadcast record, flipper returns immediately. Now to verify if
it works correctly.


I would recommend you add some validation to variables on execution
for non empty.
In this case, $sendarp_broadcast evaluated to Nil, you should throw
out an error, or even just a warning before command execution. Helps
in diagnosing more easily.

Ronald

INSERT INTO masterpair (masterpair, name, value) VALUES
('developer', 'broadcast', '192.168.2.255');

./flipper developer status
MASTERPAIR: developer
Connection to 192.168.2.181 closed.

NODE: beta181 has read IP, is read-only, replication running, 0s delay


Connection to 192.168.2.187 closed.
NODE: alpha187 has write IP, is writable, replication running, 0s delay

$ ./flipper developer set read alpha187


Connection to 192.168.2.187 closed.
Connection to 192.168.2.181 closed.
Connection to 192.168.2.181 closed.
Connection to 192.168.2.187 closed.
Connection to 192.168.2.187 closed.
Connection to 192.168.2.187 closed.
Connection to 192.168.2.187 closed.

$ ./flipper developer status
MASTERPAIR: developer
Connection to 192.168.2.181 closed.
NODE: beta181 is read-only, replication running, 0s delay
Connection to 192.168.2.187 closed.

NODE: alpha187 has read IP, has write IP, is writable, replication
running, 0s delay
WARNING: MySQL server on read IP is writable

Reply all
Reply to author
Forward
0 new messages