[Lustre-discuss] Problem mounting volumes inside vmware NAT (1.4.10)

186 views
Skip to first unread message

Jim McCusker

unread,
Jun 11, 2007, 12:32:25 PM6/11/07
to lustre-...@clusterfs.com
I have a vmware server running several vms that are stored on a lustre
volume. When I use bridged networking, I am able to mount the volume
inside the vm with no problem. I would like to have some vms on a
private NAT'ed network as database servers (with the data files stored
on lustre), but when I change the NIC to be on the NAT network, the
mount fails with the following:

[root@localhost ~]# /etc/init.d/lustrefs start
Mounting Lustre filesystems: mount.lustre:
mount(chai.med.yale.edu:mds1/client, /vol) failed: Input/output error
mds nid 0: 128.36.115.13@tcp
mds name: mds1
profile: client
options: rw,flock
retry: 0
[FAILED]

/var/log/messages shows:

Jun 11 03:13:06 localhost kernel: LustreError:
2525:0:(socklnd_cb.c:2160:ksocknal_recv_hello()) Error -104 reading
HELLO from 128.36.115.13
Jun 11 03:13:06 localhost kernel: LustreError: Connection to
128.36.115.13@tcp at host 128.36.115.13 on port 988 was reset: is it
running a compatible version of Lustre and is 128.36.115.13@tcp one of
its NIDs?
Jun 11 03:13:11 localhost kernel: LustreError:
3931:0:(client.c:947:ptlrpc_expire_one_request()) @@@ timeout (sent at
1181545986, 5s ago) req@00000100176d1600 x9/t0
o38->md...@128.36.115.13@tcp:12 lens 240/272 ref 1 fl Rpc:/0/0 rc 0/0
Jun 11 03:13:11 localhost kernel: LustreError: mdc_dev: The
configuration 'client' could not be read from the MDS 'mds1'. This may
be the result of communication errors between the client and the MDS, or
if the MDS is not running.
Jun 11 03:13:11 localhost kernel: LustreError:
3928:0:(llite_lib.c:962:lustre_fill_super()) Unable to process log: client
Jun 11 03:13:11 localhost mount: mount.lustre:
mount(chai.med.yale.edu:mds1/client, /vol) failed: Input/output error
Jun 11 03:13:11 localhost mount: mds nid 0: 128.36.115.13@tcp
Jun 11 03:13:11 localhost mount: mds name: mds1
Jun 11 03:13:11 localhost mount: profile: client
Jun 11 03:13:11 localhost mount: options: rw,flock
Jun 11 03:13:11 localhost mount: retry: 0
Jun 11 03:13:11 localhost lustrefs: Mounting Lustre filesystems: failed

Only the vmware server has more than one NIC enabled, and that isn't
having any trouble connecting, so "options lnet networks=tcp(eth0)"
doesn't seem like the right option.

Jim

_______________________________________________
Lustre-discuss mailing list
Lustre-...@clusterfs.com
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

Felix, Evan J

unread,
Jun 12, 2007, 11:22:00 AM6/12/07
to Jim McCusker, lustre-...@clusterfs.com
Do you have a nid on the MDS for the NAT'ed Network? The mds should
have nids 128.36.115.13@tcp and say 192.168.1.4@tcp (I made that up),
then you could connect to the MDS on the other network(NAT'ed) interface

Evan

Jim McCusker

unread,
Jun 12, 2007, 11:28:44 AM6/12/07
to lustre-...@clusterfs.com
The mds is outside of the NAT'ed network and I would have to jump
through many hoops to get it inside that network, because it's a virtual
network that doesn't exist physically.

Jim

Cliff White

unread,
Jun 12, 2007, 12:43:53 PM6/12/07
to Jim McCusker, lustre-...@clusterfs.com
Jim McCusker wrote:
> The mds is outside of the NAT'ed network and I would have to jump
> through many hoops to get it inside that network, because it's a virtual
> network that doesn't exist physically.

If your client cannot reach the MDS, your client will not be able to
mount. The MDS connection is necessary.
cliffw

Felix, Evan J

unread,
Jun 12, 2007, 2:21:10 PM6/12/07
to Jim McCusker, lustre-...@clusterfs.com
Can you put a lnet router on the NAT and on the MDS network?

Jim McCusker

unread,
Jun 12, 2007, 2:37:37 PM6/12/07
to Felix, Evan J, lustre-...@clusterfs.com
Maybe. I'm not finding any documentation on how to do that, only that it
exists (does it exist for 1.4.x?).

Brian J. Murrell

unread,
Jun 12, 2007, 2:51:44 PM6/12/07
to lustre-...@clusterfs.com
On Tue, 2007-06-12 at 14:37 -0400, Jim McCusker wrote:
> Maybe. I'm not finding any documentation on how to do that, only that it
> exists (does it exist for 1.4.x?).

Jim,

Just to make sure I understand your configuration you have something
like:

+-------------------+ +------------------------------+
| MDS 128.36.115.13=-----=???.???.???.??? VMWware Host |
+-------------------+ | ] +----------------+ |
| | | VMWware Guest | |
| | | Lustre Client | |
| +===???.???.???.??? | |
| +----------------+ |
+------------------------------+

Where you are NATting from the (Lustre client) VMWare
guest's ???.???.???.??? to the VMWare host's ???.???.???.???, yes?

What are the values of the ???.???.???.????

Can you ping the MDS from the VMWare host? And the VMWare guest? Can
you give us the output of "ping -c 5 <mds_ip>" from both the VMWare host
and guest?

Thanx,
b.

Jim McCusker

unread,
Jun 12, 2007, 4:38:45 PM6/12/07
to lustre-...@clusterfs.com

Brian J. Murrell wrote:
> On Tue, 2007-06-12 at 14:37 -0400, Jim McCusker wrote:
>
>> Maybe. I'm not finding any documentation on how to do that, only that it
>> exists (does it exist for 1.4.x?).
>>
>
> Jim,
>
> Just to make sure I understand your configuration you have something
> like:
>
> +-------------------+ +------------------------------+

> | MDS 128.36.115.13=-----=128.36.115.10 VMWware Host |


> +-------------------+ | ] +----------------+ |
> | | | VMWware Guest | |
> | | | Lustre Client | |

> | +===192.168.88.129 | |


> | +----------------+ |
> +------------------------------+
>
> Where you are NATting from the (Lustre client) VMWare
> guest's ???.???.???.??? to the VMWare host's ???.???.???.???, yes?
>

The VMWare host NATs for the VMWare guest.


> What are the values of the ???.???.???.????
>

Changed above. VMWare host is 128.36.115.10 as well as 192.168.88.2.
VMWare does the NAT-ing for me.


> Can you ping the MDS from the VMWare host? And the VMWare guest? Can
> you give us the output of "ping -c 5 <mds_ip>" from both the VMWare host
> and guest?
>

VMWare host:

[root@espresso vm]# ping 128.36.115.13
PING 128.36.115.13 (128.36.115.13) 56(84) bytes of data.
64 bytes from 128.36.115.13: icmp_seq=0 ttl=64 time=0.100 ms
64 bytes from 128.36.115.13: icmp_seq=1 ttl=64 time=0.109 ms
64 bytes from 128.36.115.13: icmp_seq=2 ttl=64 time=0.107 ms

--- 128.36.115.13 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1999ms
rtt min/avg/max/mdev = 0.100/0.105/0.109/0.009 ms, pipe 2


VMWare guest:

[root@espresso vm]# ssh 192.168.88.129
ro...@192.168.88.129's password:
Last login: Tue Jun 12 05:12:34 2007
[root@localhost ~]# ping 128.36.115.13
PING 128.36.115.13 (128.36.115.13) 56(84) bytes of data.
64 bytes from 128.36.115.13: icmp_seq=0 ttl=128 time=4.05 ms
64 bytes from 128.36.115.13: icmp_seq=1 ttl=128 time=0.382 ms
64 bytes from 128.36.115.13: icmp_seq=2 ttl=128 time=0.360 ms
64 bytes from 128.36.115.13: icmp_seq=3 ttl=128 time=0.272 ms
64 bytes from 128.36.115.13: icmp_seq=4 ttl=128 time=0.320 ms
64 bytes from 128.36.115.13: icmp_seq=5 ttl=128 time=0.912 ms
64 bytes from 128.36.115.13: icmp_seq=6 ttl=128 time=0.266 ms
64 bytes from 128.36.115.13: icmp_seq=7 ttl=128 time=0.296 ms

--- 128.36.115.13 ping statistics ---
8 packets transmitted, 8 received, 0% packet loss, time 7004ms
rtt min/avg/max/mdev = 0.266/0.858/4.056/1.224 ms, pipe 2

And here's the traceroute:

[root@localhost ~]# traceroute 128.36.115.13
traceroute to 128.36.115.13 (128.36.115.13), 30 hops max, 46 byte packets
1 192.168.88.2 (192.168.88.2) 0.312 ms 0.169 ms 0.114 ms
2 chai.med.yale.edu (128.36.115.13) 0.373 ms 0.259 ms 0.221 ms

Thanks,
Jim

Brian J. Murrell

unread,
Jun 12, 2007, 5:08:22 PM6/12/07
to lustre-...@clusterfs.com
On Tue, 2007-06-12 at 16:38 -0400, Jim McCusker wrote:
> > +-------------------+ +------------------------------+
> > | MDS 128.36.115.13=-----=128.36.115.10 VMWware Host |
> > +-------------------+ | ] +----------------+ |
> > | | | VMWware Guest | |
> > | | | Lustre Client | |
> > | +===192.168.88.129 | |
> > | +----------------+ |
> > +------------------------------+
> >

> The VMWare host NATs for the VMWare guest.

Right. This is exactly as I understood it but wanted to make sure with
some small tests.

> Changed above. VMWare host is 128.36.115.10 as well as 192.168.88.2.

Hrm. I've never actually used at NAT interface in VMWware, but isn't
the 192.168.88.2 actually on the Guest and not the Host? The host takes
the packets from the guest and rewrites the source address from
192.168.88.2 to .36.115.10, yes?

> VMWare does the NAT-ing for me.

Right.

> VMWare host:
>
> [root@espresso vm]# ping 128.36.115.13
> PING 128.36.115.13 (128.36.115.13) 56(84) bytes of data.
> 64 bytes from 128.36.115.13: icmp_seq=0 ttl=64 time=0.100 ms
> 64 bytes from 128.36.115.13: icmp_seq=1 ttl=64 time=0.109 ms
> 64 bytes from 128.36.115.13: icmp_seq=2 ttl=64 time=0.107 ms

Good.

> VMWare guest:
>
> [root@espresso vm]# ssh 192.168.88.129
> ro...@192.168.88.129's password:
> Last login: Tue Jun 12 05:12:34 2007
> [root@localhost ~]# ping 128.36.115.13
> PING 128.36.115.13 (128.36.115.13) 56(84) bytes of data.
> 64 bytes from 128.36.115.13: icmp_seq=0 ttl=128 time=4.05 ms
> 64 bytes from 128.36.115.13: icmp_seq=1 ttl=128 time=0.382 ms
> 64 bytes from 128.36.115.13: icmp_seq=2 ttl=128 time=0.360 ms
> 64 bytes from 128.36.115.13: icmp_seq=3 ttl=128 time=0.272 ms

Again, good. You have basic IP (albeit NATted) connectivity to the MDS.

What does the kernel log on the MDS show when the client is trying to
mount and fails?

It seems pretty clear that the NATting is confusing the MDS, but why,
I'm not sure. I thought the protocol was pretty NAT friendly.

b.

Jim McCusker

unread,
Jun 12, 2007, 5:16:55 PM6/12/07
to Brian J. Murrell, lustre-...@clusterfs.com
Brian J. Murrell wrote:
> Hrm. I've never actually used at NAT interface in VMWware, but isn't
> the 192.168.88.2 actually on the Guest and not the Host? The host takes
> the packets from the guest and rewrites the source address from
> 192.168.88.2 to .36.115.10, yes?
>

No, it's definitely on the host. The host (192.168.88.2) rewrites the
packets from 88.129 to .36.115.10.

> Again, good. You have basic IP (albeit NATted) connectivity to the MDS.
> What does the kernel log on the MDS show when the client is trying to
> mount and fails?
>
> It seems pretty clear that the NATting is confusing the MDS, but why,
> I'm not sure. I thought the protocol was pretty NAT friendly.
>

This seems to be the culprit:

Jun 12 17:00:13 chai kernel: LustreError:
12198:0:(acceptor.c:422:lnet_acceptor()) Refusing connection from
128.36.115.10: insecure port 35203

We seem to be remapping to high ports, a common strategy when using
NAT. Is there a way of disabling this check?

Jim

Brian J. Murrell

unread,
Jun 12, 2007, 5:27:34 PM6/12/07
to lustre-...@clusterfs.com
On Tue, 2007-06-12 at 17:16 -0400, Jim McCusker wrote:
>
> No, it's definitely on the host.

Hrm. Yes. This is as I expected. The vmware interface of the host is
doing the NATting. I mean the 192.168.88.2 address is the address of
the guest, not the host.

> This seems to be the culprit:
>
> Jun 12 17:00:13 chai kernel: LustreError:
> 12198:0:(acceptor.c:422:lnet_acceptor()) Refusing connection from
> 128.36.115.10: insecure port 35203

This is as I was expecting, but hoping it was not. One strategy in
NATting is to only reassign a source port if the port the NATted
connection wants is already being used. If it's not, don't NAT it.
That helps NATted connections look more natural where they can.
Unfortunately it seems VMWare is not employing this technique and is
always NATting ports (assuming you are not using 988 on the host for
anything... do you have a lustre client running there too?)

> We seem to be remapping to high ports, a common strategy when using
> NAT.

Right.

> Is there a way of disabling this check?

Good question. One which I don't know the answer to I'm afraid.

b.

Jim McCusker

unread,
Jun 12, 2007, 5:38:12 PM6/12/07
to lustre-...@clusterfs.com
Brian J. Murrell wrote:
> On Tue, 2007-06-12 at 17:16 -0400, Jim McCusker wrote:
>
>> This seems to be the culprit:
>>
>> Jun 12 17:00:13 chai kernel: LustreError:
>> 12198:0:(acceptor.c:422:lnet_acceptor()) Refusing connection from
>> 128.36.115.10: insecure port 35203
>>
>
> This is as I was expecting, but hoping it was not. One strategy in
> NATting is to only reassign a source port if the port the NATted
> connection wants is already being used. If it's not, don't NAT it.
> That helps NATted connections look more natural where they can.
> Unfortunately it seems VMWare is not employing this technique and is
> always NATting ports (assuming you are not using 988 on the host for
> anything... do you have a lustre client running there too?)
>
Yes, I do have a lustre client running on the host, which is needed for
getting to the vm files in the first place. Also, I plan on running a
number of lustre clients as guests, which would trigger this situation
again.

>> Is there a way of disabling this check?
>>
>
> Good question. One which I don't know the answer to I'm afraid.
>

Thank you for your help so far. Does anyone else know if this high port
check can be disabled?

Thanks,
Jim

Jim McCusker

unread,
Jun 12, 2007, 6:14:26 PM6/12/07
to lustre-...@clusterfs.com
So I was able to resolve this by adding the following to the network's
nat.conf file:

[privilegedTCP]
autodetect = 1
port = 988

That allows traffic bound for port 988 to stay as a low port connection.
For some reason, this is the default configuration (autodetect = 1) on
Windows, but autodetect = 0 (false) is the default on Linux.

Brian, thanks for getting me within jumping distance.

Jim

Andreas Dilger

unread,
Jun 13, 2007, 12:03:58 AM6/13/07
to Jim McCusker, Brian J. Murrell, lustre-...@clusterfs.com
On Jun 12, 2007 17:16 -0400, Jim McCusker wrote:

> Brian J. Murrell wrote:
> >It seems pretty clear that the NATting is confusing the MDS, but why,
> >I'm not sure. I thought the protocol was pretty NAT friendly.
>
> This seems to be the culprit:
>
> Jun 12 17:00:13 chai kernel: LustreError:
> 12198:0:(acceptor.c:422:lnet_acceptor()) Refusing connection from
> 128.36.115.10: insecure port 35203
>
> We seem to be remapping to high ports, a common strategy when using
> NAT. Is there a way of disabling this check?

On the MDS (and the OSTs for that matter) you need to add to modprobe.conf

options lnet accept=all [other networks config if any]

This disables the secure source-port check on the server.

There will still be a problem connecting from the MDS/OSTs to the clients,
which can happen on occasion if the network fails and then the server needs
to contact the client for some reason (e.g. lock cancellation). This is
not very common, and at worst the client will get an error after the network
failure, and then continue on once it establishes a client->server connection.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

Reply all
Reply to author
Forward
0 new messages