[Lustre-discuss] multihomed clients ignoring lnet options

250 views
Skip to first unread message

Joe Little

unread,
Feb 9, 2008, 11:16:50 PM2/9/08
to lustre-...@lists.lustre.org
I have all of my servers and clients using eth1 for the tcp lustre lnet.

All have modprobe.conf entries of:

options lnet networks="tcp0(eth1)"

and all report with "lctl list_nids" that they are using the IP
address associated with that interface (a net 192.168.200.x address)

However, when my client connects, it ignores the above and goes with
eth0 for routing, even though the mds/mgs is on that network range:

client dmesg:

Lustre: 4756:0:(module.c:382:init_libcfs_module()) maximum lustre stack 8192
Lustre: Added LNI 192.168.200.100@tcp [8/256]
Lustre: Accept secure, port 988
Lustre: OBD class driver, in...@clusterfs.com
Lustre Version: 1.6.4.2
Build Version:
1.6.4.2-19691231190000-PRISTINE-.cache.build.BUILD.lustre-kernel-2.6.9.lustre.linux-2.6.9-55.0.9.EL_lustre.1.6.4.2smp
Lustre: Lustre Client File System; in...@clusterfs.com
LustreError: 4799:0:(socklnd_cb.c:2167:ksocknal_recv_hello()) Error
-104 reading HELLO from 192.168.2.201
LustreError: 11b-b: Connection to 192.168.2.201@tcp at host
192.168.2.201 on port 988 was reset: is it running a compatible
version of Lustre and is 192.168.2.201@tcp one of its NIDs?

server dmesg:
LustreError: 120-3: Refusing connection from 192.168.2.192 for
192.168.2.201@tcp: No matching NI
_______________________________________________
Lustre-discuss mailing list
Lustre-...@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Joe Little

unread,
Feb 10, 2008, 12:58:02 AM2/10/08
to lustre-...@lists.lustre.org
never mind.. The problem was resolved by recreating again the MGS and
the OST's using the same parameters on the server. I was able to
change the parameters and still have the servers working, but my guess
is that those options are permanently etched into the filesystem.

Aaron Knister

unread,
Feb 10, 2008, 10:43:16 AM2/10/08
to Joe Little, lustre-...@lists.lustre.org
I believe that's correct. The nids of the various server components
are stored on the filesystem itself.

Aaron Knister
Associate Systems Analyst
Center for Ocean-Land-Atmosphere Studies

(301) 595-7000
aa...@iges.org

Cliff White

unread,
Feb 11, 2008, 11:00:10 PM2/11/08
to Aaron Knister, lustre-...@lists.lustre.org
Aaron Knister wrote:
> I believe that's correct. The nids of the various server components
> are stored on the filesystem itself.

Yes, and you can always see them with
tunefs.lustre --print <device>

cliffw

Joe Little

unread,
Feb 11, 2008, 11:51:20 PM2/11/08
to Cliff White, lustre-...@lists.lustre.org
On Feb 11, 2008 8:00 PM, Cliff White <Cliff...@sun.com> wrote:
> Aaron Knister wrote:
> > I believe that's correct. The nids of the various server components
> > are stored on the filesystem itself.
>
> Yes, and you can always see them with
> tunefs.lustre --print <device>
>
> cliffw

anyone to change them after the fact?

Steden Klaus

unread,
Feb 11, 2008, 11:53:41 PM2/11/08
to jmli...@gmail.com, Cliff...@sun.com, lustre-...@lists.lustre.org

If you have root, you can change them using tunefs.lustre after the file system has been shut down.

I've done this a number of times to test various lnet configs.

Klaus

Reply all
Reply to author
Forward
0 new messages