[Lustre-discuss] More failover issues

0 views
Skip to first unread message

Robert LeBlanc

unread,
Nov 12, 2007, 11:41:54 AM11/12/07
to lustre
In 1.6.0, when creating a MDT, you could specify multiple --mgsnode options
and it would failover between them. 1.6.3 only seems to take the last one
and --mgsnode=192.168.1.252@o2ib:192.168.1.253@o2ib doesn't seem to failover
to the other node. Any ideas how to get around this?

Robert

Robert LeBlanc
College of Life Sciences Computer Support
Brigham Young University
leb...@byu.edu
(801)422-1882


_______________________________________________
Lustre-discuss mailing list
Lustre-...@clusterfs.com
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

Nathan Rutman

unread,
Nov 12, 2007, 3:51:07 PM11/12/07
to Robert LeBlanc, lustre
Robert LeBlanc wrote:
> In 1.6.0, when creating a MDT, you could specify multiple --mgsnode options
> and it would failover between them. 1.6.3 only seems to take the last one
> and --mgsnode=192.168.1.252@o2ib:192.168.1.253@o2ib doesn't seem to failover
> to the other node. Any ideas how to get around this?
>
Multiple --mgsnode parameters should work:
mkfs.lustre --mkfsoptions="-O dir_index" --reformat --mdt
--mgsnode=192.168.1.253@o2ib --mgsnode=1@elan --device-size=10000 /tmp/foo

Permanent disk data:
Target: lustre-MDTffff
Index: unassigned
Lustre FS: lustre
Mount type: ldiskfs
Flags: 0x71
(MDT needs_index first_time update )
Persistent mount opts: errors=remount-ro,iopen_nopriv,user_xattr
Parameters: mgsnode=192.168.1.253@o2ib mgsnode=1@elan

Robert LeBlanc

unread,
Nov 12, 2007, 4:18:17 PM11/12/07
to Nathan Rutman, lustre

This is what I'm getting:

head2-2:~# mkfs.lustre --mkfsoptions="-O dir_index" --reformat --mdt --fsname=home --mgsnode=192.168.1.252@o2ib --mgsnode=192.168.1.253@o2ib --failnode=192.168.1.252@o2ib /dev/mapper/ldiskd-part1

   Permanent disk data:
Target:     home-MDTffff
Index:      unassigned
Lustre FS:  home


Mount type: ldiskfs
Flags:      0x71
              (MDT needs_index first_time update )
Persistent mount opts: errors=remount-ro,iopen_nopriv,user_xattr

Parameters:  mgsnode=192.168.1.253@o2ib failover.node=192.168.1.252@o2ib mdt.group_upcall=/usr/sbin/l_getgroups

device size = 972MB
formatting backing filesystem ldiskfs on /dev/mapper/ldiskd-part1
        target name  home-MDTffff
        4k blocks     0
        options       -O dir_index -i 4096 -I 512 -q -F
mkfs_cmd = mkfs.ext2 -j -b 4096 -L home-MDTffff -O dir_index -i 4096 -I 512 -q -F /dev/mapper/ldiskd-part1
Writing CONFIGS/mountdata


For some reason, only the last --mgsnode option is being kept.

Robert

Wojciech Turek

unread,
Nov 12, 2007, 4:48:50 PM11/12/07
to Robert LeBlanc, Nathan Rutman, lustre
Hi,

I think this is because there can be only one MGS per lustre installation (this is what manual says).

Wojciech Turek
Mr Wojciech Turek
Assistant System Manager
University of Cambridge
High Performance Computing service 



Robert LeBlanc

unread,
Nov 12, 2007, 4:56:07 PM11/12/07
to Wojciech Turek, Nathan Rutman, lustre
Yes only one MGS per site, but you should be able to specify multiple MGS nodes. We have done it before with 1.6.0. See http://manual.lustre.org/manual/LustreManual16_HTML/DynamicHTML-05-1.html section 2.2.2.1.

Robert

Robert LeBlanc

unread,
Nov 12, 2007, 5:11:35 PM11/12/07
to Robert LeBlanc, Nathan Rutman, lustre

Moreover, tunefs returns:

head2-2:~# tunefs.lustre --mgsnode=192.168.1.253@o2ib --mgsnode=192.168.1.252@o2ib --writeconf /dev/mapper/ldiskd-part1
checking for existing Lustre data: found CONFIGS/mountdata
Reading CONFIGS/mountdata

   Read previous values:
Target:     home-MDT0000
Index:      0


Lustre FS:  home
Mount type: ldiskfs

Flags:      0x101
              (MDT writeconf )
Persistent mount opts: errors=remount-ro,iopen_nopriv,user_xattr
Parameters:  failover.node=192.168.1.252@o2ib mdt.group_upcall=/usr/sbin/l_getgroups mgsnode=192.168.1.253@o2ib


   Permanent disk data:
Target:     home-MDT0000
Index:      0


Lustre FS:  home
Mount type: ldiskfs

Flags:      0x101
              (MDT writeconf )
Persistent mount opts: errors=remount-ro,iopen_nopriv,user_xattr
Parameters:  failover.node=192.168.1.252@o2ib mdt.group_upcall=/usr/sbin/l_getgroups   mgsnode=192.168.1.252@o2ib

Writing CONFIGS/mountdata


Notice how there are two spaces between the mdt.group_upcall and the mgsnode parameters. If you only specify one mgsnode, then there is only one space. I wonder if there is something buggy with the parser.

Robert

Wojciech Turek

unread,
Nov 12, 2007, 5:23:34 PM11/12/07
to Robert LeBlanc, Nathan Rutman, lustre
Yes but in given example in section 2.2.2.1 two mgsnodes are specified for --ost and you are specifying it for --mdt maybe that is the problem? Do you have combined mgs with mdt ? Do you have one file system or more?

Wojciech

Robert LeBlanc

unread,
Nov 12, 2007, 5:30:16 PM11/12/07
to Wojciech Turek, Nathan Rutman, lustre
My MDTs and MGS are separate since I have two MDTs, I didn’t want the MGS tied to just one so I separated it. It seems that you should be able to specify more than one MGS for the MDT because there is really no other way to tell it what the MGS failover partner is.

Robert

Nathan Rutman

unread,
Nov 13, 2007, 3:09:05 PM11/13/07
to Robert LeBlanc, lustre
Strangely, this works with my version of the 163 release branch, but
doesn't work on my 164 prerelease (same prob as you).
Anyhow, try grabbing and older version of mkfs.lustre, from 1.6.0.1,
1.6.1, or 1.6.2. There's been no major changes.

cfs21:~/cfs/b_release_1_6_3/lustre/utils# ./mkfs.lustre
mkfs.lustre v1.6.3
cfs21:~/cfs/b_release_1_6_3/lustre/utils# ./mkfs.lustre

--mkfsoptions="-O dir_index" --reformat --mdt --fsname=home
--mgsnode=192.168.1.252@o2ib --mgsnode=192.168.1.253@o2ib

--failnode=192.168.1.252@o2ib --device-size=10000 /tmp/foo

Permanent disk data:
Target: home-MDTffff


Index: unassigned
Lustre FS: home
Mount type: ldiskfs
Flags: 0x71
(MDT needs_index first_time update )
Persistent mount opts: errors=remount-ro,iopen_nopriv,user_xattr

Parameters: mgsnode=192.168.1.252@o2ib mgsnode=192.168.1.253@o2ib
failover.node=192.168.1.252@o2ib

Reply all
Reply to author
Forward
0 new messages