[Lustre-discuss] mount mdt/mgs - file exists -17

299 views
Skip to first unread message

Dan

unread,
May 26, 2011, 1:27:26 PM5/26/11
to lustre-...@lists.lustre.org
Hi,

After my MDS crashed I was unable to mount the mdt/mgs.  The dmesg output is below.  I'm unable to remove lustre modules (lustre_rmmod) and it's listed under /proc/fs/lustre/devices but not mounted.  Rebooting the system to try again results in a kernel panic.  Upon reset I ran fsck which revealed no problems so I tried a --writeconf and deleted CATALOGS but still received -17 and was unable to reboot clean or unload modules.

Fortunately this is my test system but I'd like to understand what happened!  Running Lustre 1.8.5 on RHEL 5.5.

cat /proc/fs/lustre/devices
7 AT osc test-OST0000-osc test-mdtlov_UUID 1

Lustre: MGS MGS started
Lustre: MGC192.168.5.100@o2ib: Reactivating import
Lustre: MGC192.168.5.100@o2ib: Reactivating import
Lustre: Enabling user_xattr
Lustre: test-MDT0000: Now serving test-MDT0000 on /dev/sda1 with recovery enabled
Lustre: 5590:0:(lproc_mds.c:271:lprocfs_wr_group_upcall()) test-MDT0000: group upcall set to /usr/sbin/l_getgroups
Lustre: test-MDT0000.mdt: set parameter group_upcall=/usr/sbin/l_getgroups
LustreError: 5590:0:(ldlm_lib.c:331:client_obd_setup()) can't add initial connection
LustreError: 5590:0:(obd_config.c:372:class_setup()) setup test-OST0000-osc failed (-2)
LustreError: 5590:0:(obd_config.c:1199:class_config_llog_handler()) Err -2 on cfg command:
Lustre:    cmd=cf003 0:test-OST0000-osc  1:test-OST0000_UUID  2:128.174.5.100@tcp 
LustreError: 15c-8: MGC192.168.5.100@o2ib: The configuration from log 'test-MDT0000' failed (-2). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.
LustreError: 5453:0:(obd_mount.c:1126:server_start_targets()) failed to start server test-MDT0000: -2
LustreError: 5453:0:(obd_mount.c:1655:server_fill_super()) Unable to start targets: -2
Lustre: Failing over test-MDT0000
Lustre: Failing over test-mdtlov
Lustre: test-MDT0000: shutting down for failover; client state will be preserved.
Lustre: MDT test-MDT0000 has stopped.
Lustre: MGS has stopped.
Lustre: server umount test-MDT0000 complete
LustreError: 5453:0:(obd_mount.c:2050:lustre_fill_super()) Unable to mount  (-2)

Thanks,

Dan

Johann Lombardi

unread,
May 26, 2011, 4:59:38 PM5/26/11
to Dan, lustre-...@lists.lustre.org
On Thu, May 26, 2011 at 10:27:26AM -0700, Dan wrote:
> Lustre: MGS MGS started
> Lustre: MGC192.168.5.100@o2ib: Reactivating import
> Lustre: MGC192.168.5.100@o2ib: Reactivating import

So you use infiniband ...

[...]


> LustreError: 5590:0:(ldlm_lib.c:331:client_obd_setup()) can't add
> initial connection
> LustreError: 5590:0:(obd_config.c:372:class_setup()) setup
> test-OST0000-osc failed (-2)
> LustreError: 5590:0:(obd_config.c:1199:class_config_llog_handler()) Err
> -2 on cfg command:
> Lustre: cmd=cf003 0:test-OST0000-osc 1:test-OST0000_UUID
> 2:128.174.5.100@tcp

But a tcp nid is registered for OST0000. Is this intended?
If so, have you configured lnet on the MDS to use tcp?

Cheers,
Johann

--
Johann Lombardi
Whamcloud, Inc.
www.whamcloud.com
_______________________________________________
Lustre-discuss mailing list
Lustre-...@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply all
Reply to author
Forward
0 new messages