Hi,
After my MDS crashed I was unable to mount the mdt/mgs. The dmesg output is below. I'm unable to remove lustre modules (lustre_rmmod) and it's listed under /proc/fs/lustre/devices but not mounted. Rebooting the system to try again results in a kernel panic. Upon reset I ran fsck which revealed no problems so I tried a --writeconf and deleted CATALOGS but still received -17 and was unable to reboot clean or unload modules.
Fortunately this is my test system but I'd like to understand what happened! Running Lustre 1.8.5 on RHEL 5.5.
cat /proc/fs/lustre/devices
7 AT osc test-OST0000-osc test-mdtlov_UUID 1
Lustre: MGS MGS started
Lustre:
MGC192.168.5.100@o2ib: Reactivating import
Lustre:
MGC192.168.5.100@o2ib: Reactivating import
Lustre: Enabling user_xattr
Lustre: test-MDT0000: Now serving test-MDT0000 on /dev/sda1 with recovery enabled
Lustre: 5590:0:(lproc_mds.c:271:lprocfs_wr_group_upcall()) test-MDT0000: group upcall set to /usr/sbin/l_getgroups
Lustre: test-MDT0000.mdt: set parameter group_upcall=/usr/sbin/l_getgroups
LustreError: 5590:0:(ldlm_lib.c:331:client_obd_setup()) can't add initial connection
LustreError: 5590:0:(obd_config.c:372:class_setup()) setup test-OST0000-osc failed (-2)
LustreError: 5590:0:(obd_config.c:1199:class_config_llog_handler()) Err -2 on cfg command:
Lustre: cmd=cf003 0:test-OST0000-osc 1:test-OST0000_UUID 2:128.174.5.100@tcp
LustreError: 15c-8:
MGC192.168.5.100@o2ib: The configuration from log 'test-MDT0000' failed (-2). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.
LustreError: 5453:0:(obd_mount.c:1126:server_start_targets()) failed to start server test-MDT0000: -2
LustreError: 5453:0:(obd_mount.c:1655:server_fill_super()) Unable to start targets: -2
Lustre: Failing over test-MDT0000
Lustre: Failing over test-mdtlov
Lustre: test-MDT0000: shutting down for failover; client state will be preserved.
Lustre: MDT test-MDT0000 has stopped.
Lustre: MGS has stopped.
Lustre: server umount test-MDT0000 complete
LustreError: 5453:0:(obd_mount.c:2050:lustre_fill_super()) Unable to mount (-2)
Thanks,
Dan