[Lustre-discuss] quotacheck fails on filesystem with a permanently inactivated OST

190 views
Skip to first unread message

Sam Aparicio

unread,
Mar 12, 2011, 7:15:02 PM3/12/11
to lustre-...@lists.lustre.org
I have run into a problem that lfs quotacheck fails on a system where an OST has failed and been permanently removed.
I noticed a report similar to this dated 2008 with a comment that this was likely a bug - has it been fixeed in 2.1?

any idea on how to proceed with getting quotas going, short of reformatting all the OSTs again.

we are running lustre1.8.4

also, on the MGS/MDT server, although I can see quota_ parameters assigned to the mdt. (ug3)
i don't see any for the OSTs - although all were mkfs with quota_type=ug in the parameter list.
is that expected?

thanks

sam aparicio

-00-


LustreError: 19615:0:(quota_check.c:253:lov_quota_check()) lov idx 3 inactive
LustreError: 19627:0:(quota_ctl.c:373:client_quota_ctl()) ptlrpc_queue_wait failed, rc: -16
LustreError: 29915:0:(obd_class.h:1435:obd_find_cbdata()) obd_find_cbdata: NULL export
LustreError: 29915:0:(obd_class.h:1435:obd_find_cbdata()) obd_find_cbdata: NULL export
LustreError: 30028:0:(quota_check.c:253:lov_quota_check()) lov idx 3 inactive
_______________________________________________
Lustre-discuss mailing list
Lustre-...@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Sam Aparicio

unread,
Mar 14, 2011, 12:28:02 AM3/14/11
to lustre-...@lists.lustre.org
I have run into a problem that lfs quotacheck fails on a system where an OST has failed and been permanently removed.
I noticed a report similar to this dated 2008 with a comment that this was likely a bug - has it been fixeed in 2.1?

any idea on how to proceed with getting quotas going, short of reformatting all the OSTs again.

we are running lustre1.8.4

also, on the MGS/MDT server, although I can see quota_ parameters assigned to the mdt. (ug3)
i don't see any for the OSTs - although all were mkfs with quota_type=ug in the parameter list.
is that expected?

i notice that quota_check thinks only 3 OSTs are up and aborts quotacheck. seems like it cannot correctly read the list of active and permnanently inactivated OSTs??

thanks in advance for any help.

Johann Lombardi

unread,
Mar 14, 2011, 3:53:02 AM3/14/11
to Sam Aparicio, lustre-...@lists.lustre.org
On Sun, Mar 13, 2011 at 09:28:02PM -0700, Sam Aparicio wrote:
> I have run into a problem that lfs quotacheck fails on a system where an OST has failed and been permanently removed.
> I noticed a report similar to this dated 2008 with a comment that this was likely a bug - has it been fixeed in 2.1?
>
> any idea on how to proceed with getting quotas going, short of reformatting all the OSTs again.
>
> we are running lustre1.8.4

This problem was fixed in 1.8.5, check bugzilla ticket 21174 for more information.

Cheers,
Johann

--
Johann Lombardi
Whamcloud, Inc.
www.whamcloud.com

Samuel Aparicio

unread,
Mar 14, 2011, 12:38:49 PM3/14/11
to Johann Lombardi, lustre-...@lists.lustre.org
thanks!
s.

Samuel Aparicio

unread,
Mar 14, 2011, 1:36:01 PM3/14/11
to Johann Lombardi, lustre-...@lists.lustre.org
Well I upgraded the OSTs, MGS/MDT and clients to 1.8.5, rebooted and remounted everything on the OST/MGS, the issue seems to persist though.

In the MDS kernel log I notice
Lustre: 7178:0:(quota_master.c:1716:mds_quota_recovery()) Only 0/7 OSTs are active, abort quota recovery

but all the OSTs are active and the filesystem operational ...

In the client log I notice
LustreError: 6940:0:(quota_check.c:253:lov_quota_check()) lov idx 3 inactive

the steps followed here were:
create lustre filesystem with 7 OSTs, quotas as ug enabled at mkfs time
permanently inactivate one of them having removed all the files first
lfs quotacheck fails the first time around with the errors above.

any ideas?

thanks

sam aparicio


On Mar 14, 2011, at 12:53 AM, Johann Lombardi wrote:

Johann Lombardi

unread,
Mar 15, 2011, 8:26:44 AM3/15/11
to Samuel Aparicio, lustre-...@lists.lustre.org
On Mon, Mar 14, 2011 at 10:36:01AM -0700, Samuel Aparicio wrote:
> Well I upgraded the OSTs, MGS/MDT and clients to 1.8.5, rebooted and remounted everything on the OST/MGS, the issue seems to persist though.
>
> In the MDS kernel log I notice
> Lustre: 7178:0:(quota_master.c:1716:mds_quota_recovery()) Only 0/7 OSTs are active, abort quota recovery
>
> but all the OSTs are active and the filesystem operational ...
>
> In the client log I notice
> LustreError: 6940:0:(quota_check.c:253:lov_quota_check()) lov idx 3 inactive

Arr, the fix works well with sparse OST indexes, but not with deactivated OSTs. I'm sorry about that. I will have this fixed.

Johann Lombardi

unread,
Mar 15, 2011, 9:31:56 AM3/15/11
to Samuel Aparicio, lustre-...@lists.lustre.org
On Tue, Mar 15, 2011 at 01:26:44PM +0100, Johann Lombardi wrote:
> Arr, the fix works well with sparse OST indexes, but not with deactivated OSTs. I'm sorry about that. I will have this fixed.

FYI, i have filed a bug for this issue:
http://jira.whamcloud.com/browse/LU-129

It should not take long to have a patch ready for testing.

Samuel Aparicio

unread,
Mar 15, 2011, 12:15:36 PM3/15/11
to Johann Lombardi, lustre-...@lists.lustre.org
cheers.
Reply all
Reply to author
Forward
0 new messages