SDS says "Needs maintenance" when nothing's wrong. How to clear?

mkir...@rochester.rr.com

unread,

Feb 25, 2006, 8:25:38 AM2/25/06

to

Last night, we had some connectivity problems between a V210 (Solaris
8) and an EMC Clariion. The long and short of it is that I ended up
unmounting, fscking, and remounting all 8 filesystems.

I did not notice this last night, but SDS is now complaining that all 8
metadevices "need maintenance." They mounted okay, the data is fine,
and I've confirmed that there is nothing physically wrong with any of
the components.

How do I clear the "Needs maintenance" messages from my metadevices
without destroying data?

This is an example of one of the devices. Yes, it's a "one-sided"
mirror that I've set up. This is so if I ever need to migrate to
another type of storage, I can, by simply adding the storage and
mirroring.

d16: Mirror
Submirror 0: d106
State: Needs maintenance
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 17694080 blocks

d106: Submirror of d16
State: Needs maintenance
Invoke: after replacing "Maintenance" components:
metareplace d16 c3t61d6s0 <new device>
Size: 17694080 blocks
Stripe 0:
Device Start Block Dbase State Hot Spare
c3t61d6s0 0 No Last Erred

As I said, there's nothing wrong with the disk or data, yet SDS is
complaining.

ld kelley (larry)

unread,

Feb 25, 2006, 8:53:30 AM2/25/06

to

try metastat -i

mkir...@rochester.rr.com

unread,

Feb 25, 2006, 8:55:27 AM2/25/06

to

I get the exact same output:

Julian Jacobs

unread,

Feb 25, 2006, 9:07:17 AM2/25/06

to

<mkir...@rochester.rr.com> wrote in message
news:1140875727.0...@u72g2000cwu.googlegroups.com...

You could try replacing the device with itself:
metareplace -e d16 c3t61d6s0

JulianJ

mkir...@rochester.rr.com

unread,

Feb 25, 2006, 9:42:24 AM2/25/06

to

Yes, but since this mirror has only one copy, will it just clear things
up or will it try to resync from nothing?

mkir...@rochester.rr.com

unread,

Feb 25, 2006, 10:19:03 AM2/25/06

to

I have a solution: Give it what it wants!

I'll allocate some temporary storage to create a two-copy mirror, sync
up, do the metareplace, and viola!

Julian Jacobs

unread,

Feb 25, 2006, 1:01:17 PM2/25/06

to

<mkir...@rochester.rr.com> wrote in message
news:1140880743.8...@v46g2000cwv.googlegroups.com...

>I have a solution: Give it what it wants!
>
> I'll allocate some temporary storage to create a two-copy mirror, sync
> up, do the metareplace, and viola!
>

You have to replace the last failed item first.
A metareplace -e should just put that meta device into a good state.
If the item is in a Clarion it should already be in a good state.
Any way whycreate a meta device for this LUN?

JulianJ

slackware guy

unread,

Feb 25, 2006, 5:25:55 PM2/25/06

to

I think metareplace only works if the data is mirrored. From your
output d16 should have 2 subdevices which are mirrirs of each other.
You do not have mirrored data. If the disk is not totally dead, BACK IT
UP IMMEDIATELY! If you do a metareplace, there will be nothing to
restore your data from. Your alternative (assuming you have enough life
left in the disk) is to metainit the d106 mirror (say metainit d206 1 1
cXtXdXs2) and mirror the device using metattach d16 d206.

Good Luck!

slackware guy

unread,

Feb 25, 2006, 5:32:37 PM2/25/06

to

Then do the metareplace

Julian Jacobs

unread,

Feb 26, 2006, 11:29:09 AM2/26/06

to

"slackware guy" <dgbr...@cox.net> wrote in message
news:1140906355.4...@i39g2000cwa.googlegroups.com...

The following is a link to a metareplace man page:
http://www.cse.msu.edu/cgi-bin/man2html?metareplace?1m?/usr/man
metareplace - enable or replace components of submirrors or RAID5
metadevices

A component may be in one of several states. The Last Erred and the
Maintenance states require action. Always replace components in the
Maintenance state first, followed by a resync and validation of data. After
components requiring maintenance are fixed, validated, and resynced,
components in the Last Erred state should be replaced. To avoid data loss,
it is always best to back up all data before replacing Last Erred devices

-e Transitions the state of component to the available state and
resyncs the failed component. If the failed component has been hot spare
replaced, the hot spare is placed in the available state and made available
for other hot spare replacements. This command is useful when a component
fails due to human error (for example, accidentally turning off a disk), or
because the component was physically replaced. In this case, the replacement
component must be partitioned to match the disk being replaced before
running the metareplace command.

As this mirror is a single sidded mirror we have two options.
1. Delete and recreate the mirror
2. Replace the errored component with itself.
The second option is the fastest as only a metastat -e d16 c3t61d6s0 needs
to be done.

Is it worth having these mirrors? All you are doing is putting LVM or
whatever you want to call it in the way.
The chances are that if you do migrate to new storage you will most likely
also move to larger LUN's. This would also add to your migration plan. When
you use growfs you will write-lock the file system.

JulianJ