Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

ZFS hangs on pulled disk

57 views
Skip to first unread message

Tobias Mueller

unread,
Sep 19, 2006, 8:46:35 AM9/19/06
to
Hi there!

I am just on testing ZFS with the latest Solaris 10 sparc (from june).
I have created an RaidZ with three external SCSI-Disks.

I can bring one disk offline using zpool, pull it and replace it with
another disk. No problem, so it should really be a RaidZ.

But when I pull one disk physically the whole ZFS is nearly dead.
ZPool status hangs, i can't unmount the whole tree (so even init 6
hangs) and also an "ls" or "df" hangs. The only possible action which
works was an `echo something > fileinzfspool`or touch.

Has anyone experienced the same problems? Are there any known solutions
to not have to hard reset the whole server if one device fails?

Thanks in advance,

Tobias

toby

unread,
Sep 19, 2006, 11:14:24 AM9/19/06
to
Tobias Mueller wrote:
> Hi there!
>
> I am just on testing ZFS with the latest Solaris 10 sparc (from june).
> I have created an RaidZ with three external SCSI-Disks.
>
> I can bring one disk offline using zpool, pull it and replace it with
> another disk. No problem, so it should really be a RaidZ.
>
> But when I pull one disk physically the whole ZFS is nearly dead.
> ZPool status hangs, i can't unmount the whole tree (so even init 6
> hangs) and also an "ls" or "df" hangs. The only possible action which
> works was an `echo something > fileinzfspool`or touch.
>
> Has anyone experienced the same problems?

Yes, I've posted about this, same behaviour on an X2100. Nobody has
replied. I have this problem with SVM also. It doesn't make a
difference in my tests if mirrors are offline'd first, machine hangs
until disk is replaced.

Daniel Rock

unread,
Sep 19, 2006, 11:38:09 AM9/19/06
to
toby <to...@telegraphics.com.au> wrote:
> Yes, I've posted about this, same behaviour on an X2100. Nobody has
> replied. I have this problem with SVM also. It doesn't make a
> difference in my tests if mirrors are offline'd first, machine hangs
> until disk is replaced.

X2100 (ATA) is a different thing. Hotplugging is not supported by the
ata driver[1], so this is the expected behaviour. But SCSI should behave
better though.

[1] no native SATA support for the NForce chipsets yet.

--
Daniel

toby

unread,
Sep 19, 2006, 12:51:09 PM9/19/06
to
Daniel Rock wrote:
> toby <to...@telegraphics.com.au> wrote:
> > Yes, I've posted about this, same behaviour on an X2100. Nobody has
> > replied. I have this problem with SVM also. It doesn't make a
> > difference in my tests if mirrors are offline'd first, machine hangs
> > until disk is replaced.
>
> X2100 (ATA) is a different thing. Hotplugging is not supported by the
> ata driver[1], so this is the expected behaviour.

Colour me NAIVE. The bays are advertised as hot swappable. Well, well -
false claim? "Hot swappable with future software upgrade"?

Rich Teer

unread,
Sep 19, 2006, 1:05:25 PM9/19/06
to
On Tue, 19 Sep 2006, toby wrote:

> Colour me NAIVE. The bays are advertised as hot swappable. Well, well -
> false claim? "Hot swappable with future software upgrade"?

No, if you look at the fine print, you'll see that hot swapping is only
supported under Windoze at present. I hear Solaris support for hot
swapability is in the works.

--
Rich Teer, SCNA, SCSA, OpenSolaris CAB member

President,
Rite Online Inc.

Voice: +1 (250) 979-1638
URL: http://www.rite-group.com/rich

toby

unread,
Sep 19, 2006, 1:38:42 PM9/19/06
to

Rich Teer wrote:
> On Tue, 19 Sep 2006, toby wrote:
>
> > Colour me NAIVE. The bays are advertised as hot swappable. Well, well -
> > false claim? "Hot swappable with future software upgrade"?
>
> No, if you look at the fine print, you'll see that hot swapping is only
> supported under Windoze at present. I hear Solaris support for hot
> swapability is in the works.

Thanks, glad to hear.

toby

unread,
Sep 20, 2006, 7:33:40 PM9/20/06
to
Rich Teer wrote:
> On Tue, 19 Sep 2006, toby wrote:
>
> > Colour me NAIVE. The bays are advertised as hot swappable. Well, well -
> > false claim? "Hot swappable with future software upgrade"?
>
> No, if you look at the fine print,

The fine print referred to is, I believe,
http://www.sun.com/products-n-solutions/hardware/docs/html/819-3721-12/Chap4.html#pgfId-999846
which says that unless using "integrated mirroring" (NVRAID - supported
under Windows only), the server must be powered down before replacing
disks.

> you'll see that hot swapping is only
> supported under Windoze at present. I hear Solaris support for hot
> swapability is in the works.

I expect this is support for NVRAID only... those of us using SVM and
ZFS are SOL. :/

I like my X2100, but I'd have preferred that Sun had emphasised that
the "hotswappable" SATA bays are only such if running Windows AND in
NVRAID configuration. Is this limitation common to IBM, HP and Dell
small server lines?

Out of interest, what is the minimum spec Sun box with full support for
online drive replacement?

Rich Teer

unread,
Sep 21, 2006, 12:34:15 PM9/21/06
to
On Wed, 20 Sep 2006, toby wrote:

> > supported under Windoze at present. I hear Solaris support for hot
> > swapability is in the works.
>
> I expect this is support for NVRAID only... those of us using SVM and
> ZFS are SOL. :/

To be honest, I'm not entirely sure. I did look into this a while ago,
but the details are lost in the mists of time. :-)

> I like my X2100, but I'd have preferred that Sun had emphasised that
> the "hotswappable" SATA bays are only such if running Windows AND in
> NVRAID configuration. Is this limitation common to IBM, HP and Dell
> small server lines?

No idea, as I don't even consider models from other vendors.

> Out of interest, what is the minimum spec Sun box with full support for
> online drive replacement?

I *think* at the moment that would be the X4100.

toby

unread,
Sep 21, 2006, 3:21:48 PM9/21/06
to
Rich Teer wrote:
> On Wed, 20 Sep 2006, toby wrote:
>
> > > supported under Windoze at present. I hear Solaris support for hot
> > > swapability is in the works.
> >
> > I expect this is support for NVRAID only... those of us using SVM and
> > ZFS are SOL. :/
>
> To be honest, I'm not entirely sure. I did look into this a while ago,
> but the details are lost in the mists of time. :-)
>
> > I like my X2100, but I'd have preferred that Sun had emphasised that
> > the "hotswappable" SATA bays are only such if running Windows AND in
> > NVRAID configuration. Is this limitation common to IBM, HP and Dell
> > small server lines?
>
> No idea, as I don't even consider models from other vendors.

Well, I'd only consider models certified for Solaris 10. :) But online
disk replacement is important to our application, so in hindsight we
may have had to review what the other entry level brands can do. Not
that this would be easy -- such simple answers are often rather
obscure. :/

>
> > Out of interest, what is the minimum spec Sun box with full support for
> > online drive replacement?
>
> I *think* at the moment that would be the X4100.

According to
http://www.sun.com/products-n-solutions/hardware/docs/html/819-1157-15/x4100maint.html#pgfId-1002165
the SAS disks are hotswappable in the context of LSISAS1064 controller
RAID, "a highly integrated, low-cost RAID solution. It is designed for
systems requiring redundancy and high availability, but not requiring a
full-featured RAID implementation."

My question, not immediately answerable in these manuals, is what can
the "full-featured" SVM and ZFS mirror users do? Is offlining mirrors
and pulling the disk going to work on the X4100?

At least the X4100 has enough bays for a hot spare configuration, and
presumably *adding* a disk in a previously empty slot is OK.

Frank Cusack

unread,
Sep 21, 2006, 6:00:38 PM9/21/06
to
On 21 Sep 2006 12:21:48 -0700 "toby" <to...@telegraphics.com.au> wrote:
> According to
> http://www.sun.com/products-n-solutions/hardware/docs/html/819-1157-15/x4100maint.html#pgfId-1002165
> the SAS disks are hotswappable in the context of LSISAS1064 controller
> RAID, "a highly integrated, low-cost RAID solution. It is designed for
> systems requiring redundancy and high availability, but not requiring a
> full-featured RAID implementation."
>
> My question, not immediately answerable in these manuals, is what can
> the "full-featured" SVM and ZFS mirror users do? Is offlining mirrors
> and pulling the disk going to work on the X4100?

Yes. The "full-featured RAID" refers to the HW RAID.

> At least the X4100 has enough bays for a hot spare configuration, and
> presumably *adding* a disk in a previously empty slot is OK.

Yes.

You could just wait for a Solaris update which works for the x2100.

-frank

toby

unread,
Sep 21, 2006, 6:16:35 PM9/21/06
to
Frank Cusack wrote:
> On 21 Sep 2006 12:21:48 -0700 "toby" <to...@telegraphics.com.au> wrote:
> > According to
> > http://www.sun.com/products-n-solutions/hardware/docs/html/819-1157-15/x4100maint.html#pgfId-1002165
> > the SAS disks are hotswappable in the context of LSISAS1064 controller
> > RAID, "a highly integrated, low-cost RAID solution. It is designed for
> > systems requiring redundancy and high availability, but not requiring a
> > full-featured RAID implementation."
> >
> > My question, not immediately answerable in these manuals, is what can
> > the "full-featured" SVM and ZFS mirror users do? Is offlining mirrors
> > and pulling the disk going to work on the X4100?
>
> Yes. The "full-featured RAID" refers to the HW RAID.

My understanding from the quote above is exactly the opposite.

>
> > At least the X4100 has enough bays for a hot spare configuration, and
> > presumably *adding* a disk in a previously empty slot is OK.
>
> Yes.
>
> You could just wait for a Solaris update which works for the x2100.

As detailed above, I do not know whether such an update would support
only the "integrated" NVRAID - as currently supported by Windows only -
or whether one will be able to replace a disk configured with only SVM
and/or ZFS mirrors. I strongly suspect, if/when this update arrives,
that only the hw RAID will allow replacement without reboot on the
X2100. If anyone has better information, please provide.

>
> -frank

Frank Cusack

unread,
Sep 21, 2006, 9:06:46 PM9/21/06
to
On 21 Sep 2006 15:16:35 -0700 "toby" <to...@telegraphics.com.au> wrote:
> Frank Cusack wrote:
>> On 21 Sep 2006 12:21:48 -0700 "toby" <to...@telegraphics.com.au> wrote:
>> > According to
>> > http://www.sun.com/products-n-solutions/hardware/docs/html/819-1157-15/x4100maint.html#pgfId-1002165
>> > The SAS disks are hotswappable in the context of LSISAS1064 controller

>> > RAID, "a highly integrated, low-cost RAID solution. It is designed for
>> > systems requiring redundancy and high availability, but not requiring a
>> > full-featured RAID implementation."
>> >
>> > My question, not immediately answerable in these manuals, is what can
>> > the "full-featured" SVM and ZFS mirror users do? Is offlining mirrors
>> > and pulling the disk going to work on the X4100?
>>
>> Yes. The "full-featured RAID" refers to the HW RAID.
>
> My understanding from the quote above is exactly the opposite.

I can't find that quote in the URL above. Looking at
<http://docs.sun.com/source/819-1157-15/app-biosraid.html>, E.4.1,
it is QUITE clear they are talking about the HW RAID.

I can assure you that drives on the x4100 are hot swappable under
software RAID.

>> > At least the X4100 has enough bays for a hot spare configuration, and
>> > presumably *adding* a disk in a previously empty slot is OK.
>>
>> Yes.
>>
>> You could just wait for a Solaris update which works for the x2100.
>
> As detailed above, I do not know whether such an update would support
> only the "integrated" NVRAID - as currently supported by Windows only -
> or whether one will be able to replace a disk configured with only SVM
> and/or ZFS mirrors. I strongly suspect, if/when this update arrives,
> that only the hw RAID will allow replacement without reboot on the
> X2100. If anyone has better information, please provide.

All the info I've been reading strongly indicates the hot swap support
has nothing to do with the hardware RAID functionality. There's not
some single piece of info I can quote you though, it's info that comes
in fits and spurts on Usenet. I can't imagine it would be hw-raid
though. The reason hot swap doesn't work on the x2100 is that the
SATA controller is not supported as a first-class SATA device. yet.

-frank

toby

unread,
Sep 21, 2006, 9:48:53 PM9/21/06
to
Frank Cusack wrote:
> On 21 Sep 2006 15:16:35 -0700 "toby" <to...@telegraphics.com.au> wrote:
> > Frank Cusack wrote:
> >> On 21 Sep 2006 12:21:48 -0700 "toby" <to...@telegraphics.com.au> wrote:
> >> > According to
> >> > http://www.sun.com/products-n-solutions/hardware/docs/html/819-1157-15/x4100maint.html#pgfId-1002165
> >> > The SAS disks are hotswappable in the context of LSISAS1064 controller
> >> > RAID, "a highly integrated, low-cost RAID solution. It is designed for
> >> > systems requiring redundancy and high availability, but not requiring a
> >> > full-featured RAID implementation."
> >> >
> >> > My question, not immediately answerable in these manuals, is what can
> >> > the "full-featured" SVM and ZFS mirror users do? Is offlining mirrors
> >> > and pulling the disk going to work on the X4100?
> >>
> >> Yes. The "full-featured RAID" refers to the HW RAID.
> >
> > My understanding from the quote above is exactly the opposite.
>
> I can't find that quote in the URL above. Looking at
> <http://docs.sun.com/source/819-1157-15/app-biosraid.html>, E.4.1,
> it is QUITE clear they are talking about the HW RAID.

Yes, my quote is also discussing the hw RAID, and says: "designed for
systems ... NOT requiring a full-featured RAID implementation"
(emphasis mine).

Unless the management software, if any - which I have not used, and am
not likely to use, since the subsystem is not supported by Solaris - is
as sophisticated as SVM or ZFS, I'm not sure how the NVRAID can be
described as "full-featured". For instance, does it do hot sparing?

>
> I can assure you that drives on the x4100 are hot swappable under
> software RAID.

That is good to know.

>
> >> > At least the X4100 has enough bays for a hot spare configuration, and
> >> > presumably *adding* a disk in a previously empty slot is OK.
> >>
> >> Yes.
> >>
> >> You could just wait for a Solaris update which works for the x2100.
> >
> > As detailed above, I do not know whether such an update would support
> > only the "integrated" NVRAID - as currently supported by Windows only -
> > or whether one will be able to replace a disk configured with only SVM
> > and/or ZFS mirrors. I strongly suspect, if/when this update arrives,
> > that only the hw RAID will allow replacement without reboot on the
> > X2100. If anyone has better information, please provide.
>
> All the info I've been reading strongly indicates the hot swap support
> has nothing to do with the hardware RAID functionality.

Sure, it may be nothing more than coincidence that the only working
hotswap config involves hw RAID at this time.

> There's not
> some single piece of info I can quote you though, it's info that comes
> in fits and spurts on Usenet. I can't imagine it would be hw-raid
> though. The reason hot swap doesn't work on the x2100 is that the
> SATA controller is not supported as a first-class SATA device. yet.

I'm sure X2100 owners will be very happy if so.

>
> -frank

Frank Cusack

unread,
Sep 22, 2006, 1:49:18 AM9/22/06
to
On 21 Sep 2006 18:48:53 -0700 "toby" <to...@telegraphics.com.au> wrote:
> Frank Cusack wrote:
>> On 21 Sep 2006 15:16:35 -0700 "toby" <to...@telegraphics.com.au> wrote:
>> > Frank Cusack wrote:
>> >> On 21 Sep 2006 12:21:48 -0700 "toby" <to...@telegraphics.com.au> wrote:
>> >> > According to
>> >> > http://www.sun.com/products-n-solutions/hardware/docs/html/819-1157-15/x4100maint.html#pgfId-1002165
>> >> > The SAS disks are hotswappable in the context of LSISAS1064 controller
>> >> > RAID, "a highly integrated, low-cost RAID solution. It is designed for
>> >> > systems requiring redundancy and high availability, but not requiring a
>> >> > full-featured RAID implementation."
>> >> >
>> >> > My question, not immediately answerable in these manuals, is what can
>> >> > the "full-featured" SVM and ZFS mirror users do? Is offlining mirrors
>> >> > and pulling the disk going to work on the X4100?
>> >>
>> >> Yes. The "full-featured RAID" refers to the HW RAID.
>> >
>> > My understanding from the quote above is exactly the opposite.
>>
>> I can't find that quote in the URL above. Looking at
>> <http://docs.sun.com/source/819-1157-15/app-biosraid.html>, E.4.1,
>> it is QUITE clear they are talking about the HW RAID.
>
> Yes, my quote is also discussing the hw RAID, and says: "designed for
> systems ... NOT requiring a full-featured RAID implementation"
> (emphasis mine).

OK, we violently agree then. :-) When I quoted "full-featured RAID" I
probably should have said "not a full-featured RAID". And what they
mean is "not a full-featured *hardware* RAID".

> Unless the management software, if any - which I have not used, and am
> not likely to use, since the subsystem is not supported by Solaris - is
> as sophisticated as SVM or ZFS, I'm not sure how the NVRAID can be
> described as "full-featured". For instance, does it do hot sparing?

Well, NVRAID is not described as full-featured. I think we agree again,
you're just arguing about my misquote.

NVRAID is not full-featured because it is not really a RAID. It is
like a winmodem. It does some drive mgmt but you need a software
driver to make it go. (So now it should make sense that it only works
under Windows.)

But you can skip the HW RAID features (such as they are) entirely and
you have whatever features your software RAID has, which in Solaris'
case there is no hot plug because it uses the generic ATA driver.

>> All the info I've been reading strongly indicates the hot swap support
>> has nothing to do with the hardware RAID functionality.
>
> Sure, it may be nothing more than coincidence that the only working
> hotswap config involves hw RAID at this time.

I'd bet that hot swap works under Windows even without RAID.

-frank

Daniel Rock

unread,
Sep 22, 2006, 4:31:58 AM9/22/06
to
In comp.sys.sun.hardware toby <to...@telegraphics.com.au> wrote:
> I expect this is support for NVRAID only... those of us using SVM and
> ZFS are SOL. :/

I don't think so. NVRAID is no hardware RAID. It is just software RAID with
some boot support from the BIOS. All the logic is in the device driver.
The device driver is the same (also in Windows), regardless if you enable
or disable NVRAID in BIOS. I'd expect to behave the Solaris driver similar.

--
Daniel

toby

unread,
Sep 22, 2006, 11:55:47 AM9/22/06
to
Daniel Rock wrote:
> In comp.sys.sun.hardware toby <to...@telegraphics.com.au> wrote:
> > I expect this is support for NVRAID only... those of us using SVM and
> > ZFS are SOL. :/
>
> I don't think so. NVRAID is no hardware RAID. It is just software RAID with
> some boot support from the BIOS.

I see. This "support" includes things like probing as one drive and not
two? (Just speculating here, I haven't used NVRAID.)

> All the logic is in the device driver.
> The device driver is the same (also in Windows), regardless if you enable
> or disable NVRAID in BIOS.

Wouldn't that also be the case if the RAID functionality is in
hardware, too?

> I'd expect to behave the Solaris driver similar.

The documentation is very skimpy on this whole issue. The only definite
statement it makes, quoted above, is that you may use NVRAID with
Windows only, and you may then replace drives online. I made the
following inferences:
- the "integrated" RAID is not supported under Solaris (I don't care, I
want SVM/ZFS)
- online replacement is not supported by Solaris' SATA drivers (as
Frank says, and tested by experiment)

>
> --
> Daniel

Daniel Rock

unread,
Sep 22, 2006, 4:37:36 PM9/22/06
to
In comp.sys.sun.hardware toby <to...@telegraphics.com.au> wrote:
> Daniel Rock wrote:
>> In comp.sys.sun.hardware toby <to...@telegraphics.com.au> wrote:
>> > I expect this is support for NVRAID only... those of us using SVM and
>> > ZFS are SOL. :/
>>
>> I don't think so. NVRAID is no hardware RAID. It is just software RAID with
>> some boot support from the BIOS.
>
> I see. This "support" includes things like probing as one drive and not
> two? (Just speculating here, I haven't used NVRAID.)

Not directly. By support I mean being able to boot from a degraded mirror or
RAID-5. The BIOS only registers one logical drive (for int 0x13 calls) and
will handle the redirection to the right drive or recalculation of parity
in the case of a failed RAID-5 drive.


>> All the logic is in the device driver.
>> The device driver is the same (also in Windows), regardless if you enable
>> or disable NVRAID in BIOS.
>
> Wouldn't that also be the case if the RAID functionality is in
> hardware, too?

No. With real hardware RAID the OS only sees the logical drive and therefor
just has to "talk" to one drive. Creating/modifying/deleting RAID devices
is done by the chipset in the hardware RAID and not by the OS.

Hardware RAID detects fault conditions in drives by itself and notifies the
OS. Rebuilding is just initiated by the OS but performed in hardware. etc.
etc. etc.

NVraid is very primitive: In the second last sector of each configured disks
the configuration (RAID-0/1/5; state) is being stored. The OS builds pseudo
disks out of this configuration.

You can get an impression how primitive NVRAID really is by looking at the
FreeBSD sources:

http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/dev/ata/ata-raid.c?rev=HEAD

Search for functions containing the string "_nvidia_"


>> I'd expect to behave the Solaris driver similar.
>
> The documentation is very skimpy on this whole issue. The only definite
> statement it makes, quoted above, is that you may use NVRAID with
> Windows only, and you may then replace drives online. I made the
> following inferences:
> - the "integrated" RAID is not supported under Solaris (I don't care, I
> want SVM/ZFS)
> - online replacement is not supported by Solaris' SATA drivers (as
> Frank says, and tested by experiment)

Solaris doesn't provide a SATA driver for NForce chipsets yet. Currently
the drives are driven by the old (P)ATA driver. If SATA drivers will be
released (don't know when) they will most definitely support hot plugging.

--
Daniel

toby

unread,
Sep 22, 2006, 5:02:46 PM9/22/06
to

Daniel Rock wrote:
> In comp.sys.sun.hardware toby <to...@telegraphics.com.au> wrote:
> > ...

> > The documentation is very skimpy on this whole issue. The only definite
> > statement it makes, quoted above, is that you may use NVRAID with
> > Windows only, and you may then replace drives online. I made the
> > following inferences:
> > - the "integrated" RAID is not supported under Solaris (I don't care, I
> > want SVM/ZFS)
> > - online replacement is not supported by Solaris' SATA drivers (as
> > Frank says, and tested by experiment)
>
> Solaris doesn't provide a SATA driver for NForce chipsets yet. Currently
> the drives are driven by the old (P)ATA driver. If SATA drivers will be
> released (don't know when) they will most definitely support hot plugging.

Thanks for all the background, Daniel. I guess we'll wait and hope. :)

>
> --
> Daniel

0 new messages