Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

MegaRaid won't do passthrough

4,086 views
Skip to first unread message

Ian Prideaux

unread,
Oct 15, 2015, 12:00:42 PM10/15/15
to
Hi All,

I've got a X4270 which I'm going to put Solaris11u2 on. It's got a LSI
Megaraid 9261-8i raid controller. I've got seven 1TB disks. I've read
that the best thing to do with ZFS & raid controllers is to put the raid
controller in passthrough/JBOD mode, and let ZFS do everything. However,
this raid controller won't do passthrough. It can only create seven
raid0 arrays, but if I do that, then I loose the ability to hot-swap
disks, I'd need to reboot after a disk change.

My original plan was a mirrored pair for root, a mirrored pair for
/var/log, a mirrored pair for /var/spool, and one hotspare.

Which is the best scenario?

1. As my original plan, with separate physical (pairs of) spindles for
root, log & spool.

2. Get ZFS to raid5 the three (virtual) disks, I know that I'll end up
with only 2TB usable space, will this be any faster or slower? Will I
get better data integrity?

3. It's possible that I can get another drive, would I be better to have
four (mirrored pairs of) disks in a ZFS raid5, and no raid controller
hotspares?

4. Some other layout that I've not though of yet?

Thanks.

John D Groenveld

unread,
Oct 15, 2015, 12:07:00 PM10/15/15
to
In article <mvoiiv$vg4$1...@speranza.aioe.org>,
Ian Prideaux <i...@reversegatesnet.co.uk> wrote:
>I've got a X4270 which I'm going to put Solaris11u2 on. It's got a LSI
>Megaraid 9261-8i raid controller. I've got seven 1TB disks. I've read

Swap the RAID controller for a SAS HBA.

John
groe...@acm.org

Ian Prideaux

unread,
Oct 15, 2015, 12:42:08 PM10/15/15
to
Nice idea, but we've got no money, and anyway, buying things takes
months :-(

John D Groenveld

unread,
Oct 15, 2015, 1:35:06 PM10/15/15
to
In article <mvol0n$5re$1...@speranza.aioe.org>,
Ian Prideaux <i...@reversegatesnet.co.uk> wrote:
>Nice idea, but we've got no money, and anyway, buying things takes
>months :-(

BTW the Oracle SAS 3Gb and SAS 6Gb HBA parts:
SG-XPCIE8SAS-I-Z
SGX-SAS6-INT-Z

John
groe...@acm.org

Chris Ridd

unread,
Oct 15, 2015, 2:32:22 PM10/15/15
to
On 2015-10-15 16:00:33 +0000, Ian Prideaux said:

> Hi All,
>
> I've got a X4270 which I'm going to put Solaris11u2 on. It's got a LSI
> Megaraid 9261-8i raid controller. I've got seven 1TB disks. I've read
> that the best thing to do with ZFS & raid controllers is to put the raid
> controller in passthrough/JBOD mode, and let ZFS do everything. However,
> this raid controller won't do passthrough. It can only create seven
> raid0 arrays, but if I do that, then I loose the ability to hot-swap
> disks, I'd need to reboot after a disk change.

Are there any firmware upgrades for the card which will let you use it
in JBOD mode?

--
Chris

Ian Collins

unread,
Oct 15, 2015, 4:24:45 PM10/15/15
to
Ian Prideaux wrote:
> Hi All,
>
> I've got a X4270 which I'm going to put Solaris11u2 on. It's got a LSI
> Megaraid 9261-8i raid controller. I've got seven 1TB disks. I've read
> that the best thing to do with ZFS & raid controllers is to put the raid
> controller in passthrough/JBOD mode, and let ZFS do everything. However,
> this raid controller won't do passthrough. It can only create seven
> raid0 arrays, but if I do that, then I loose the ability to hot-swap
> disks, I'd need to reboot after a disk change.
>
> My original plan was a mirrored pair for root, a mirrored pair for
> /var/log, a mirrored pair for /var/spool, and one hotspare.
>
> Which is the best scenario?

You're out of luck, the firmware in that card lies when it says it
supports JBOD. I've been down that bumpy road. You may be able to
re-flash the card, I didn't have the time budget to fully investigate
that option.

Just do the sucky RAID0 option and live with it...

As for layout, use a ZFS mirror for the root pool (don't bother with
multiple pools) and use the remaining drives for a "tank" pool you can
use for everything else.

> 1. As my original plan, with separate physical (pairs of) spindles for
> root, log & spool.

Unless you are likely to fill them, keep log & spool in the root pool.

> 2. Get ZFS to raid5 the three (virtual) disks, I know that I'll end up
> with only 2TB usable space, will this be any faster or slower? Will I
> get better data integrity?

How you build the second pool depends on what you intend to do with it.
If you want performance, use a stripe of mirrors. If you want
capacity, use raidz or raidz2.

> 3. It's possible that I can get another drive, would I be better to have
> four (mirrored pairs of) disks in a ZFS raid5, and no raid controller
> hotspares?

If you have quick and easy access to the box, I wouldn't bother with hot
spares.

--
Ian Collins

Cydrome Leader

unread,
Oct 16, 2015, 3:53:15 PM10/16/15
to
Ian Collins <ian-...@hotmail.com> wrote:
> Ian Prideaux wrote:
>> Hi All,
>>
>> I've got a X4270 which I'm going to put Solaris11u2 on. It's got a LSI
>> Megaraid 9261-8i raid controller. I've got seven 1TB disks. I've read
>> that the best thing to do with ZFS & raid controllers is to put the raid
>> controller in passthrough/JBOD mode, and let ZFS do everything. However,
>> this raid controller won't do passthrough. It can only create seven
>> raid0 arrays, but if I do that, then I loose the ability to hot-swap
>> disks, I'd need to reboot after a disk change.
>>
>> My original plan was a mirrored pair for root, a mirrored pair for
>> /var/log, a mirrored pair for /var/spool, and one hotspare.
>>
>> Which is the best scenario?
>
> You're out of luck, the firmware in that card lies when it says it
> supports JBOD. I've been down that bumpy road. You may be able to
> re-flash the card, I didn't have the time budget to fully investigate
> that option.

LSI/megaraid junk are hands down the worst controllers ever made.

> Just do the sucky RAID0 option and live with it...

well, it won't explode if you hot swap a RAID0 drive. The question is will
it permit you to delete the "dead" volume and create a new one to replace
it, without rebooting 15 times. If you think about it, hot swapping RAID0
doesn't even really make any sense in the first place. It's always good to
experiment and document the disk failure/recovery procedures on RAID
controllers. They sometimes have unexpected behaviour, and operator error
can trash an array that should have had just been a simple disk swap.

Ian Prideaux

unread,
Oct 18, 2015, 4:34:43 AM10/18/15
to
I asked that on their contact form. The useless sods abdicated:

> Thank You for contacting LSI-Avago Support
>
> Please accept my apologies, but your information describes an OEM
> Chip / Controller, purchased from SUN. Unfortunately, we are unable
> to provide direct support for OEM products. Please engage SUN for
> support on this issue. Your Vendor is using a LSI Chip / Controller,
> but the functionality is set to their System. We have no information
> on how the Chip / Controller works on their System. That is why we
> cannot provide any Support.
>
1. They don't seem to know that sun don't exist any more, it's now oracle.
2. "functionality is set to their System" is gibberish.

:-(



Ian Prideaux

unread,
Oct 18, 2015, 4:48:15 AM10/18/15
to
I do need hotspares because the machines live in datacentres, anything
up to 30 miles away. The last time that a machine failed, it took them a
week to get the paperwork done, to go through two layers of change
management to get them to allow me access to the building and the data
floor :-(

ISTM that the smallest chunk that I can use is a mirrored pair. I'm now
toying with the idea of 4 disks in a raid6 for /, and a mirrored pair
for /var/spool/MyApplication, and one hotspare. Or possibly just raid6
all six disks, and one hotspare. If I have two separate vdisks, can I
use each to, or will zfs automatically do something to, improve the
reliability and/or speed of the other? Two separate vdisks will give me
3 disks worth of space, one big raid6 vdisk will give me 4 disks worth,
although I'm not short of space, the machines are currently running with
two disks.


Ian Collins

unread,
Oct 18, 2015, 4:58:42 AM10/18/15
to
Ian Prideaux wrote:
>>
> I do need hotspares because the machines live in datacentres, anything
> up to 30 miles away. The last time that a machine failed, it took them a
> week to get the paperwork done, to go through two layers of change
> management to get them to allow me access to the building and the data
> floor :-(
>
> ISTM that the smallest chunk that I can use is a mirrored pair.

Why?

Use single drive RAID0s and let ZFS take care of the redundancy and hot
spare.

> I'm now
> toying with the idea of 4 disks in a raid6 for /, and a mirrored pair
> for /var/spool/MyApplication, and one hotspare. Or possibly just raid6
> all six disks, and one hotspare.

I haven't used legacy RAID for so long I can barely remember what RAID6
is...

I'm pretty sure Solaris 11.2 can't boot of a stripe, so my original
suggestion of a mirror root pool and four drives for an everything else
pool plus the hot spare.

--
Ian Collins

John D Groenveld

unread,
Oct 18, 2015, 4:15:36 PM10/18/15
to
In article <mvvlip$713$1...@speranza.aioe.org>,
Ian Prideaux <i...@reversegatesnet.co.uk> wrote:
>I asked that on their contact form. The useless sods abdicated:

Does LSI MegaCLI for recognize the RAID controller?
<URL:http://www.avagotech.com/support/download-search>
If so, you might be able to flash it with LSI's firmware.
But you might also end up with an unsupported doorstop.

Good luck,
John
groe...@acm.org

Ian Collins

unread,
Oct 18, 2015, 4:34:11 PM10/18/15
to
John D Groenveld wrote:
> In article <mvvlip$713$1...@speranza.aioe.org>,
> Ian Prideaux <i...@reversegatesnet.co.uk> wrote:
>> I asked that on their contact form. The useless sods abdicated:
>
> Does LSI MegaCLI for recognize the RAID controller?

It does, that's what I use to manage these annoying systems.

--
Ian Collins

Cydrome Leader

unread,
Oct 19, 2015, 6:34:21 PM10/19/15
to
hotspares is not the same as hotswap. If you try the single disk raid0 and
use zfs, thats's great, but you may not be able to physically replace a
failed disk without a reboot and reconfig of the array itself, even if you
remirrored to another spare disk which would be acting as a "hotspare".

Electrically, it's not too likely your controller and chassis don't
support hot plugging drives, unless it's from 100 years ago. Any "no
hotswap under X configuration" limitations are due to garbage controller
design.

> ISTM that the smallest chunk that I can use is a mirrored pair. I'm now
> toying with the idea of 4 disks in a raid6 for /, and a mirrored pair

Performance of RAID6 is in general poor, and a 4 disk RAID6 really doesn't
make sense anyways. Your best bet for 4 drives is raid 1+0 with each set
of 2 drives mirrored, and the mirrors striped. You can lose 1 drive with
no problem, and may be able to lose a second drive part of the time. Real
controllers can do this properly, I would not trust LSI on this though.


> for /var/spool/MyApplication, and one hotspare. Or possibly just raid6
> all six disks, and one hotspare. If I have two separate vdisks, can I
> use each to, or will zfs automatically do something to, improve the
> reliability and/or speed of the other? Two separate vdisks will give me
> 3 disks worth of space, one big raid6 vdisk will give me 4 disks worth,
> although I'm not short of space, the machines are currently running with
> two disks.

ZFS can't make raid6 or raidz or whatever oracle tries to call it faster
than what it is. You may not need any write speed, I don't know your
application or needs, so I can't help with that. You may need to
experiment to see what does or doesn't work.


Andrew Gabriel

unread,
Oct 20, 2015, 2:18:20 PM10/20/15
to
In article <n03r56$fml$1...@reader1.panix.com>,
Cydrome Leader <pres...@MUNGEpanix.com> writes:
> Performance of RAID6 is in general poor, and a 4 disk RAID6 really doesn't
> make sense anyways. Your best bet for 4 drives is raid 1+0 with each set
> of 2 drives mirrored, and the mirrors striped. You can lose 1 drive with
> no problem, and may be able to lose a second drive part of the time. Real
> controllers can do this properly, I would not trust LSI on this though.

4-disk RAIDZ2 may look pointless at a first glance, but a couple of points:
For the use cases where RAIDZ2 is poor, it's at the best end of the scale.
Its mean time to data loss is over 3 orders of magnitude better than a
4-disk RAID10, which means there are some situations where you can deploy
it, where the alternative RAID10 configuration fails to meet the customer's
SLA.

> ZFS can't make raid6 or raidz or whatever oracle tries to call it faster
> than what it is. You may not need any write speed, I don't know your
> application or needs, so I can't help with that. You may need to
> experiment to see what does or doesn't work.

RAIDZ1/2 has significant differences from RAID5/6. In particular RAIDZ
doesn't have a fixed stripe size - it makes the stripe size match the
data blocksize it's writing, so it never needs to do the stripe-wide
read-modify-write cycles you see in RAID5/6.

--
Andrew Gabriel
[email address is not usable -- followup in the newsgroup]

Cydrome Leader

unread,
Oct 22, 2015, 11:45:03 AM10/22/15
to
Andrew Gabriel <and...@cucumber.demon.co.uk> wrote:
> In article <n03r56$fml$1...@reader1.panix.com>,
> Cydrome Leader <pres...@MUNGEpanix.com> writes:
>> Performance of RAID6 is in general poor, and a 4 disk RAID6 really doesn't
>> make sense anyways. Your best bet for 4 drives is raid 1+0 with each set
>> of 2 drives mirrored, and the mirrors striped. You can lose 1 drive with
>> no problem, and may be able to lose a second drive part of the time. Real
>> controllers can do this properly, I would not trust LSI on this though.
>
> 4-disk RAIDZ2 may look pointless at a first glance, but a couple of points:
> For the use cases where RAIDZ2 is poor, it's at the best end of the scale.
> Its mean time to data loss is over 3 orders of magnitude better than a
> 4-disk RAID10, which means there are some situations where you can deploy
> it, where the alternative RAID10 configuration fails to meet the customer's
> SLA.

I can't say I've ever seen a SLA request that forbid RAID 1+0 as being not
reliable enough. Has this scenario really happened for you?

>> ZFS can't make raid6 or raidz or whatever oracle tries to call it faster
>> than what it is. You may not need any write speed, I don't know your
>> application or needs, so I can't help with that. You may need to
>> experiment to see what does or doesn't work.
>
> RAIDZ1/2 has significant differences from RAID5/6. In particular RAIDZ
> doesn't have a fixed stripe size - it makes the stripe size match the
> data blocksize it's writing, so it never needs to do the stripe-wide
> read-modify-write cycles you see in RAID5/6.

I guess this is true. It would not be true if the OP did enable some sort
of LSI/megaraid RAID 5 or 6 at the controller level. Over here, we only
use RAID 6 if the access time has to be faster than recalling tape from a
vault and restoring. In that respect, it's blazing fast, and as amazing as
sales people might make you think it is.

Ian Prideaux

unread,
Oct 22, 2015, 12:26:48 PM10/22/15
to
I did a quick test today, created a file:
dd if=/dev/urandom of=file1 count=$((1<<20));sync
cp file1 file2;sync

and I did it on a megaraid raid6 array and a megaraid raid1 array.

dding the files took approx the same time on both arrays.

Copying the files took ~ twice as long on the mirror as it did on the
raid6 i.e. the raid6 was twice the speed of the mirror. I was expecting
the mirror to be faster, but no.


Cydrome Leader

unread,
Oct 22, 2015, 1:08:02 PM10/22/15
to
You may need to try this with more than a dozen files to get some real
numbers. Who knows what was being cached where. sync doens't even
guarantee a real sync to disks these days anyways. How fast is your
/dev/urandom?

Ian Prideaux

unread,
Oct 22, 2015, 1:31:17 PM10/22/15
to
/dev/urandom was much slower than either of the file copies, and the
same speed on both arrays, so I'm guessing that it was limited by
urandom, not the disk arrays.

I'll try some bigger tests, with more files, tomorrow.

Ian Collins

unread,
Oct 22, 2015, 1:54:50 PM10/22/15
to
Ian Prideaux wrote:

> I did a quick test today, created a file:
> dd if=/dev/urandom of=file1 count=$((1<<20));sync
> cp file1 file2;sync
>
> and I did it on a megaraid raid6 array and a megaraid raid1 array.

Don't use hardware raid, leave that job to ZFS.

--
Ian Collins

invalid

unread,
Oct 22, 2015, 2:50:50 PM10/22/15
to
We've said that repeatedly but he has ignored us. It seems like a waste of
time to read his posts so he went into my killfile.

>

Ian Collins

unread,
Oct 23, 2015, 5:25:15 PM10/23/15
to
You have a point! It's incredible how people can ask for help on a
forum frequented be people with decades of experience and then
completely ignore the advice offered.

--
Ian Collins

John D Groenveld

unread,
Oct 23, 2015, 6:04:00 PM10/23/15
to
In article <d8vmtl...@mid.individual.net>,
Ian Collins <ian-...@hotmail.com> wrote:
>You have a point! It's incredible how people can ask for help on a
>forum frequented be people with decades of experience and then
>completely ignore the advice offered.

The OP didn't ignore the advice, he wrote back that he couldn't
afford to purchase a HBA.
Message-ID: <mvol0n$5re$1...@speranza.aioe.org>
<URL:http://al.howardknight.net/msgid.cgi?STYPE=msgid&A=0&MSGI=%3Cmvol0n$5re$1...@speranza.aioe.org%3E>

A quick glance at eBay suggests he can't afford not to.
<URL:http://www.ebay.com/sch/i.html?_odkw=1068&_osacat=0&_from=R40&_trksid=p2045573.m570.l1313.TR0.TRC0.H0.X1068+lsi.TRS0&_nkw=1068+lsi&_sacat=0>

John
groe...@acm.org

Ian Collins

unread,
Oct 23, 2015, 8:03:21 PM10/23/15
to
John D Groenveld wrote:
> In article <d8vmtl...@mid.individual.net>,
> Ian Collins <ian-...@hotmail.com> wrote:
>> You have a point! It's incredible how people can ask for help on a
>> forum frequented be people with decades of experience and then
>> completely ignore the advice offered.
>
> The OP didn't ignore the advice, he wrote back that he couldn't
> afford to purchase a HBA.

He wouldn't have to. I have a number of Sun/Oracle boxes with those
pesky cards (and even more Dells with the same controller) and they are
all configured with multiple single drive RAIDs passed through to ZFS.

--
Ian Collins

Doug McIntyre

unread,
Oct 24, 2015, 5:38:06 PM10/24/15
to

Ian Collins

unread,
Oct 24, 2015, 5:42:48 PM10/24/15
to
Indeed I did...

--
Ian Collins
0 new messages