I may be getting a new Dell PE1850 soon, to replace our ancient CVS server
(still running 4-STABLE). The new machine will ideally run 6.0 and have a
PERC4e/DC RAID card - the one with battery-backed cache. This is listed as
supported by amr(4), but I'm wondering how well it actually works in the
case of a disk failure. Will the driver tell me that disk has failed (a
syslog message would be enough) or will I have to make a daily trip into
the server room to check the front panel lights? Presumably it handles
hot-swapping a replacement drive OK?
I found some posts mentioning some management/monitoring tools for these
controllers that were allegedly available from the www.lsilogic.com
website, but I can't find anything on there for FreeBSD. Do the Linux
tools work?
Cheers,
Scott
--
===========================================================================
Scott Mitchell | PGP Key ID | "Eagles may soar, but weasels
Cambridge, England | 0x54B171B9 | don't get sucked into jet engines"
scott at fishballoon.org | 0xAA775B8B | -- Anon
> Hi all,
>
> I may be getting a new Dell PE1850 soon, to replace our ancient CVS server
> (still running 4-STABLE). The new machine will ideally run 6.0 and have a
> PERC4e/DC RAID card - the one with battery-backed cache. This is listed as
> supported by amr(4), but I'm wondering how well it actually works in the
> case of a disk failure. Will the driver tell me that disk has failed (a
> syslog message would be enough) or will I have to make a daily trip into
> the server room to check the front panel lights? Presumably it handles
> hot-swapping a replacement drive OK?
>From what I remember, you will receive status-change kernel messages when
disks disappear, rebuilds start, and so forth. So for most day-to-day
manipulation you should be fine.
You may want to make sure the auto rebuild option is enabled in the
controller's BIOS since no working control programs from userland are
generally available at this time. That also means you can't create new
volumes at runtime, but thats not so horrible...
--
Doug White | FreeBSD: The Power to Serve
dwh...@gumbysoft.com | www.FreeBSD.org
That would be fine - as long as there's some notification of important
events.
> You may want to make sure the auto rebuild option is enabled in the
> controller's BIOS since no working control programs from userland are
> generally available at this time. That also means you can't create new
> volumes at runtime, but thats not so horrible...
I expect there will only ever be one volume, so that's unlikely to be a
problem :)
Many thanks,
The sysutils/megarc port appears to work for both status change polling
and runtime configuration (at least on a PE800 and a PE2850 that I tested
on).
Cool, I'll check that out when the hardware arrives.
Many thanks,
>Hi all,
>
>I may be getting a new Dell PE1850 soon, to replace our ancient CVS server
>(still running 4-STABLE). The new machine will ideally run 6.0 and have a
>PERC4e/DC RAID card - the one with battery-backed cache. This is listed as
>supported by amr(4), but I'm wondering how well it actually works in the
>case of a disk failure. Will the driver tell me that disk has failed (a
>syslog message would be enough) or will I have to make a daily trip into
>the server room to check the front panel lights? Presumably it handles
>hot-swapping a replacement drive OK?
>
>I found some posts mentioning some management/monitoring tools for these
>controllers that were allegedly available from the www.lsilogic.com
>website, but I can't find anything on there for FreeBSD. Do the Linux
>tools work?
>
>
FYI there also has been a big update to the amr driver which claims to
dramatically increase performance among other things, interestingly
enought it was augmented by Yahoo, I can only assume they are moving to
Dell, yahoo for me (and now you :).
The updates are still in -current but it will be MFC'ed into stable
sooner or later.
http://lists.freebsd.org/pipermail/cvs-src/2005-December/056814.html
Log:
Mega update to the LSI MegaRAID driver:
1. Implement a large set of ioctl shims so that the Linux management apps
from LSI will work. This includes infrastructure to support adding, deleting
and rescanning arrays at runtime. This is based on work from Doug Ambrosko,
heavily augmented by LSI and Yahoo.
2. Implement full 64-bit DMA support. Systems with more than 4GB of RAM
can now operate without the cost of bounce buffers. Cards that cannot do
64-bit DMA will automatically revert to using bounce buffers. This option
can be forced off by setting the 'hw.amr.force_sg32" tunable in the loader.
It should only be turned off for debugging purposes. This work was sponsored
by Yahoo.
3. Streamline the command delivery and interrupt handler paths after
much discussion with Dell and LSI. The logic now closely matches the
intended design, making it both more robust and much faster. Certain
i/o failures under heavy load should be fixed with this.
4. Optimize the locking. In the interrupt handler, the card can be checked
for completed commands without any locks held, due to the handler being
implicitely serialized and there being no need to look at any shared data.
Only grab the lock to return the command structure to the free pool. A
small optimization can still be made to collect all of the completions
together and then free them together under a single lock.
Items 3 and 4 significantly increase the performance of the driver. On an
LSI 320-2X card, transactions per second went from 13,000 to 31,000 in my
testing with these changes. However, these changes are still fairly
experimental and shouldn't be merged to 6.x until there is more testing.
Thanks to Doug Ambrosko, LSI, Dell, and Yahoo for contributing towards
this.
Yeah, I saw that, and it sounds most excellent. Good to see some real
support from the likes of Dell and LSI, too.
I might be able to get away with running -stable on this machine, but
-current will be right out. Hopefully these changes can be MFCed in time
for 6.1.
Scott
> _______________________________________________
> freebsd...@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stabl...@freebsd.org"
> I may be getting a new Dell PE1850 soon, to replace our ancient CVS
> server
> (still running 4-STABLE). The new machine will ideally run 6.0 and
> have a
> PERC4e/DC RAID card - the one with battery-backed cache. This is
> listed as
I have an 1850 with the buil-in PERC 4e/Si since all I needed was the
RAID1 mirror of the internal drives. It works extremely well, and
the speed is quite good.
As for notices of when the drives go bad, under 4.x I've had disk
failures with the amr driver (different PERC cards) and not gotten
any such notices in the syslog that I recall. I did find a program
posted to one of the freebsd lists called 'amrstat' that I run
nightly. It produces this kind of output:
Drive 0: 68.24 GB, RAID1 <writeback,no-read-ahead,no-adaptative-
io> optimal
If it says "degraded" it is time to fix a drive. You just fire up
the lsi megaraid tools and find out which drive it is.
If you go to the LSI download area, they have one file for FreeBSD,
which is labeled the driver. In that zip file is also the management
software for freebsd. You'll want that. Personally, I like the
"MEGAMGR" software which was released for freebsd 4.x and mimics the
BIOS' interface in a terminal window.
The rebuild on LSI controllers is set to automatic on the dells as
default. It just works as expected.
Overall, I'm a big fan of the LSI cards and the amr driver...
Unfortunately for me, the latest equipment I just got only takes low-
profile cards, and LSI doesn't offer a dual channel RAID card in low-
profile configuration... so I need to look at adaptec.
> Items 3 and 4 significantly increase the performance of the
> driver. On an
> LSI 320-2X card, transactions per second went from 13,000 to
> 31,000 in my
> testing with these changes. However, these changes are still fairly
> experimental and shouldn't be merged to 6.x until there is more
> testing.
> Thanks to Doug Ambrosko, LSI, Dell, and Yahoo for contributing
> towards
> this.
Damn that's awesome! Thanks to all who helped with this... This
will be great for some of my servers.
Now, does anyone have any numbers to compare this with other RAID
cards? Particularly the 2230SLP? :-)
/me wishes LSI maid low profile dual channel cards...
We'll only be mirroring the internal drives too for now - the 4e/DC seems
to be the only RAID option on the 1850 with battery-backed cache, and
doesn't cost much more for the extra peace-of-mind.
> As for notices of when the drives go bad, under 4.x I've had disk
> failures with the amr driver (different PERC cards) and not gotten
> any such notices in the syslog that I recall.
That's a pity. Maybe Doug was thinking of one of the aac(4) based PERC
cards? Still, something I can run out of cron to check the array status
should be fine.
> I did find a program
> posted to one of the freebsd lists called 'amrstat' that I run
> nightly. It produces this kind of output:
>
> Drive 0: 68.24 GB, RAID1 <writeback,no-read-ahead,no-adaptative-
> io> optimal
>
> If it says "degraded" it is time to fix a drive. You just fire up
> the lsi megaraid tools and find out which drive it is.
>
> If you go to the LSI download area, they have one file for FreeBSD,
> which is labeled the driver. In that zip file is also the management
> software for freebsd. You'll want that. Personally, I like the
> "MEGAMGR" software which was released for freebsd 4.x and mimics the
> BIOS' interface in a terminal window.
There's a port of the management software now: sysutils/megarc
> The rebuild on LSI controllers is set to automatic on the dells as
> default. It just works as expected.
Cool.
> Overall, I'm a big fan of the LSI cards and the amr driver...
>
> Unfortunately for me, the latest equipment I just got only takes low-
> profile cards, and LSI doesn't offer a dual channel RAID card in low-
> profile configuration... so I need to look at adaptec.
This is on your x4100? Nice machine. We have a v20z with dual Opteron
270s that I totally love. Looking at getting an x4100 too... sadly these
are product development machines so they'll be running RedHat and Solaris.
Doesn't the x4100 have h/w RAID built in? Or does that not work with
FreeBSD?
> We'll only be mirroring the internal drives too for now - the 4e/DC
> seems
> to be the only RAID option on the 1850 with battery-backed cache, and
> doesn't cost much more for the extra peace-of-mind.
Then you'll be pleasantly surprised to know that the 4e/Si has a
battery too. I certainly was... and it even has 256MB of cache RAM.
Quite the bargain! I'll send you screen shots of the config menus in
private email.
>> Unfortunately for me, the latest equipment I just got only takes low-
>> profile cards, and LSI doesn't offer a dual channel RAID card in low-
>> profile configuration... so I need to look at adaptec.
>
> This is on your x4100? Nice machine. We have a v20z with dual
> Opteron
> 270s that I totally love. Looking at getting an x4100 too... sadly
> these
> are product development machines so they'll be running RedHat and
> Solaris.
> Doesn't the x4100 have h/w RAID built in? Or does that not work with
> FreeBSD?
Yes, this is the X4100. It only has room for two low-profile PCI-X
cards, which the 320-2X certainly is not. Curiously, LSI has on
their web site some big announcements about some deals with Sun to
use their products, so one would hope they would have a low-profile
high-end card. Currently they only have a low-end card that is low
profile.
I'm biting the bullet and getting an Adaptec 2230 low profile card.
I hope it is fast. if not, then back to the drawing board... sigh.
Are you refering to this Doug. The Linux ioctl shim requires one file
that hasn't been committed yet. Scott L. & ps have it. I may commit
it now that I'm back. This lets all of the Dell/LSI Linux tools
run on FreeBSD including the firmware update tool. The caveat is
that with the driver re-do it seems the certain things in the ioctl
path causes the firmware to lock-up. I haven't been around enough
to help with that problem. I have a binary that locks it up pretty
quick.
Most of the existing monitoring tools have bugs. The Linux tools
tend to be better but the last copy of MegaMon leaked shared memory
then quit. We have a tool at work but it is encumbered so we can't
give it out.
| > I did find a program
| > posted to one of the freebsd lists called 'amrstat' that I run
| > nightly. It produces this kind of output:
| >
| > Drive 0: 68.24 GB, RAID1 <writeback,no-read-ahead,no-adaptative-
| > io> optimal
| >
| > If it says "degraded" it is time to fix a drive. You just fire up
| > the lsi megaraid tools and find out which drive it is.
This is probably a faily good scheme. Caveat is that you can have
a "optimal" RAID that is broken :-(
On another note, ipmi is pretty good to remotely monitor these boxes
and you can run the Dell SOL proxy tool for Linux on FreeBSD then setup
the BIOS on the serial port and connect the serial port to BMC/LAN.
FWIW, I've been working on an openipmi compatible driver. It basically
works for a bunch of programs that I've tested with as long as they
are compiled with a correct ioctl file.
Doug A.
That's lame. Under what condition does it happen, do you know?
Thanks,
Jung-uk Kim
If your mail client cannot handle the charset, read:
http://docs.freebsd.org/cgi/mid.cgi?200601122020.59843.jkim
Jung-uk Kim
Running RAID 10, a drive was swapped and the rebuild started on the
replacement drive. The rebuild complained about the source drive
for the mirror rebuild having read errors that couldn't be recovered.
It continued on and finished re-creating the mirror. Then the RAID
proceeeded onto a background init which they normal did and started
failing that and re-starting the background init over and over again.
The box changed the RAID from degraded to optimal when the rebuild
completed (with errors). Do a dd of the entire RAID logical device
returned an error at the bad sector since it couldn't recover that.
The RAID controller reported an I/O error and still left the RAID as
optimal.
We reported this and where told that's the way it is designed :-(
Probably the spec. is defined by whatever the RAID controller happens
to do versus what make sense :-(
So far this has only happened once. Changing firmware did not help.
Doug A.
PS. sorry for the null email before this. Hit the wrong key.
Sorry, my mail client did it again. :-(
Jung-uk Kim
Similar thing happened to me once or twice (with RAID5) and I thought
it was just a broken controller. If the culprit was design, it IS
really lame. :-(
> Doug A.
>
> PS. sorry for the null email before this. Hit the wrong key.
No need to be sorry. I made the same mistake again. ;-)
Thanks for the info,
Jung-uk Kim
Interesting timing as I ran into this sort of situation on the
weekend on a 3ware drive in RAID1. The card had complained for a week
about read errors on drive 1. We thought we would wait until the
weekend maintenance window to swap it out. Sadly, before that
window, drive zero totally died a horrible death. We popped in a new
drive on port zero, started the rebuild, and it crapped out saying
there was a read error on drive 1. However, there is a check box
that says continue the build, even with errors on the source drive.
This setup seems to give you the best of both worlds. We did a quick
check of the resultant files compared to backups and only a couple
were toasted. (The box is going to be retired in a month, so if there
is other hidden fs corruption if it holds out for another 3 weeks we
dont care too much). The correct approach would be to do a total
restore of course, but this was good enough for us in this
situation. I guess the question is, is this RAID1 in a proper mirror
given that there are hard errors on the drive on port 1 ?
---Mike
With Adaptec we used to do a verify of each disk before a swap
to increase our chances of a successful disk swap. Adaptec was
a little heavy handed in if you are running on the last disk of the
mirror and it has a read-error it will fail the drive. If you have
a RAID 10 then you lose 1/2 the file system :-( I'd rather just
get the read error back to the OS then loose the entire drive.
| This setup seems to give you the best of both worlds. We did a quick
| check of the resultant files compared to backups and only a couple
| were toasted. (The box is going to be retired in a month, so if there
| is other hidden fs corruption if it holds out for another 3 weeks we
| dont care too much). The correct approach would be to do a total
| restore of course, but this was good enough for us in this
| situation. I guess the question is, is this RAID1 in a proper mirror
| given that there are hard errors on the drive on port 1 ?
That sounds like a good controller assuming it says the RAID is still
degraded and it's not optimal. I assume "optimal" means everything
is fine and safe to read the entire volume.
Doug A.
I'd suggest whining to them. To me "optimal" means "as far as I know
there are no problems with the RAID". If enough customers whine they
might change their view!
Doug A.
heh. When we've told Dell that some of our 1750s and 1850s were
locking up randomly, with various errors (most common, either the mpt0
driver complains and never recovers, or the server locks up entirely
with no error), we're told that a) FreeBSD isn't supported and b) to
run the diagnostics disk (which never finds anything except that the
CD ROM drive is empty) which basically leads to the implied c) to piss
up a rope.
Dell cares not about us FreeBSD users.
Better to just go with known-working hardware like 3ware cards. I do
wish they had a SCSI RAID controller. Seems like all major SCSI RAID
cards have various problems: Adaptec 2100S(and the rest in the line)
would not rebuild transparently -- the OS would get various timeout
errors while rebuilds or verifies were ongoing; Mylex's FreeBSD driver
has 2.5 year old bug i386/55603; and then these Dell cards (LSI?) have
obvious problems. I'm sure there are others I've missed.
3ware cards in Supermicro servers (not sure which exact models,
Silicon Mechanics sells them as their R200, R204, Q500, and others)
are rock solid for us even during rebuilds and with degraded arrays.
And the best part is they're not Dells.</educated_bias>
Hi Doug,
I was actually referring to Doug White, who said:
>From what I remember, you will receive status-change kernel messages when
>disks disappear, rebuilds start, and so forth. So for most day-to-day
>manipulation you should be fine.
It wasn't clear if this applied to the amr(4)-based PERC cards or just the
aac(4) ones.
Sounds like the re-worked amr driver will be very much better, at least
once a few more bugs have been ironed out of it.
> Most of the existing monitoring tools have bugs. The Linux tools
> tend to be better but the last copy of MegaMon leaked shared memory
> then quit. We have a tool at work but it is encumbered so we can't
> give it out.
>
> | > I did find a program
> | > posted to one of the freebsd lists called 'amrstat' that I run
> | > nightly. It produces this kind of output:
> | >
> | > Drive 0: 68.24 GB, RAID1 <writeback,no-read-ahead,no-adaptative-
> | > io> optimal
> | >
> | > If it says "degraded" it is time to fix a drive. You just fire up
> | > the lsi megaraid tools and find out which drive it is.
>
> This is probably a faily good scheme. Caveat is that you can have
> a "optimal" RAID that is broken :-(
That's pretty sucky, but presumably not a FreeBSD-specific problem?
Despite that, I'm reasonably hopeful that a scheme like this along with
good backups (which we have) will be enough to avoid any major disasters.
Is Dell's support any better if you tell them you're running RedHat?
Regards,
Yes that only applies to the aac based machines and not amr based machines
(ie. Adaptec versus LSI). With LSI you have to poll the controller
for RAID events and that is not public.
| Sounds like the re-worked amr driver will be very much better, at least
| once a few more bugs have been ironed out of it.
Yes.
| > Most of the existing monitoring tools have bugs. The Linux tools
| > tend to be better but the last copy of MegaMon leaked shared memory
| > then quit. We have a tool at work but it is encumbered so we can't
| > give it out.
| >
| > | > I did find a program
| > | > posted to one of the freebsd lists called 'amrstat' that I run
| > | > nightly. It produces this kind of output:
| > | >
| > | > Drive 0: 68.24 GB, RAID1 <writeback,no-read-ahead,no-adaptative-
| > | > io> optimal
| > | >
| > | > If it says "degraded" it is time to fix a drive. You just fire up
| > | > the lsi megaraid tools and find out which drive it is.
| >
| > This is probably a faily good scheme. Caveat is that you can have
| > a "optimal" RAID that is broken :-(
|
| That's pretty sucky, but presumably not a FreeBSD-specific problem?
| Despite that, I'm reasonably hopeful that a scheme like this along with
| good backups (which we have) will be enough to avoid any major disasters.
It's not a FreeBSD specific problem.
| Is Dell's support any better if you tell them you're running RedHat?
We can sort-of run RedHat. That is, we ran the Linux RAID binaries
from LSI & Dell with the Linux ioctl emulation layer I did on FreeBSD.
I netboot Linux sometimes to verify some things.
Doug A.
> has 2.5 year old bug i386/55603; and then these Dell cards (LSI?) have
> obvious problems. I'm sure there are others I've missed.
i've never had a rebuild error on a Dell LSI card. never had a
failure on a box with adaptec based card, so can't say about that.
> I was actually referring to Doug White, who said:
>
>> From what I remember, you will receive status-change kernel
>> messages when
>> disks disappear, rebuilds start, and so forth. So for most day-to-day
>> manipulation you should be fine.
>
> It wasn't clear if this applied to the amr(4)-based PERC cards or
> just the
> aac(4) ones.
>
> Sounds like the re-worked amr driver will be very much better, at
> least
> once a few more bugs have been ironed out of it.
From my experience, the amr driver does not issue warnings of any
sort that show up on the console or in log files. The aac driver is
more chatty -- I see log file lines about the battery being
recharged, etc.
I've never had a drive failure on any box in which I have an aac
driven card, so can't speak to that but I'd bet $1 that it would log
it. The amr driver doesn't log drive failures -- one must run some
utility to probe it.
Steve
----- Original Message -----
From: "Vivek Khera" <vi...@khera.org>
> From my experience, the amr driver does not issue warnings of any
> sort that show up on the console or in log files. The aac driver is
> more chatty -- I see log file lines about the battery being
> recharged, etc.
================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it.
In the event of misdirection, illegible or incomplete transmission please telephone (023) 8024 3137
or return the E.mail to postm...@multiplay.co.uk.
Following up to myself for the benefit of the archives - I can confirm that
the PERC4e in the PE1850 works perfectly with amr(4) under 6.0. I've been
using the sysutils/megarc port for managing the adapter from FreeBSD. It
has a truly awful user interface but allows you to do everything that the
BIOS setup program does, so far as I can tell.
For monitoring we're relying on the email alerts from the DRAC/4 management
card also in the machine, which turn out to work very well. We actually
had a disk failure on the machine already (one of the drives had apparently
worked itself a bit loose in transit and decided to power itself off a few
days after I put the machine in the rack). The DRAC sent out an email when
the drive "died", it auto-rebuilt when shoved back into the slot properly,
then another email from the DRAC when the rebuild was complete.
I'm looking forward to the amr(4) performance improvements in 6.1 and being
able to run the Linux megmgr tool (I think this is the one with the same
user interface as the BIOS setup program).
Cheers,