Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Re: hardware for home use large storage

85 views
Skip to first unread message

Daniel O'Connor

unread,
Feb 8, 2010, 12:26:46 AM2/8/10
to freebsd...@freebsd.org, Dan Langille
On Mon, 8 Feb 2010, Dan Langille wrote:
> Given that, what motherboard and RAM configuration would you
> recommend to work with FreeBSD [and probably ZFS].  The lists seems
> to indicate that more RAM is better with ZFS.

I have something similar (5x1Tb) - I have a Gigabyte GA-MA785GM-US2H
with an Athlon X2 and 4Gb of RAM (only half filled - 2x2Gb)

The board has 5 SATA ports + 1 eSATA (I looped that back into the case
to connect to the DVD drive :).

I boot it off a 4Gb CF card in an IDE adapter. I think you could boot
off ZFS but it seemed a bit unreliable when I installed it so I opted
for a more straightforward method.

The CPU fan is fairly quiet (although a 3rd party one would probably be
quieter) and the rest of the motherboard is fanless.

The onboard video works great with radeonhd (it's a workstation for
someone as well as a file server).

Note that it doesn't support ECC, I don't know if that is a problem.

--
Daniel O'Connor software and network engineer
for Genesis Software - http://www.gsoft.com.au
"The nice thing about standards is that there
are so many of them to choose from."
-- Andrew Tanenbaum
GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C

signature.asc

Dan Langille

unread,
Feb 8, 2010, 12:01:01 AM2/8/10
to FreeBSD Stable
Hi,

I'm looking at creating a large home use storage machine. Budget is a
concern, but size and reliability are also a priority. Noise is also a
concern, since this will be at home, in the basement. That, and cost,
pretty much rules out a commercial case, such as a 3U case. It would be
nice, but it greatly inflates the budget. This pretty much restricts me
to a tower case.

The primary use of this machine will be a backup server[1]. It will do
other secondary use will include minor tasks such as samba, CIFS, cvsup,
etc.

I'm thinking of 8x1TB (or larger) SATA drives. I've found a case[2]
with hot-swap bays[3], that seems interesting. I haven't looked at
power supplies, but given that number of drives, I expect something
beefy with a decent reputation is called for.

Whether I use hardware or software RAID is undecided. I

I think I am leaning towards software RAID, probably ZFS under FreeBSD
8.x but I'm open to hardware RAID but I think the cost won't justify it
given ZFS.

Given that, what motherboard and RAM configuration would you recommend
to work with FreeBSD [and probably ZFS]. The lists seems to indicate
that more RAM is better with ZFS.

Thanks.


[1] - FYI running Bacula, but that's out of scope for this question

[2] - http://www.newegg.com/Product/Product.aspx?Item=N82E16811192058

[3] - nice to have, especially for a failure.

Svein Skogen (Listmail Account)

unread,
Feb 8, 2010, 4:52:05 AM2/8/10
to freebsd...@freebsd.org
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 08.02.2010 06:01, Dan Langille wrote:
> Hi,
>
> I'm looking at creating a large home use storage machine. Budget is a
> concern, but size and reliability are also a priority. Noise is also a
> concern, since this will be at home, in the basement. That, and cost,
> pretty much rules out a commercial case, such as a 3U case. It would be
> nice, but it greatly inflates the budget. This pretty much restricts me
> to a tower case.
>
> The primary use of this machine will be a backup server[1]. It will do
> other secondary use will include minor tasks such as samba, CIFS, cvsup,
> etc.
>
> I'm thinking of 8x1TB (or larger) SATA drives. I've found a case[2]
> with hot-swap bays[3], that seems interesting. I haven't looked at
> power supplies, but given that number of drives, I expect something
> beefy with a decent reputation is called for.
>
> Whether I use hardware or software RAID is undecided. I
>
> I think I am leaning towards software RAID, probably ZFS under FreeBSD
> 8.x but I'm open to hardware RAID but I think the cost won't justify it
> given ZFS.
>
> Given that, what motherboard and RAM configuration would you recommend
> to work with FreeBSD [and probably ZFS]. The lists seems to indicate
> that more RAM is better with ZFS.

Just before christmas, I rebuilt my own storage backend server for my
home, so I've had a recent look at "what's there". Some hardware I had
from the old solution, and some were new. Some of it is a tad more
expensive that what you gave as the idea here, but the logic is (mostly)
the same. I'll also include what replacements for some of the old parts
I'm looking at.

Heirlooms of the old server:
- -Disks (four Samsung HD501LJ, Four Seagate ST31500341AS)
- -Disk Controller AMI/Lsilogic Megaraid SAS 8308ELP (8chan MFI)

The new hardware around this:
- -Chieftec UNB-410F-B
- -Two Chieftec SST3141SAS
- -Chieftec APS-850C (850watt modular power)
- -Intel E7500 CPU using the bundled stock cooler, and arcticsilver paste
- -4 2GB Corasair Valueram DDR2 1066 sticks
- -Asus P5Q Premium mainboard
- -LSI SAS3801E (for the tape autoloader)
- -Some old graphics board (unless you need a lot of fancy 3D stuff, use
what you have around that's not ESD-damaged here).

Should I have started from scratch, I'd have used Seagate 2TB "Green"
disks, due to the lower temperatures and powerconsumption of these. And
that's about the only thing I'd do differently. The MFI controller
(Megaraid) would stay, simply because it has built in logic to
periodically do patrolreading and consistency checks, and I've had
previous experiences with the raid-controllers checks discovering bad
disks before they go critical. But this breed of controllers is a little
costly (Customers are willing to pay for the features, so the
manufacturer milks them for all they can).

I recommend you go for a modular power, that is rated for quite a bit
more that what you expect to draw from it. The reason is that as current
increases, the efficiency of the conversion drops, so a power running at
half its rated max, is more efficient than one pushed to the limits. Go
for modular so you don't have to have the extra cables tied into coils
inside your machine distruption airflow (and creating EMF noise).

Make sure you get yourself a proper ESD wriststrap (or anklestrap)
before handling any of these components, and make sure you use correct
torque for all the screws handling the components (and disks). This
machine will probably have a lot of uptime, and disks (and fans) create
vibrations. If in doubt, use some fancy-colored nailpolish (or locktite)
on the screws to make sure they don't unscrew from vibrations over time.
(a loose screw has a will of it own, and WILL short-circuit the most
expensive component in your computer).

Also make sure you use cableties to get the cables out of the airflow,
so you get sufficient cooling. Speaking of cooling, make sure your
air-inputs have some sort of filtering, or you'll learn where Illiad
(userfriendly.org) got the idea for "Dustpuppy". No matter how pedantic
you are about cleaning your house, a computer is basically a large,
expensive, vacuum-cleaner and WILL suck in dust from the air.

These are some of the pointers I'd like to share on the subject. :)

//Svein


- --
- --------+-------------------+-------------------------------
/"\ |Svein Skogen | sv...@d80.iso100.no
\ / |Solberg �stli 9 | PGP Key: 0xE5E76831
X |2020 Skedsmokorset | sv...@jernhuset.no
/ \ |Norway | PGP Key: 0xCE96CE13
| | sv...@stillbilde.net
ascii | | PGP Key: 0x58CD33B6
ribbon |System Admin | svein-l...@stillbilde.net
Campaign|stillbilde.net | PGP Key: 0x22D494A4
+-------------------+-------------------------------
|msn messenger: | Mobile Phone: +47 907 03 575
|sv...@jernhuset.no | RIPE handle: SS16503-RIPE
- --------+-------------------+-------------------------------
If you really are in a hurry, mail me at
svein-...@stillbilde.net
This mailbox goes directly to my cellphone and is checked
even when I'm not in front of my computer.
- ------------------------------------------------------------
Picture Gallery:
https://gallery.stillbilde.net/v/svein/
- ------------------------------------------------------------
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.12 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAktv3sUACgkQODUnwSLUlKSdCQCcDzIFDv4zSRmPwYP3XhxQyIBe
Tc0AnikVuqUs0IO1Z6bcaeLJWjXJ2jVv
=zV8R
-----END PGP SIGNATURE-----

Miroslav Lachman

unread,
Feb 8, 2010, 5:59:26 AM2/8/10
to Dan Langille, FreeBSD Stable
Dan Langille wrote:
> Hi,
>
> I'm looking at creating a large home use storage machine. Budget is a
> concern, but size and reliability are also a priority. Noise is also a
> concern, since this will be at home, in the basement. That, and cost,
> pretty much rules out a commercial case, such as a 3U case. It would be
> nice, but it greatly inflates the budget. This pretty much restricts me
> to a tower case.
>
> The primary use of this machine will be a backup server[1]. It will do
> other secondary use will include minor tasks such as samba, CIFS, cvsup,
> etc.


It depends on your needs (storage capacity [number of drives],
performance etc.)

One year ago I purchased HP ProLiant ML110G5 / P2160 / 1GB / 250GB SATA
/ DVDRW / Tower / (with 3 years Next Business Day support!). It is sold
for about 9000,- CZK ($500), I added next 4GB of RAM, 4x 1TB Samsung F1
instead of original 250GB Seagate. System is booted from 2GB internal
USB flash drive and all drives are in RAIDZ pool.
The machine is really quiet.
All in all cost is about $1000 with 3 years NBD.
You can put in 2TB drives instead of 1TB drives.

It is really low end machine, but runs without problems for more than a
year.

Miroslav Lachman

Christer Solskogen

unread,
Feb 8, 2010, 6:15:26 AM2/8/10
to Miroslav Lachman, FreeBSD Stable, Dan Langille
On Mon, Feb 8, 2010 at 11:59 AM, Miroslav Lachman <000....@quip.cz> wrote:
> System is booted from 2GB internal USB flash

Be aware that not all USB sticks work as a root device on 8.0-RELEASE.
I've tried a couple of different sticks
that is probed *after* the kernel tries to mount /. It seems to be a
problem that emerged with 8.0, as 7.x worked like a charm on the same
USB stick.

http://www.freebsd.org/cgi/query-pr.cgi?pr=138798

--
chs,

Daniel Engberg

unread,
Feb 8, 2010, 8:57:01 PM2/8/10
to freebsd...@freebsd.org
While I'm not a heavy FreeBSD user I can offer you some advice on
hardware at least based on my own experience.

If you want things to work as good as possible go with Intel chipset
and LAN. AMD chipsets works (mostly) but you'll have worse performance
and you wont get an Intel NIC which performs much better than Realtek
or Attansic. which you usually find on AMD motherboards.

A general tip is to go for business chipsets as Intel like to call
them, Q35 (I have a few of those and they work very good), Q45 and
Q57. By doing so you can be sure to get Intel NIC and they aren't much
more expensive than your average motherboard and also usually carries
some kind of remote management.

Having in mind that FreeBSD may/may not support the newest hardware
around I'd guess that Q57 needs -CURRENT for now but I would highly
recommend it as Socket 775 is slowly dying.

ASUS P7Q57-M DO looks like a very nice board if you want "bleeding
edge" have in mind though as time of writing support for NIC doesn't
seem to be in FreeBSD but I guess its a short matter of time
(82578DM). Pair it with the slowest Core i3 CPU you can find and you
have a very nice solution. If you step up to i5 you get hardware
encryption =)

If you want legacy Intel DQ45CB should be a pretty nice choice with
supported LAN out of the box. Intel Pentium E6300 should be more than
enough for storage.

Both MSI and Gigabyte also makes Q-chipsets motherboards but they
don't seem to widely available in the US and their boards should be
fine too.

Since you want to have more than 5 HDDs you need a controller card of
some sort, in that case I would recommend you to have a look at the
Supermicro ones mentioned in the post.
http://forums.freebsd.org/showpost.php?p=59735&postcount=5
UIO is just a backwards PCIe slot so turning it around till make it
fit although you mean need to secure it somehow. They may be a bit
hard to find but you can find a few sellers on eBay too. What I don't
know is how the motherboards will react if you pop one in which you
need to do some research on.

As for memory you'll need at least 2Gb but 4Gb is highly recommended
if you're going to use ZFS. Just make sure the sticks follows JEDEC
standards and you'll be fine (Corsair Value Select series or stock
Crucial are fine).

//Daniel

Matthew D. Fuller

unread,
Feb 9, 2010, 12:30:02 AM2/9/10
to Daniel O'Connor, freebsd...@freebsd.org, Dan Langille
On Mon, Feb 08, 2010 at 03:56:46PM +1030 I heard the voice of
Daniel O'Connor, and lo! it spake thus:

>
> I have something similar (5x1Tb) - I have a Gigabyte GA-MA785GM-US2H
> with an Athlon X2 and 4Gb of RAM (only half filled - 2x2Gb)
>
> [...]

>
> Note that it doesn't support ECC, I don't know if that is a problem.

How's that? Is the BIOS just stupid, or is the board physically
missing traces?


--
Matthew Fuller (MF4839) | full...@over-yonder.net
Systems/Network Administrator | http://www.over-yonder.net/~fullermd/
On the Internet, nobody can hear you scream.

Charles Sprickman

unread,
Feb 9, 2010, 1:15:24 AM2/9/10
to Dan Langille, FreeBSD Stable
On Mon, 8 Feb 2010, Dan Langille wrote:

> Hi,
>
> I'm looking at creating a large home use storage machine. Budget is a
> concern, but size and reliability are also a priority. Noise is also a
> concern, since this will be at home, in the basement. That, and cost, pretty
> much rules out a commercial case, such as a 3U case. It would be nice, but
> it greatly inflates the budget. This pretty much restricts me to a tower
> case.

I recently had to put together something very cheap for a client for
disk-only backups (rsync + zfs snapshots). As you noticed, rack
enclosures that will hold a bunch of drives are insanely expensive. I put
my "wishlist" from NewEgg below. While the $33 case is a bit flimsy, the
extra high-cfm fan in the back and the fan that sits in front of the drive
bays keeps the drives extremely cool. For $33, I lucked out.

> The primary use of this machine will be a backup server[1]. It will do other
> secondary use will include minor tasks such as samba, CIFS, cvsup, etc.
>
> I'm thinking of 8x1TB (or larger) SATA drives. I've found a case[2] with
> hot-swap bays[3], that seems interesting. I haven't looked at power
> supplies, but given that number of drives, I expect something beefy with a
> decent reputation is called for.

For home use is the hot-swap option really needed? Also, it seems like
people who use zfs (or gmirror + gstripe) generally end up buying pricey
hardware raid cards for compatibility reasons. There seem to be no decent
add-on SATA cards that play nice with FreeBSD other than that weird
supermicro card that has to be physically hacked about to fit.

I did "splurge" for a server-class board from Supermicro since I wanted
bios serial port console redirection, and as many SATA ports on-board that
I could find.

> Whether I use hardware or software RAID is undecided. I
>
> I think I am leaning towards software RAID, probably ZFS under FreeBSD 8.x
> but I'm open to hardware RAID but I think the cost won't justify it given
> ZFS.

I've had two very different ZFS experiences so far. On the hardware I
mention in this message, I had zero problems and excellent performance
(bonnie++ showing 145MB/s reads, 132MB/s writes on a 4 disk RAIDZ1 array)
with 8.0/amd64 w/4GB of RAM. I did no "tuning" at all - amd64 is the way
to go for ZFS.

On an old machine at home with 2 old (2003 era) 32-bit xeons, I ran into
all the issues people see with i386+ZFS - kernel memory exhaustion
resulting in a panic, screwing around with an old 3Ware RAID card (JBOD
mode) that cannot properly scan for new drives, just a total mess without
lots of futzing about.

> Given that, what motherboard and RAM configuration would you recommend to
> work with FreeBSD [and probably ZFS]. The lists seems to indicate that more
> RAM is better with ZFS.

Here's the list:

http://secure.newegg.com/WishList/PublicWishDetail.aspx?WishListNumber=8441629

Just over $1K, and I've got 4 nice drives, ECC memory, and a server board.
Going with the celeron saved a ton of cash with no impact on ZFS that I
can discern, and again, going with a cheap tower case slashed the cost as
well. That whole combo works great. Now when I use up those 6 SATA
ports, I don't know how to get more cheaply, but I'll worry about that
later...

Charles

> Thanks.
>
>
> [1] - FYI running Bacula, but that's out of scope for this question
>
> [2] - http://www.newegg.com/Product/Product.aspx?Item=N82E16811192058
>
> [3] - nice to have, especially for a failure.

> _______________________________________________
> freebsd...@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stabl...@freebsd.org"
>

Daniel O'Connor

unread,
Feb 9, 2010, 1:07:50 AM2/9/10
to Matthew D. Fuller, freebsd...@freebsd.org, Dan Langille
On Tue, 9 Feb 2010, Matthew D. Fuller wrote:
> On Mon, Feb 08, 2010 at 03:56:46PM +1030 I heard the voice of
>
> Daniel O'Connor, and lo! it spake thus:
> > I have something similar (5x1Tb) - I have a Gigabyte
> > GA-MA785GM-US2H with an Athlon X2 and 4Gb of RAM (only half filled
> > - 2x2Gb)
> >
> > [...]
> >
> > Note that it doesn't support ECC, I don't know if that is a
> > problem.
>
> How's that? Is the BIOS just stupid, or is the board physically
> missing traces?

I don't know.. Some consumer Gigabyte motherboards seem to support it
(eg GA-MA770T-UD3P).

Probably the result of idiotic penny pinching though :-/

signature.asc

Andrew Snow

unread,
Feb 9, 2010, 1:21:32 AM2/9/10
to freebsd...@freebsd.org

http://www.supermicro.com/products/motherboard/ATOM/ICH9/X7SPA.cfm?typ=H


Supermicro just released a new Mini-ITX fanless Atom server board with
6xSATA ports (based on Intel ICH9) and a PCIe 16x slot. It takes up to
4GB of RAM, and there's even a version with KVM-over-LAN for headless
operation and remote management.


Jeremy Chadwick

unread,
Feb 9, 2010, 1:33:10 AM2/9/10
to freebsd...@freebsd.org

Neat hardware. But with regards to the KVM-over-LAN stuff: it's IPMI,
and Supermicro has a very, *very* long history of having shoddy IPMI
support. I've been told the latter by too many different individuals in
the industry (some co-workers, some work at Yahoo, some at Rackable,
etc.) for me to rely on it. If you *have* to go this route, make sure
you get the IPMI module which has its own dedicated LAN port on the
module and ***does not*** piggyback on top of an existing LAN port on
the mainboard.

--
| Jeremy Chadwick j...@parodius.com |
| Parodius Networking http://www.parodius.com/ |
| UNIX Systems Administrator Mountain View, CA, USA |
| Making life hard for others since 1977. PGP: 4BD6C0CB |

Daniel O'Connor

unread,
Feb 9, 2010, 1:47:02 AM2/9/10
to freebsd...@freebsd.org, Charles Sprickman, Dan Langille
On Tue, 9 Feb 2010, Charles Sprickman wrote:
> For home use is the hot-swap option really needed?  Also, it seems
> like people who use zfs (or gmirror + gstripe) generally end up
> buying pricey hardware raid cards for compatibility reasons.  There
> seem to be no decent add-on SATA cards that play nice with FreeBSD
> other than that weird supermicro card that has to be physically
> hacked about to fit.

A friend of mine is building one and I couldn't get the Supermicro card
to work (the older version). It being a black box driver I couldn't see
what the problem was.

I had good success with the onboard SATA ports (AHCI compliant), however
there are only 6 ports on the board I picked.

Hopefully port multipliers will be fully working when I need some more
disks ;)

signature.asc

Matthew D. Fuller

unread,
Feb 9, 2010, 5:32:28 AM2/9/10
to Daniel O'Connor, freebsd...@freebsd.org, Dan Langille
On Tue, Feb 09, 2010 at 04:37:50PM +1030 I heard the voice of

Daniel O'Connor, and lo! it spake thus:
>
> Probably the result of idiotic penny pinching though :-/

Irritating. One of my favorite parts of AMD's amd64 chips is that I
no longer have to spend through the nose or be a detective (or, often,
both) to get ECC. So far, it seems like there are relatively few
hidden holes on that path, and I haven't stepped in one, but every new
one I hear about increases my terror of the day when there are more
holes than solid ground :(

Gerrit Kühn

unread,
Feb 9, 2010, 5:32:19 AM2/9/10
to Charles Sprickman, FreeBSD Stable, Dan Langille

CS> pricey hardware raid cards for compatibility reasons. There seem to
CS> be no decent add-on SATA cards that play nice with FreeBSD other than
CS> that weird supermicro card that has to be physically hacked about to
CS> fit.

BTW: I recently built some more machines with this card. I can confirm now
that you can use it with "standard" brackets, if you have some spare. The
distance for the two holders is the same as for e.g. 3ware 95/96
controllers and I had some spares in standard height there because I use
the 3wares in low profile setups. The brackets of Intel NICs seem to fit,
too. The only thing that is different with the card now is the side on
which the components are mounted. But this should not be a problem unless
you want to place them next ti a graphics card.


cu
Gerrit

Gerrit Kühn

unread,
Feb 9, 2010, 5:35:48 AM2/9/10
to Andrew Snow, freebsd...@freebsd.org
On Tue, 09 Feb 2010 17:21:32 +1100 Andrew Snow <and...@modulus.org> wrote
about Re: hardware for home use large storage:

AS> http://www.supermicro.com/products/motherboard/ATOM/ICH9/X7SPA.cfm?typ=H

The good thing about this board is that the pineview atoms seem to be
64bit capable, which makes them attractive for zfs. I bought a board with
VIA Nano processor for this reason last year, as I could not find a decent
hardware with 64bit capable atom.


cu
Gerrit

Dan Langille

unread,
Feb 9, 2010, 6:37:47 AM2/9/10
to Charles Sprickman, FreeBSD Stable
Charles Sprickman wrote:
> On Mon, 8 Feb 2010, Dan Langille wrote:
>
>> I'm thinking of 8x1TB (or larger) SATA drives. I've found a case[2]
>> with hot-swap bays[3], that seems interesting. I haven't looked at
>> power supplies, but given that number of drives, I expect something
>> beefy with a decent reputation is called for.
>
> For home use is the hot-swap option really needed?

Is anything needed?

The option is cheap and convenient. When it comes time to swap disks,
you don't have to take the case apart, etc. Yes, it saves downtime, but
it is also easier.

> Also, it seems like
> people who use zfs (or gmirror + gstripe) generally end up buying pricey
> hardware raid cards for compatibility reasons. There seem to be no
> decent add-on SATA cards that play nice with FreeBSD other than that
> weird supermicro card that has to be physically hacked about to fit.

They use software RAID and hardware RAID at the same time? I'm not sure
what you mean by this. Compatibility with FreeBSD?

Karl Denninger

unread,
Feb 9, 2010, 7:53:26 AM2/9/10
to Jeremy Chadwick, freebsd...@freebsd.org
Jeremy Chadwick wrote:
> On Tue, Feb 09, 2010 at 05:21:32PM +1100, Andrew Snow wrote:
>
>> http://www.supermicro.com/products/motherboard/ATOM/ICH9/X7SPA.cfm?typ=H
>>
>> Supermicro just released a new Mini-ITX fanless Atom server board
>> with 6xSATA ports (based on Intel ICH9) and a PCIe 16x slot. It
>> takes up to 4GB of RAM, and there's even a version with KVM-over-LAN
>> for headless operation and remote management.
>>
>
> Neat hardware. But with regards to the KVM-over-LAN stuff: it's IPMI,
> and Supermicro has a very, *very* long history of having shoddy IPMI
> support. I've been told the latter by too many different individuals in
> the industry (some co-workers, some work at Yahoo, some at Rackable,
> etc.) for me to rely on it. If you *have* to go this route, make sure
> you get the IPMI module which has its own dedicated LAN port on the
> module and ***does not*** piggyback on top of an existing LAN port on
> the mainboard.
>
What's wrong with the Supermicro IPMI implementations? I have several -
all have a SEPARATE LAN port on the main board for the IPMI KVM
(separate and distinct from the board's primary LAN ports), and I've not
had any trouble with any of them.

-- Karl

Tom Evans

unread,
Feb 9, 2010, 7:51:35 AM2/9/10
to Charles Sprickman, FreeBSD Stable, Dan Langille
On Tue, Feb 9, 2010 at 6:15 AM, Charles Sprickman <sp...@bway.net> wrote:
> ....

> Here's the list:
>
> http://secure.newegg.com/WishList/PublicWishDetail.aspx?WishListNumber=8441629
>
> Just over $1K, and I've got 4 nice drives, ECC memory, and a server board.
> Going with the celeron saved a ton of cash with no impact on ZFS that I can
> discern, and again, going with a cheap tower case slashed the cost as well.
>  That whole combo works great.  Now when I use up those 6 SATA ports, I
> don't know how to get more cheaply, but I'll worry about that later...
>
> Charles
>

As long as those SATA ports are AHCI compliant, should work quite
nicely with a SiI port multiplier. Failing that, a simple 2 port SiI
PCI-E SATA card (supported by siis(4) driver) + 2 x SiI port
multiplier would give you 10 extra SATA ports.

My SiI PCI-E card cost £15, and the PM about £50, so it is about
£13/port, or ~$20/port. Probably can get the components cheaper in the
US actually. I also found some nice simple drive racks for £20/4
drives - not completely hotswappable, but much easier to replace than
screwed into the case.

Cheers

Tom

Andriy Gapon

unread,
Feb 9, 2010, 7:53:27 AM2/9/10
to Matthew D. Fuller, freebsd...@freebsd.org
on 09/02/2010 12:32 Matthew D. Fuller said the following:

> On Tue, Feb 09, 2010 at 04:37:50PM +1030 I heard the voice of
> Daniel O'Connor, and lo! it spake thus:
>> Probably the result of idiotic penny pinching though :-/
>
> Irritating. One of my favorite parts of AMD's amd64 chips is that I
> no longer have to spend through the nose or be a detective (or, often,
> both) to get ECC. So far, it seems like there are relatively few
> hidden holes on that path, and I haven't stepped in one, but every new
> one I hear about increases my terror of the day when there are more
> holes than solid ground :(

Yep.
For sure, Gigabyte BIOS on this board is completely missing ECC initialization
code. I mean not only the menus in setup, but the code that does memory
controller programming.
Not sure about the physical lanes though.

--
Andriy Gapon

Jeremy Chadwick

unread,
Feb 9, 2010, 8:44:56 AM2/9/10
to freebsd...@freebsd.org

http://unix.derkeiler.com/Mailing-Lists/FreeBSD/current/2008-01/msg01206.html
http://forums.freebsd.org/showthread.php?t=7750
http://www.beowulf.org/archive/2007-November/019925.html
http://bivald.com/lessons-learned/2009/06/supermicro_ipmi_problems_web_i.html
http://lists.freebsd.org/pipermail/freebsd-stable/2008-August/044248.html
http://lists.freebsd.org/pipermail/freebsd-stable/2008-August/044237.html

(Last thread piece does mention that the user was able to get keyboard
working by disabling umass(4) of all things)

It gets worse when you use one of the IPMI modules that piggybacks on an
existing Ethernet port -- the NIC driver for the OS, from the ground up,
has to be fully aware of ASF and any quirks/oddities involved. For
example, on bge(4) and bce(4), you'll find this (bge mentioned below):

hw.bge.allow_asf
Allow the ASF feature for cooperating with IPMI. Can cause sys-
tem lockup problems on a small number of systems. Disabled by
default.

So unless the administrator intentionally sets the loader tunable prior
to booting the OS installation, they'll find all kinds of MAC problems
as a result of the IPMI piggybacking. "Why isn't this enabled by
default?" I believe because there were reports of failures/problems on
people's systems who *did not* have IPMI cards. Lose-lose situation.

If you really want me to dig up people at Yahoo who have dealt with IPMI
on thousands of Supermicro servers and the insanity involved (due to
bugs, quirks, or implementation differences between the IPMI firmwares
and which revision/model of module used), I can do so. Most of the
complaints I've heard of stem from serial-over-IPMI. I don't think
it'd be a very positive/"supportive" thread, however. :-)

One similar product that does seem to work well is iLO, available on
HP/Compaq hardware.

Dan Langille

unread,
Feb 9, 2010, 8:45:12 AM2/9/10
to Tom Evans, Charles Sprickman, FreeBSD Stable, Dan Langille

Now there's an idea. Drive racks? Got a URL?


--
Dan Langille -- http://langille.org/

Karl Denninger

unread,
Feb 9, 2010, 9:03:36 AM2/9/10
to Jeremy Chadwick, freebsd...@freebsd.org
I load these things over the IPKVM all the time. I leave a DVD-ROM in
the drive when I install them and my initial load is done over the IPKVM
on the board. It "just works."

Maybe they have had trouble in the past (most of those complaints look
to be 2007/2008 issues), but the current stuff I use from them (their
dual XEON boards) haven't given me a lick of trouble. And you can't
argue with the price of the boards I use, considering that they have
dual gigabit networking ports plus a separate IPMI LAN interface,
support ECC memory and dual Xeons.

I don't use the IPMI protocol itself but I **DO** use the remote console
and management over HTTPS. No problems at all and FreeBSD has yet to
throw up on it in any way.

-- Karl

Jeremy Chadwick

unread,
Feb 9, 2010, 9:15:47 AM2/9/10
to freebsd...@freebsd.org

http://www.supermicro.com/products/chassis/mobileRack/

I'd recommend staying away from anything with SAF-TE (for SCSI) or SES2
(for SAS or SATA) however. At least with regards to SCSI, I've seen
quite a few of the QLogic SAF-TE chips get in the way of drive failures
and start changing SCSI IDs of all the disks (yes you read that right)
on the bus willy-nilly.

That means that basically the CSE-M34T or CSE-M35T-1 would be good
choices. Yes they come in Black.

Tom Evans

unread,
Feb 9, 2010, 9:09:43 AM2/9/10
to Dan Langille, Charles Sprickman, FreeBSD Stable

These aren't the exact racks I bought, they seem to be discontinued
(glad I bought 3 at once!), slightly more expensive, but same idea:
http://www.scan.co.uk/Products/Silverstone-SST-CFP51B-Aluminum-Bay-converter-3x525-to-4x35-in-Black-with-120mm-Fan-RoHS

I got the SiI add-in card and port multiplier from the same place:
http://www.scan.co.uk/Products/Lycom-PE-103-x2-Port-SATAII-3Gbps-PCI-E-Controller-Card-with-NCQ-PC-MAC-Linux
http://www.scan.co.uk/Products/Lycom-ST-126RM-SATA-II-3Gbps-1-To-5-Port-Multiplier-bridge-board-(for-Rack-Mount)

For fixing the portmultiplier into the case, I recommend No More Nails :)

I bought one of those cases that has 5.25" bays all down the front -
10 bays on mine, 1 with a DVD recorder, 9 filled with three of those
drive racks, which gives me 12 'easily accessible' drive bays, 2
internal ones. With 6 SATA ports on the motherboard, together with the
SiI controller + one portmultiplier, I have 12 bays and 12 SATA ports
for not too much.

I currently have 6 of them filled with 1.5Tb SATA drives in a raidz
pool, and can expand the pool by adding another 6 as I run out of
space. Works very nicely for my needs :)

One thing to point out about using a PM like this: you won't get
fantastic bandwidth out of it. For my needs (home storage server),
this really doesn't matter, I just want oodles of online storage, with
redundancy and reliability.

Cheers

Tom

Miroslav Lachman

unread,
Feb 9, 2010, 9:37:52 AM2/9/10
to Jeremy Chadwick, freebsd...@freebsd.org
Jeremy Chadwick wrote:
> On Tue, Feb 09, 2010 at 06:53:26AM -0600, Karl Denninger wrote:

[...]

I can't agree with the last statement about HP's iLO. I have addon card
in ML110 G5 (dedicated NIC), the card is "expensive" and bugs are
amazing. The management NIC freezes once a day (or more often) with
older firmware and must be restarted from inside the installed system by
IPMI command on "localhost". With newer firmware, the interface is
periodicaly restarded. The virtual media doesn't work at all. It is my
worst experience with remote management cards.
I believe that other HP servers with built-in card with different FW is
working better, this is just my experience.

Next one is eLOM in Sun Fire X2100 (shared NIC using bge + ASF). ASF
works without problem, but virtual media works only if you are
connecting by IP address, not by domain name (from Windows machines) and
there is some issue with timeouts of virtual media / console.
I reported this + 8 different bugs of web management interface to Sun
more than year ago - none was fixed.

Next place is for IBM 3650 + RSA II card (dedicated NIC). Expensive,
something works, somthing not. For example the card can't read CPU
temperature, so you will not recieve any alert in case of overheating.
(it was 2 years ago, maybe newer firmware is fixed)

Then I have one Supermicro Twin server 6016TT-TF with built-in IPMI /
KVM with dedicated NIC port. I found one bug with fan rpm readings (half
the number compared to BIOS numbers) and one problem with FreeBSD 7.x
sysinstall (USB keyboard not working, but sysinstall from 8.x works
without problem). In installed FreeBSD system keyboard and virtual media
is working without problems.

On the top is Dell R610 DRAC (dedicated NIC) - I didn't find any bugs
and there are a lot more features compared to concurrent products.

Miroslav Lachman

Dan Langille

unread,
Feb 9, 2010, 10:01:23 AM2/9/10
to Tom Evans, Charles Sprickman, FreeBSD Stable, Dan Langille


A PM? What's that?

Yes, my priority is reliable storage. Speed is secondary.

What bandwidth are you getting?

Jeremy Chadwick

unread,
Feb 9, 2010, 10:07:35 AM2/9/10
to freebsd...@freebsd.org
On Tue, Feb 09, 2010 at 10:01:23AM -0500, Dan Langille wrote:
> On Tue, February 9, 2010 9:09 am, Tom Evans wrote:
> > On Tue, Feb 9, 2010 at 1:45 PM, Dan Langille <d...@langille.org> wrote:
> > One thing to point out about using a PM like this: you won't get
> > fantastic bandwidth out of it. For my needs (home storage server),
> > this really doesn't matter, I just want oodles of online storage, with
> > redundancy and reliability.
>
> A PM? What's that?

Port multiplier.

Svein Skogen (Listmail Account)

unread,
Feb 9, 2010, 10:02:59 AM2/9/10
to freebsd...@freebsd.org
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 09.02.2010 15:37, Miroslav Lachman wrote:
*SNIP*

I think the general consensus here is "nice theory lousy
implementation", and the added migraine of no such thing as a common
standard.

Maybe creating a common standard for this could be a nice GSOC project,
to build a nice "remote console" based on SSH and arm/mips?

p.s. I've seen the various proprietary remote console solutions. They
didn't really impress me much, so I ended up using off-the-shelf
components for building my servers. Not necessarily cheaper, but at
least it's under _MY_ control.

//Svein

iEYEARECAAYFAktxeSIACgkQODUnwSLUlKQrFgCgoWo9wjqQoQMUe2WmTm8wwB19
1QYAoKHy8i8B+sBd6eCkAN+hdfMscJW4
=gzs3
-----END PGP SIGNATURE-----

Tom Evans

unread,
Feb 9, 2010, 10:16:13 AM2/9/10
to Dan Langille, Charles Sprickman, FreeBSD Stable

PM = Port Multiplier

I'm getting disk speed, as I only have one device behind the PM
currently (just making sure it works properly :). The limits are that
the link from siis to the PM is SATA (3Gb/s, 375MB/s), and the siis
sits on a PCIe 1x bus (2Gb/s, 250 MB/s), so the bandwidth from that is
shared amongst the up-to 5 disks behind the PM.

Writing from /dev/zero to the pool, I get around 120MB/s. Reading from
the pool, and writing to /dev/null, I get around 170 MB/s.

Cheers

Tom

Peter C. Lai

unread,
Feb 9, 2010, 11:18:19 AM2/9/10
to Dan Langille, Charles Sprickman, FreeBSD Stable
On 2010-02-09 06:37:47AM -0500, Dan Langille wrote:
> Charles Sprickman wrote:
>> On Mon, 8 Feb 2010, Dan Langille wrote:
> > Also, it seems like
>> people who use zfs (or gmirror + gstripe) generally end up buying pricey
>> hardware raid cards for compatibility reasons. There seem to be no decent
>> add-on SATA cards that play nice with FreeBSD other than that weird
>> supermicro card that has to be physically hacked about to fit.

Mostly only because certain cards have issues w/shoddy JBOD implementation.
Some cards (most notably ones like Adaptec 2610A which was rebranded by
Dell as the "CERC SATA 1.5/6ch" back in the day) won't let you run the
drives in passthrough mode and seem to all want to stick their grubby
little RAID paws into your JBOD setup (i.e. the only way to have minimal
participation from the "hardware" RAID is to set each disk as its own
RAID-0/volume in the controller BIOS) which then cascades into issues with
SMART, AHCI, "triple caching"/write reordering, etc on the FreeBSD side (the
controller's own craptastic cache, ZFS vdev cache, vmm/app cache, oh my!).
So *some* people go with something tried-and-true (basically bordering on
server-level cards that let you ditch any BIOS type of RAID config and
present the raw disk devices to the kernel).

>
> They use software RAID and hardware RAID at the same time? I'm not sure
> what you mean by this. Compatibility with FreeBSD?

> _______________________________________________
> freebsd...@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stabl...@freebsd.org"

--
===========================================================
Peter C. Lai | Bard College at Simon's Rock
Systems Administrator | 84 Alford Rd.
Information Technology Svcs. | Gt. Barrington, MA 01230 USA
peter AT simons-rock.edu | (413) 528-7428
===========================================================

Boris Kochergin

unread,
Feb 9, 2010, 11:35:07 AM2/9/10
to Peter C. Lai, Charles Sprickman, FreeBSD Stable, Dan Langille
Peter C. Lai wrote:
> On 2010-02-09 06:37:47AM -0500, Dan Langille wrote:
>
>> Charles Sprickman wrote:
>>
>>> On Mon, 8 Feb 2010, Dan Langille wrote:
>>> Also, it seems like
>>> people who use zfs (or gmirror + gstripe) generally end up buying pricey
>>> hardware raid cards for compatibility reasons. There seem to be no decent
>>> add-on SATA cards that play nice with FreeBSD other than that weird
>>> supermicro card that has to be physically hacked about to fit.
>>>
>
> Mostly only because certain cards have issues w/shoddy JBOD implementation.
> Some cards (most notably ones like Adaptec 2610A which was rebranded by
> Dell as the "CERC SATA 1.5/6ch" back in the day) won't let you run the
> drives in passthrough mode and seem to all want to stick their grubby
> little RAID paws into your JBOD setup (i.e. the only way to have minimal
> participation from the "hardware" RAID is to set each disk as its own
> RAID-0/volume in the controller BIOS) which then cascades into issues with
> SMART, AHCI, "triple caching"/write reordering, etc on the FreeBSD side (the
> controller's own craptastic cache, ZFS vdev cache, vmm/app cache, oh my!).
> So *some* people go with something tried-and-true (basically bordering on
> server-level cards that let you ditch any BIOS type of RAID config and
> present the raw disk devices to the kernel)
As someone else has mentioned, recent SiL stuff works well. I have
multiple http://www.newegg.com/Product/Product.aspx?Item=N82E16816132008
cards servicing RAID-Z2 and GEOM_RAID3 arrays on 8.0-RELEASE and
8.0-STABLE machines using both the old ata(4) driver and ATA_CAM. Don't
let the RAID label scare you--that stuff is off by default and the
controller just presents the disks to the operating system. Hot swap
works. I haven't had the time to try the siis(4) driver for them, which
would result in better performance.

-Boris

Dan Langille

unread,
Feb 9, 2010, 11:29:50 AM2/9/10
to Tom Evans, Charles Sprickman, FreeBSD Stable, Dan Langille

That leads me to conclude that a number of SATA cards is better than a
port multiplier. But the impression I'm getting is that few of these work
well with FreeBSD. Which is odd... I thought these cards would merely
present the HDD to the hardware and no diver was required. As opposed to
RAID cards for which OS-specific drivers are required.

Peter C. Lai

unread,
Feb 9, 2010, 11:31:55 AM2/9/10
to Tom Evans, Charles Sprickman, FreeBSD Stable, Dan Langille
That's faster than just about anything I have at home.
So you should be fine. It should be good enough to serve as primary media
center storage even (for retrievals, anyway, probably a tad bit slow for
live transcoding).

Also does anybody know if benching dd if=/dev/zero onto a zfs volume that
has compression turned on might affect what dd (which is getting what it
knows from vfs/vmm) might report?

> _______________________________________________
> freebsd...@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stabl...@freebsd.org"

--

Freddie Cash

unread,
Feb 9, 2010, 1:04:26 PM2/9/10
to FreeBSD Stable

> Add-on (PCI-X/PCIe) RAID controllers tend to have solid drivers in FreeBSD.
Add-on SATA controllers not so much. The RAID controllers also tend to
support more SATA features like NCQ, hot-swap, monitoring, etc. They also
enable you to use the same hardware across OSes (FreeBSD, Linux, etc).

For example, we use 3Ware controllers in all our servers, as they have good,
solid support under FreeBSD and Linux. On the Linux servers, we use
hardware RAID. On the FreeBSD servers, we use them as SATA controllers
(Single Disk arrays, not JBOD). Either way, the management is the same, the
drivers are the same, the support is the same.

It's hard to find good, non-RAID, SATA controllers with solid FreeBSD
support, and good throughput, with any kind of management/monitoring
features.

--
Freddie Cash
fjw...@gmail.com

Andre Wensing

unread,
Feb 9, 2010, 1:52:05 PM2/9/10
to FreeBSD Stable

And I thought I found one in the Adaptec 1405 Integrated SAS/SATA
controller, because it's marketed as an inexpensive SAS/SATA non-RAID
addon-card. On top of that, they advertise it as having FreeBSD6 and
FreeBSD7-support and drivers. So I ordered it for my storage-box
(FreeNAS) with great expectations. Sadly, they don't have support nor
drivers for FreeBSD ("drivers will be released Q4 2009") at all, so I'm
thinking of leaving FreeNAS and trying some linux-flavor that does
support this card...
But Adaptec doesn't have a great track-record for FreeBSD-support, does it?

André Wensing

Peter C. Lai

unread,
Feb 9, 2010, 2:27:07 PM2/9/10
to Andre Wensing, FreeBSD Stable

Everything is a repackage of some OEM these days (basically gone are the
days of Adaptec==LSI==mpt(4) and all). Find the actual chipset make and
model if you can, then you can look at what is supported, as the drivers
deal with the actual chipset and could care less about the brand of some
vertically-integrated fpga package.

Probably will want to give a shout-out on -hardware i.e.
http://markmail.org/message/b5imismi5s3iafc5#query:+page:1+mid:5htpj5fw7uijtzqp+state:results

Matthew Dillon

unread,
Feb 9, 2010, 2:49:55 PM2/9/10
to FreeBSD Stable
The Silicon Image 3124A chipsets (the PCI-e version of the 3124. The
original 3124 was PCI-x). The 3124A's are starting to make their way
into distribution channels. This is probably the best 'cheap' solution
which offers fully concurrent multi-target NCQ operation through a port
multiplier enclosure with more than the PCIe 1x bus the ultra-cheap
3132 offers. I think the 3124A uses an 8x bus (not quite sure, but it
is more than 1x).

AHCI on-motherboard with equivalent capabilities do not appear to be
in wide distribution yet. Most AHCI chips can do NCQ to a single
target (even a single target behind a PM), but not concurrently to
multiple targets behind a port multiplier. Even though SATA bandwidth
constraints might seem to make this a reasonable alternative it
actually isn't because any seek heavy activity to multiple drives
will be serialized and perform EXTREMELY poorly. Linear performance
will be fine. Random performance will be horrible.

It should be noted that while hotswap is supported with silicon image
chipsets and port multiplier enclosures (which also use Sili chips in
the enclosure), the hot-swap capability is not anywhere near as robust
as you would find with a more costly commercial SAS setup. SI chips
are very poorly made (this is the same company that went bust under
another name a few years back due to shoddy chipsets), and have a lot
of on-chip hardware bugs, but fortunately OSS driver writers (linux
guys) have been able to work around most of them. So even though the
chipset is a bit shoddy actual operation is quite good. However,
this does mean you generally want to idle all activity on the enclosure
to safely hot swap anything, not just the drive you are pulling out.
I've done a lot of testing and hot-swapping an idle disk while other
drives in the same enclosure are hot is not reliable (for a cheap port
multiplier enclosure using a Sili chip inside, which nearly all do).

Also, a disk failure within the enclosure can create major command
sequencing issues for other targets in the enclosure because error
processing has to be serialized. Fine for home use but don't expect
miracles if you have a drive failure.

The Sili chips and port multiplier enclosures are definitely the
cheapest multi-disk solution. You lose on aggregate bandwidth and
you lose on some robustness but you get the hot-swap basically for free.

--

Multi-HD setups for home use are usually a lose. I've found over
the years that it is better to just buy a big whopping drive and
then another one or two for backups and not try to gang them together
in a RAID. And yes, at one time in the past I was running three
separate RAID-5 using 3ware controllers. I don't anymore and I'm
a lot happier.

If you have more than 2TB worth of critical data you don't have much
of a choice, but I'd go with as few physical drives as possible
regardless. The 2TB Maxtor green or black drives are nice. I
strongly recommend getting the highest-capacity drives you can
afford if you don't want your power bill to blow out your budget.

The bigger problem is always having an independent backup of the data.
Depending on a single-instanced filesystem, even one like ZFS, for a
lifetime's worth of data is not a good idea. Fire, theft... there are
a lot of ways the data can be lost. So when designing the main
system you have to take care to also design the backup regimen
including something off-site (or swapping the physical drive once
a month, etc). i.e. multiple backup regimens.

If single-drive throughput is an issue then using ZFS's caching
solution with a small SSD is the way to go (and yes, DFly has a SSD
caching solution now too but that's not pertainant to this thread).
The Intel SSDs are really nice, but I am singularly unimpressed with
the OCZ Colossus's which don't even negotiate NCQ. I don't know much
re: other vendors.

A little $100 Intel 40G SSD has around a 40TB write endurance and can
last 10 years as a disk meta-data caching environment with a little care,
particularly if you only cache meta-data. A very small incremental
cost gives you 120-200MB/sec of seek-agnostic bandwidth which is
perfect for network serving, backup, remote filesystems, etc. Unless
the box has 10GigE or multiple 1xGigE network links there's no real
need to try to push HD throughput beyond what the network can do
so it really comes down to avoiding thrashing the HDs with random seeks.
That is what the small SSD cache gives you. It can be like night and
day.

-Matt

Christian Weisgerber

unread,
Feb 9, 2010, 4:33:37 PM2/9/10
to freebsd...@freebsd.org
Matthew D. Fuller <full...@over-yonder.net> wrote:

> > I have something similar (5x1Tb) - I have a Gigabyte GA-MA785GM-US2H
> > with an Athlon X2 and 4Gb of RAM (only half filled - 2x2Gb)
> >
> > Note that it doesn't support ECC, I don't know if that is a problem.
>
> How's that? Is the BIOS just stupid, or is the board physically
> missing traces?

Doesn't matter really, does it?

I have a GA-MA78G-DS3H. According to the specs, it supports ECC
memory. And that is all the mention of ECC you will find anywhere.
There is nothing in the BIOS. My best guess is that they quite
literally mean that you can plug ECC memory into the board and it
will work, but that there are no provisions to actually use ECC.

That said, I also have an Asus M2N-SLI Deluxe. If I enable ECC in
the BIOS, the board locks up sooner or later, even when just sitting
in the BIOS. memtest86 dies a screaming death immediately. When
I disable ECC, the board is solid, both in actual use and with
memtest.

I thought if I built a PC from components, I'd be already a step
above the lowest dregs of the consumer market, but apparently not.

--
Christian "naddy" Weisgerber na...@mips.inka.de

Charles Sprickman

unread,
Feb 9, 2010, 5:21:07 PM2/9/10
to Dan Langille, FreeBSD Stable

>From what I've seen on this list, people buy a nice Areca or 3Ware card
and put it in JBOD mode and run ZFS on top of the drives. The card is
just being used to get lots of sata ports with a stable driver and known
good hardware. I've asked here a few times in the last few years for
recommendations on a cheap SATA card and it seems like such a thing does
not exist. This might be a bit dated at this point, but you're playing it
safe if you go with a 3ware/Areca/LSI card.

I don't recall all the details, but there were issues with siil,
highpoint, etc. IIRC it was not really FBSD's issue, but bugginess in
those cards. The intel ICH9 chipset works well, but there are no add-on
PCIe cards that have an intel chip on them...

I'm sure someone will correct me if my info is now outdated or flat-out
wrong. :)

Charles

Charles Sprickman

unread,
Feb 9, 2010, 5:32:02 PM2/9/10
to Jeremy Chadwick, freebsd...@freebsd.org

I have a box down at Softlayer (one of the few major server rental outfits
that officially supports FreeBSD), and one of the reasons I went with them
is that they advertised "IP-KVM support". Turns out they run Supermicro
boxes with the IPMI card. It mostly works, but it is very quirky and you
have to use a very wonky Java client app to get the remote console. You
have to build a kernel that omits certain USB devices to make the keyboard
work over the KVM connection (and their stock FBSD install has it
disabled).

I can usually get in, but sometimes I have to open a ticket with them and
a tech does some kind of reset on the card. I don't know if they a
hitting a button on the card/chassis or if they have some way to do this
remotely. After they do that, I'll see something like this in dmesg:

umass0: <Peppercon AG Multidevice, class 0/0, rev 2.00/0.01, addr 2> on
uhub4
ums0: <Peppercon AG Multidevice, class 0/0, rev 2.00/0.01, addr 2> on
uhub4
ums0: 3 buttons and Z dir.
ukbd0: <Peppercon AG Multidevice, class 0/0, rev 2.00/0.01, addr 2> on
uhub4
kbd2 at ukbd0

The umass device is to support the "virtual media" feature that simply
does not work. It's supposed to allow you to point the ipmi card at an
iso or disk image on an SMB server and boot your server off of it. I had
no luck with this.

All the IPMI power on/off, reset, and hw monitoring functions do work well
though.

> It gets worse when you use one of the IPMI modules that piggybacks on an
> existing Ethernet port -- the NIC driver for the OS, from the ground up,
> has to be fully aware of ASF and any quirks/oddities involved. For
> example, on bge(4) and bce(4), you'll find this (bge mentioned below):
>
> hw.bge.allow_asf
> Allow the ASF feature for cooperating with IPMI. Can cause sys-
> tem lockup problems on a small number of systems. Disabled by
> default.
>
> So unless the administrator intentionally sets the loader tunable prior
> to booting the OS installation, they'll find all kinds of MAC problems
> as a result of the IPMI piggybacking. "Why isn't this enabled by
> default?" I believe because there were reports of failures/problems on
> people's systems who *did not* have IPMI cards. Lose-lose situation.

I don't think they have this setup, or if they do, they are using it on
the internal LAN, so I don't notice any weirdness.

> If you really want me to dig up people at Yahoo who have dealt with IPMI
> on thousands of Supermicro servers and the insanity involved (due to
> bugs, quirks, or implementation differences between the IPMI firmwares
> and which revision/model of module used), I can do so. Most of the
> complaints I've heard of stem from serial-over-IPMI. I don't think
> it'd be a very positive/"supportive" thread, however. :-)
>
> One similar product that does seem to work well is iLO, available on
> HP/Compaq hardware.

I've heard great things about that. It seems like a much better design -
it's essentially a small server that is independent from the main host.
Has it's own LAN and serial ports as well.

Charles

> --
> | Jeremy Chadwick j...@parodius.com |
> | Parodius Networking http://www.parodius.com/ |
> | UNIX Systems Administrator Mountain View, CA, USA |
> | Making life hard for others since 1977. PGP: 4BD6C0CB |
>

Peter C. Lai

unread,
Feb 9, 2010, 5:56:32 PM2/9/10
to Charles Sprickman, freebsd...@freebsd.org, Jeremy Chadwick

Dell PowerEdge Remote Access (DRAC) cards also provided this as well,
and for a while there, you could actually VNC into them. But HP offers iLO
for no extra charge or discount upon removal (DRACs are worth about $250)
and has become such a prominent "must-have" datacenter feature that the
"iLO" term is beginning to become genericized for web-accessible and virtual
disc-capable onboard out-of-band IP-console management.

Daniel O'Connor

unread,
Feb 9, 2010, 6:57:03 PM2/9/10
to freebsd...@freebsd.org, Christian Weisgerber
On Wed, 10 Feb 2010, Christian Weisgerber wrote:
> Matthew D. Fuller <full...@over-yonder.net> wrote:
> > > I have something similar (5x1Tb) - I have a Gigabyte
> > > GA-MA785GM-US2H with an Athlon X2 and 4Gb of RAM (only half
> > > filled - 2x2Gb)
> > >
> > > Note that it doesn't support ECC, I don't know if that is a
> > > problem.
> >
> > How's that? Is the BIOS just stupid, or is the board physically
> > missing traces?
>
> Doesn't matter really, does it?
>
> I have a GA-MA78G-DS3H. According to the specs, it supports ECC
> memory. And that is all the mention of ECC you will find anywhere.
> There is nothing in the BIOS. My best guess is that they quite
> literally mean that you can plug ECC memory into the board and it
> will work, but that there are no provisions to actually use ECC.

FWIW I can't see ECC support listed for that board on Gigabyte's
website.. (vs the GA-MA770T-UD3P which does list ECC as supported -
DDR3 board though)

--
Daniel O'Connor software and network engineer
for Genesis Software - http://www.gsoft.com.au
"The nice thing about standards is that there
are so many of them to choose from."
-- Andrew Tanenbaum
GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C

signature.asc

Emil Mikulic

unread,
Feb 9, 2010, 7:40:03 PM2/9/10
to Peter C. Lai, FreeBSD Stable
On Tue, Feb 09, 2010 at 11:31:55AM -0500, Peter C. Lai wrote:
> Also does anybody know if benching dd if=/dev/zero onto a zfs volume that
> has compression turned on might affect what dd (which is getting what it
> knows from vfs/vmm) might report?

Absolutely!

Compression on:
4294967296 bytes transferred in 16.251397 secs (264282961 bytes/sec)
4294967296 bytes transferred in 16.578707 secs (259065276 bytes/sec)
4294967296 bytes transferred in 16.178586 secs (265472353 bytes/sec)
4294967296 bytes transferred in 16.069003 secs (267282747 bytes/sec)

Compression off:
4294967296 bytes transferred in 58.248351 secs (73735432 bytes/sec)

Dan Langille

unread,
Feb 9, 2010, 11:28:57 PM2/9/10
to Boris Kochergin, Peter C. Lai, Charles Sprickman, FreeBSD Stable

That's a really good price. :)

If needed, I could host all eight SATA drives for $160, much cheaper
than any of the other RAID cards I've seen.

The issue then is finding a motherboard which has 4x PCI Express slots. ;)

Dan Langille

unread,
Feb 10, 2010, 12:02:49 AM2/10/10
to Matthew Dillon, FreeBSD Stable
Trying to make sense of stuff I don't know about...

Matthew Dillon wrote:
>
> AHCI on-motherboard with equivalent capabilities do not appear to be
> in wide distribution yet. Most AHCI chips can do NCQ to a single
> target (even a single target behind a PM), but not concurrently to
> multiple targets behind a port multiplier. Even though SATA bandwidth
> constraints might seem to make this a reasonable alternative it
> actually isn't because any seek heavy activity to multiple drives
> will be serialized and perform EXTREMELY poorly. Linear performance
> will be fine. Random performance will be horrible.

Don't use a port multiplier and this goes away. I was hoping to avoid a
PM and using something like the Syba PCI Express SATA II 4 x Ports RAID
Controller seems to be the best solution so far.

http://www.amazon.com/Syba-Express-Ports-Controller-SY-PEX40008/dp/B002R0DZWQ/ref=sr_1_22?ie=UTF8&s=electronics&qid=1258452902&sr=1-22

>
> It should be noted that while hotswap is supported with silicon image
> chipsets and port multiplier enclosures (which also use Sili chips in
> the enclosure), the hot-swap capability is not anywhere near as robust
> as you would find with a more costly commercial SAS setup. SI chips
> are very poorly made (this is the same company that went bust under
> another name a few years back due to shoddy chipsets), and have a lot
> of on-chip hardware bugs, but fortunately OSS driver writers (linux
> guys) have been able to work around most of them. So even though the
> chipset is a bit shoddy actual operation is quite good. However,
> this does mean you generally want to idle all activity on the enclosure
> to safely hot swap anything, not just the drive you are pulling out.
> I've done a lot of testing and hot-swapping an idle disk while other
> drives in the same enclosure are hot is not reliable (for a cheap port
> multiplier enclosure using a Sili chip inside, which nearly all do).

What I'm planning to use is an SATA enclosure but I'm pretty sure a port
multiplier is not involved:

http://www.athenapower.us/web_backplane_zoom/bp_sata3141b.html

> Also, a disk failure within the enclosure can create major command
> sequencing issues for other targets in the enclosure because error
> processing has to be serialized. Fine for home use but don't expect
> miracles if you have a drive failure.

Another reason to avoid port multipliers.

Niki Denev

unread,
Feb 10, 2010, 12:57:27 AM2/10/10
to Peter C. Lai, freebsd...@freebsd.org

I thought that their VNC implementation is non-standard, and
I wasn't able to VPN into them, at least on the latest Core i7 models.

Pieter de Goeje

unread,
Feb 10, 2010, 5:27:53 AM2/10/10
to freebsd...@freebsd.org, Peter C. Lai, Charles Sprickman, Boris Kochergin, Dan Langille

You should be able to put PCIe 4x card in a PCIe 16x or 8x slot.
For an explanation allow me to quote wikipedia:

"A PCIe card will fit into a slot of its physical size or bigger, but may not
fit into a smaller PCIe slot. Some slots use open-ended sockets to permit
physically longer cards and will negotiate the best available electrical
connection. The number of lanes actually connected to a slot may also be less
than the number supported by the physical slot size. An example is a x8 slot
that actually only runs at ×1; these slots will allow any ×1, ×2, ×4 or ×8
card to be used, though only running at the ×1 speed. This type of socket is
described as a ×8 (×1 mode) slot, meaning it physically accepts up to ×8 cards
but only runs at ×1 speed. The advantage gained is that a larger range of PCIe
cards can still be used without requiring the motherboard hardware to support
the full transfer rate—in so doing keeping design and implementation costs
down."

-- Pieter

Jeremy Chadwick

unread,
Feb 10, 2010, 5:55:16 AM2/10/10
to freebsd...@freebsd.org
> the full transfer rate???in so doing keeping design and implementation costs
> down."

Correction -- more than likely on a consumer motherboard you *will not*
be able to put a non-VGA card into the PCIe x16 slot. I have numerous
Asus and Gigabyte motherboards which only accept graphics cards in their
PCIe x16 slots; this """feature""" is documented in user manuals. I
don't know how/why these companies chose to do this, but whatever.

I would strongly advocate that the OP (who has stated he's focusing on
stability and reliability over speed) purchase a server motherboard that
has a PCIe x8 slot on it and/or server chassis (usually best to buy both
of these things from the same vendor) and be done with it.

Gót András

unread,
Feb 10, 2010, 6:06:34 AM2/10/10
to freebsd...@freebsd.org, Jeremy Chadwick
>> example is a x8 slot that actually only runs at �1; these slots will
>> allow any �1, �2, �4 or �8 card to be used, though only running at the
>> �1 speed. This type of socket is
>> described as a �8 (�1 mode) slot, meaning it physically accepts up to �8
>> cards but only runs at �1 speed. The advantage gained is that a larger

>> range of PCIe cards can still be used without requiring the motherboard
>> hardware to support the full transfer rate???in so doing keeping design
>> and implementation costs down."
>
> Correction -- more than likely on a consumer motherboard you *will not*
> be able to put a non-VGA card into the PCIe x16 slot. I have numerous Asus
> and Gigabyte motherboards which only accept graphics cards in their PCIe
> x16 slots; this """feature""" is documented in user manuals. I don't know
> how/why these companies chose to do this, but whatever.
>
> I would strongly advocate that the OP (who has stated he's focusing on
> stability and reliability over speed) purchase a server motherboard that
> has a PCIe x8 slot on it and/or server chassis (usually best to buy both
> of these things from the same vendor) and be done with it.

Hi,

We're running an 'old' LSI U320 x4 (or x8) PCIe hw raid card in a simple
Gigabyte mobo without any problems. It was plug and play. The mobo has
some P35 chipset and an E7400 CPU. If the exact types needed I'll look
after them. (And yes, the good old U320 scsi is lightning fast compared to
any new SATA drives and only 3x36GB disks are in raid5. I know that it
won't win the capacity contest... :) )

I think these single cpu server boards are quite overpriced regarding to
the few extra features that would make some to buy them.

Anyway, I liked that Atom D510 supermicro mobo that was mentioned earlier.
I think it would handle any good PCIe cards and would fit in a nice
Supermicro tower. I'd also suggest to with as less disk as you can. 2TB
disks are here so you can make a 4TB R5 aray with only 3 and you power
bill won't wipe out your bank account.

Regards,
Andras Got

Miroslav Lachman

unread,
Feb 10, 2010, 8:30:54 AM2/10/10
to Svein Skogen (Listmail Account), freebsd...@freebsd.org

Does anybody have experiences with ATEN IP8000 card?
I found it today
http://www.aten.com/products/productItem.php?pcid=2006041110563001&psid=20060411131311002&pid=20080401180847001&layerid=subClass1

It is not cheap, but it seems as universal solution for any motherboard
with PCI slot.

"Host-side OS support - Windows 2000/2003/XP
/NT/VistaRedhat 7.1 and above; FreeBSD, Novell"

Miroslav Lachman

Jeremy Chadwick

unread,
Feb 10, 2010, 8:51:41 AM2/10/10
to freebsd...@freebsd.org

There's also the PC Weasel[1], which does VGA-to-serial and provides
reset/power-cycle capability over the serial port. 100% OS-independent.
The concept itself is really cool[2], but there's 3 major problems:

1) PCI version is 5V; some systems are limited to 3.3V PCI slots (see
Wikipedia: http://en.wikipedia.org/wiki/File:PCI_Keying.png) -- not
to mention lots of systems are doing away with PCI altogether (in
servers especially)
2) Limited to 38400 bps throughput (I run serial consoles at 115200),
3) Very expensive -- US$350 *per card*.

I'm surprised no one else has come up with a similar solution especially
given the regularity of DSPs, CPLDs, and FPGAs in this day and age.

[1]: http://www.realweasel.com/intro.html
[2]: http://www.realweasel.com/design.html

Matthew Dillon

unread,
Feb 10, 2010, 11:35:44 AM2/10/10
to freebsd...@freebsd.org
:Correction -- more than likely on a consumer motherboard you *will not*

:be able to put a non-VGA card into the PCIe x16 slot. I have numerous
:Asus and Gigabyte motherboards which only accept graphics cards in their
:PCIe x16 slots; this """feature""" is documented in user manuals. I
:don't know how/why these companies chose to do this, but whatever.
:
:I would strongly advocate that the OP (who has stated he's focusing on
:stability and reliability over speed) purchase a server motherboard that
:has a PCIe x8 slot on it and/or server chassis (usually best to buy both
:of these things from the same vendor) and be done with it.
:
:--
:| Jeremy Chadwick j...@parodius.com |

It is possible this is related to the way Intel on-board graphics
work in recent chipsets. e.g. i915 or i925 chipsets. The
on-motherboard video uses a 16-lane internal PCI-e connection which
is SHARED with the 16-lane PCI-e slot. If you plug something into
the slot (e.g. a graphics card), it disables the on-motherboard
video. I'm not sure if the BIOS can still boot if you plug something
other than a video card into these MBs and no video at all is available.
Presumably it should be able to, you just wouldn't have any video at
all.

Insofar as I know AMD-based MBs with on-board video don't have this
issue, though it should also be noted that AMD-based MBs tend to be
about 6-8 months behind Intel ones in terms of features.

-Matt

Boris Kochergin

unread,
Feb 10, 2010, 1:35:06 PM2/10/10
to Dan Langille, Peter C. Lai, Charles Sprickman, FreeBSD Stable
If you want to go this route, I bought one a while ago so that I could
stuff as many dual-port Gigabit Ethernet controllers into it as possible
(it was a SPAN port replicator):
http://www.newegg.com/Product/Product.aspx?Item=N82E16813130136. Newegg
doesn't carry it anymore, but if you can find it elsewhere, I can vouch
for its stability:

# uptime
1:20PM up 494 days, 5:23, 1 user, load averages: 0.05, 0.07, 0.05

In my setups with those Silicon Image cards, though, they serve as
additional controllers, with the following onboard SATA controllers
being used to provide most of the ports:

SB600 (AMD/ATI)
SB700 (AMD/ATI)
ICH9 (Intel)
63XXESB2 (Intel)

I haven't had any problems with any of them.

-Boris

Dan Langille

unread,
Feb 10, 2010, 1:46:42 PM2/10/10
to Boris Kochergin, FreeBSD Stable

I don't know what the above means.

I think it means you are primarily using the onboard SATA contollers and
have those Silicon Image cards providing additional ports where required.

>
> SB600 (AMD/ATI)
> SB700 (AMD/ATI)
> ICH9 (Intel)
> 63XXESB2 (Intel)

These are the chipsets on that motherboard?

Boris Kochergin

unread,
Feb 10, 2010, 2:10:12 PM2/10/10
to Dan Langille, FreeBSD Stable
Correct.

>
>>
>> SB600 (AMD/ATI)
>> SB700 (AMD/ATI)
>> ICH9 (Intel)
>> 63XXESB2 (Intel)
>
> These are the chipsets on that motherboard?
Those are the SATA controller chipsets. Here are the corresponding
chipsets advertised on the motherboards, in north bridge/south bridge form:

SB600 SATA: AMD 770/AMD SB600
SB700 SATA: AMD SR5690/AMD SP5100
ICH9 SATA: Intel 3200/Intel ICH9
63XXESB2 SATA: Intel 5000X/Intel ESB2

-Boris

David N

unread,
Feb 10, 2010, 2:10:03 PM2/10/10
to Christian Weisgerber, freebsd...@freebsd.org
> _______________________________________________
> freebsd...@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stabl...@freebsd.org"
>

I had an M2A-VM HDMI that had the ECC problem, ASUS released a BIOS
update for it, not sure for the M2N if they fixed that problem.

>From what I've seen, most ASUS boards have the ECC option, dont take
my word for it though.

Regards
David N

Steve Polyack

unread,
Feb 10, 2010, 2:40:04 PM2/10/10
to Dan Langille, FreeBSD Stable
On 2/10/2010 12:02 AM, Dan Langille wrote:
> Trying to make sense of stuff I don't know about...
>
> Matthew Dillon wrote:
>>
>> AHCI on-motherboard with equivalent capabilities do not appear to be
>> in wide distribution yet. Most AHCI chips can do NCQ to a single
>> target (even a single target behind a PM), but not concurrently to
>> multiple targets behind a port multiplier. Even though SATA
>> bandwidth
>> constraints might seem to make this a reasonable alternative it
>> actually isn't because any seek heavy activity to multiple drives
>> will be serialized and perform EXTREMELY poorly. Linear performance
>> will be fine. Random performance will be horrible.
>
> Don't use a port multiplier and this goes away. I was hoping to avoid
> a PM and using something like the Syba PCI Express SATA II 4 x Ports
> RAID Controller seems to be the best solution so far.
>
> http://www.amazon.com/Syba-Express-Ports-Controller-SY-PEX40008/dp/B002R0DZWQ/ref=sr_1_22?ie=UTF8&s=electronics&qid=1258452902&sr=1-22
>

Dan, I can personally vouch for these cards under FreeBSD. We have 3 of
them in one system, with almost every port connected to a port
multiplier (SiI5xxx PMs). Using the siis(4) driver on 8.0-RELEASE
provides very good performance, and supports both NCQ and FIS-based
switching (an essential for decent port-multiplier performance).

One thing to consider, however, is that the card is only single-lane
PCI-Express. The bandwidth available is only 2.5Gb/s (~312MB/sec,
slightly less than that of the SATA-2 link spec), so if you have 4
high-performance drives connected, you may hit a bottleneck at the
bus. I'd be particularly interested if anyone can find any similar
Silicon Image SATA controllers with a PCI-E 4x or 8x interface ;)

>
>>
>> It should be noted that while hotswap is supported with silicon
>> image
>> chipsets and port multiplier enclosures (which also use Sili
>> chips in
>> the enclosure), the hot-swap capability is not anywhere near as
>> robust
>> as you would find with a more costly commercial SAS setup. SI chips
>> are very poorly made (this is the same company that went bust under
>> another name a few years back due to shoddy chipsets), and have a
>> lot
>> of on-chip hardware bugs, but fortunately OSS driver writers (linux
>> guys) have been able to work around most of them. So even though
>> the
>> chipset is a bit shoddy actual operation is quite good. However,
>> this does mean you generally want to idle all activity on the
>> enclosure
>> to safely hot swap anything, not just the drive you are pulling out.
>> I've done a lot of testing and hot-swapping an idle disk while other
>> drives in the same enclosure are hot is not reliable (for a cheap
>> port
>> multiplier enclosure using a Sili chip inside, which nearly all do).
>
>

I haven't had such bad experience as the above, but it is certainly a
concern. Using ZFS we simply 'offline' the device, pull, replace with a
new one, glabel, and zfs replace. It seems to work fine as long as
nothing is accessing the device you are replacing (otherwise you will
get a kernel panic a few minutes down the line). m...@FreeBSD.org has
also committed a large patch set to 9-CURRENT which implements "proper"
SATA/AHCI hot-plug support and error-recovery through CAM.

-Steve Polyack

Dmitry Morozovsky

unread,
Feb 10, 2010, 3:20:37 PM2/10/10
to Dan Langille, FreeBSD Stable
On Mon, 8 Feb 2010, Dan Langille wrote:

DL> I'm looking at creating a large home use storage machine. Budget is a
DL> concern, but size and reliability are also a priority. Noise is also a
DL> concern, since this will be at home, in the basement. That, and cost,
DL> pretty much rules out a commercial case, such as a 3U case. It would be
DL> nice, but it greatly inflates the budget. This pretty much restricts me to
DL> a tower case.

[snip]

We use the following at work, but it's still pretty cheap and pretty silent:

Chieftec WH-02B-B (9x5.25 bays)

filled with

2 x Supermicro CSE-MT35T
http://www.supermicro.nl/products/accessories/mobilerack/CSE-M35T-1.cfm
for regular storage, 2 x raidz1

1 x Promise SuperSwap 1600
http://www.promise.com/product/product_detail_eng.asp?product_id=169
for changeable external backups

and still have 2 5.25 bays for anything interesting ;-)

other parts are regular SocketAM2+ motherboard, Athlon X4, 8G ram,
FreeBSD/amd64

--
Sincerely,
D.Marck [DM5020, MCK-RIPE, DM3-RIPN]
[ FreeBSD committer: ma...@FreeBSD.org ]
------------------------------------------------------------------------
*** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- ma...@rinet.ru ***
------------------------------------------------------------------------

Jonathan

unread,
Feb 10, 2010, 3:17:00 PM2/10/10
to freebsd...@freebsd.org
On 2/8/2010 12:01 AM, Dan Langille wrote:
> Hi,
>
> I'm thinking of 8x1TB (or larger) SATA drives. I've found a case[2] with
> hot-swap bays[3], that seems interesting. I haven't looked at power
> supplies, but given that number of drives, I expect something beefy with
> a decent reputation is called for.

I have a system with two of these [1] and an 8 port LSI SAS card that
runs fine for me. I run an 8 drive ZFS array off the LSI card and then
have 2 drives mirrored off the motherboard SATA ports for booting with
ZFS. Hotswap works fine for me as well with this hardware.

Jonathan

http://www.newegg.com/Product/Product.aspx?Item=N82E16816215001

Dmitry Morozovsky

unread,
Feb 10, 2010, 3:24:12 PM2/10/10
to Dan Langille, FreeBSD Stable
On Wed, 10 Feb 2010, Dmitry Morozovsky wrote:

DM> other parts are regular SocketAM2+ motherboard, Athlon X4, 8G ram,
DM> FreeBSD/amd64

well, not exactly "regular" - it's ASUS M2N-LR-SATA with 10 SATA channels, but
I suppose there are comparable in "workstation" mobo market now...

Dan Langille

unread,
Feb 10, 2010, 4:05:57 PM2/10/10
to Dmitry Morozovsky, FreeBSD Stable
Dmitry Morozovsky wrote:
> On Wed, 10 Feb 2010, Dmitry Morozovsky wrote:
>
> DM> other parts are regular SocketAM2+ motherboard, Athlon X4, 8G ram,
> DM> FreeBSD/amd64
>
> well, not exactly "regular" - it's ASUS M2N-LR-SATA with 10 SATA channels, but
> I suppose there are comparable in "workstation" mobo market now...

10 SATA channels? Newegg claims only 6:

http://www.newegg.com/Product/Product.aspx?Item=N82E16813131134

Dmitry Morozovsky

unread,
Feb 10, 2010, 6:10:48 PM2/10/10
to Dan Langille, FreeBSD Stable
On Wed, 10 Feb 2010, Dan Langille wrote:

DL> Dmitry Morozovsky wrote:
DL> > On Wed, 10 Feb 2010, Dmitry Morozovsky wrote:
DL> >
DL> > DM> other parts are regular SocketAM2+ motherboard, Athlon X4, 8G ram, DM>
DL> > FreeBSD/amd64
DL> >
DL> > well, not exactly "regular" - it's ASUS M2N-LR-SATA with 10 SATA channels,
DL> > but I suppose there are comparable in "workstation" mobo market now...
DL>
DL> 10 SATA channels? Newegg claims only 6:

You refer to regular M2N-LR, M2N-LR-SATA contains additional 4-channel
Marvell chip:

marck@moose:~> grep '^atapci.*: <' /var/run/dmesg.boot
atapci0: <nVidia nForce MCP55 UDMA133 controller> port
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 4.0 on pci0
atapci1: <nVidia nForce MCP55 SATA300 controller> port
0xc400-0xc407,0xc080-0xc083,0xc000-0xc007,0xbc00-0xbc03,0xb880-0xb88f mem
0xef9bd000-0xef9bdfff irq 21 at device 5.0 on pci0
atapci2: <nVidia nForce MCP55 SATA300 controller> port
0xb800-0xb807,0xb480-0xb483,0xb400-0xb407,0xb080-0xb083,0xb000-0xb00f mem
0xef9bc000-0xef9bcfff irq 22 at device 5.1 on pci0
atapci3: <nVidia nForce MCP55 SATA300 controller> port
0xac00-0xac07,0xa880-0xa883,0xa800-0xa807,0xa480-0xa483,0xa400-0xa40f mem
0xef9b3000-0xef9b3fff irq 23 at device 5.2 on pci0
atapci4: <Adaptec 1420SA SATA300 controller> port 0xe800-0xe8ff mem
0xefd00000-0xefdfffff irq 17 at device 0.0 on pci3
atapci5: <Marvell 88SX6041 SATA300 controller> port 0xe400-0xe4ff mem
0xefb00000-0xefbfffff irq 18 at device 6.0 on pci3

(atapci4 is now used for 1-disk Promise enclosure; I tried to use SiL card to
use eSATA port native, but it failed to initialize there, so I use simple
SATA-eSATA bracket to use eSATA capabilities to this Eternal Beast [tm] ;-P)

Bruce Simpson

unread,
Feb 10, 2010, 10:00:39 PM2/10/10
to freebsd...@freebsd.org
On 02/10/10 19:40, Steve Polyack wrote:
>
> I haven't had such bad experience as the above, but it is certainly a
> concern. Using ZFS we simply 'offline' the device, pull, replace with
> a new one, glabel, and zfs replace. It seems to work fine as long as
> nothing is accessing the device you are replacing (otherwise you will
> get a kernel panic a few minutes down the line). m...@FreeBSD.org has
> also committed a large patch set to 9-CURRENT which implements
> "proper" SATA/AHCI hot-plug support and error-recovery through CAM.

I've been running with this patch in 8-STABLE for well over a week now
on my desktop w/o issues; I am using main disk for dev, and eSATA disk
pack for light multimedia use.

Dan Langille

unread,
Feb 12, 2010, 1:35:41 PM2/12/10
to Bruce Simpson, freebsd...@freebsd.org

MFC to 8.x?

--
Dan Langille -- http://langille.org/

Dan Langille

unread,
Feb 14, 2010, 1:17:05 AM2/14/10
to FreeBSD Stable
Dan Langille wrote:
> Hi,
>
> I'm looking at creating a large home use storage machine. Budget is a
> concern, but size and reliability are also a priority. Noise is also a
> concern, since this will be at home, in the basement. That, and cost,
> pretty much rules out a commercial case, such as a 3U case. It would be
> nice, but it greatly inflates the budget. This pretty much restricts me
> to a tower case.
>
> The primary use of this machine will be a backup server[1]. It will do
> other secondary use will include minor tasks such as samba, CIFS, cvsup,
> etc.

>
> I'm thinking of 8x1TB (or larger) SATA drives. I've found a case[2]
> with hot-swap bays[3], that seems interesting. I haven't looked at
> power supplies, but given that number of drives, I expect something
> beefy with a decent reputation is called for.
>
> Whether I use hardware or software RAID is undecided. I
>
> I think I am leaning towards software RAID, probably ZFS under FreeBSD
> 8.x but I'm open to hardware RAID but I think the cost won't justify it
> given ZFS.
>
> Given that, what motherboard and RAM configuration would you recommend
> to work with FreeBSD [and probably ZFS]. The lists seems to indicate
> that more RAM is better with ZFS.
>
> Thanks.
>
>
> [1] - FYI running Bacula, but that's out of scope for this question
>
> [2] - http://www.newegg.com/Product/Product.aspx?Item=N82E16811192058
>
> [3] - nice to have, especially for a failure.

After creating three different system configurations (Athena,
Supermicro, and HP), my configuration of choice is this Supermicro setup:

1. Samsung SATA CD/DVD Burner $20 (+ $8 shipping)
2. SuperMicro 5046A $750 (+$43 shipping)
3. LSI SAS 3081E-R $235
4. SATA cables $60
5. Crucial 3�2G ECC DDR3-1333 $191 (+ $6 shipping)
6. Xeon W3520 $310

Total price with shipping $1560

Details and links at http://dan.langille.org/2010/02/14/supermicro/

I'll probably start with 5 HDD in the ZFS array, 2x gmirror'd drives for
the boot, and 1 optical drive (so 8 SATA ports).

Daniel O'Connor

unread,
Feb 14, 2010, 2:08:42 AM2/14/10
to freebsd...@freebsd.org, Dan Langille
On Sun, 14 Feb 2010, Daniel O'Connor wrote:

> On Sun, 14 Feb 2010, Dan Langille wrote:
> > After creating three different system configurations (Athena,
> > Supermicro, and HP), my configuration of choice is this Supermicro
> > setup:
> >
> >     1. Samsung SATA CD/DVD Burner $20 (+ $8 shipping)
> >     2. SuperMicro 5046A $750 (+$43 shipping)
> >     3. LSI SAS 3081E-R $235
> >     4. SATA cables $60
> >     5. Crucial 3×2G ECC DDR3-1333 $191 (+ $6 shipping)
> >     6. Xeon W3520 $310
> >
> > Total price with shipping $1560
> >
> > Details and links at http://dan.langille.org/2010/02/14/supermicro/
> >
> > I'll probably start with 5 HDD in the ZFS array, 2x gmirror'd
> > drives for the boot, and 1 optical drive (so 8 SATA ports).
>
> That is f**king expensive for a home setup :)
>
> I priced a decent ZFS PC for a small business and it was AUD$2500
> including the disks (5x750Gb), case, PSU etc..

Also, that one booted off a 4Gb CF card (non RAID/mirror though).

signature.asc

Daniel O'Connor

unread,
Feb 14, 2010, 2:05:10 AM2/14/10
to freebsd...@freebsd.org, Dan Langille
On Sun, 14 Feb 2010, Dan Langille wrote:
> After creating three different system configurations (Athena,
> Supermicro, and HP), my configuration of choice is this Supermicro
> setup:
>
>     1. Samsung SATA CD/DVD Burner $20 (+ $8 shipping)
>     2. SuperMicro 5046A $750 (+$43 shipping)
>     3. LSI SAS 3081E-R $235
>     4. SATA cables $60
>     5. Crucial 3×2G ECC DDR3-1333 $191 (+ $6 shipping)
>     6. Xeon W3520 $310
>
> Total price with shipping $1560
>
> Details and links at http://dan.langille.org/2010/02/14/supermicro/
>
> I'll probably start with 5 HDD in the ZFS array, 2x gmirror'd drives
> for the boot, and 1 optical drive (so 8 SATA ports).

That is f**king expensive for a home setup :)

I priced a decent ZFS PC for a small business and it was AUD$2500
including the disks (5x750Gb), case, PSU etc..

--

signature.asc

Dan Langille

unread,
Feb 14, 2010, 9:07:53 AM2/14/10
to Daniel O'Connor, freebsd...@freebsd.org
Daniel O'Connor wrote:
> On Sun, 14 Feb 2010, Dan Langille wrote:
>> After creating three different system configurations (Athena,
>> Supermicro, and HP), my configuration of choice is this Supermicro
>> setup:
>>
>> 1. Samsung SATA CD/DVD Burner $20 (+ $8 shipping)
>> 2. SuperMicro 5046A $750 (+$43 shipping)
>> 3. LSI SAS 3081E-R $235
>> 4. SATA cables $60
>> 5. Crucial 3�2G ECC DDR3-1333 $191 (+ $6 shipping)
>> 6. Xeon W3520 $310
>>
>> Total price with shipping $1560
>>
>> Details and links at http://dan.langille.org/2010/02/14/supermicro/
>>
>> I'll probably start with 5 HDD in the ZFS array, 2x gmirror'd drives
>> for the boot, and 1 optical drive (so 8 SATA ports).
>
> That is f**king expensive for a home setup :)
>
> I priced a decent ZFS PC for a small business and it was AUD$2500
> including the disks (5x750Gb), case, PSU etc..

Yes, and this one doesn't yet have HDD.

Can you supply details of your system?

Dan Naumov

unread,
Feb 14, 2010, 9:53:54 AM2/14/10
to FreeBSD-STABLE Mailing List
> On Sun, 14 Feb 2010, Dan Langille wrote:
>> After creating three different system configurations (Athena,
>> Supermicro, and HP), my configuration of choice is this Supermicro
>> setup:
>>
>> 1. Samsung SATA CD/DVD Burner $20 (+ $8 shipping)
>> 2. SuperMicro 5046A $750 (+$43 shipping)
>> 3. LSI SAS 3081E-R $235
>> 4. SATA cables $60
>> 5. Crucial 3�2G ECC DDR3-1333 $191 (+ $6 shipping)
>> 6. Xeon W3520 $310

You do realise how much of a massive overkill this is and how much you
are overspending?


- Dan Naumov

Wes Morgan

unread,
Feb 14, 2010, 11:43:27 AM2/14/10
to Dan Langille, FreeBSD Stable
On Sun, 14 Feb 2010, Dan Langille wrote:

> 5. Crucial 3ᅵ2G ECC DDR3-1333 $191 (+ $6 shipping)


> 6. Xeon W3520 $310
>
> Total price with shipping $1560
>
> Details and links at http://dan.langille.org/2010/02/14/supermicro/

Wow um... That's quite a setup. Do you really need the Xeon W3520? You
could get a regular core 2 system for much less and still use the ECC ram
(highly recommended). The case you're looking at only has 6 hot-swap bays
according to the manuals, although the pictures show 8 (???). You could
shave some off the case and cpu, upgrade your 3081E-R to an ARC-1222 for
$200 more and have the hardware raid option.

If I was building a tower system, I'd put together something like this:

Case with 8 hot-swap SATA bays ($250):
http://www.newegg.com/Product/Product.aspx?Item=N82E16811192058
Or if you prefer screwless, you can find the case without the 2 hotswap
bays and use an icy dock screwless version.

Intel server board (for ECC support) ($200):
http://www.newegg.com/Product/Product.aspx?Item=N82E16813121328

SAS controller ($120):
http://www.buy.com/prod/supermicro-lsi-megaraid-lsisas1068e-8-port-sas-raid-controller-16mb/q/loc/101/207929556.html
Note: You'll need to change or remove the mounting bracket since it is
"backwards". I was able to find a bracket with matching screw holes on an
old nic and secure it to my case. It uses the same chipset as the more
expensive 3081E-R, if I remember correctly.

Quad-core CPU ($190):
http://www.newegg.com/Product/Product.aspx?Item=N82E16819115131

4x2gb ram sticks (97*2):
http://www.newegg.com/Product/Product.aspx?Item=N82E16820139045

same SATA cables for sata to mini-sas, same CD burner. Total cost probably
$400 less, which you can use to buy some of the drives.

For my personal (overkill) setup I have a chenbro 4U chassis with 16
hotswap bays and mini-SAS backplanes, a zippy 2+1 640 watt redundant power
supply (sounds like a freight train). I cannot express the joy I felt in
ripping out all the little SATA cables and snaking a couple fat 8087s
under the fans. 8 of the bays are dedicated to my media array, and the
other 8 are there for swapping in and out of backup drives mostly, but the
time they REALLY come in handy is when you need to upgrade your array. Buy
the replacement drives, pop them in, migrate the pool, and remove the old
drives.

I've been running with this for almost 3 years. If I had to do it over
again, I probably wouldn't get the power supply, it was more expensive
than the chassis and I don't think it has ever "saved" me from anything
(although I can't complain, it runs 24/7 and never had a glitch).

If I could find a good tower case I might consider it, but I've never seen
one I liked with mini-sas backplanes. Really the only thing I'm missing is
a nice 21U rack on casters, then the whole thing disappears into a corner
humming away.

Alexander Motin

unread,
Feb 14, 2010, 12:51:42 PM2/14/10
to Steve Polyack, FreeBSD Stable, Dan Langille
Steve Polyack wrote:
> On 2/10/2010 12:02 AM, Dan Langille wrote:
>> Don't use a port multiplier and this goes away. I was hoping to avoid
>> a PM and using something like the Syba PCI Express SATA II 4 x Ports
>> RAID Controller seems to be the best solution so far.
>>
>> http://www.amazon.com/Syba-Express-Ports-Controller-SY-PEX40008/dp/B002R0DZWQ/ref=sr_1_22?ie=UTF8&s=electronics&qid=1258452902&sr=1-22
>
> Dan, I can personally vouch for these cards under FreeBSD. We have 3 of
> them in one system, with almost every port connected to a port
> multiplier (SiI5xxx PMs). Using the siis(4) driver on 8.0-RELEASE
> provides very good performance, and supports both NCQ and FIS-based
> switching (an essential for decent port-multiplier performance).
>
> One thing to consider, however, is that the card is only single-lane
> PCI-Express. The bandwidth available is only 2.5Gb/s (~312MB/sec,
> slightly less than that of the SATA-2 link spec), so if you have 4
> high-performance drives connected, you may hit a bottleneck at the
> bus. I'd be particularly interested if anyone can find any similar
> Silicon Image SATA controllers with a PCI-E 4x or 8x interface ;)

Here is SiI3124 based card with built-in PCIe x8 bridge:
http://www.addonics.com/products/host_controller/adsa3gpx8-4em.asp

It is not so cheap, but with 12 disks connected via 4 Port Multipliers
it can give up to 1GB/s (4x250MB/s) of bandwidth.

Cheaper PCIe x1 version mentioned above gave me up to 200MB/s, that is
maximum of what I've seen from PCIe 1.0 x1 controllers. Looking on NCQ
and FBS support it can be enough for many real-world applications, that
don't need so high linear speeds, but have many concurrent I/Os.

--
Alexander Motin

Dan Langille

unread,
Feb 14, 2010, 5:02:40 PM2/14/10
to Dmitry Morozovsky, FreeBSD Stable
Dmitry Morozovsky wrote:
> On Wed, 10 Feb 2010, Dmitry Morozovsky wrote:
>
> DM> other parts are regular SocketAM2+ motherboard, Athlon X4, 8G ram,
> DM> FreeBSD/amd64
>
> well, not exactly "regular" - it's ASUS M2N-LR-SATA with 10 SATA channels, but
> I suppose there are comparable in "workstation" mobo market now...

I couldn't find this one for sale, FWIW. But looks interesting. Thanks.

Dan Langille

unread,
Feb 14, 2010, 4:38:02 PM2/14/10
to Dan Naumov, FreeBSD-STABLE Mailing List


I appreciate the comments and feedback. I'd also appreciate alternative
suggestions in addition to what you have contributed so far. Spec out
the box you would build.

Dan Langille

unread,
Feb 14, 2010, 5:16:30 PM2/14/10
to Alexander Motin, FreeBSD Stable, Steve Polyack

Is that the URL you meant to post? "4 Port eSATA PCI-E 8x Controller
for Mac Pro". I'd rather use internal connections.

Charles Sprickman

unread,
Feb 14, 2010, 5:32:52 PM2/14/10
to Dan Langille, FreeBSD-STABLE Mailing List, Dan Naumov

$1200, and I'll run any benchmarks you'd like to see:

http://secure.newegg.com/WishList/PublicWishDetail.aspx?WishListNumber=8441629

This box is really only for backups, so no fancy CPU. The sub-$100
celeron seems to not impact ZFS performance a bit. It does have ECC
memory, and a fancy "server" mainboard.

C

Dan Naumov

unread,
Feb 14, 2010, 5:42:00 PM2/14/10
to FreeBSD-STABLE Mailing List

======================
Case: Fractal Design Define R2 - 89 euro:
http://www.fractal-design.com/?view=product&prod=32

Mobo/CPU: Supermicro X7SPA-H / Atom D510 - 180-220 euro:
http://www.supermicro.com/products/motherboard/ATOM/ICH9/X7SPA.cfm?typ=H

PSU: Corsair 400CX 80+ - 59 euro:
http://www.corsair.com/products/cx/default.aspx

RAM: Corsair 2x2GB, DDR2 800MHz SO-DIMM, CL5 - 85 euro
======================
Total: ~435 euro

The motherboard has 6 native AHCI-capable ports on ICH9R controller
and you have a PCI-E slot free if you want to add an additional
controller card. Feel free to blow the money you've saved on crazy
fast SATA disks and if your system workload is going to have a lot of
random reads, then spend 200 euro on a 80gb Intel X25-M for use as a
dedicated L2ARC device for your pool.


- Sincerely,
Dan Naumov

Dan Naumov

unread,
Feb 14, 2010, 6:10:49 PM2/14/10
to FreeBSD-STABLE Mailing List, d...@langille.org

And to expand a bit, if you want that crazy performance without
blowing silly amounts of money:

Get a dock for holding 2 x 2,5" disks in a single 5,25" slot and put
it at the top, in the only 5,25" bay of the case. Now add an
additional PCI-E SATA controller card, like the often mentioned PCIE
SIL3124. Now you have 2 x 2,5" disk slots and 8 x 3,5" disk slots,
with 6 native SATA ports on the motherboard and more ports on the
controller card. Now get 2 x 80gb Intel SSDs and put them into the
dock. Now partition each of them in the following fashion:

1: swap: 4-5gb
2: freebsd-zfs: ~10-15gb for root filesystem
3: freebsd-zfs: rest of the disk: dedicated L2ARC vdev

GMirror your SSD swap partitions.
Make a ZFS mirror pool out of your SSD root filesystem partitions
Build your big ZFS pool however you like out of the mechanical disks you have.
Add the 2 x ~60gb partitions as dedicated independant L2ARC devices
for your SATA disk ZFS pool.

Now you have redundant swap, redundant and FAST root filesystem and
your ZFS pool of SATA disks has 120gb worth of L2ARC space on the
SSDs. The L2ARC vdevs dont need to be redundant, because should an IO
error occur while reading off L2ARC, the IO is deferred to the "real"
data location on the pool of your SATA disks. You can also remove your
L2ARC vdevs from your pool at will, on a live pool.


- Sincerely,
Dan Naumov

Dan Langille

unread,
Feb 14, 2010, 6:07:33 PM2/14/10
to Wes Morgan, FreeBSD Stable

Going to
http://www.supermicro.com/products/system/tower/5046/SYS-5046A-X.cfm it
does say 6 hot-swap and two spare. I'm guessing they say that because
the M/B supports only 6 SATA connections:

http://www.supermicro.com/products/motherboard/Core2Duo/X58/C7X58.cfm

> You could
> shave some off the case and cpu, upgrade your 3081E-R to an ARC-1222 for
> $200 more and have the hardware raid option.

That is a nice card. However, I don't want hardware RAID. I want ZFS.


> If I was building a tower system, I'd put together something like this:

Thank you for the suggestions.


> Case with 8 hot-swap SATA bays ($250):
> http://www.newegg.com/Product/Product.aspx?Item=N82E16811192058
> Or if you prefer screwless, you can find the case without the 2 hotswap
> bays and use an icy dock screwless version.

I do like this case, it's one I have priced:

http://dan.langille.org/2010/02/14/pricing-the-athena/

> Intel server board (for ECC support) ($200):
> http://www.newegg.com/Product/Product.aspx?Item=N82E16813121328

ECC, nice, which is something I've found appealing.

> SAS controller ($120):
> http://www.buy.com/prod/supermicro-lsi-megaraid-lsisas1068e-8-port-sas-raid-controller-16mb/q/loc/101/207929556.html
> Note: You'll need to change or remove the mounting bracket since it is
> "backwards". I was able to find a bracket with matching screw holes on an
> old nic and secure it to my case. It uses the same chipset as the more
> expensive 3081E-R, if I remember correctly.

I follow what you say, but cannot comprehend why the bracket is backwards.

> Quad-core CPU ($190):
> http://www.newegg.com/Product/Product.aspx?Item=N82E16819115131
>
> 4x2gb ram sticks (97*2):
> http://www.newegg.com/Product/Product.aspx?Item=N82E16820139045
>
> same SATA cables for sata to mini-sas, same CD burner. Total cost probably
> $400 less, which you can use to buy some of the drives.

I put this all together, and named it after you (hope you don't mind):

http://dan.langille.org/2010/02/14/273/

You're right, $400 less.

I also wrote up the above suggestions with a Supermicro case instead:

SUPERMICRO CSE-743T-645B Black 4U Pedestal Chassis w/ 645W Power Supply
$320
http://www.newegg.com/Product/Product.aspx?Item=N82E16811152047

I like your suggestions with the above case. It is now my preferred
solution.

> For my personal (overkill) setup I have a chenbro 4U chassis with 16
> hotswap bays and mini-SAS backplanes, a zippy 2+1 640 watt redundant power
> supply (sounds like a freight train). I cannot express the joy I felt in
> ripping out all the little SATA cables and snaking a couple fat 8087s
> under the fans. 8 of the bays are dedicated to my media array, and the
> other 8 are there for swapping in and out of backup drives mostly, but the
> time they REALLY come in handy is when you need to upgrade your array. Buy
> the replacement drives, pop them in, migrate the pool, and remove the old
> drives.

This is really nice. :)

Thank you for your suggestions. They have been very helpful. :)

Dan Langille

unread,
Feb 14, 2010, 6:31:34 PM2/14/10
to Dmitry Morozovsky, FreeBSD Stable
Dmitry Morozovsky wrote:
> On Mon, 8 Feb 2010, Dan Langille wrote:
>
> DL> I'm looking at creating a large home use storage machine. Budget is a
> DL> concern, but size and reliability are also a priority. Noise is also a
> DL> concern, since this will be at home, in the basement. That, and cost,
> DL> pretty much rules out a commercial case, such as a 3U case. It would be
> DL> nice, but it greatly inflates the budget. This pretty much restricts me to
> DL> a tower case.
>
> [snip]
>
> We use the following at work, but it's still pretty cheap and pretty silent:
>
> Chieftec WH-02B-B (9x5.25 bays)

$130 http://www.ncixus.com/products/33591/WH-02B-B-OP/Chieftec/ but not
available

$87.96 at http://www.xpcgear.com/chieftec-wh-02b-b-mid-tower-case.html

http://www.chieftec.com/wh02b-b.html

> filled with
>
> 2 x Supermicro CSE-MT35T
> http://www.supermicro.nl/products/accessories/mobilerack/CSE-M35T-1.cfm
> for regular storage, 2 x raidz1

I could not find a price on that, but guessing at $100 each

> 1 x Promise SuperSwap 1600
> http://www.promise.com/product/product_detail_eng.asp?product_id=169
> for changeable external backups

$100 from
http://www.overstock.com/Electronics/Promise-SuperSwap-1600-Drive-Enclosure/2639699/product.html

So that's $390. Not bad.

Still need RAM, M/B, PSU, and possibly video.

> and still have 2 5.25 bays for anything interesting ;-)

I'd be filling those three with DVD-RW and two SATA drives in a gmirror
configuration.

> other parts are regular SocketAM2+ motherboard, Athlon X4, 8G ram,
> FreeBSD/amd64

Let's say $150 for the M/B, $150 for the CPU, and $200 for the RAM.

Total is $890. Nice.

Tortise

unread,
Feb 14, 2010, 6:36:57 PM2/14/10
to FreeBSD Stable
----- Original Message -----
From: "Dan Langille" <d...@langille.org>
To: "Wes Morgan" <mor...@chemikals.org>
Cc: "FreeBSD Stable" <freebsd...@freebsd.org>
Sent: Monday, February 15, 2010 12:07 PM
Subject: Re: hardware for home use large storage


>>>> Whether I use hardware or software RAID is undecided. I
>>>>
>>>> I think I am leaning towards software RAID, probably ZFS under FreeBSD 8.x
>>>> but I'm open to hardware RAID but I think the cost won't justify it given
>>>> ZFS.
>>>>
>>>> Given that, what motherboard and RAM configuration would you recommend to
>>>> work with FreeBSD [and probably ZFS]. The lists seems to indicate that more
>>>> RAM is better with ZFS.
>>>>

.....

> That is a nice card. However, I don't want hardware RAID. I want ZFS.

I hope its not too rude to ask, and with no rudeness intended.

Is ZFS better under FreeBSD or Open Solaris? (I gather an server version of open solaris is less than a couple of months away.)


Dan Langille

unread,
Feb 14, 2010, 7:22:44 PM2/14/10
to Dan Naumov, FreeBSD Stable
Dan Naumov wrote:
> On Sun, Feb 14, 2010 at 11:38 PM, Dan Langille <d...@langille.org> wrote:
> ======================
> Case: Fractal Design Define R2 - 89 euro -
> http://www.fractal-design.com/?view=product&prod=32

That is a nice case. It's one slot short for what I need. The trays
are great. I want three more slots for 2xSATA for a gmirror base-OS and
an optical drive. As someone mentioned on IRC, there are many similar
non hot-swap cases. From the website, I couldn't see this for sale in
USA. But converting your price, to US$, it is about $121.

Looking around, this case was suggested to me. I like it a lot:

LIAN LI PC-A71F Black Aluminum ATX Full Tower Computer Case $240
http://www.newegg.com/Product/Product.aspx?Item=N82E16811112244

> Mobo/CPU: Supermicro X7SPA-H / Atom D510 - 180-220 euro -
> http://www.supermicro.com/products/motherboard/ATOM/ICH9/X7SPA.cfm?typ=H

Non-ECC RAM, which is something I'd like to have. $175

> PSU: Corsair 400CX 80+ - 59 euro -
> http://www.corsair.com/products/cx/default.aspx

http://www.newegg.com/Product/Product.aspx?Item=N82E16817139008 for $50

Is that sufficient power up to 10 SATA HDD and an optical drive?

> RAM: Corsair 2x2GB, DDR2 800MHz SO-DIMM, CL5 - 85 euro

http://www.newegg.com/Product/Product.aspx?Item=N82E16820145238 $82


> ======================
> Total: ~435 euro

With my options, it's about $640 with shipping etc.

> The motherboard has 6 native AHCI-capable ports on ICH9R controller
> and you have a PCI-E slot free if you want to add an additional
> controller card. Feel free to blow the money you've saved on crazy
> fast SATA disks and if your system workload is going to have a lot of
> random reads, then spend 200 euro on a 80gb Intel X25-M for use as a
> dedicated L2ARC device for your pool.

I have been playing with the idea of an L2ARC device. They sound crazy
cool.

Thank you Dan.

-- dan

Artem Belevich

unread,
Feb 14, 2010, 7:09:14 PM2/14/10
to Dan Naumov, FreeBSD-STABLE Mailing List, d...@langille.org
> your ZFS pool of SATA disks has 120gb worth of L2ARC space

Keep in mind that housekeeping of 120G L2ARC may potentially require
fair amount of RAM, especially if you're dealing with tons of small
files.

See this thread:
http://www.mail-archive.com/zfs-d...@opensolaris.org/msg34674.html

--Artem

Dan Naumov

unread,
Feb 14, 2010, 7:37:45 PM2/14/10
to Dan Langille, FreeBSD-STABLE Mailing List
>> PSU: Corsair 400CX 80+ - 59 euro -
>
>> http://www.corsair.com/products/cx/default.aspx
>
> http://www.newegg.com/Product/Product.aspx?Item=N82E16817139008 for $50
>
> Is that sufficient power up to 10 SATA HDD and an optical drive?

Disk power use varies from about 8 watt/disk for "green" disks to 20
watt/disk for really powerhungry ones. So yes.


- Sincerely,
Dan Naumov

Dan Langille

unread,
Feb 14, 2010, 7:33:07 PM2/14/10
to Dan Naumov, FreeBSD-STABLE Mailing List

That sounds very interesting. I just looking around for such a thing,
and could not find it. Is there a more specific name? URL?

> Now add an
> additional PCI-E SATA controller card, like the often mentioned PCIE
> SIL3124.

http://www.newegg.com/Product/Product.aspx?Item=N82E16816124026 for $35

> Now you have 2 x 2,5" disk slots and 8 x 3,5" disk slots,
> with 6 native SATA ports on the motherboard and more ports on the
> controller card. Now get 2 x 80gb Intel SSDs and put them into the
> dock. Now partition each of them in the following fashion:
>
> 1: swap: 4-5gb
> 2: freebsd-zfs: ~10-15gb for root filesystem
> 3: freebsd-zfs: rest of the disk: dedicated L2ARC vdev
>
> GMirror your SSD swap partitions.
> Make a ZFS mirror pool out of your SSD root filesystem partitions
> Build your big ZFS pool however you like out of the mechanical disks you have.
> Add the 2 x ~60gb partitions as dedicated independant L2ARC devices
> for your SATA disk ZFS pool.
>
> Now you have redundant swap, redundant and FAST root filesystem and
> your ZFS pool of SATA disks has 120gb worth of L2ARC space on the
> SSDs. The L2ARC vdevs dont need to be redundant, because should an IO
> error occur while reading off L2ARC, the IO is deferred to the "real"
> data location on the pool of your SATA disks. You can also remove your
> L2ARC vdevs from your pool at will, on a live pool.

That is nice.

Thank you.

Dan Langille

unread,
Feb 14, 2010, 7:54:57 PM2/14/10
to Charles Sprickman, FreeBSD-STABLE Mailing List, Dan Naumov
Charles Sprickman wrote:
> On Sun, 14 Feb 2010, Dan Langille wrote:
>
>> Dan Naumov wrote:
>>>> On Sun, 14 Feb 2010, Dan Langille wrote:
>>>>> After creating three different system configurations (Athena,
>>>>> Supermicro, and HP), my configuration of choice is this Supermicro
>>>>> setup:
>>>>>
>>>>> 1. Samsung SATA CD/DVD Burner $20 (+ $8 shipping)
>>>>> 2. SuperMicro 5046A $750 (+$43 shipping)
>>>>> 3. LSI SAS 3081E-R $235
>>>>> 4. SATA cables $60
>>>>> 5. Crucial 3�2G ECC DDR3-1333 $191 (+ $6 shipping)
>>>>> 6. Xeon W3520 $310
>>>
>>> You do realise how much of a massive overkill this is and how much you
>>> are overspending?
>>
>>
>> I appreciate the comments and feedback. I'd also appreciate
>> alternative suggestions in addition to what you have contributed so
>> far. Spec out the box you would build.
>
> $1200, and I'll run any benchmarks you'd like to see:
>
> http://secure.newegg.com/WishList/PublicWishDetail.aspx?WishListNumber=8441629
>
>
> This box is really only for backups, so no fancy CPU. The sub-$100
> celeron seems to not impact ZFS performance a bit. It does have ECC
> memory, and a fancy "server" mainboard.

That's pretty neat. Especially given it has 4x1TB of disks.

For my needs, I'd like a bigger case and PSU: $720 without HDD.

https://secure.newegg.com/WishList/MySavedWishDetail.aspx?ID=8918889

My system will have a minimum of 8 SATA devices (5 for ZFS, 2 for the
gmirror'd OS, and 1 for the optical drive). Thus, I'd still need to buy
another SATA controller on top of the above.

Thank you.

Daniel O'Connor

unread,
Feb 14, 2010, 8:24:55 PM2/14/10
to Dan Langille, freebsd...@freebsd.org
On Mon, 15 Feb 2010, Dan Langille wrote:
> > I priced a decent ZFS PC for a small business and it was AUD$2500
> > including the disks (5x750Gb), case, PSU etc..
>
> Yes, and this one doesn't yet have HDD.
>
> Can you supply details of your system?

1 AP400791A 4U Rackmount chassis (no PSU)
1 MB455SPF 5 drive hot swap bay (in 3x5.25")
5 HAWD7502ABYS WD 750Gb 24x7 RAID
1 GA-MA770T-UD3P Gigabyte AMD770T AM3 motherboard
1 CPAP-965 AMD PhenomII X4 AM2+/3
2 MEK-4G1333D3D4R Kingston 4Gb DDR3/1333 ECC RAM
1 PSS-PSR700 Seasonic 700W PSU
1 VCMS4350-D512H Radeon 4350 PCIe video card
1 FMCFP4G 4Gb CF card
1 n/a CF to IDE adapter

Note that I haven't actually built it yet, I don't expect any problems
though.

I built a much cheaper version (non hot swap) at home using a Gigabyte
GA-MA785GM-US2H, Athlon II X2 240 2.8GHz, 4Gb DDR2 RAM and 5 1Tb WD
drives in an Antec NineHundred case. It boots of a CF card too, but has
onboard video and only a 400W PSU (which is probably overkill, steady
state draw was ~110W)

signature.asc

Alexander Motin

unread,
Feb 15, 2010, 1:57:01 AM2/15/10
to Dan Langille, FreeBSD-STABLE Mailing List, Dan Naumov
Dan Langille wrote:
> Dan Naumov wrote:
>> Now add an
>> additional PCI-E SATA controller card, like the often mentioned PCIE
>> SIL3124.
>
> http://www.newegg.com/Product/Product.aspx?Item=N82E16816124026 for $35

This is PCI-X version. Unless you have PCI-X slot, PCIe x1 version seems
preferable: http://www.newegg.com/Product/Product.aspx?Item=N82E16816124027

--
Alexander Motin

Alexander Motin

unread,
Feb 15, 2010, 1:33:04 AM2/15/10
to Dan Langille, FreeBSD Stable, Steve Polyack

Not exactly what I meant, as it is a Mac version, but yes. At least such
controllers exist. May be they also could be found with internal SATA.

--
Alexander Motin

Dmitry Morozovsky

unread,
Feb 15, 2010, 3:01:33 AM2/15/10
to Dan Naumov, FreeBSD-STABLE Mailing List, Dan Langille
On Mon, 15 Feb 2010, Dan Naumov wrote:

DN> >> PSU: Corsair 400CX 80+ - 59 euro -
DN> >
DN> >> http://www.corsair.com/products/cx/default.aspx
DN> >
DN> > http://www.newegg.com/Product/Product.aspx?Item=N82E16817139008 for $50
DN> >
DN> > Is that sufficient power up to 10 SATA HDD and an optical drive?
DN>
DN> Disk power use varies from about 8 watt/disk for "green" disks to 20
DN> watt/disk for really powerhungry ones. So yes.

The only thing one should be aware that startup current on contemporary 3.5
SATA disks would exceed 2.5A on 12V buse, so delaying plate startup is rather
vital.

Or get 500-520 VA PSU to be sure. Or do both just to be on the safe side ;-)

--
Sincerely,
D.Marck [DM5020, MCK-RIPE, DM3-RIPN]
[ FreeBSD committer: ma...@FreeBSD.org ]
------------------------------------------------------------------------
*** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- ma...@rinet.ru ***
------------------------------------------------------------------------

Alexander Leidinger

unread,
Feb 15, 2010, 2:57:10 AM2/15/10
to Dan Naumov, FreeBSD-STABLE Mailing List, d...@langille.org
Quoting Dan Naumov <dan.n...@gmail.com> (from Mon, 15 Feb 2010
01:10:49 +0200):

> Get a dock for holding 2 x 2,5" disks in a single 5,25" slot and put
> it at the top, in the only 5,25" bay of the case. Now add an
> additional PCI-E SATA controller card, like the often mentioned PCIE
> SIL3124. Now you have 2 x 2,5" disk slots and 8 x 3,5" disk slots,
> with 6 native SATA ports on the motherboard and more ports on the
> controller card. Now get 2 x 80gb Intel SSDs and put them into the
> dock. Now partition each of them in the following fashion:
>
> 1: swap: 4-5gb
> 2: freebsd-zfs: ~10-15gb for root filesystem
> 3: freebsd-zfs: rest of the disk: dedicated L2ARC vdev

If you already have 2 SSDs I suggest to make 4 partitions. The
additional one for the ZIL (decide yourself what you want to speed up
"more" and size the L2ARC and ZIL partitions accordingly...). This
should speed up write operations. The ZIL one should be zfs mirrored,
because the ZIL is more sensitive to disk failures than the L2ARC:
zpool add <pool> log mirror <SSD1pX> <SSD2pX>

> GMirror your SSD swap partitions.
> Make a ZFS mirror pool out of your SSD root filesystem partitions
> Build your big ZFS pool however you like out of the mechanical disks
> you have.
> Add the 2 x ~60gb partitions as dedicated independant L2ARC devices
> for your SATA disk ZFS pool.

BTW, the cheap way of doing something like this is to add a USB memory
stick as L2ARC:
http://www.leidinger.net/blog/2010/02/10/making-zfs-faster/
This will not give you the speed boost of a real SSD attached via
SATA, but for the price (maybe you even got the memory stick for free
somewhere) you can not get something better.

Bye,
Alexander.

--
Crito, I owe a cock to Asclepius; will you remember to pay the debt?
-- Socrates' last words

http://www.Leidinger.net Alexander @ Leidinger.net: PGP ID = B0063FE7
http://www.FreeBSD.org netchild @ FreeBSD.org : PGP ID = 72077137

Wes Morgan

unread,
Feb 15, 2010, 3:04:51 AM2/15/10
to Dmitry Morozovsky, FreeBSD Stable, Dan Langille
On Mon, 15 Feb 2010, Dmitry Morozovsky wrote:

> On Sun, 14 Feb 2010, Dan Langille wrote:
>

> [snip]
>
> DL> > SAS controller ($120):
> DL> > http://www.buy.com/prod/supermicro-lsi-megaraid-lsisas1068e-8-port-sas-raid-controller-16mb/q/loc/101/207929556.html
> DL> > Note: You'll need to change or remove the mounting bracket since it is
> DL> > "backwards". I was able to find a bracket with matching screw holes on an
> DL> > old nic and secure it to my case. It uses the same chipset as the more
> DL> > expensive 3081E-R, if I remember correctly.
> DL>
> DL> I follow what you say, but cannot comprehend why the bracket is backwards.
>
> It's because IO slot is ot the other side of the bracked, like good old ISA

Yeah. Mirror image would be a more accurate description. I'm surprised I
had an ISA card that matched up with the mounting holes. Supermicro calls
it "UIO".

Dmitry Morozovsky

unread,
Feb 15, 2010, 2:55:17 AM2/15/10
to Dan Langille, FreeBSD Stable, Wes Morgan
On Sun, 14 Feb 2010, Dan Langille wrote:

[snip]

DL> > SAS controller ($120):
DL> > http://www.buy.com/prod/supermicro-lsi-megaraid-lsisas1068e-8-port-sas-raid-controller-16mb/q/loc/101/207929556.html
DL> > Note: You'll need to change or remove the mounting bracket since it is
DL> > "backwards". I was able to find a bracket with matching screw holes on an
DL> > old nic and secure it to my case. It uses the same chipset as the more
DL> > expensive 3081E-R, if I remember correctly.
DL>
DL> I follow what you say, but cannot comprehend why the bracket is backwards.

It's because IO slot is ot the other side of the bracked, like good old ISA

Dan Naumov

unread,
Feb 15, 2010, 3:49:47 AM2/15/10
to FreeBSD-STABLE Mailing List
> I had a feeling someone would bring up L2ARC/cache devices. This gives
> me the opportunity to ask something that's been on my mind for quite
> some time now:
>
> Aside from the capacity different (e.g. 40GB vs. 1GB), is there a
> benefit to using a dedicated RAM disk (e.g. md(4)) to a pool for
> L2ARC/cache? The ZFS documentation explicitly states that cache
> device content is considered volatile.

Using a ramdisk as an L2ARC vdev doesn't make any sense at all. If you
have RAM to spare, it should be used by regular ARC.


- Sincerely,
Dan Naumov

Jeremy Chadwick

unread,
Feb 15, 2010, 3:30:27 AM2/15/10
to freebsd...@freebsd.org

I had a feeling someone would bring up L2ARC/cache devices. This gives


me the opportunity to ask something that's been on my mind for quite
some time now:

Aside from the capacity different (e.g. 40GB vs. 1GB), is there a
benefit to using a dedicated RAM disk (e.g. md(4)) to a pool for
L2ARC/cache? The ZFS documentation explicitly states that cache
device content is considered volatile.

Example:

# zpool status storage
pool: storage
state: ONLINE
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
storage ONLINE 0 0 0
mirror ONLINE 0 0 0
ad10 ONLINE 0 0 0
ad14 ONLINE 0 0 0

errors: No known data errors

# mdconfig -a -t malloc -o reserve -s 256m -u 16
# zpool add storage cache md16
# zpool status storage
pool: storage
state: ONLINE
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
storage ONLINE 0 0 0
mirror ONLINE 0 0 0
ad10 ONLINE 0 0 0
ad14 ONLINE 0 0 0
cache
md16 ONLINE 0 0 0


And removal:

# zpool remove storage md16
# mdconfig -d -u 16
#

--
| Jeremy Chadwick j...@parodius.com |
| Parodius Networking http://www.parodius.com/ |
| UNIX Systems Administrator Mountain View, CA, USA |
| Making life hard for others since 1977. PGP: 4BD6C0CB |

Ulf Zimmermann

unread,
Feb 15, 2010, 3:56:55 AM2/15/10
to Dan Langille, FreeBSD-STABLE Mailing List, Dan Naumov
On Sun, Feb 14, 2010 at 07:33:07PM -0500, Dan Langille wrote:
> >Get a dock for holding 2 x 2,5" disks in a single 5,25" slot and put
> >it at the top, in the only 5,25" bay of the case.
>
> That sounds very interesting. I just looking around for such a thing,
> and could not find it. Is there a more specific name? URL?

I had an Addonics 5.25" frame for 4x 2.5" SAS/SATA but the small fans in it
are unfortunatly of the cheap kind. I ended up using the 2x2.5" to 3.5"
frame from Silverstone (for the small Silverstone case I got).

--
Regards, Ulf.

---------------------------------------------------------------------
Ulf Zimmermann, 1525 Pacific Ave., Alameda, CA-94501, #: 510-865-0204
You can find my resume at: http://www.Alameda.net/~ulf/resume.html

Jeremy Chadwick

unread,
Feb 15, 2010, 4:07:56 AM2/15/10
to freebsd...@freebsd.org

...except that it's already been proven on FreeBSD that the ARC getting
out of control can cause kernel panics[1], horrible performance until
ZFS has had its active/inactive lists flushed[2], and brings into
question how proper tuning is to be established and what the effects are
on the rest of the system[3]. There are still reports of people
disabling ZIL "for stability reasons" as well.

My thought process basically involves "getting rid" of the ARC and using
L2ARC entirely, given that it provides more control/containment which
cannot be achieved on FreeBSD (see above). In English: I'd trust a
whole series of md(4) disks (with sizes that I choose) over something
"variable/dynamic" which cannot be controlled or managed effectively.

The "Internals" section of Brendan Gregg's blog[4] outlines where the
L2ARC sits in the scheme of things, or if the ARC could essentially
be disabled by setting the minimum size to something very small (a few
megabytes) and instead using L2ARC which is manageable.

[1]: http://lists.freebsd.org/pipermail/freebsd-questions/2010-January/211009.html
[2]: http://lists.freebsd.org/pipermail/freebsd-stable/2010-January/053949.html
[3]: http://lists.freebsd.org/pipermail/freebsd-stable/2010-February/055073.html
[4]: http://blogs.sun.com/brendan/entry/test

Alexander Leidinger

unread,
Feb 15, 2010, 4:50:00 AM2/15/10
to Jeremy Chadwick, freebsd...@freebsd.org

Quoting Jeremy Chadwick <fre...@jdc.parodius.com> (from Mon, 15 Feb
2010 01:07:56 -0800):

> On Mon, Feb 15, 2010 at 10:49:47AM +0200, Dan Naumov wrote:
>> > I had a feeling someone would bring up L2ARC/cache devices. This gives
>> > me the opportunity to ask something that's been on my mind for quite
>> > some time now:
>> >
>> > Aside from the capacity different (e.g. 40GB vs. 1GB), is there a
>> > benefit to using a dedicated RAM disk (e.g. md(4)) to a pool for
>> > L2ARC/cache? The ZFS documentation explicitly states that cache
>> > device content is considered volatile.
>>
>> Using a ramdisk as an L2ARC vdev doesn't make any sense at all. If you
>> have RAM to spare, it should be used by regular ARC.
>
> ...except that it's already been proven on FreeBSD that the ARC getting
> out of control can cause kernel panics[1], horrible performance until

There are other ways (not related to ZFS) to shoot into your feet too,
I'm tempted to say that this is
a) a documentation bug
and
b) a lack of sanity checking of the values... anyone out there with
a good algorithm for something like this?

Normally you do some testing with the values you use, so once you
resolved the issues, the system should be stable.

> ZFS has had its active/inactive lists flushed[2], and brings into

Someone needs to sit down and play a little bit with ways to tell the
ARC that there is free memory. The mail you reference already tells
that the inactive/cached lists should maybe taken into account too (I
didn't had a look at this part of the ZFS code).

> question how proper tuning is to be established and what the effects are
> on the rest of the system[3]. There are still reports of people

That's what I talk about regarding b) above. If you specify an arc_max
which is too big (arc_max > kmem_size - SOME_SAVE_VALUE), there should
be a message from the kernel and the value should be adjusted to a
save amount.

Until the problems are fixed, a MD for L2ARC may be a viable
alternative (if you have enough mem to give for this). Feel free to
provide benchmark numbers, but in general I see this just as a
workaround for the current issues.

> disabling ZIL "for stability reasons" as well.

For the ZIL you definitively do not want to have a MD. If you do not
specify a log vdev for the pool, the ZIL will be written somewhere on
the disks of the pool. When the data hits the ZIL, it has to be really
on a non-volatile storage. If you lose the ZIL, you lose data.

> The "Internals" section of Brendan Gregg's blog[4] outlines where the
> L2ARC sits in the scheme of things, or if the ARC could essentially
> be disabled by setting the minimum size to something very small (a few
> megabytes) and instead using L2ARC which is manageable.

At least in 7-stable, 8-stable and 9-current, the arc_max now really
corresponds to a max value, so it is more of providing a save arc_max
than a minimal arc_max. No matter how you construct the L2ARC, ARC
access will be faster than L2ARC access.

Bye,
Alexander.

--
BOFH excuse #439:

Hot Java has gone cold

Jeremy Chadwick

unread,
Feb 15, 2010, 7:27:44 AM2/15/10
to freebsd...@freebsd.org
On Mon, Feb 15, 2010 at 10:50:00AM +0100, Alexander Leidinger wrote:
> Quoting Jeremy Chadwick <fre...@jdc.parodius.com> (from Mon, 15 Feb
> 2010 01:07:56 -0800):
>
> >On Mon, Feb 15, 2010 at 10:49:47AM +0200, Dan Naumov wrote:
> >>> I had a feeling someone would bring up L2ARC/cache devices. This gives
> >>> me the opportunity to ask something that's been on my mind for quite
> >>> some time now:
> >>>
> >>> Aside from the capacity different (e.g. 40GB vs. 1GB), is there a
> >>> benefit to using a dedicated RAM disk (e.g. md(4)) to a pool for
> >>> L2ARC/cache? The ZFS documentation explicitly states that cache
> >>> device content is considered volatile.
> >>
> >>Using a ramdisk as an L2ARC vdev doesn't make any sense at all. If you
> >>have RAM to spare, it should be used by regular ARC.
> >
> >...except that it's already been proven on FreeBSD that the ARC getting
> >out of control can cause kernel panics[1], horrible performance until

First and foremost, sorry for the long post. I tried to keep it short,
but sometimes there's just a lot to be said.

> There are other ways (not related to ZFS) to shoot into your feet
> too, I'm tempted to say that this is
> a) a documentation bug
> and
> b) a lack of sanity checking of the values... anyone out there with
> a good algorithm for something like this?
>
> Normally you do some testing with the values you use, so once you
> resolved the issues, the system should be stable.

What documentation? :-) The Wiki? If so, that's been outdated for
some time; I know Ivan Voras was doing his best to put good information
there, but it's hard given the below chaos.

The following tunables are recurrently mentioned as focal points, but no
one's explained in full how to tune these "properly", or which does what
(perfect example: vm.kmem_size_max vs. vm.kmem_size. _max used to be
what you'd adjust to solve kmem exhaustion issues, but now people are
saying otherwise?). I realise it may differ per system (given how much
RAM the system has), so different system configurations/examples would
need to be provided. I realise that the behaviour of some have changed
too (e.g. -RELEASE differs from -STABLE, and 7.x differs from 8.x).
I've marked commonly-referred-to tunables with an asterisk:

kern.maxvnodes
* vm.kmem_size
* vm.kmem_size_max
* vfs.zfs.arc_min
* vfs.zfs.arc_max
vfs.zfs.prefetch_disable (auto-tuned based on available RAM on 8-STABLE)
vfs.zfs.txg.timeout
vfs.zfs.vdev.cache.size
vfs.zfs.vdev.cache.bshift
vfs.zfs.vdev.max_pending
vfs.zfs.zil_disable

Then, when it comes to debugging problems as a result of tuning
improperly (or entire lack of), the following counters (not tunables)
are thrown into the mix as "things people should look at":

kstat.zfs.misc.arcstats.c
kstat.zfs.misc.arcstats.c_min
kstat.zfs.misc.arcstats.c_max
kstat.zfs.misc.arcstats.evict_skip
kstat.zfs.misc.arcstats.memory_throttle_count
kstat.zfs.misc.arcstats.size

None of these have sysctl descriptions (sysctl -d) either. I can
provide posts to freebsd-stable, freebsd-current, freebsd-fs, or
freebsd-questions, or freebsd-users referencing these variables or
counters if you need context.

All that said:

I would be more than happy to write some coherent documentation that
folks could refer to "officially", but rather than spend my entire
lifetime reverse-engineering the ZFS code I think it'd make more sense
to get some official parties involved to explain things.

I'd like to add some kind of monitoring section as well -- how
administrators could keep an eye on things and detect, semi-early, if
additional tuning is required or something along those lines.

> >ZFS has had its active/inactive lists flushed[2], and brings into
>
> Someone needs to sit down and play a little bit with ways to tell
> the ARC that there is free memory. The mail you reference already
> tells that the inactive/cached lists should maybe taken into account
> too (I didn't had a look at this part of the ZFS code).
>
> >question how proper tuning is to be established and what the effects are
> >on the rest of the system[3]. There are still reports of people
>
> That's what I talk about regarding b) above. If you specify an
> arc_max which is too big (arc_max > kmem_size - SOME_SAVE_VALUE),
> there should be a message from the kernel and the value should be
> adjusted to a save amount.
>
> Until the problems are fixed, a MD for L2ARC may be a viable
> alternative (if you have enough mem to give for this). Feel free to
> provide benchmark numbers, but in general I see this just as a
> workaround for the current issues.

I've played with this a bit (2-disk mirror + one 256MB md), but I'm not
completely sure how to read the bonnie++ results, nor am I sure I'm
using the right arguments (bonnie++ -s8192 -n64 -d/pool on a machine
that has 4GB).

L2ARC ("cache" vdev) is supposed to improve random reads, while a "log"
vdev (presumably something that links in with the ZIL) improves random
writes. I'm not sure where bonnie++ tests random reads, but I do see it
testing random seeks.

> >disabling ZIL "for stability reasons" as well.
>
> For the ZIL you definitively do not want to have a MD. If you do not
> specify a log vdev for the pool, the ZIL will be written somewhere
> on the disks of the pool. When the data hits the ZIL, it has to be
> really on a non-volatile storage. If you lose the ZIL, you lose
> data.

Thanks for the clarification here. In my case, I never disable the ZIL.
I never have and I never will given the above risk. However there's
lots of folks who advocate doing this because they have systems which
crash if they don't. I've never understood how/why that is (I've never
seen the ZIL responsible for any crash I've witnessed either).

> >The "Internals" section of Brendan Gregg's blog[4] outlines where the
> >L2ARC sits in the scheme of things, or if the ARC could essentially
> >be disabled by setting the minimum size to something very small (a few
> >megabytes) and instead using L2ARC which is manageable.
>
> At least in 7-stable, 8-stable and 9-current, the arc_max now really
> corresponds to a max value, so it is more of providing a save
> arc_max than a minimal arc_max.

Ahh, that might explain this semi-old post where a user was stating that
arc_max didn't appear to really be a hard limit, but just some kind of
high water mark.

> No matter how you construct the L2ARC, ARC access will be faster than
> L2ARC access.

Yes, based on Brendan's blog, I can see how that'd be the case; there'd
be some added overhead given the design/placement of L2ARC.

The options as I see them are (a)) figure out some *reliable* way to
describe to folks how to tune their systems to not experience ARC or
memory exhaustion related issues, or (b) utilise L2ARC exclusively and
set the ARC (arc_max) to something fairly small.

Dan Langille

unread,
Feb 15, 2010, 9:34:40 AM2/15/10
to u...@alameda.net, FreeBSD-STABLE Mailing List, Dan Naumov
Ulf Zimmermann wrote:
> On Sun, Feb 14, 2010 at 07:33:07PM -0500, Dan Langille wrote:
>>> Get a dock for holding 2 x 2,5" disks in a single 5,25" slot and put
>>> it at the top, in the only 5,25" bay of the case.
>> That sounds very interesting. I just looking around for such a thing,
>> and could not find it. Is there a more specific name? URL?
>
> I had an Addonics 5.25" frame for 4x 2.5" SAS/SATA but the small fans in it
> are unfortunatly of the cheap kind. I ended up using the 2x2.5" to 3.5"
> frame from Silverstone (for the small Silverstone case I got).

Ahh, something like this:

http://silverstonetek.com/products/p_contents.php?pno=SDP08&area=usa

I understand now. Thank you.

Alexander Leidinger

unread,
Feb 15, 2010, 10:11:05 AM2/15/10
to Jeremy Chadwick, freebsd...@freebsd.org
Quoting Jeremy Chadwick <fre...@jdc.parodius.com> (from Mon, 15 Feb
2010 04:27:44 -0800):

> On Mon, Feb 15, 2010 at 10:50:00AM +0100, Alexander Leidinger wrote:
>> Quoting Jeremy Chadwick <fre...@jdc.parodius.com> (from Mon, 15 Feb
>> 2010 01:07:56 -0800):
>>
>> >On Mon, Feb 15, 2010 at 10:49:47AM +0200, Dan Naumov wrote:
>> >>> I had a feeling someone would bring up L2ARC/cache devices. This gives
>> >>> me the opportunity to ask something that's been on my mind for quite
>> >>> some time now:
>> >>>
>> >>> Aside from the capacity different (e.g. 40GB vs. 1GB), is there a
>> >>> benefit to using a dedicated RAM disk (e.g. md(4)) to a pool for
>> >>> L2ARC/cache? The ZFS documentation explicitly states that cache
>> >>> device content is considered volatile.
>> >>
>> >>Using a ramdisk as an L2ARC vdev doesn't make any sense at all. If you
>> >>have RAM to spare, it should be used by regular ARC.
>> >
>> >...except that it's already been proven on FreeBSD that the ARC getting
>> >out of control can cause kernel panics[1], horrible performance until
>
> First and foremost, sorry for the long post. I tried to keep it short,
> but sometimes there's just a lot to be said.

And sometimes a shorter answer takes longer...

>> There are other ways (not related to ZFS) to shoot into your feet
>> too, I'm tempted to say that this is
>> a) a documentation bug
>> and
>> b) a lack of sanity checking of the values... anyone out there with
>> a good algorithm for something like this?
>>
>> Normally you do some testing with the values you use, so once you
>> resolved the issues, the system should be stable.
>
> What documentation? :-) The Wiki? If so, that's been outdated for

Hehe... :)

> some time; I know Ivan Voras was doing his best to put good information
> there, but it's hard given the below chaos.

Do you want write access to it (in case you haven't, I didn't check)?

> The following tunables are recurrently mentioned as focal points, but no
> one's explained in full how to tune these "properly", or which does what
> (perfect example: vm.kmem_size_max vs. vm.kmem_size. _max used to be
> what you'd adjust to solve kmem exhaustion issues, but now people are
> saying otherwise?). I realise it may differ per system (given how much
> RAM the system has), so different system configurations/examples would
> need to be provided. I realise that the behaviour of some have changed
> too (e.g. -RELEASE differs from -STABLE, and 7.x differs from 8.x).
> I've marked commonly-referred-to tunables with an asterisk:

It can also be that some people just tell something without really
knowing what they say (based upon some kind of observed evidence, not
because of being a bad guy).

> kern.maxvnodes

Needs to be tuned if you run out of vnodes... ok, this is obvious. I
do not know how it will show up (panic or graceful error handling,
e.g. ENOMEM).

> * vm.kmem_size
> * vm.kmem_size_max

I tried kmem_size_max on -current (this year), and I got a panic
during use, I changed kmem_size to the same value I have for _max and
it didn't panic anymore. It looks (from mails on the lists) that _max
is supposed to give a max value for auto-enhancement, but at least it
was not working with ZFS last month (and I doubt it works now).

> * vfs.zfs.arc_min
> * vfs.zfs.arc_max

_min = minimum even when the system is running out of memory (the ARC
gives back memory if other parts of the kernel need it).
_max = maximum (with a recent ZFS on 7/8/9 (7.3 will have it, 8.1 will
have it too) I've never seen the size exceed the _max anymore)

> vfs.zfs.prefetch_disable (auto-tuned based on available RAM on 8-STABLE)
> vfs.zfs.txg.timeout

It looks like the txg is just a workaround. I've read a little bit in
Brendan's blog and it seems they noticed the periodic writes too (with
the nice graphical performance monitoring of OpenStorage) and they are
investigating the issue. It looks like we are more affected by this
(for whatever reason). What it is doing (attention, this is an
observation, not a technical description of code I've read!) seems to
be to write out data to the disks more early (and thus there is less
data to write -> less blocking to notice).

> vfs.zfs.vdev.cache.size
> vfs.zfs.vdev.cache.bshift
> vfs.zfs.vdev.max_pending

Uhm... this smells like you got it out of one of my posts where I told
that I experiment with this on a system. I can tell you that I have no
system with this tuned anymore, tuning kmem_size (and KVA_PAGES during
kernel compile) has a bigger impact.

> vfs.zfs.zil_disable

What it does should be obvious. IMHO this should not help much
regarding stability (changing kmem_size should give a bigger impact).
As don't know what was tested on systems where this is disabled, I
want to highlight the "IMHO" in the sentence before...

> Then, when it comes to debugging problems as a result of tuning
> improperly (or entire lack of), the following counters (not tunables)
> are thrown into the mix as "things people should look at":
>
> kstat.zfs.misc.arcstats.c
> kstat.zfs.misc.arcstats.c_min
> kstat.zfs.misc.arcstats.c_max

c_max is vfs.zfs.arc_max, c_min is vfs.zfs.arc_min.

> kstat.zfs.misc.arcstats.evict_skip
> kstat.zfs.misc.arcstats.memory_throttle_count
> kstat.zfs.misc.arcstats.size

I'm not very sure about size and c... both represent some kind of
current size, but they are not the same.


About the tuning I would recommend to depend upon a more human
readable representation. I've seen someone posting something like
this, but I do not know how it was generated (some kind of script, but
I do not know where to get it).

> All that said:
>
> I would be more than happy to write some coherent documentation that
> folks could refer to "officially", but rather than spend my entire
> lifetime reverse-engineering the ZFS code I think it'd make more sense
> to get some official parties involved to explain things.

http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide

> I'd like to add some kind of monitoring section as well -- how
> administrators could keep an eye on things and detect, semi-early, if
> additional tuning is required or something along those lines.
>
>> >ZFS has had its active/inactive lists flushed[2], and brings into
>>
>> Someone needs to sit down and play a little bit with ways to tell
>> the ARC that there is free memory. The mail you reference already
>> tells that the inactive/cached lists should maybe taken into account
>> too (I didn't had a look at this part of the ZFS code).
>>
>> >question how proper tuning is to be established and what the effects are
>> >on the rest of the system[3]. There are still reports of people
>>
>> That's what I talk about regarding b) above. If you specify an
>> arc_max which is too big (arc_max > kmem_size - SOME_SAVE_VALUE),
>> there should be a message from the kernel and the value should be
>> adjusted to a save amount.
>>
>> Until the problems are fixed, a MD for L2ARC may be a viable
>> alternative (if you have enough mem to give for this). Feel free to
>> provide benchmark numbers, but in general I see this just as a
>> workaround for the current issues.
>
> I've played with this a bit (2-disk mirror + one 256MB md), but I'm not
> completely sure how to read the bonnie++ results, nor am I sure I'm
> using the right arguments (bonnie++ -s8192 -n64 -d/pool on a machine
> that has 4GB).
>
> L2ARC ("cache" vdev) is supposed to improve random reads, while a "log"

It is supposed to improve random reads, if the working set is in the cache...

> vdev (presumably something that links in with the ZIL) improves random
> writes. I'm not sure where bonnie++ tests random reads, but I do see it

It is not supposed to improve random writes, it is supposed to improve
direct writes (man 2 open, search for O_FSYNC... in Solaris it is
O_DSYNC).

> testing random seeks.

[...]

> The options as I see them are (a)) figure out some *reliable* way to
> describe to folks how to tune their systems to not experience ARC or
> memory exhaustion related issues, or (b) utilise L2ARC exclusively and
> set the ARC (arc_max) to something fairly small.

I would prefer a) together with some more sanity checking when
changing the values. :)

It is just that it is not easy to come up with a correct sanity checking...

Bye,
Alexander.

--
If sarcasm were posted on Usenet, would anybody notice?
-- James Nicoll

It is loading more messages.
0 new messages