dual 400 -> dual 600 worth it?

Alfred Perlstein

unread,

Dec 1, 1999, 3:00:00 AM12/1/99

to

Anyone here do a buildworld with dual PIII-600s?

I'd like to know if dual 600s are worth the investment,
right now i have 2x400 but unsure if it's worth it to
spend almost 1k for this upgrade.

thanks,
-Alfred

To Unsubscribe: send mail to majo...@FreeBSD.org
with "unsubscribe freebsd-chat" in the body of the message

Doug Barton

unread,

Dec 1, 1999, 3:00:00 AM12/1/99

to

On Wed, 1 Dec 1999, Alfred Perlstein wrote:

>
> Anyone here do a buildworld with dual PIII-600s?
>
> I'd like to know if dual 600s are worth the investment,
> right now i have 2x400 but unsure if it's worth it to
> spend almost 1k for this upgrade.

We have some machines here with dual 550's and 1/2G of ram. I can
do a full make world in c. 57 minutes (from memory, but that's pretty
accurate). Here are my make.conf settings:

CFLAGS= -O -pipe
NOPROFILE= true
COPTFLAGS= -O -pipe
USA_RESIDENT= YES
TOP_TABLE_SIZE= 32503

Other than that I just do a straight 'make -j4 -DNOCLEAN world'. That's
with both directories new and clean. If I'm doing a second make world I
can get it done in just under 42 minutes. *grin*

HTH,

Doug
--
"Welcome to the desert of the real."

- Laurence Fishburne as Morpheus, "The Matrix"

Daniel O'Connor

unread,

Dec 1, 1999, 3:00:00 AM12/1/99

to

On 01-Dec-99 Doug Barton wrote:
> We have some machines here with dual 550's and 1/2G of ram. I can
> do a full make world in c. 57 minutes (from memory, but that's pretty
> accurate). Here are my make.conf settings:

Well I think you're doing something wrong..

I have a dual PII-350 system with 128 meg of ram and crappy 5400 rpm IDE disks
and my fastest buildworld is 55 minutes..

Takes about 10 minutes for installworld..

---
Daniel O'Connor software and network engineer
for Genesis Software - http://www.gsoft.com.au
"The nice thing about standards is that there
are so many of them to choose from."
-- Andrew Tanenbaum

David Scheidt

unread,

Dec 2, 1999, 3:00:00 AM12/2/99

to

On Thu, 2 Dec 1999, Daniel O'Connor wrote:

>
> On 01-Dec-99 Doug Barton wrote:
> > We have some machines here with dual 550's and 1/2G of ram. I can
> > do a full make world in c. 57 minutes (from memory, but that's pretty
> > accurate). Here are my make.conf settings:
>
> Well I think you're doing something wrong..
>
> I have a dual PII-350 system with 128 meg of ram and crappy 5400 rpm IDE disks
> and my fastest buildworld is 55 minutes..

Indeed. My dual PII-400 256MB with LVD-SCSI disks does make -j8 -DNOGAMES
in ~50 minutes. Try upping the number of jobs until it slows down.

In answer to the original question, I don't think that going to dual 600MHz
processors is worth the money. If the money is buring a hole in your
pocket, sure, but otherwise, nope.

David Scheidt

Doug Barton

unread,

Dec 2, 1999, 3:00:00 AM12/2/99

to

On Wed, 1 Dec 1999, David Scheidt wrote:

> On Thu, 2 Dec 1999, Daniel O'Connor wrote:
>
> >
> > On 01-Dec-99 Doug Barton wrote:
> > > We have some machines here with dual 550's and 1/2G of ram. I can
> > > do a full make world in c. 57 minutes (from memory, but that's pretty
> > > accurate). Here are my make.conf settings:
> >
> > Well I think you're doing something wrong..
> >
> > I have a dual PII-350 system with 128 meg of ram and crappy 5400 rpm IDE disks
> > and my fastest buildworld is 55 minutes..

Well, I would lay money that my crappy IDE drives are crappier
than yours. :) Now that my project is "official" as opposed to
"experimental" I'm working on getting some better ones.

> Indeed. My dual PII-400 256MB with LVD-SCSI disks does make -j8 -DNOGAMES
> in ~50 minutes. Try upping the number of jobs until it slows down.

Yeah, the new box I'm evaluating has SCA LVD SCSI, and it goes a
lot faster. I'm compiling -Stable and so far -j 6, 8 and 12 have all
crashed, while the same exact sources compiled without -j just fine. (More
to come with that on freebsd-stable when then -j 2 test is done). I didn't
pay too much attention at the time, but IIRC the no -j make world
completed in just under 50 minutes. I didn't pay too much attention 'cuz I
thought -j would work and be much faster.

Thanks,

Doug
--
"Welcome to the desert of the real."

- Laurence Fishburne as Morpheus, "The Matrix"

To Unsubscribe: send mail to majo...@FreeBSD.org

Sean-Paul Rees

unread,

Dec 2, 1999, 3:00:00 AM12/2/99

to

David Scheidt wrote:

> Indeed. My dual PII-400 256MB with LVD-SCSI disks does make -j8 -DNOGAMES
> in ~50 minutes. Try upping the number of jobs until it slows down.

I did a buildworld (just straight make buildworld- no funny options in
make.conf) on a uniprocessor PII 300, 192MB RAM, and UltraWide SCSI
disks, and it finished in 1 hour 18 minutes 32.69 seconds (~78.5min).
This is in full multiuser in regular operation, although its really not
under any load. I'm considering adding another this left over 300 here,
just gotta buy a VRM.

Cheers,
Sean

Alfred Perlstein

unread,

Dec 2, 1999, 3:00:00 AM12/2/99

to

On Wed, 1 Dec 1999, Doug Barton wrote:

> On Wed, 1 Dec 1999, Alfred Perlstein wrote:
>
> >
> > Anyone here do a buildworld with dual PIII-600s?
> >
> > I'd like to know if dual 600s are worth the investment,
> > right now i have 2x400 but unsure if it's worth it to
> > spend almost 1k for this upgrade.
>

> We have some machines here with dual 550's and 1/2G of ram. I can
> do a full make world in c. 57 minutes (from memory, but that's pretty
> accurate). Here are my make.conf settings:
>

> CFLAGS= -O -pipe
> NOPROFILE= true
> COPTFLAGS= -O -pipe
> USA_RESIDENT= YES
> TOP_TABLE_SIZE= 32503
>
> Other than that I just do a straight 'make -j4 -DNOCLEAN world'. That's
> with both directories new and clean. If I'm doing a second make world I
> can get it done in just under 42 minutes. *grin*

How are your filesystems mounted/what kind of disks?

Personally I've found that a higher -j can work even better, something
like 8 or even 12 works pretty good. (depends if I want to play mp3s
while building or not :) )

If you're not using softupdates or your FS's are mounted normally
how well do you fare with 'async'? With a clean slate (without NOCLEAN)?

thanks,
-Alfred

Matthew D. Fuller

unread,

Dec 2, 1999, 3:00:00 AM12/2/99

to

On Wed, Dec 01, 1999 at 11:29:41PM -0800, a little birdie told me
that Alfred Perlstein remarked

> On Wed, 1 Dec 1999, Doug Barton wrote:
> >
> > Other than that I just do a straight 'make -j4 -DNOCLEAN world'. That's
> > with both directories new and clean. If I'm doing a second make world I
> > can get it done in just under 42 minutes. *grin*
>
> How are your filesystems mounted/what kind of disks?
>
> Personally I've found that a higher -j can work even better, something
> like 8 or even 12 works pretty good. (depends if I want to play mp3s
> while building or not :) )

As a data point, I just started a buildworld when I left work last night.
This is a dual PPro 200/512k, 256 meg RAM, /usr/src and /usr/obj on
seperate 7200 RPM UW SCSI drives on an aic7880 controller.
After the build:
/dev/da3s1e on /usr/obj (ufs, asynchronous, NFS exported, local, noatime,
nosuid, writes: sync 82 async 44465)
/dev/da4s1e on /usr/src (ufs, asynchronous, NFS exported, local, noatime,
nosuid, writes: sync 6 async 44600)

mortis:/usr/src
root% time nice +20 make -j6 buildworld
....
7111.969u 2208.024s 1:38:44.38 157.3% 1386+1422k 32924+4824io 5880pf+0w

Could probably push it a bit faster with a higher -j (8 or 10), but I
didn't. X was running at the time, along with Netscape, ~30 xterms, and
a few other sundry thingymabobs.

--
Matthew Fuller (MF4839) | full...@over-yonder.net
Unix Systems Administrator | full...@futuresouth.com
Specializing in FreeBSD | http://www.over-yonder.net/
FutureSouth Communications | ISPHelp ISP Consulting

"The only reason I'm burning my candle at both ends, is because I
haven't figured out how to light the middle yet"

will andrews

unread,

Dec 8, 1999, 3:00:00 AM12/8/99

to

On 02-Dec-99 Doug Barton wrote:
> Well, I would lay money that my crappy IDE drives are crappier
> than yours. :) Now that my project is "official" as opposed to
> "experimental" I'm working on getting some better ones.

My (single) PII-450 w/ 128MB SDRAM + U2W SCSI LVD 10000rpm disks covered a Nov.
16 -STABLE make world in about 1 hour, 15 minutes. That's _with_ softupdates
and without anything of value (CPU-wise) in the background other than rc5des.

And that's without a -j flag. I don't like -j. :-)

> Yeah, the new box I'm evaluating has SCA LVD SCSI, and it goes a
> lot faster. I'm compiling -Stable and so far -j 6, 8 and 12 have all

It _SHOULD_ go faster with SCSI as opposed to (E)IDE/UDMA/etc. Say, a dt of
about 10-20 minutes, depending on overall bus speed & other minor factors.

There is no such thing as a "good" non-SCSI controller. ;)

--
Will Andrews <and...@technologist.com>
GCS/E/S @d- s+:+>+:- a--->+++ C++ UB++++ P+ L- E--- W+++ !N !o ?K w---
?O M+ V-- PS+ PE++ Y+ PGP+>+++ t++ 5 X++ R+ tv+ b++>++++ DI+++ D+
G++>+++ e->++++ h! r-->+++ y?

Dag-Erling Smorgrav

unread,

Dec 8, 1999, 3:00:00 AM12/8/99

to

will andrews <and...@technologist.com> writes:
> On 02-Dec-99 Doug Barton wrote:
> > Yeah, the new box I'm evaluating has SCA LVD SCSI, and it goes a
> > lot faster. I'm compiling -Stable and so far -j 6, 8 and 12 have all
> It _SHOULD_ go faster with SCSI as opposed to (E)IDE/UDMA/etc.

Why, because "Scuzzy" is a cooler name than "Eye-dee-ee"? SCSI has
higher overhead than IDE, so for a single-disk system (or a two-disk
system, provided each is on a separate IDE bus), IDE wins (given
otherwise identical disks, of course).

DES
--
Dag-Erling Smorgrav - d...@flood.ping.uio.no

David Scheidt

unread,

Dec 8, 1999, 3:00:00 AM12/8/99

to

On 8 Dec 1999, Dag-Erling Smorgrav wrote:

> will andrews <and...@technologist.com> writes:
> > On 02-Dec-99 Doug Barton wrote:
> > > Yeah, the new box I'm evaluating has SCA LVD SCSI, and it goes a
> > > lot faster. I'm compiling -Stable and so far -j 6, 8 and 12 have all
> > It _SHOULD_ go faster with SCSI as opposed to (E)IDE/UDMA/etc.
>
> Why, because "Scuzzy" is a cooler name than "Eye-dee-ee"? SCSI has
> higher overhead than IDE, so for a single-disk system (or a two-disk
> system, provided each is on a separate IDE bus), IDE wins (given
> otherwise identical disks, of course).

Sun claims this about the Ultra 5 workstation. The problem with this theory
seems to be that "otherwise identical disks" don't seem to exist in IDE
disks. The ultra 5's have been no end of trouble with their disks, at least
until they get ultra-SCSI ones.

Dag-Erling Smorgrav

unread,

Dec 10, 1999, 3:00:00 AM12/10/99

to

David Scheidt <dsch...@enteract.com> writes:
> Sun claims this about the Ultra 5 workstation. The problem with this theory
> seems to be that "otherwise identical disks" don't seem to exist in IDE
> disks.

Yes, they do. Check the manufacturers' web sites if you don't believe
me. All but the most expensive models are available with IDE or ATAPI
interfaces (usually labeled N or A) as well as with various types of
SCSI interfaces (labeled S, W, LW etc.)

> disks. The ultra 5's have been no end of trouble with their disks, at least
> until they get ultra-SCSI ones.

Because Sun tried to shave a few bucks off the cost by equipping them
with bottom-of-the-line disks (Segate Medalist). If the Ultra 5
shipped with e.g. IBM DeskStar disks, there wouldn't be any trouble.

Terry Lambert

unread,

Dec 10, 1999, 3:00:00 AM12/10/99

to

> > > Yeah, the new box I'm evaluating has SCA LVD SCSI, and it goes a
> > > lot faster. I'm compiling -Stable and so far -j 6, 8 and 12 have all
> > It _SHOULD_ go faster with SCSI as opposed to (E)IDE/UDMA/etc.
>
> Why, because "Scuzzy" is a cooler name than "Eye-dee-ee"? SCSI has
> higher overhead than IDE, so for a single-disk system (or a two-disk
> system, provided each is on a separate IDE bus), IDE wins (given
> otherwise identical disks, of course).

FWIW, while the IDE specification supports tagged command queues
to allow more than one disk transaction to be outstanding, there
are no IDE drives currently available that support this (IBM has
run some in some labs, but there was no real interest in getting
them out, and I am told the project was scrapped for lack of
controller support on other than lab-based controllers).

This means that for server systems, A SCSI drive with a tagged
command queue depth of 128 (common on a number of IBM drives,
just to keep the vendor the same) can support 128 times as much
concurrency as an IDE drive, everything else about the drive
being equal.

It's constantly amazing to me that the same people who state
that FreeBSD should not go after the desktop and should not
have graphical logins and other destop workstation fluff, are
the same people who claim that IDE is as good as, or better
than, SCSI.

Perhaps for a single user workstation, IDE _is_ better than
SCSI. All of the benchmarks that claim this are non-concurrent,
after all, just like the one application likely to be running
at a time on a single user workstation.

For heavily loaded servers, howwever, there is absolutely no
comparison: SCSI wins because of concurrency, and latency for
single-user, single-threaded operations be damned.

PS: My SCSI-based, mirrored NOC disk array on my NOC is capable
of handling all of BEST Internet Inc.'s mail for a full month
in just under 48 hours... for in excess of 10,000 transiently
connected servers sending it ETRNs; what's your IDE based NOC
capable of?

Terry Lambert
te...@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.

Brett Glass

unread,

Dec 10, 1999, 3:00:00 AM12/10/99

to

At 07:13 PM 12/9/1999 , Terry Lambert wrote:

>It's constantly amazing to me that the same people who state
>that FreeBSD should not go after the desktop and should not
>have graphical logins and other destop workstation fluff, are
>the same people who claim that IDE is as good as, or better
>than, SCSI.
>
>Perhaps for a single user workstation, IDE _is_ better than
>SCSI. All of the benchmarks that claim this are non-concurrent,
>after all, just like the one application likely to be running
>at a time on a single user workstation.
>
>For heavily loaded servers, howwever, there is absolutely no
>comparison: SCSI wins because of concurrency, and latency for
>single-user, single-threaded operations be damned.

I think the real question is, "Can the host CPU provide that
concurrency as effectively as the embedded CPU in the drive?"

If it can at least get close, one probably won't see a significant
difference in performance; the host CPU will do the same scheduling
that the embedded processor would, and the raw hardware (the disk
and heads) will be used about as efficiently.

I honestly don't know how clever the controllers in Joe SCSI
Drive are, or how boneheaded (or smart!) a UNIX file system can
be as regards efficient use of the disk, so I'll admit that
I don't know if this is the way it works on a FreeBSD system.
I do suspect that in a RAID system, it might actually be *better*
to have IDE drives in the array and a SCSI interface to the
computer, since the RAID controller is expected to take
concurrency and head position into account. And the "I" in
RAID does stand for "inexpensive," which means that, these
days, it might as well stand for "IDE."

--Brett

Jamie Bowden

unread,

Dec 10, 1999, 3:00:00 AM12/10/99

to

On Fri, 10 Dec 1999, Brett Glass wrote:

:I do suspect that in a RAID system, it might actually be *better*

:to have IDE drives in the array and a SCSI interface to the
:computer, since the RAID controller is expected to take
:concurrency and head position into account. And the "I" in
:RAID does stand for "inexpensive," which means that, these
:days, it might as well stand for "IDE."

Inexpensive != Cheap

Jamie Bowden

--

"Of course, that's sort of like asking how other than Marketing, how Microsoft is different from any other software company..."
Kenneth G. Cavness

Brett Glass

unread,

Dec 10, 1999, 3:00:00 AM12/10/99

to

At 06:59 AM 12/10/1999 , Jamie Bowden wrote:

>Inexpensive != Cheap

Actually, RAID systems were conceived as a way of using *very*
cheap drives from companies you've probably never heard of -- such
as Kalok -- without worrying so much about failures. You knew that
these drives would fail every so often, but the odds that more than
one would fail at a time were tiny. So, buying a few more drives
at a greatly reduced cost (and getting more speed on top of that)
was a win.

--Brett

Jay Nelson

unread,

Dec 11, 1999, 3:00:00 AM12/11/99

to

On Fri, 10 Dec 1999, Terry Lambert wrote:

>> > > Yeah, the new box I'm evaluating has SCA LVD SCSI, and it goes a
>> > > lot faster. I'm compiling -Stable and so far -j 6, 8 and 12 have all
>> > It _SHOULD_ go faster with SCSI as opposed to (E)IDE/UDMA/etc.

[snip]

>This means that for server systems, A SCSI drive with a tagged
>command queue depth of 128 (common on a number of IBM drives,
>just to keep the vendor the same) can support 128 times as much
>concurrency as an IDE drive, everything else about the drive
>being equal.

This may be a stupid question, but would soft updates improve IDE
performance in relation to SCSI? Or would it simply block longer less
often?

-- Jay

Terry Lambert

unread,

Dec 11, 1999, 3:00:00 AM12/11/99

to

> >> > > Yeah, the new box I'm evaluating has SCA LVD SCSI, and it goes a
> >> > > lot faster. I'm compiling -Stable and so far -j 6, 8 and 12 have all
> >> > It _SHOULD_ go faster with SCSI as opposed to (E)IDE/UDMA/etc.
>
> [snip]
>
> >This means that for server systems, A SCSI drive with a tagged
> >command queue depth of 128 (common on a number of IBM drives,
> >just to keep the vendor the same) can support 128 times as much
> >concurrency as an IDE drive, everything else about the drive
> >being equal.
>
> This may be a stupid question, but would soft updates improve IDE
> performance in relation to SCSI? Or would it simply block longer less
> often?

Soft updates speak to the ability to stall dependent writes
until they either _must_ be done, or until they no longer are
relevent. It does this as a strategy for ordering the metadata
updates (other methods are DOW - Delayed Ordered Writes - and
synchronous writing of metadata... in decreasing order of
performance).

Soft updates basically represents possible file system operations
as a directed acyclic graph, and then registers dependency
resolvers that resolve dependencies between operations on an
edge (this is why Kirk often calls them "soft dependencies"
instead of "soft updates", the original name used by Ganger and
Patt).

For operations where the cached metadata updates that have been
delayed result in a cache hit and a rewrite of the update, you
can expect high performance from soft updates. They also do an
implict implementation of what is commonly called "write gathering"
by virtue of being able to resolve metadata updates that revert
data before it has to go to disk. This is why when you copy a
huge tree, then delete it, and you do it within 30 seconds (or
whatever you've set your syncerd for), the only thing that gets
updated is the access and modification times of the directory
in which you did the work.

But then you must add in locality of reference. The locality
of reference theorem stated that applications tend to operate
against data sets which are localized, and that different
applications will operate against different localities by
virtue of operating against different data sets.

In this case, you have soft updates graphs with dependencies
hooked off of them, but the directories to which these graphs
apply are members og non-intersecting sets.

This means that there is no dependency related stalling as a
result of the applications modifications to file system data or
metadata, and thus, if it can be supported, the operations can
be interleaved to stable storage.

So the very short answer to the question is "on a multi-applicaiton
server, using soft updates doesn't mean that you wouldn't benefit
from interleaving your I/O".

To speak to Brett's issue of RAID 5, parity is striped across
all disks, and doing parity updates on one stripe on one disk
will block all non-interleaved I/O to all other areas of the
disk; likewise, doing a data write will prevent a parity write
from completing for the other four disks in the array.

This effectively means that, unless you can interleave I/O
requests, as tagged command queues do, you are much worse off.

As to the issue of spindle sync, which Brett alluded to, I
don't think that it is supported for IDE, so you will be
eating a full rotational latency on average, instead of one
half a rotational latency, on average: (0 + 1)/2 vs. (1 + 1)/2.

Rod Grimes did some experimentation with CCD and spindle sync
on SCSI devices back when CCD first became capable of mirroring,
and has some nice hard data that you should ask him for (or dig
it out of DejaNews on the FreeBSD news group).

As I said, tagged command queues are supported by the IDE
standard; they just aren't supported by commonly available
IDE disks or controllers.

Terry Lambert
te...@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.

Jay Nelson

unread,

Dec 11, 1999, 3:00:00 AM12/11/99

to

I've pruned the cc list so people dont have to read this twice.

On Sat, 11 Dec 1999, Terry Lambert wrote:

[snip]

>Soft updates speak to the ability to stall dependent writes
>until they either _must_ be done, or until they no longer are
>relevent. It does this as a strategy for ordering the metadata
>updates (other methods are DOW - Delayed Ordered Writes - and
>synchronous writing of metadata... in decreasing order of
>performance).

It's this ability to delay and gather writes that prompted the
question. If a SCSI bus can handle 8-12MB with tagged queuing and
UltraDMA can do 30MB while blocking, where do the performance lines
cross -- or do they? As the number of spindles go up, I would expect
SCSI to outperform IDE -- but on a single drive system, do writes to
an IDE at UDMA speeds block less than gathered writes to a drive on a
nominally slower SCSI bus?

[snip]

>So the very short answer to the question is "on a multi-applicaiton
>server, using soft updates doesn't mean that you wouldn't benefit
>from interleaving your I/O".

Hmm... that suggests you also might not. On a single drive system
with soft updates, would an Ultra IDE perform worse, on par or better
than SCSI with a light to moderate IO load?

>To speak to Brett's issue of RAID 5, parity is striped across
>all disks, and doing parity updates on one stripe on one disk
>will block all non-interleaved I/O to all other areas of the
>disk; likewise, doing a data write will prevent a parity write
>from completing for the other four disks in the array.

I have seen reletavely few benefits of RAID 5. Performance sucks
relative to mirroring across separate controllers and until you reach
10-12 drives, the cost is about the same. I never thought about the
parity, though. Now that I have, I like RAID 5 even less.

>This effectively means that, unless you can interleave I/O
>requests, as tagged command queues do, you are much worse off.

Applying that to the single drive IDE vs. SCSI question suggests that,
even with higher bus burst speeds, I'm still lkely to end up worse
off, depending on load, than I would with SCSI -- soft updates
not withstanding. Is that correct?

>As to the issue of spindle sync, which Brett alluded to, I
>don't think that it is supported for IDE, so you will be
>eating a full rotational latency on average, instead of one
>half a rotational latency, on average: (0 + 1)/2 vs. (1 + 1)/2.

I think that just answered my earlier question.

>Rod Grimes did some experimentation with CCD and spindle sync
>on SCSI devices back when CCD first became capable of mirroring,
>and has some nice hard data that you should ask him for (or dig
>it out of DejaNews on the FreeBSD news group).

Thanks -- I'll look them up. And -- I appreciate your answer. I
learned quite a bit from it. It did raise the question of differences
between soft updates and lfs -- but I'll save that for another time.

-- Jay

David Scheidt

unread,

Dec 11, 1999, 3:00:00 AM12/11/99

to

On Fri, 10 Dec 1999, Jay Nelson wrote:

> On Sat, 11 Dec 1999, Terry Lambert wrote:
>
> Hmm... that suggests you also might not. On a single drive system
> with soft updates, would an Ultra IDE perform worse, on par or better
> than SCSI with a light to moderate IO load?

Under light to moderate IO loads, the disk interface isn't likely to be the
overall limiting factor on the machine. You certainly save some money by
going with IDE. On a low-end box, perhaps as much as 15 or 20% of the total
cost of the machine. Once you move away from the bottom end, or you want
more than a couple disks, SCSI looks much better.

David

Brett Glass

unread,

Dec 11, 1999, 3:00:00 AM12/11/99

to

At 10:45 PM 12/10/1999 , David Scheidt wrote:

>Under light to moderate IO loads, the disk interface isn't likely to be the
>overall limiting factor on the machine. You certainly save some money by
>going with IDE. On a low-end box, perhaps as much as 15 or 20% of the total
>cost of the machine. Once you move away from the bottom end, or you want
>more than a couple disks, SCSI looks much better.

Why wouldn't IDE retain an advantage -- so long as you put the disks on
separate controllers to avoid having one block another? (I like
SCSI too, but given the realities -- or unrealities -- of hard drive
pricing I'm always looking to milk more performance out of IDE drives
when I can.)

--Brett

David Scheidt

unread,

Dec 11, 1999, 3:00:00 AM12/11/99

to

On Fri, 10 Dec 1999, Brett Glass wrote:

> At 10:45 PM 12/10/1999 , David Scheidt wrote:
>
> >Under light to moderate IO loads, the disk interface isn't likely to be the
> >overall limiting factor on the machine. You certainly save some money by
> >going with IDE. On a low-end box, perhaps as much as 15 or 20% of the total
> >cost of the machine. Once you move away from the bottom end, or you want
> >more than a couple disks, SCSI looks much better.
>
> Why wouldn't IDE retain an advantage -- so long as you put the disks on
> separate controllers to avoid having one block another? (I like
> SCSI too, but given the realities -- or unrealities -- of hard drive
> pricing I'm always looking to milk more performance out of IDE drives
> when I can.)

For the highest level of performance, you really must have each disk on its
own IDE channel. I don't have much experience with machines with lots of
IDE disks. The most I have worked with is 4 IDE disks, with two on the
onboard controller and two on a PCI card controller. The machine didn't
seem to do as many IO transactions per second as a similiar machine with 4
LVD SCSI disks.

David

Brett Glass

unread,

Dec 11, 1999, 3:00:00 AM12/11/99

to

At 11:32 AM 12/11/1999 , David Scheidt wrote:

>For the highest level of performance, you really must have each disk on its
>own IDE channel. I don't have much experience with machines with lots of
>IDE disks. The most I have worked with is 4 IDE disks, with two on the
>onboard controller and two on a PCI card controller. The machine didn't
>seem to do as many IO transactions per second as a similiar machine with 4
>LVD SCSI disks.

AFAIK, IDE doesn't really have an equivalent of disconnect/reconnect (though
some vendors have tried to implement something like it). So, by having
more than one drive per interface, you're likely to slow things down
because one drive must wait for the other.

The IDE interface is cheap TTL, though -- a lot cheaper than SCSI. So you
really CAN have an interface per drive at reasonable expense. At that
point, I suspect that the smartness of the OS, and the speed of the host
CPU, would determine performance.

I sometimes long for the days of ESDI, where the host could control
EVERYTHING about the way the drive was read and written. (I did
some experiments which involved positioning the head on a blank track
when a write was expected, then writing the data to the very next
sector that came along while simultaneously updating an intention FIFO
for the metadata on a different spindle. The metadata itself was updated
later, so the intention log kept things from getting out of sync
due to power loss. You could pull the plug on that machine when the
disks were grinding away and never lose a thing.)

I also wrote some pretty good disk cache software during that era -- mostly
in assembler. I decribed how I did it in an article in BYTE circa 1985, and
even threw in a tool that let reader test the effectiveness of different
algorithms in real life situations.

--Brett

David Kelly

unread,

Dec 12, 1999, 3:00:00 AM12/12/99

to

Brett Glass writes:
> I sometimes long for the days of ESDI, where the host could control
> EVERYTHING about the way the drive was read and written. (I did
> some experiments which involved positioning the head on a blank track
> when a write was expected, then writing the data to the very next
> sector that came along while simultaneously updating an intention FIFO
> for the metadata on a different spindle. The metadata itself was updated
> later, so the intention log kept things from getting out of sync
> due to power loss. You could pull the plug on that machine when the
> disks were grinding away and never lose a thing.)

How similar is that to the log partition in SGI's XFS? There was no
restriction as to what spindle the log filesystem was placed. Quite to
the contrary, it was indicated using a separate drive on a separate
SCSI bus would help performance.

XFS for Linux was to be released by now. I haven't been paying
attention. Was it?

--
David Kelly N4HHE, dke...@hiwaay.net
=====================================================================
The human mind ordinarily operates at only ten percent of its
capacity -- the rest is overhead for the operating system.

Jay Nelson

unread,

Dec 12, 1999, 3:00:00 AM12/12/99

to

On Sat, 11 Dec 1999, David Kelly wrote:

[snip]

>How similar is that to the log partition in SGI's XFS? There was no
>restriction as to what spindle the log filesystem was placed. Quite to
>the contrary, it was indicated using a separate drive on a separate
>SCSI bus would help performance.

XFS sounds a lot like AIX's JFS. Which raises the question: What is
the connection between BSD's lfs, soft updates, SGI's XFS and AIX's
jfs? Don't they all do essentially the same thing except for where the
log is written?

Also -- and this is just curiosity, why did we go with soft updates
instead of finishing lfs? Aside from the fact that soft updates
appears cleaner than lfs, is there any outstanding superiority of one
over the other?

Finally, has anyone used soft updates with vinum?

-- Jay

Kris Kennaway

unread,

Dec 12, 1999, 3:00:00 AM12/12/99

to

On Sat, 11 Dec 1999, Jay Nelson wrote:

> On Sat, 11 Dec 1999, David Kelly wrote:
>
> [snip]
>
> >How similar is that to the log partition in SGI's XFS? There was no
> >restriction as to what spindle the log filesystem was placed. Quite to
> >the contrary, it was indicated using a separate drive on a separate
> >SCSI bus would help performance.
>
> XFS sounds a lot like AIX's JFS. Which raises the question: What is
> the connection between BSD's lfs, soft updates, SGI's XFS and AIX's
> jfs? Don't they all do essentially the same thing except for where the
> log is written?
>
> Also -- and this is just curiosity, why did we go with soft updates
> instead of finishing lfs? Aside from the fact that soft updates
> appears cleaner than lfs, is there any outstanding superiority of one
> over the other?

These are FAQs - instead of wasting peoples cycles in explaining it again
you'd probably be better served just checking the archives. Terry has
posted about it extensively in past threads.

> Finally, has anyone used soft updates with vinum?

There should be no reason why it won't work, as they're orthogonal
systems. Again, check the archives.

Kris

Jay Nelson

unread,

Dec 12, 1999, 3:00:00 AM12/12/99

to

On Sat, 11 Dec 1999, Kris Kennaway wrote:

[snip]

>> Also -- and this is just curiosity, why did we go with soft updates
>> instead of finishing lfs? Aside from the fact that soft updates
>> appears cleaner than lfs, is there any outstanding superiority of one
>> over the other?
>
>These are FAQs - instead of wasting peoples cycles in explaining it again

I'm sure you're right, but I couldn't find the answer in the FAQ I
supped this morning. Is there a different FAQ?

>you'd probably be better served just checking the archives. Terry has
>posted about it extensively in past threads.

Terry's posts did answer a number of questions. Specifically that lfs
and soft updates both could only roll a file system back to a known
good state -- instead of a journaled file system which is capable of
rolling forward to a known state. Neither lfs or soft updates
appear to have much to do with journaling. Still, I didn't find
anything that explained the decision to go with soft updates. Perhaps
I missed the relevant threads. Were they prior to '98?

Sorry for wasting your cycles.

-- Jay

David Scheidt

unread,

Dec 12, 1999, 3:00:00 AM12/12/99

to

On Sat, 11 Dec 1999, Kris Kennaway wrote:

> On Sat, 11 Dec 1999, Jay Nelson wrote:
> > Finally, has anyone used soft updates with vinum?
>
> There should be no reason why it won't work, as they're orthogonal
> systems. Again, check the archives.

There were some issues with RAID-5 vinum and softupdates. I don't use
vinum, so I don't know if they were fixed or not.

David Scheidt

Kris Kennaway

unread,

Dec 12, 1999, 3:00:00 AM12/12/99

to

On Sat, 11 Dec 1999, David Scheidt wrote:

> > There should be no reason why it won't work, as they're orthogonal
> > systems. Again, check the archives.
>
> There were some issues with RAID-5 vinum and softupdates. I don't use
> vinum, so I don't know if they were fixed or not.

I thought these were more of the case of exposing bugs in the other due to
an abnormal use pattern. In any case, there "should be no reason why it
won't work" :-)

Kris

Mattias Pantzare

unread,

Dec 12, 1999, 3:00:00 AM12/12/99

to

> >How similar is that to the log partition in SGI's XFS? There was no
> >restriction as to what spindle the log filesystem was placed. Quite to
> >the contrary, it was indicated using a separate drive on a separate
> >SCSI bus would help performance.
>
> XFS sounds a lot like AIX's JFS. Which raises the question: What is
> the connection between BSD's lfs, soft updates, SGI's XFS and AIX's
> jfs? Don't they all do essentially the same thing except for where the
> log is written?

No. lfs is a logging filesystem. You only have a log that contains everything,
including all your files. The downside to this is that you have to have a
garbage collector that cleans deleted data from the log. The good thing is
that you never have to seek for writes. All writes are to the end of the log.
http://www.cs.berkeley.edu/projects/sprite/papers/ has some papers on lfs.

A journaling filesystem is like a normal filesystem but you have a transaction
log that turns synchronous writes into a synchronous write to the log and a
asynchronous write to the normal filesystem. This avoids seeks when latency is
important.

Soft updates do not have a log att all. Take a look at
http://www.ece.cmu.edu/~ganger/papers/CSE-TR-254-95/.

Lfs will not roll in the normal sense, it will simply discard the half done
write at the end of the log if there is one.

Soft updates can't do rolling as there is no log.

Brett Glass

unread,

Dec 12, 1999, 3:00:00 AM12/12/99

to

At 05:47 PM 12/11/1999 , David Kelly wrote:

>How similar is that to the log partition in SGI's XFS?

I actually don't know, as the product was for DOS and Netware -- not
UNIX.

--Brett

David Kelly

unread,

Dec 13, 1999, 3:00:00 AM12/13/99

to

Jay Nelson writes:
> Terry's posts did answer a number of questions. Specifically that lfs
> and soft updates both could only roll a file system back to a known
> good state -- instead of a journaled file system which is capable of
> rolling forward to a known state. Neither lfs or soft updates
> appear to have much to do with journaling. Still, I didn't find
> anything that explained the decision to go with soft updates. Perhaps
> I missed the relevant threads. Were they prior to '98?

I believe the correct answer as to why today we have soft updates
rather than lfs is simply the fact Dr. McKusick tackled soft updates
and made it reliable and easy to apply before anybody got lfs to the
same state.

Can't find any mention if XFS for Linux has been released. May 1999
announcemnt that SGI intends to:
http://www.sgi.com/developers/oss/sgi_resources/feature5.html

More info on XFS:
http://www.sgi.com/Technology/xfs-whitepaper.html

--
David Kelly N4HHE, dke...@hiwaay.net
=====================================================================
The human mind ordinarily operates at only ten percent of its
capacity -- the rest is overhead for the operating system.

Terry Lambert

unread,

Dec 14, 1999, 3:00:00 AM12/14/99

to

> On Sat, 11 Dec 1999, Terry Lambert wrote:
>
> [snip]
>
> >Soft updates speak to the ability to stall dependent writes
> >until they either _must_ be done, or until they no longer are
> >relevent. It does this as a strategy for ordering the metadata
> >updates (other methods are DOW - Delayed Ordered Writes - and
> >synchronous writing of metadata... in decreasing order of
> >performance).
>
> It's this ability to delay and gather writes that prompted the
> question. If a SCSI bus can handle 8-12MB with tagged queuing and
> UltraDMA can do 30MB while blocking, where do the performance lines
> cross -- or do they? As the number of spindles go up, I would expect
> SCSI to outperform IDE -- but on a single drive system, do writes to
> an IDE at UDMA speeds block less than gathered writes to a drive on a
> nominally slower SCSI bus?

The question you have to ask is whether or not your I/O requests
are going to be interleaved (and therefore concurrently outstanding)
or not.

You only need four outstanding concurrent requests for your 8M
SCSI bus to beat a 30M UltraDMA. I'll note for the record here
that your PCI bus is capable of 133M burst, and considerably
less when doing continuous duty, so bus limits will hit before
that (I rather suspect your 30M number, if it's real and not a
"for instance", is a burst rate, not a continuous transfer rate,
and depends either on pre-reading or a disk read cache hit).

Effectively, what you are asking is "if I have a really fast
network interface, can I set my TCP/IP windows size down to
one frame per window, and get better performance than a not
quite as fast interface using a large window?".

The answer depends on your latency. If you have zero latency,
then you don't need the ability to interleave I/O. If you
non-zero latency, then interleaving I/O will help you move
more data.

For tagged command queues, the questions are:

o are your seek times so fast that elevator sorting the
requests (only possible if the drive knows about two
or more requests) will not yield better performance?

o is your transfer latency so low that interleaving
your I/O will not yield better performance?

o is all of your I/O so serialized, because you are only
actively running a single application at a time, that
you won't benefit from increased I/O concurrency?

> [snip]
>
> >So the very short answer to the question is "on a multi-applicaiton
> >server, using soft updates doesn't mean that you wouldn't benefit
> >from interleaving your I/O".
>

> Hmm... that suggests you also might not. On a single drive system
> with soft updates, would an Ultra IDE perform worse, on par or better
> than SCSI with a light to moderate IO load?

Worse, if there is any significance to the I/O, so long as
you have non-zero latency.

Unfortunately, with drives which lie about their true geometry,
and which claim to be "perfect" by automatically redirecting
bad sectors, it's not possible to elevator sort in the OS (this
used to be common, and variable/unknown drive geometry is why
this tuning is currently turned off by default in newfs).

> >To speak to Brett's issue of RAID 5, parity is striped across
> >all disks, and doing parity updates on one stripe on one disk
> >will block all non-interleaved I/O to all other areas of the
> >disk; likewise, doing a data write will prevent a parity write
> >from completing for the other four disks in the array.
>
> I have seen reletavely few benefits of RAID 5. Performance sucks
> relative to mirroring across separate controllers and until you reach
> 10-12 drives, the cost is about the same. I never thought about the
> parity, though. Now that I have, I like RAID 5 even less.

If your I/O wasn't being serialized by your drives/controller,
the parity would be much less of a performance issue.

> >This effectively means that, unless you can interleave I/O
> >requests, as tagged command queues do, you are much worse off.
>
> Applying that to the single drive IDE vs. SCSI question suggests that,
> even with higher bus burst speeds, I'm still lkely to end up worse
> off, depending on load, than I would with SCSI -- soft updates
> not withstanding. Is that correct?

Yes, depending on load.

For a single user desktop connected to a human, generally you
only run one application at a time, and so serialized I/O is
OK, even if you are doing streaming media to or from the disk.

The master/slave bottleneck and multimedia are the reason
that modern systems with IDE tend to have two controllers,
with one used for the main disk, and the other used for the
CDROM/DVD.

> >As to the issue of spindle sync, which Brett alluded to, I
> >don't think that it is supported for IDE, so you will be
> >eating a full rotational latency on average, instead of one
> >half a rotational latency, on average: (0 + 1)/2 vs. (1 + 1)/2.
>
> I think that just answered my earlier question.

This is a RAID-specific issue. Most modern drives in a one
drive system record sectors in descending order on a track. As
soon as the seek completes, it begins caching data, and returns
the data you asked for as soon as it has cached back far enough.
For single application sequential read behaviour, this pre-caches
the data that you are likely to ask for next.

A different issue that probably applies to what you meant is
for a multiple program machine with two or more readers, the
value of the cache is greatly reduced, unless the drive supports
multiple track caches, one per reader, in order to preserve
locality of reference for each reader. This one is not RAID
specific, but is load specific.

In general, good SCSI drives will support a track cache per
tagged command queue.

It would be interesting to test algorithms for increasing I/O
locality to keep the number of files on which I/O is outstanding
under the number of track caches (probably by delaying I/O for
files that were not in the last track cache count minus one I/Os
in favor of I/O requests that are).

> >Rod Grimes did some experimentation with CCD and spindle sync
> >on SCSI devices back when CCD first became capable of mirroring,
> >and has some nice hard data that you should ask him for (or dig
> >it out of DejaNews on the FreeBSD news group).
>
> Thanks -- I'll look them up. And -- I appreciate your answer. I
> learned quite a bit from it. It did raise the question of differences
> between soft updates and lfs -- but I'll save that for another time.

8-).

Terry Lambert
te...@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.

Terry Lambert

unread,

Dec 14, 1999, 3:00:00 AM12/14/99

to

> >Under light to moderate IO loads, the disk interface isn't likely to be the
> >overall limiting factor on the machine. You certainly save some money by
> >going with IDE. On a low-end box, perhaps as much as 15 or 20% of the total
> >cost of the machine. Once you move away from the bottom end, or you want
> >more than a couple disks, SCSI looks much better.
>
> Why wouldn't IDE retain an advantage -- so long as you put the disks on
> separate controllers to avoid having one block another? (I like
> SCSI too, but given the realities -- or unrealities -- of hard drive
> pricing I'm always looking to milk more performance out of IDE drives
> when I can.)

I will let you in on a "secret": SCSI drives cost more because
that's what the market will bear, based on their performance
characteristics relative to IDE.

They cost the same to manufacture; it doesn't matter what mask
you use to burn your 1 square inch ASIC.

FWIW: IBM has demonstrated IDE hardware with tagged command queues,
but is not manufacturing it (so far as I have been able to tell).
The only other remaining thorn in IDE is the need to add more
controllers after you have two disks installed; SCSI has it rather
well beat there.

Terry Lambert

unread,

Dec 14, 1999, 3:00:00 AM12/14/99

to

> > I sometimes long for the days of ESDI, where the host could control
> > EVERYTHING about the way the drive was read and written. (I did
> > some experiments which involved positioning the head on a blank track
> > when a write was expected, then writing the data to the very next
> > sector that came along while simultaneously updating an intention FIFO
> > for the metadata on a different spindle. The metadata itself was updated
> > later, so the intention log kept things from getting out of sync
> > due to power loss. You could pull the plug on that machine when the
> > disks were grinding away and never lose a thing.)
>

> How similar is that to the log partition in SGI's XFS? There was no
> restriction as to what spindle the log filesystem was placed. Quite to
> the contrary, it was indicated using a separate drive on a separate
> SCSI bus would help performance.

You can look at the logging code, and find out. SGI has put the
logging code up for download as the first installment on XFS.

The short answer is that XFS doesn't require a seperate spindle,
and unless you turned off write caching most modern drives tend
to lie about stuff having been committed to stable storage, and
tend to do The Wrong Thing after a power failure, which is they
fail to write the data they have in cache before they are done
spinning down. A cache that crosses a seek boundary (e.g. for
bad sector sparing) is particularly at risk.

> XFS for Linux was to be released by now. I haven't been paying
> attention. Was it?

Right now, it is vapor-ware. They have still not cleaned out the
USL code enough to be able to release it. I have heard rumors
that they attempted to clean room it, but can't find "virgins" to
write the new code that are also capable of doing the job.

Jay Nelson

unread,

Dec 14, 1999, 3:00:00 AM12/14/99

to

Thanks for the reply and the references. Your answer _should_ be a
FAQ. It would save the repeat questions.

-- Jay

On Sun, 12 Dec 1999, Mattias Pantzare wrote:

>> >How similar is that to the log partition in SGI's XFS? There was no
>> >restriction as to what spindle the log filesystem was placed. Quite to
>> >the contrary, it was indicated using a separate drive on a separate
>> >SCSI bus would help performance.
>>

>> XFS sounds a lot like AIX's JFS. Which raises the question: What is
>> the connection between BSD's lfs, soft updates, SGI's XFS and AIX's
>> jfs? Don't they all do essentially the same thing except for where the
>> log is written?
>
>No. lfs is a logging filesystem. You only have a log that contains everything,
>including all your files. The downside to this is that you have to have a
>garbage collector that cleans deleted data from the log. The good thing is
>that you never have to seek for writes. All writes are to the end of the log.
>http://www.cs.berkeley.edu/projects/sprite/papers/ has some papers on lfs.
>
>A journaling filesystem is like a normal filesystem but you have a transaction
>log that turns synchronous writes into a synchronous write to the log and a
>asynchronous write to the normal filesystem. This avoids seeks when latency is
>important.
>
>Soft updates do not have a log att all. Take a look at
> http://www.ece.cmu.edu/~ganger/papers/CSE-TR-254-95/.
>
>
>Lfs will not roll in the normal sense, it will simply discard the half done
>write at the end of the log if there is one.
>
>Soft updates can't do rolling as there is no log.
>

To Unsubscribe: send mail to majo...@FreeBSD.org

Jay Nelson

unread,

Dec 14, 1999, 3:00:00 AM12/14/99

to

This answers the questions. Thanks, Terry. I've left the whole message
intact for the benefit of others searching the archives.

-- Jay

Terry Lambert

unread,

Dec 14, 1999, 3:00:00 AM12/14/99

to

> >> Also -- and this is just curiosity, why did we go with soft updates
> >> instead of finishing lfs? Aside from the fact that soft updates
> >> appears cleaner than lfs, is there any outstanding superiority of one
> >> over the other?
> >
> >These are FAQs - instead of wasting peoples cycles in explaining it again
>
> I'm sure you're right, but I couldn't find the answer in the FAQ I
> supped this morning. Is there a different FAQ?

They are FAQs, not "in the FAQ".

The archives you should be looking at, and the place you should be
asking the question are the freebsd-fs list.

Soft Updates was implemented because Whistle paid Kirk to do the
work, as well as throwing in some of Julians time and my time in
the bargain. The reason for doing the work was to get rid of the
UPS circuitry and heavy battery in the next generation Whistle
InterJet product.

LFS wasn't finished because the implementation is incomplete (but
only minorly so), and because it was not kept up to date with VM
and other system interface changes (IMO: you change the interface,
you're responsible for fixing all the code that uses it). The
minor missing piece was a "cleaner" daemon to follow behind and
reclaim logs that are no longer referenced by inodes. It's pretty
trivial to write one of these.

Frankly, logging solves different problems than soft updates, and
the technology is orthogonal. Soft Updates solves the metadata
ordering problem, without requiring synchronous writes. The LFS
solves the fast recovery following a crash problem; it does this
at the cost of a latency between when disk space is no longer
being used, and when it becomes available for reuse, as well as
adding in a certain level of fragmentation (the cleaner also needs
to be a defragger, for a small definition of defragger).

Soft Updates is capable of being generalized to allow dependencies
to span file system layers, including externalizing a generalized
transactioning interface to user space (Very Useful). Logging is
a raw disk management mechanism.

> Still, I didn't find
> anything that explained the decision to go with soft updates. Perhaps
> I missed the relevant threads. Were they prior to '98?

Soft Updates came in because someone paid for its developement;
there is a bit of difference between the Ganger/Patt implementation,
and the one in FreeBSD, but not a huge amount. It leverages greatly
on work Kirk had already done for BSDI and OpenBSD.

Brett Glass

unread,

Dec 14, 1999, 3:00:00 AM12/14/99

to

At 05:39 PM 12/13/1999 , Terry Lambert wrote:

>I will let you in on a "secret": SCSI drives cost more because
>that's what the market will bear, based on their performance
>characteristics relative to IDE.

Unfortunately, I don't believe that the price/performance
ratio of UltraSCSI is anywhere near that of Ultra-66 ATAPI.

>They cost the same to manufacture; it doesn't matter what mask
>you use to burn your 1 square inch ASIC.

Which is the problem. You're being charged a premium for hardware
that's very similar, due to lower volume. And SCSI has higher command
latency than IDE. SCSI drives usually make up for this with tagged
command queueing, hidden elevator seeking, and larger on-drive
caches. Sometimes this is a clear win, but sometimes it is not.

The ideal thing would be a hybrid: a drive which supported the
full SCSI command repertoire but didn't have the overhead of
selection, arbitration, bus settling time, signal deskewing,
etc.

I would do this by making the one drive per ATAPI cable act like
a SCSI device that was always selected, eliminating the bus
selection phase altogether. The interface would be cheaper,
because straight TTL is much less expensive than the terminators
and transceivers needed for SCSI. It'd be more energy-efficient,
too. And it'd have a higher peak transfer rate, because SCSI is
limited by having to handle the more varied transmission line
characteristics that come from an 8-foot or 16-foot cable with
multiple irregularly spaced taps.

The "SCATAPI" drives would use a 1 meter cable -- ALWAYS 1 meter,
even if you could get away with less. Fold it up neatly
if it's too long. No taps, 28 AWG conductors, controlled impedance,
and twists in the signal lines all the way. Peak speed ought to
reach 132 MBps easily. This just happens to be the capacity of
32-bit PCI. A later generation could move up to AGP speeds and
run off the motherboard chipset's AGP circuitry. CAM would work
with no modification.

--Brett

Jay Nelson

unread,

Dec 14, 1999, 3:00:00 AM12/14/99

to

On Tue, 14 Dec 1999, Terry Lambert wrote:

[snip]

>They are FAQs, not "in the FAQ".

I suspect they probably should be in the FAQ. The average admin who
doesn't follow mailing lists asks questions like this. The more we
claim (justifiably) stability, the more seriously they evaluate
FreeBSD against commercial alternatives. This is an area where few of
us really understand the issues involved.

>The archives you should be looking at, and the place you should be
>asking the question are the freebsd-fs list.

I did look in the fs archives -- although I'm not sure the general
question belongs there since it seems to have more to do with the
differences between FreeBSD and the commercal offerings.

Is it fair to summarize the differences as:

Soft updates provide little in terms of recovering data, but enhances
performance during runtime. Recovery being limited to ignoring
metadata that wasn't written to disk.

Log file systems offers little data recovery in return for faster
system recovery after an unorderly halt at the cost of a runtime
penalty.

Journaled filesystem offer the potential of data recovery at a boot
time and runtime cost.

I know this is disgustingly over simplified, but about all you can get
through to typical management.

I also have to admit, I'm a little confused with your usage of the
word orthogonal. Do you mean that an orthogonal technology projects
cleanly or uniformly into different dimensions of system space?

-- Jay

David Scheidt

unread,

Dec 14, 1999, 3:00:00 AM12/14/99

to

On Mon, 13 Dec 1999, Brett Glass wrote:

>
> The "SCATAPI" drives would use a 1 meter cable -- ALWAYS 1 meter,
> even if you could get away with less. Fold it up neatly
> if it's too long. No taps, 28 AWG conductors, controlled impedance,
> and twists in the signal lines all the way. Peak speed ought to
> reach 132 MBps easily. This just happens to be the capacity of
> 32-bit PCI. A later generation could move up to AGP speeds and
> run off the motherboard chipset's AGP circuitry. CAM would work
> with no modification.

There are those of us who have machines that need long cables. One of the
boxes I manage has disks that are 15 meters away from the CPU cabinet. If
you need to have hundreds of disks, you can't have silly cable lengths. Of
course, the next generation box will be all Fibre Channel, but still. Even
on my home box, I have a need for greater than 1 metre bus lengths.

david

Olaf Hoyer

unread,

Dec 14, 1999, 3:00:00 AM12/14/99

to

>> The "SCATAPI" drives would use a 1 meter cable -- ALWAYS 1 meter,
>> even if you could get away with less. Fold it up neatly
>> if it's too long. No taps, 28 AWG conductors, controlled impedance,
>> and twists in the signal lines all the way. Peak speed ought to
>> reach 132 MBps easily. This just happens to be the capacity of
>> 32-bit PCI. A later generation could move up to AGP speeds and
>> run off the motherboard chipset's AGP circuitry. CAM would work
>> with no modification.

Hi!

should be shielded cables, or do I miss something about keeping signals clean?

AGP IMHO is some el cheapo implementation of some simple electrical
prolonging of a simple interface for fast vid transfer, like the VESA local
bus design in past days.

>
>There are those of us who have machines that need long cables. One of the
>boxes I manage has disks that are 15 meters away from the CPU cabinet. If
>you need to have hundreds of disks, you can't have silly cable lengths. Of
>course, the next generation box will be all Fibre Channel, but still. Even
>on my home box, I have a need for greater than 1 metre bus lengths.

Yes, but if someone really needs, say 20 disks/CD-ROMs attached some meters
away from your box, wouldn't it make sense and be cheaper to put them in a
dedicated file server /server box and attach them via a fast network?

BTW: How do they the 15 meters?

Regards
Olaf Hoyer

------
Olaf Hoyer ICQ: 22838075 mailto: Olaf....@nightfire.de
home: www.nightfire.de (The home of the burning CPU)

Death be my master, my soul and saviour... (The book of inferno, chapter II)
"There is no justice, there is just me", said the Reaper (Terry Pratchett)

Wer mit Ungeheuern kämpft, mag zusehn,
daß er nicht dabei zum Ungeheuer wird.
Und wenn du lange in einen Abgrund blickst, blickt der Abgrund
auch in dich hinein.
(Friedrich Nietzsche, Jenseits von Gut und Böse)

Brett Glass

unread,

Dec 14, 1999, 3:00:00 AM12/14/99

to

At 09:12 PM 12/13/1999 , David Scheidt wrote:

>There are those of us who have machines that need long cables. One of the
>boxes I manage has disks that are 15 meters away from the CPU cabinet.

There will always be some of these, but they'll be statistically rare.

>If
>you need to have hundreds of disks, you can't have silly cable lengths.

SCSI can really only have 7 disks per cable. (Yes, I know, you can extend
the addressing to get 15, or use logical units, but this causes problems
with bus loading and also with contention for the bus.)

Also, I'm sure you will agree that hundreds of spindles on one computer
is not the norm. We shouldn't bog down our core standards because of one
case that's several sigma off the low end of the probability scale.

>Of
>course, the next generation box will be all Fibre Channel, but still. Even
>on my home box, I have a need for greater than 1 metre bus lengths.

No problem! Use the shorter cables within the disk array and then use
SCSI or Fiber Channel -- on the other side of the RAID controller -- for
the longer-haul connections.

Also, putting that much disk space on a single machine may not be a good idea.
If it has that much data to serve up or search, it's probably going to be
strapped for CPU cycles or network bandwidth. Depending on the situation,
you might be better off distributing your files or databases and putting several
disks (but not hundreds) on each server. This makes the system more failsafe,
too: one bad CPU won't take down the whole operation.

--Brett

Brett Glass

unread,

Dec 14, 1999, 3:00:00 AM12/14/99

to

At 09:25 PM 12/13/1999 , Olaf Hoyer wrote:

>Hi!
>
>should be shielded cables, or do I miss something about keeping signals clean?

Twisted pairs don't provide the same shielding as coax, but they resist
noise surprisingly well. (It's a standard exercise in advanced E&M physics
classes to show this.) And twists don't lower the impedance as much as a
shield, nor do they require baluns. If you buy a long differential SCSI
ribbon cable, you'll find that the pairs are twisted within the ribbon.
(The twists stop at the places where the connectors are crimped
on, then resume.)

>AGP IMHO is some el cheapo implementation of some simple electrical
>prolonging of a simple interface for fast vid transfer, like the VESA local
>bus design in past days.

It is, indeed, a somewhat quick and dirty interface. But it's a good one.
It does very fast DMA without involving the CPU, which means good
concurrency. In some ways, it's even better suited for hard disks than it
is for graphics!

>Yes, but if someone really needs, say 20 disks/CD-ROMs attached some meters
>away from your box, wouldn't it make sense and be cheaper to put them in a
>dedicated file server /server box and attach them via a fast network?

That's what I'd do. Inside the box, you could use regular IDE for CD-ROMs
or fast interfaces for hard disks. ATAPI is cheap; additional interfaces
would probably cost about $5 each. Just a few more chips and connectors
on a PC board. Outside the box, you'd use FireWire, SCSI, fast Ethernet,
or gigabit Ethernet.

Jamie Bowden

unread,

Dec 14, 1999, 3:00:00 AM12/14/99

to

On Mon, 13 Dec 1999, Brett Glass wrote:

:Also, putting that much disk space on a single machine may not be a good idea.

:If it has that much data to serve up or search, it's probably going to be
:strapped for CPU cycles or network bandwidth. Depending on the situation,
:you might be better off distributing your files or databases and putting several
:disks (but not hundreds) on each server. This makes the system more failsafe,
:too: one bad CPU won't take down the whole operation.

Can I point out that the PC isn't the only platform on the planet? When I
was at NASA 16 processor (or more) Origin2000's and Sun Enterprise servers
with anywhere from 200GB to 1TB+ drive arrays on them were quite common.

Eventually PC's won't be single processor toys. Hell, you can build dual
CPU boxes now for less than a 286 cost 10 years ago. Any spec you come up
with better be scalable, and not ignore multi cpu configurations.

Jamie Bowden

--

"Of course, that's sort of like asking how other than Marketing, how Microsoft is different from any other software company..."
Kenneth G. Cavness

David Scheidt

unread,

Dec 14, 1999, 3:00:00 AM12/14/99

to

On Tue, 14 Dec 1999, Olaf Hoyer wrote:

> >There are those of us who have machines that need long cables. One of the

> >boxes I manage has disks that are 15 meters away from the CPU cabinet. If
> >you need to have hundreds of disks, you can't have silly cable lengths. Of

> >course, the next generation box will be all Fibre Channel, but still. Even
> >on my home box, I have a need for greater than 1 metre bus lengths.
>

> Yes, but if someone really needs, say 20 disks/CD-ROMs attached some meters
> away from your box, wouldn't it make sense and be cheaper to put them in a
> dedicated file server /server box and attach them via a fast network?

I am waiting to meet a network that can do 400MB/S -- the agregate
throughput of 20 Fast/Wide/Differential SCSI controllers.

>
> BTW: How do they the 15 meters?

High voltage differential FW SCSI. The spec allows fro a cable length of 25
metres. With good cables, controllers, disks, enclosures, and terminators,
you can do 30, though I wouldn't for a production box.

David scheidt

Terry Lambert

unread,

Dec 14, 1999, 3:00:00 AM12/14/99

to

> >I will let you in on a "secret": SCSI drives cost more because
> >that's what the market will bear, based on their performance
> >characteristics relative to IDE.
>
> Unfortunately, I don't believe that the price/performance
> ratio of UltraSCSI is anywhere near that of Ultra-66 ATAPI.

Who said anything about that? The market will bear higher
prices from SCSI, so SCSI costs more.

> >They cost the same to manufacture; it doesn't matter what mask
> >you use to burn your 1 square inch ASIC.
>
> Which is the problem. You're being charged a premium for hardware
> that's very similar, due to lower volume. And SCSI has higher command
> latency than IDE. SCSI drives usually make up for this with tagged
> command queueing, hidden elevator seeking, and larger on-drive
> caches. Sometimes this is a clear win, but sometimes it is not.

Sorry, but Bzzzt. SCSI is actually cheaper in Europe than the
US. It is a function of market pressures. I can put you in
contact with the IBM Santa Teresa (disk drive manufacturing)
people if you need me to...

> The ideal thing would be a hybrid: a drive which supported the
> full SCSI command repertoire but didn't have the overhead of
> selection, arbitration, bus settling time, signal deskewing,
> etc.

Yeah, that's called "ATAPI". All IDE CDROMs are SCSI CDROMS
in disguise.

Terry Lambert
te...@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.

Terry Lambert

unread,

Dec 14, 1999, 3:00:00 AM12/14/99

to

> >They are FAQs, not "in the FAQ".
>
> I suspect they probably should be in the FAQ. The average admin who
> doesn't follow mailing lists asks questions like this. The more we
> claim (justifiably) stability, the more seriously they evaluate
> FreeBSD against commercial alternatives. This is an area where few of
> us really understand the issues involved.
>
> >The archives you should be looking at, and the place you should be
> >asking the question are the freebsd-fs list.
>
> I did look in the fs archives -- although I'm not sure the general
> question belongs there since it seems to have more to do with the
> differences between FreeBSD and the commercal offerings.
>
> Is it fair to summarize the differences as:
>
> Soft updates provide little in terms of recovering data, but enhances
> performance during runtime. Recovery being limited to ignoring
> metadata that wasn't written to disk.

No.

Soft updates:

What is lost are uncommitted writes. Committed writes are
guaranteed to have been ordered. This means that you can
deterministically recover the disk not just to a stable state,
but to the stable state that it was intended to be in. The
things that are lost are implied state between files (e.g.
a record file and an index file for a database); this can be
worked around using two stage commits on the data in the
database software.

Soft updates is slow to recover because of the need to tell
the difference between a hard failure and a soft failure (a
hard failure is a software or hardware fault; a soft failure
is the loss of power). If you can tell this, then you don't
need to fsck the drive, only recover over-allocated cylinder
group bitmaps. This can be done in the background, locking
access to a cylinder group at a time.

Distinguishing the failure type is the biggest problem here,
and requires NVRAM or a technology like soft read-only (first
implemented by a team I was on at Artisoft around 1996 for a
port of the Heidemann framework and soft updates to Windows
95, as far as I can tell).

> Log file systems offers little data recovery in return for faster
> system recovery after an unorderly halt at the cost of a runtime
> penalty.

Log structured FSs:

Zero rotational latency on writes, fast recovery after a hard
or soft failure.

What is lost are uncommitted writes (see above).

LFSs recover quickly because they look for the metadata log
entry with the most recent date, and they are "magically"
recovered to that point.

There is still a catch-22 with regard to soft vs. hard failures,
but most hard failures can be safely ignored, since any data
dated from before the hard failure is OK, unless the drive is
going south. You must therefore differentiate hard failures in
the kernel "panic" messages, so that a human has an opportunity
to see them.

LFSs have an ongoing runtime cost that is effectively the need
to "garbage collect" outdated logs so that their extents can
be reused by new data.

> Journaled filesystem offer the potential of data recovery at a boot
> time and runtime cost.

JFSs:

A JFS maintains a Journal; this is sometimes called an intention
log. Because it logs its intent before the fact, it can offer
a transactional interface to user space. This lets the programmer
skip the more expensive two stage commit process in favor of hooks
into the intention log.

Because transactions done this way can be nested, a completed
but uncommitted transaction can be rolled forward to the extent
that the nesting level has returned to "0" -- in other words,
all nested transation intents have been logged.

Because transactions can be rolled forward, you will recover to
the state that the JFS would have been in had the failure not
ever occurred. This works, because writes, etc., are not
acknowledged back to the caller until the intention has been
carried out. Things like an intent to delete a file, rename a
file, etc. are logged at level 0 (i.e. not in a user defined
transaction bound), and so can be acknowledged immediately;
wirtes of actual data need to be delayed, if they are in a
transaction bound.

This lets you treat a JFS as a committed stable storage, without
second-guessing the kernel or the drive cache, etc..

A JFS recovery, like an LFS recovery, uses the most recent valid
timestamp in the intention log, and then rules all transactions
that have completed forward.

Like LFS, hard errors can be ignored, unless the hard errors
occur during replay of the journal in rolling some completed
transaction forward. Because of this, care must be taken on
recovery.

JFS recovery can take a while, if there are a lot of completed
intentions in the journal.

Many JFS implementations also use logs in order to write user
data, so that the write acknowledge can be accelerated.

> I know this is disgustingly over simplified, but about all you can get
> through to typical management.
>
> I also have to admit, I'm a little confused with your usage of the
> word orthogonal. Do you mean that an orthogonal technology projects
> cleanly or uniformly into different dimensions of system space?

Yes. "Mutually perpendicular" and "Intersecting at only one point".
It's my training in physics seeping through...

Terry Lambert

unread,

Dec 14, 1999, 3:00:00 AM12/14/99

to

> >There are those of us who have machines that need long cables. One of the
> >boxes I manage has disks that are 15 meters away from the CPU cabinet.
>

> There will always be some of these, but they'll be statistically rare.

They may be disproportionately rare, but I'd also say they were
also the boxes with a disproportionately large number of the
disks in the world. 8-).

> SCSI can really only have 7 disks per cable. (Yes, I know, you can extend
> the addressing to get 15, or use logical units, but this causes problems
> with bus loading and also with contention for the bus.)
>
> Also, I'm sure you will agree that hundreds of spindles on one computer
> is not the norm. We shouldn't bog down our core standards because of one
> case that's several sigma off the low end of the probability scale.

I don't see how core standards are getting bogged down by this;
you personally use IDE, right? So it doesn't bog you down, at
least not except in the sense that you are already bogged down
by not being able to interleave your commands because your chosen
interface doesn't full implement its core standard.

> Also, putting that much disk space on a single machine may not be a good idea.
> If it has that much data to serve up or search, it's probably going to be
> strapped for CPU cycles or network bandwidth.

I only have two things to say to that:

1) Altavista
2) www.cdrom.com

Terry Lambert

unread,

Dec 14, 1999, 3:00:00 AM12/14/99

to

> Can I point out that the PC isn't the only platform on the planet? When I
> was at NASA 16 processor (or more) Origin2000's and Sun Enterprise servers
> with anywhere from 200GB to 1TB+ drive arrays on them were quite common.
>
> Eventually PC's won't be single processor toys. Hell, you can build dual
> CPU boxes now for less than a 286 cost 10 years ago. Any spec you come up
> with better be scalable, and not ignore multi cpu configurations.

Eventually, PCs will go away, and you will host your sessions
on mainframes once again; only your destop will just always be
there for you to attach to.

A lot of things had to happen for us to go from mainframe to
mini to pc to lan to clinet/server and then back to mainframe;
most of them had to do with economic model (getting away for
CPU second charges, getting "click-through" as a revenue model,
getting low cost high speed networks in place, etc.).

See:

http://www.qubit.net/

Perhaps you can even convince them to let you use FreeBSD instead
of Linux (they were WIN/CE).

I think you will also start to see true "capabilities" based OSs
("capabilities" is a security model), where you can kick out the
plug, plug it back in, reattach your "webpad", and be back exactly
where you were before you kicked out the plug.

Now if only IKE/ISAKMP weren't based on clipper chip technology...

Jonathan M. Bresler

unread,

Dec 14, 1999, 3:00:00 AM12/14/99

to

>
> Now if only IKE/ISAKMP weren't based on clipper chip technology...
>
>
> Terry Lambert
> te...@lambert.org

???? certain chip vendors chips may be based upon or
include clipper chip (do you know of any?).

IKE/ISAKMP is not based upon clipper. the leaf fields, the
key escrow and all the rest of it are not part of IKE/ISAKMP. this
statemtne is based upon reading the RFC's, IPSec by naganamd doraswamy
and dan harkins. surely you are not suggesting that KAME has
implemented a software version of clipper chip technology in their
code.

jmb

Brett Glass

unread,

Dec 15, 1999, 3:00:00 AM12/15/99

to

At 06:05 AM 12/14/1999 , Jamie Bowden wrote:

>Can I point out that the PC isn't the only platform on the planet? When I
>was at NASA 16 processor (or more) Origin2000's and Sun Enterprise servers
>with anywhere from 200GB to 1TB+ drive arrays on them were quite common.
>
>Eventually PC's won't be single processor toys.

Multiprocessing has always been a stopgap measure to get extra performance
out of a machine until uniprocessors caught up. The diminishing returns
make tightly coupled multiprocessing far less desirable than loosely
coupled (or uncoupled!) distributed computing.

Just my 2 cents.

--Brett Glass

Brett Glass

unread,

Dec 15, 1999, 3:00:00 AM12/15/99

to

At 11:53 AM 12/14/1999 , Terry Lambert wrote:

>Sorry, but Bzzzt. SCSI is actually cheaper in Europe than the
>US.

Or is IDE more expensive, so that the prices converge that way?
This was the impression I got when I was in Europe.

> > The ideal thing would be a hybrid: a drive which supported the
> > full SCSI command repertoire but didn't have the overhead of
> > selection, arbitration, bus settling time, signal deskewing,
> > etc.
>
>Yeah, that's called "ATAPI". All IDE CDROMs are SCSI CDROMS
>in disguise.

ATAPI is *sort of* that. But not really. Some important SCSI
features are missing.

--Brett

Brett Glass

unread,

Dec 15, 1999, 3:00:00 AM12/15/99

to

At 12:28 PM 12/14/1999 , Terry Lambert wrote:

> > Also, I'm sure you will agree that hundreds of spindles on one computer
> > is not the norm. We shouldn't bog down our core standards because of one
> > case that's several sigma off the low end of the probability scale.
>
>I don't see how core standards are getting bogged down by this;
>you personally use IDE, right?

I personally use both IDE and SCSI. Even the laptop I'm typing this
on now has both.

>So it doesn't bog you down, at
>least not except in the sense that you are already bogged down
>by not being able to interleave your commands because your chosen
>interface doesn't full implement its core standard.

Again, I haven't chosen just one interface. But what I'd like to
see is a core standard that offers the maximum performance at
the least cost in the greatest number of applications. A full
implementation of SCSI features over a fast TTL interface with
relatively short cables would be the best of both worlds.

Of course, none of the other interfaces would go away, so if you
really wanted to run a 15-foot cable SCSI would still be there.

> > Also, putting that much disk space on a single machine may not be a good idea.
> > If it has that much data to serve up or search, it's probably going to be
> > strapped for CPU cycles or network bandwidth.
>
>I only have two things to say to that:
>
>1) Altavista
>2) www.cdrom.com

Neither should be one big machine. (Yes, I know that CDROM.COM is, but that's
not the way I personally would have implemented it. I would have used at
least two machines for redundancy's sake, if for no other reason.)

Terry Lambert

unread,

Dec 15, 1999, 3:00:00 AM12/15/99

to

> > Now if only IKE/ISAKMP weren't based on clipper chip technology...
>
>

> ???? certain chip vendors chips may be based upon or
> include clipper chip (do you know of any?).
>
> IKE/ISAKMP is not based upon clipper. the leaf fields, the
> key escrow and all the rest of it are not part of IKE/ISAKMP. this
> statemtne is based upon reading the RFC's, IPSec by naganamd doraswamy
> and dan harkins. surely you are not suggesting that KAME has
> implemented a software version of clipper chip technology in their
> code.

Read the December 1999 ";login:" magazine from Usenix, and see
the article:

IKE/ISAKMP considered harmful
William Allen Simpson

I quote from the first paragraph following the abstract:

The Internet Security Association and Key Management
Protocol (ISAKMP) [RFC-2408] framework was originally
developed by the United States National Security
Agency (NSA) with an ASN.1 syntax from the initial
Fortezza (used in teh nefarious clipper chip). The
Internet Key Exchange (IKE) [RFC-2409] is a session-key
excahnge mechanism that fits alongside Fortezza under
its own "Domain of Interpretation" (DOI).

He goes on to state that it has "egregious fundamental design
flaws", and states that he was administratively prevented from
publishing the information in the IETF until after publication
of IKE/ISAKMP.

It's interesting that OpenBSD has implemented IKE/ISAKMP already.

Terry Lambert
te...@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.

David Scheidt

unread,

Dec 15, 1999, 3:00:00 AM12/15/99

to

On Tue, 14 Dec 1999, Brett Glass wrote:

> At 06:05 AM 12/14/1999 , Jamie Bowden wrote:
>
> >Can I point out that the PC isn't the only platform on the planet? When I
> >was at NASA 16 processor (or more) Origin2000's and Sun Enterprise servers
> >with anywhere from 200GB to 1TB+ drive arrays on them were quite common.
> >
> >Eventually PC's won't be single processor toys.
>
> Multiprocessing has always been a stopgap measure to get extra performance
> out of a machine until uniprocessors caught up. The diminishing returns

But uniprocessors will never catch up. The glue needed to build an N-way
machine will always be less expensive than N uniprocessor boxes. N may
change in value as technology changes, but the benefit of being able to
share resources like memory and I/O channels wil always exist for some
applications.

> make tightly coupled multiprocessing far less desirable than loosely
> coupled (or uncoupled!) distributed computing.

For some applications loosely coupled multi-processing makes sense. For
others, like operations on one datastream, it doesn't.

David Scheidt

Nathan Kinsman

unread,

Dec 15, 1999, 3:00:00 AM12/15/99

to

I wont speak for IDE drives in general. However, I'm currently using
Maxtor DiamondMax Plus hard drives for servers with moderate or lower
workload. They have DualWave™ twin processors, 2 MB 100 MHz SDRAM cache
buffer, 7200 RPM, and run on a UltraDMA 66. On a simple "dd" benchmark,
I get 20 megs per second (yeah I know, single tasking). Coupled with an
Arco DupliDisk IDE RAID 1 controller, this gives me up to 40 megabytes
per disk of fault tolerant storage for around $500.00-$600.00 USD. This
is also the price of one WD Enterprise 9 gig SCSI last I checked (a few
months ago). I freely admit there are advantages to using SCSI, but
with a the above configuration I can put together entire clusters of IDE
RAID servers for the price of a single SCSI RAID system. Clearly any
performance advantages of SCSI end at that point.

DupliDisk-Bay 5.25 IDE RAID 1 Mirroring Controller ......$250 USD
20.40GB Diamond MAX Plus .................................$183 USD
20.40GB Diamond MAX Plus .................................$183 USD

Along with rest of server hardware:
AMD Athlon 500 ...........................................$181 USD
Asus K7M Motherboard .....................................$150 USD
256 Megs SDRAM ...........................................$220 USD
Intel EtherExpress Pro 100+ NIC ..........................$ 60 USD
3DCool Tornado 1000 Case .................................$155 USD
(supercooling!)
Floppy, keyboard, etc .....................................$ 50 USD
approx

And there you go, a high performance hardware RAID 1 server for FreeBSD
at around $1400 USD, or the price of a cheap RAID controller. Buy 6 of
them and build a server cluster that would massively outperform a
similarly priced Dell Poweredge 4350 with Dual Pentium 600s and mirrored
18 gig SCSI.

Prices obtained by Pricewatch. http://www.pricewatch.com

Jeroen C. van Gelderen

unread,

Dec 15, 1999, 3:00:00 AM12/15/99

to

Terry Lambert wrote:
> > > Now if only IKE/ISAKMP weren't based on clipper chip technology..

It's said to see someone like you issue such a FUDish statement. IKE
may have it's problems but this has nothing to do with it's 'Clipper
heritage'.

> Read the December 1999 ";login:" magazine from Usenix, and see
> the article:
>
> IKE/ISAKMP considered harmful
> William Allen Simpson
>
> I quote from the first paragraph following the abstract:
>
> The Internet Security Association and Key Management
> Protocol (ISAKMP) [RFC-2408] framework was originally
> developed by the United States National Security
> Agency (NSA) with an ASN.1 syntax from the initial
> Fortezza (used in teh nefarious clipper chip). The
> Internet Key Exchange (IKE) [RFC-2409] is a session-key
> excahnge mechanism that fits alongside Fortezza under
> its own "Domain of Interpretation" (DOI).
>
> He goes on to state that it has "egregious fundamental design
> flaws", and states that he was administratively prevented from
> publishing the information in the IETF until after publication
> of IKE/ISAKMP.

This reinforces my comments above. And if you quote the *relevant*
sections of the document it will become even clearer...

> It's interesting that OpenBSD has implemented IKE/ISAKMP already.

What are you trying to say?

Cheers,
Jeroen
--
Jeroen C. van Gelderen - jer...@vangelderen.org
Interesting read: http://www.vcnet.com/bms/ JLF

Brett Glass

unread,

Dec 15, 1999, 3:00:00 AM12/15/99

to

At 09:07 PM 12/14/1999 , David Scheidt wrote:

> > Multiprocessing has always been a stopgap measure to get extra performance
> > out of a machine until uniprocessors caught up. The diminishing returns
>
>But uniprocessors will never catch up.

Actually, uniprocessors often do best in price/performance, because multiprocessor
servers are priced so high and CPUs represent such a large precentage of the price
of the system.

>The glue needed to build an N-way
>machine will always be less expensive than N uniprocessor boxes.

Not so. The special chip sets are usually priced at a premium.

> > make tightly coupled multiprocessing far less desirable than loosely
> > coupled (or uncoupled!) distributed computing.
>
>For some applications loosely coupled multi-processing makes sense. For
>others, like operations on one datastream, it doesn't.

Actually, a Web page that draws images from several servers via IMG
tags is very much like an "operation on one datastream," very neatly
distributed.

--Brett

Daniel O'Connor

unread,

Dec 15, 1999, 3:00:00 AM12/15/99

to

On 15-Dec-99 Brett Glass wrote:
> Not so. The special chip sets are usually priced at a premium.

Well not in the case for 'low end' SMP..

ie an Epox dual PII/PIII mobo is about AU$40 more expensive than the UP
version..

Its a basic board, and doesn't have anything on board, but they work
really nicely under FreeBSD :)

---
Daniel O'Connor software and network engineer
for Genesis Software - http://www.gsoft.com.au
"The nice thing about standards is that there
are so many of them to choose from."
-- Andrew Tanenbaum

Brett Glass

unread,

Dec 15, 1999, 3:00:00 AM12/15/99

to

At 09:02 AM 12/15/1999 , David Scheidt wrote:

>On NonPC machines, CPU cost is a pretty small fraction of system
>cost.

That's because, on more proprietary systems, the costs of other
components are artificially high -- usually by artifice. I remember
trying to put a generic SCSI drive into an SGI system several years
ago. It was a struggle, because they used special mounting brackets
and a special connector, trying to make it look as if you HAD to
buy the drive from them at 4X the going price. But it was a plain
old SCSI drive, and you could tell which brand by looking at the
mechanical design.

--Brett

David Scheidt

unread,

Dec 15, 1999, 3:00:00 AM12/15/99

to

[ cc list trimmed. ]

On Wed, 15 Dec 1999, Brett Glass wrote:

> At 09:02 AM 12/15/1999 , David Scheidt wrote:
>
> >On NonPC machines, CPU cost is a pretty small fraction of system
> >cost.
>
> That's because, on more proprietary systems, the costs of other
> components are artificially high -- usually by artifice. I remember

My point was that NonPC machines tend to have a higher peripherial loads,
and better designed hardware. They have things like memory subsystems that
support 16-way interleaving and multiple-bit ECC. They are also much lower
volume, which pushes up the cost of parts. The setup costs for an ASIC are
much the same if you are making 20,000 or a million.

> trying to put a generic SCSI drive into an SGI system several years
> ago. It was a struggle, because they used special mounting brackets
> and a special connector, trying to make it look as if you HAD to
> buy the drive from them at 4X the going price. But it was a plain
> old SCSI drive, and you could tell which brand by looking at the
> mechanical design.

HP aren't so bad about that. They do use a proprietary disk enclosure
system, but it is very nice, and we did look at other systems. Equivelent
systems weren't much cheaper, and didn't have as good guarantees about
long-term replacement part availability. If a system has a design life-time
of seven years, you want to be sure you can get spares in 6.5! (You can, of
course, stockpile them yourself, but there are non-trivial costs involved
with that. )

David scheidt

Mark Ovens

unread,

Dec 15, 1999, 3:00:00 AM12/15/99

to

On Tue, Dec 14, 1999 at 05:54:56PM -0700, Brett Glass wrote:
> At 11:53 AM 12/14/1999 , Terry Lambert wrote:
>
> >Sorry, but Bzzzt. SCSI is actually cheaper in Europe than the
> >US.
>
> Or is IDE more expensive, so that the prices converge that way?
> This was the impression I got when I was in Europe.
>

Where in Europe? In this month's Personal Computer World (Dabs Direct advert,
http://www.dabs.com):

IBM 18ES U2W 18.1GB UKP364.25
IBM 36ZX U2W 36.7GB UKP821.32

IBM Deskstar GP UDMA66 20.3GB UKP139.82
IBM Deskstar GXP UDMA66 34.2GB UKP252.62

The exchange rate as I type is US$1.6067 to the UKP

> > > The ideal thing would be a hybrid: a drive which supported the
> > > full SCSI command repertoire but didn't have the overhead of
> > > selection, arbitration, bus settling time, signal deskewing,
> > > etc.
> >
> >Yeah, that's called "ATAPI". All IDE CDROMs are SCSI CDROMS
> >in disguise.
>
> ATAPI is *sort of* that. But not really. Some important SCSI
> features are missing.
>
> --Brett
>
>

> To Unsubscribe: send mail to majo...@FreeBSD.org
> with "unsubscribe freebsd-chat" in the body of the message

--
PERL has been described as "the duct tape of the Internet"
and "the Unix Swiss Army chainsaw"
- Computer Shopper 12/99
________________________________________________________________
FreeBSD - The Power To Serve http://www.freebsd.org
My Webpage http://ukug.uk.freebsd.org/~mark/
mailto:ma...@ukug.uk.freebsd.org http://www.radan.com

Terry Lambert

unread,

Dec 16, 1999, 3:00:00 AM12/16/99

to

> > > > Now if only IKE/ISAKMP weren't based on clipper chip technology..
>
> It's said to see someone like you issue such a FUDish statement. IKE
> may have it's problems but this has nothing to do with it's 'Clipper
> heritage'.

";login:" is read by a hell of a lot more people than my
posts to "chat".

The ";login:" article identifies many attacks against IKE/ISAKMP,
and provides source code for one of them.

> This reinforces my comments above. And if you quote the *relevant*
> sections of the document it will become even clearer...

The ";login:" document, or the IKE/ISAKMP document?

> > It's interesting that OpenBSD has implemented IKE/ISAKMP already.
>
> What are you trying to say?

That perhaps they would have something useful to say on the
subject.

Terry Lambert
te...@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.

Terry Lambert

unread,

Dec 16, 1999, 3:00:00 AM12/16/99

to

> >The glue needed to build an N-way
> >machine will always be less expensive than N uniprocessor boxes.
>

> Not so. The special chip sets are usually priced at a premium.

I think this is because they work, and allow things like more
than 2 PCI bus masters at a time, compared to many chipsets,
whose arbitration logic fails over 2 PCI masters.

Brett Glass

unread,

Dec 16, 1999, 3:00:00 AM12/16/99

to

At 10:45 AM 12/15/1999 , David Scheidt wrote:

>HP aren't so bad about that. They do use a proprietary disk enclosure
>system, but it is very nice, and we did look at other systems. Equivelent
>systems weren't much cheaper, and didn't have as good guarantees about
>long-term replacement part availability. If a system has a design life-time
>of seven years, you want to be sure you can get spares in 6.5! (You can, of
>course, stockpile them yourself, but there are non-trivial costs involved
>with that. )

HP is the very WORST manufacturer when it comes to out-of-warranty service.
Costs to do anything more than routine maintenance on any product that's
no longer shipping are designed to exceed the replacement cost. They once
wanted to charge me $350 for -- believe it or not -- a KEYCAP! (They wouldn't
send it without charging me for a whole new keyboard.) In fact, in many cases,
they just WILL NOT SELL PARTS. I'll never buy a printer or laptop from them
again due to such policies.

--Brett Glass

Jeroen C. van Gelderen

unread,

Dec 16, 1999, 3:00:00 AM12/16/99

to

Terry Lambert wrote:
> > > > > Now if only IKE/ISAKMP weren't based on clipper chip technology..
> >
> > It's said to see someone like you issue such a FUDish statement. IKE
> > may have it's problems but this has nothing to do with it's 'Clipper
> > heritage'.
>
> ";login:" is read by a hell of a lot more people than my
> posts to "chat".

What's your point?

> The ";login:" article identifies many attacks against IKE/ISAKMP,
> and provides source code for one of them.

This still has nothing to do with it's 'Clipper heritage' as you
originally implied[1].

> > This reinforces my comments above. And if you quote the *relevant*
> > sections of the document it will become even clearer...
>
> The ";login:" document, or the IKE/ISAKMP document?

The ";login:" document. The part you quoted doesn't tell us that
the problems stem from any 'Clipper heritage', so quote the
relevant part.

> > > It's interesting that OpenBSD has implemented IKE/ISAKMP already.
> >
> > What are you trying to say?
>
> That perhaps they would have something useful to say on the
> subject.

Can't get less FUD^H^H^Huseful, so I agree.

Cheers,
Jeroen

[1] "Now if only IKE/ISAKMP weren't based on clipper chip
technology..." -- Terry Lambert

--
Jeroen C. van Gelderen - jer...@vangelderen.org
Interesting read: http://www.vcnet.com/bms/ JLF

Brett Glass

unread,

Dec 16, 1999, 3:00:00 AM12/16/99

to

At 05:07 PM 12/15/1999 , Terry Lambert wrote:
> > >The glue needed to build an N-way
> > >machine will always be less expensive than N uniprocessor boxes.
> >
> > Not so. The special chip sets are usually priced at a premium.
>
>I think this is because they work, and allow things like more
>than 2 PCI bus masters at a time, compared to many chipsets,
>whose arbitration logic fails over 2 PCI masters.

That's correct. Most of these chipsets are produced in relatively
small volumes by server manufacturers, who must devote a lot of
time, effort, equipment, and staff to R&D. One pays a premium
for that!

The most cost-effective solution, when one needs more computing
resources than fit cheaply into one box, is to find ways to
distribute the problem cleanly among MANY boxes. SMP is, most
of the time, either a last resort or a way to throw money at
the problem rather than finessing it.

--Brett Glass

Terry Lambert

unread,

Dec 16, 1999, 3:00:00 AM12/16/99

to

> > > > > > Now if only IKE/ISAKMP weren't based on clipper chip technology..
> > >
> > > It's said to see someone like you issue such a FUDish statement. IKE
> > > may have it's problems but this has nothing to do with it's 'Clipper
> > > heritage'.
> >
> > ";login:" is read by a hell of a lot more people than my
> > posts to "chat".
>
> What's your point?

That my post informs about the ";login:" article, and, having a
smaller circulation, should be taken as a call for indignant
people such as yourself, rather than a direct FUD supposedly
by me.

> > The ";login:" article identifies many attacks against IKE/ISAKMP,
> > and provides source code for one of them.
>
> This still has nothing to do with it's 'Clipper heritage' as you
> originally implied[1].

I don't understand how you can make this bald a statement; the
problems with Fortezza based systems are that the underlying
state machine sucks.

Why is it when knee-jerk reactionaries see "Clipper", they
automatically think I'm talking about back doors, rather than
the quality of the technology?

> > The ";login:" document, or the IKE/ISAKMP document?
>
> The ";login:" document. The part you quoted doesn't tell us that
> the problems stem from any 'Clipper heritage', so quote the
> relevant part.

A great many of the problematic specifications are due
to the IKE/ISAKMP framework. This is not surprising,
since the early drafts used ASN.1 and were fairly clearly
ISO-inspired. The observations of another ISO implementor
(and security analyst) appear applicable:

The specification was so general, and left so many
choices, that it was necessary to hold "implementor
workshops" to agree on what subsets to build and
what choices to make. The specification wasn't a
specification of a protocol. Instead it was a
framework in which a protocol could be designed and
implemented. [Folklore-00]

The IKE/ISAKMP framework relies on a "Domain of
Interpretation" (DOI) for the actual details. IKE/ISAKMP
has required numerous implementation workshops to reach
agreement on the interpretations of the spcifications.
Implementation and testing has already taken several years.

In any case, if you want to read more, you can always get a copy
of the December ";login:" from any technical library, instead of
having me type it in for you.

> > > > It's interesting that OpenBSD has implemented IKE/ISAKMP already.
> > >
> > > What are you trying to say?
> >
> > That perhaps they would have something useful to say on the
> > subject.
>
> Can't get less FUD^H^H^Huseful, so I agree.

I meant that I would be interested in how they answer Mr. Simpson's
objections. All of them, not just the Fortezza based ones. He
outlines a number of vulnerabilities:

o Cookie crumb attack
o Cookie Jar Attack
o Cookie race attack
o Agressive denial of service
o Cookie deficiency
o Revealed identities
o Futile filters
o Quick denial of service

and provides source code for the "Cookie crumbs" exploit.

I would be very interested in how people are going to defend an
IKE/ISAKMP system against this exploit.

The code runs on FreeBSD.

The author can be reached at: <wsim...@greendragon.com> if
you want to obtain source code.

Terry Lambert
te...@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.

Terry Lambert

unread,

Dec 16, 1999, 3:00:00 AM12/16/99

to

> > > >The glue needed to build an N-way
> > > >machine will always be less expensive than N uniprocessor boxes.
> > >
> > > Not so. The special chip sets are usually priced at a premium.
> >
> >I think this is because they work, and allow things like more
> >than 2 PCI bus masters at a time, compared to many chipsets,
> >whose arbitration logic fails over 2 PCI masters.
>
> That's correct. Most of these chipsets are produced in relatively
> small volumes by server manufacturers, who must devote a lot of
> time, effort, equipment, and staff to R&D. One pays a premium
> for that!

My point here was that I don't give a damn how cheap it is,
if it doesn't work. It doesn't matter if I'm getting a palm
computer, a pager, or installing a network operations center:
if it doesn't work, it's not useful for anything but landfill.

> The most cost-effective solution, when one needs more computing
> resources than fit cheaply into one box, is to find ways to
> distribute the problem cleanly among MANY boxes. SMP is, most
> of the time, either a last resort or a way to throw money at
> the problem rather than finessing it.

I'm not even involved in your SMP thread; I'm only saying that
what's "special" about the chipsets you seem to find too expensive
is that they actually _work_, compared to the cheaper chipsets
you are putatively defending.

Brett Glass

unread,

Dec 16, 1999, 3:00:00 AM12/16/99

to

At 05:57 PM 12/15/1999 , Terry Lambert wrote:

>My point here was that I don't give a damn how cheap it is,
>if it doesn't work.

In this business, "cheap" doesn't necessarily mean "doesn't
work;" more often, it means "high volume." Which, in a
competitive market, means that it is more likely to work.
The bugs will come out quickly, and customers will switch
to other products if they're not fixed. (A non-competitive
market, such as desktop operating systems, is another
story, of course.)

> > The most cost-effective solution, when one needs more computing
> > resources than fit cheaply into one box, is to find ways to
> > distribute the problem cleanly among MANY boxes. SMP is, most
> > of the time, either a last resort or a way to throw money at
> > the problem rather than finessing it.
>
>I'm not even involved in your SMP thread; I'm only saying that
>what's "special" about the chipsets you seem to find too expensive
>is that they actually _work_, compared to the cheaper chipsets
>you are putatively defending.

I've seen some darn good "cheap" chipsets. VIA's come to
mind. In fact, VIA was good enough to be tapped by AMD to design
the motherboard chipsets for the Athlon.

--Brett

Jamie Bowden

unread,

Dec 16, 1999, 3:00:00 AM12/16/99

to

Brett Glass wrote:

> trying to put a generic SCSI drive into an SGI system several years
> ago. It was a struggle, because they used special mounting brackets
> and a special connector, trying to make it look as if you HAD to
> buy the drive from them at 4X the going price. But it was a plain
> old SCSI drive, and you could tell which brand by looking at the
> mechanical design.

Sorry Brett, but this is just wrong. SGI uses mounting sleds that only
work in their machines, but so does everyone else. The connectors are all
standard 50pin, 68pin, or SCA.

Jamie Bowden

--

"Of course, that's sort of like asking how other than Marketing, how Microsoft is different from any other software company..."
Kenneth G. Cavness

Brett Glass

unread,

Dec 17, 1999, 3:00:00 AM12/17/99

to

At 05:35 AM 12/16/1999 , Jamie Bowden wrote:

>Sorry Brett, but this is just wrong. SGI uses mounting sleds that only
>work in their machines, but so does everyone else. The connectors are all
>standard 50pin, 68pin, or SCA.

They weren't on THAT machine. Perhaps they've given up on that strategy
now.

--Brett