Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

RAID 7 Is An Extension Of The 5 Berkeley RAID Levels

0 views
Skip to first unread message

Lester Buck

unread,
Sep 20, 1992, 10:09:08 AM9/20/92
to
[More reposting...]

>Newsgroups: comp.periphs.scsi
>From: ror...@icon.com (Randy Rorden)
>Organization: Icon International, Inc.
>Date: Sun, 20 Sep 1992 01:34:57 GMT
>Message-ID: <1992Sep20.0...@icon.com>
>References: <1992091715...@tuna.wang.com>
>Sender: ne...@icon.com
>Nntp-Posting-Host: kolob.icon.com
>Lines: 162

In article <1992091715...@tuna.wang.com> MICHAEL...@OFFICE.WANG.COM ("Michael Willett") writes:
>Peter Whittaker asks how SCSI-based RAID 7 relates to the earlier
>RAID levels, as far as standards go. I'm just now reading one of
>Storage Computer's reference papers entitled "RAID Aid: A Taxonomic
>Extension of the Berkeley Disk Array Schema." The paper summarizes
>the different Berkeley RAID 1 through 5 levels in terms of their
>architecture, advantages, and disadvantages. It then describes the
>extended levels 6 and 7 for RAID architecture, and presents conclusions.
>(Apparently RAID 7 is an extension of the earlier RAIDs, probably like
>the way computer vendors add their own extensions to UNIX.)
>
>Another one of their technical papers says that RAID 7 has three unique
>architectural features that differentiate it from other RAID levels,
>producing improved performance:
>
>1) RAID 7 is asynchronous with respect to usage of I/O data paths.
>Each I/O drive (includes all data and one or more of the parity
>drives) as well as each host interface (there may be multiple host
>interfaces) has independent control and data paths. This means that
>each can be accessed completely independent of the other. This is
>facilitated by a separate device cache for each device/interface as
>well.
>
>2) RAID 7 is asynchronous with respect to device hierarchy and data
>bus utilization. Each drive and each interface is connected to a
>high speed data bus and controlled by the embedded operating system
>to make independent transfers to and from the central cache.
>
>3) RAID 7 is asynchronous with respect to the operation of an embedded
>real time process oriented operating system. This means that
>exclusive of and independent of the host, or multiple host paths, the
>embedded operating system manages all I/O transfers asynchronously
>across the RAID 7 data and parity drives.
>
>The articles explain how RAID 7 overcomes the earlier RAID performance
>problems where RAID 3 fails on small reads and small writes,
>RAID 5 handles large and small writes poorly, and RAID 1 does not
>execute large and small writes well.
>
>The articles conclude that the architectural changes of level 6 and
>particularly level 7 explicitly recognize the asynchronous nature of
>disk drive usage, and thereby optimize the (host viewed) process of
>asynchronous data transfers. Authors and practitioners alike (references
>cited are a Berkeley report, Information Week, Datamation, and Computer
>Technology Review reports) have identified the requirement for RAID
>architecture to serve the transaction processing -- or asynchronous --
>environment, and levels 6 and 7 are a natural evolution of the RAID
>architecture to support these requirements.
>
>Since I'm not really an expert in this field, interested people might
>want to check directly with Storage Computer (Tel. 603-880-3005,
>FAX 603-889-7232) for "the rest of the story" and copies of the
>articles. Alternatively, I could US mail out some xerox copies of these
>technical articles to interested people. (They're a little too long
>for me to rekey.)
>
>I'll see if I can get some more information on this at their Banyan
>Users' Group meeting booth next week. I may also see them at today's
>AIX product fair at IBM.
>
>What opinions and advice might people have about this?
>
>------------------...@OFFICE.Wang.com-----------------------

I too read the press releases that implied that some kind of new RAID
level had been invented at Storage Computer Corp. I then read an article
by John O'Brien of Storage Computer that was published in the Spring
1992 Computer Technology Review. In it, he mentions the above-listed
three features that supposedly make RAID 7 different from other
RAID levels. I do not agree that these "architectural" features
constitute a different RAID level. They simply describe an implementation
of an already defined RAID organization.

As an introduction, O'Brien indicates that each RAID level reflects
a different design architecture. This is not correct. The original
and subsequent Berkeley papers [Patterson88] [Katz89], made it clear
that even though they described the RAID organizations in terms of
hardware implementations, it was done solely to simplify the presentation.
They also discuss the potential benefits and disadvantages of architectural
features such as buffering, caching, and asynchronous access to enhance the
performance of systems using various RAID organizations [Katz89]. They do not
assign new RAID levels to these features. That's because RAID levels
define different ways of organizing data on disk drives and ways of
providing redundancy so that lost data can be recovered when a drive fails,
not how those drives may be connected, controlled, cached, or bufferred.

The RAID organization described in Mr. O'Brien's article is simply a
RAID level 4 with device-level and global caching. RAID 4 does block-level
interleaving across a set of data drives and stores the xor of the blocks
for a given "stripe" on a parity drive. RAID 4 is not a typically used
organization in commercial disk arrays, due to their tendency to "bottleneck"
on the parity drive. All writes that are smaller than the number of blocks
in a stripe must do a read-modify-write to the parity drive. RAID 5
helps to alleviate this problem by rotating the parity block across all
the drives, thus allowing multiple parity updates to occur simultaneously.
The overhead to calculate the drive for a given parity block is very
small - just a modulo of the stripe number. For this reason, array vendors
choose to implement RAID 5 instead. The Storage Computer product uses
caching to help reduce the parity drive "bottleneck" problem, such as keeping
the old parity in cache to avoid an extra read, but these same benefits
can be applied to RAID 5 as well.

Other statements in the O'Brien article that I take issue with:

"Unlike RAID 5, the RAID 7 device expends no overhead to support a
rotated parity distribution scheme."

Technically, this is true, but as mentioned above, the overhead to perform
a modulo divide is extremely small in relation to the overall data transfer
time and can be done in parallel with device access such that it has no
impact at all.

"Unlike RAID 3 and RAID 5, RAID 7 supports multiple host connections."

There are no intrinsic restrictions to the support of multiple host
connections within any RAID organization, including levels 3 and 5.
I know of several disk array products, including one my own company
has just announced that support multiple hosts on various RAID levels.

"RAID 3 offers an average access time longer than that of a single
spindle and cannot match the single spindle performance for small writes."

Again, technically correct, but only if you add "in cases where the disks in
the group are not synchronized." For this reason, implementations of
RAID 3 use spindle sync. They have exactly the same access time as
a single spindle.

"Unlike RAID 5 which can only be scaled up in multiples of its specific
write group size, RAID 7 is capable of being linearly scaled up."

To be honest, I'm not sure I understood this statement. There is nothing
that prevents a RAID 5 organization from being scaled to any number
of drives. If Mr. O'Brien meant that you can't add a drive to an
existing RAID 5 set to expand its capacity, this too is incorrect.
This can be done in two ways: a new drive can add its blocks at
the end of the existing set of logical blocks, with each new block
participating in the parity calculation of the existing stripes, or
the data and parity blocks can be re-spread across the new stripe
width. The first method can happen instantaneously (assuming all the
new blocks were pre-zeroed), but will not be able to enjoy the
performance benefits of striping when accessing the added blocks.
In any case, there is no difference in "scalability" between RAID 5
and Storage Computer's RAID 4.

I don't want to give the impression that I consider the architectural
features of the Storage Computer array to be inconsequential - to the
contrary, our own RAID 5 product uses many of these features to greatly
improve performance. I just don't agree with the implication that these
features, or any other implementation details, can rightly be called
a new RAID level. If this were true, there would arguably be as many
RAID levels as there are RAID vendors and products (a quickly expanding
number). "RAID 7" is a name a marketing type picked for a product, not
a new RAID level. I just hope the rest of the marketing types don't
pick this up and play "one-upsmanship" with their own RAID 8, 9, etc.
(a co-worker suggested we should beat them to it by announcing a
RAID aleph-null :-) ).

--
Randy Rorden ror...@icon.com
Engineering Director
Sanyo Icon International, Inc. PHONE: (801) 225-6888
Orem, Utah 84057 FAX: (801) 226-0651

--
A. Lester Buck bu...@siswat.hou.tx.us ...!uhnix1!siswat!buck

Randy Rorden

unread,
Sep 20, 1992, 6:44:44 PM9/20/92
to
My thanks to Lester Buck for recognizing that my posting is more
appropriate to this group. I also neglected to include my references,
a netiquette "faux pas" for sure. Here they are:

[Patterson88] D.A. Patterson, G.A. Gibson, and R.H. Katz, "A case
for redundant arrays of inexpensive disks," Proc. ACM SIGMOD Conf.,
Chicago, IL, June 1988.

[Katz89] R.H. Katz, G.A. Gibson, and D.A. Patterson, "Disk System
Architectures for High Performance Computing," Proc. IEEE, vol. 77,
no. 12, December 1989.

0 new messages