Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Cache Coherency protocols for DSM.

0 views
Skip to first unread message

Steven Hancock

unread,
Aug 18, 1994, 4:07:58 PM8/18/94
to comp-p...@uunet.uu.net
Hi all,

I am working on my Master's thesis and would like more information
about software-based cache coherency protocols for Distributed
Shared Memory (DSM). Bibliographic references would be most helpful.

Thanks in advance.

+-----------------------------------------------------------+
| Steve Hancock steve....@msfc.nasa.gov | +-----------------------------------------------------------+

John Carter

unread,
Aug 20, 1994, 1:53:22 AM8/20/94
to
steve....@msfc.nasa.gov (Steven Hancock) writes:

> I am working on my Master's thesis and would like more information
> about software-based cache coherency protocols for Distributed
> Shared Memory (DSM). Bibliographic references would be most helpful.

[WARNING -- Yet again I've fallen prey to writing ten times more text
than I intended -- this is a LONG message. Hopefully you and others will
find it useful.]

Here's my quick "history of the world" when it comes to DSM protocols. The
discussion of consistency models is inherently wrapped up in this somewhat,
as all the protocols after the first take advantage of some form of relaxed
consistency (weak, release, entry, lazy release, ...).

DISCLAIMER: Most of the systems/protocols I list below were described in
multiple publications, but in general I only include one citation (the one
I consider "best"). I use bibtex formats throughout. Apologies to those
whose systems I forget or those who I misrepresent -- there have been more
DSM systems than you can shake a stick at. Also, remember that I'm doing
this from memory while sitting 5000 miles from my office.

First, here's a survey of the issues -- a good starting point, although it
has a couple of errors (e.g, Munin was a the V system, not Sys V Unix :-):

@ARTICLE{nitzberglo91,
AUTHOR = {B. Nitzberg and V. Lo},
TITLE = {Distributed Shared Memory: A Survey of Issues and Algorithms},
JOURNAL = IEEE-COMPUTER,
VOLUME = {24},
NUMBER = {8},
PAGES = {52-60},
MONTH = aug,
YEAR = 1991}

>>> Grand progenitor of all that is DSM: Ivy (Li and Hudak) <<<

First system to support the shared memory abstraction on distributed memory
machines. Employed a single writer, write-invalidate (ownership-based)
sequentially consistent coherency protocol. Tried several different
directory management schemes -- distributed schemes worked best.

@ARTICLE{lihudak89,
AUTHOR = {K. Li and P. Hudak},
JOURNAL = {ACM Transactions on Computer Systems},
NUMBER = 4,
PAGES = {321-359},
TITLE = {Memory Coherence in Shared Virtual Memory Systems},
VOLUME = 7,
MONTH = nov,
YEAR = 1989,
xNOTE = {An earlier version appeared in
{\em Proceedings of the 1986 5th Annual ACM
Symposium on Principles of Distributed Computing},
pages 229--239, August 1986.}}

>>> Early object-based DSMs: Clouds, Emerald, Amber, Orca <<<

CLOUDS introduced a number of interesting mechanisms on top of conventional
DSM (sequentially consistent, write-invalidate), including support for
objects and the ability to lock objects down to mitigate the potential for
"ping ponging" that can occur in Ivy (as well as conventional shared memory
hardware) due to false (and true) sharing.

@TECHREPORT{ramachandranahamad88,
AUTHOR = {U. Ramachandran and M. Ahamad and Y.A. Khalidi},
TITLE = {Unifying Synchronization and Data Transfer in Maintaining
Coherence of Distributed Shared Memory},
INSTITUTION = {Georgia Institute of Technology},
YEAR = 1988,
NUMBER = {GIT-CS-88/23},
MONTH = jun}

[Quite a few Clouds papers have been published over the years, but I've
never figured out which one is best to cite. This tech report is the one I
learned from, and includes all of the important details, so I cite it.
Perhaps one of the authors could follow up and tell us which one they
prefer.]

EMERALD supported consistency in a very different way from conventional
DSM systems -- by disallowing replication. Traditionally, threads migrated
to where the data was (via RPCs/object invocations), although the programmer
could move objects using special system calls.

@ARTICLE{jullevy88,
AUTHOR = {E. Jul and H. Levy and N. Hutchinson and A. Black},
TITLE = {Fine-Grained Mobility in the {Emerald} System},
JOURNAL = ACM-TOCS,
YEAR = 1988,
VOLUME = {6},
NUMBER = {1},
PAGES = {109-133},
MONTH = feb}

AMBER added elements of DSM to an Emerald-like distributed objects system
(a gross simplification, I know). The basic consistency protocol was
similar to that of previous systems, but not as restrictive as Emerald's.

@CONFERENCE{chaseamador89,
AUTHOR = {J.S. Chase and F.G. Amador and E.D. Lazowska and
H.M. Levy and R.J. Littlefield},
TITLE = {The {A}mber System: Parallel Programming on a Network
of Multiprocessors},
BOOKTITLE = SOSP12,
YEAR = 1989,
PAGES = {147-158},
MONTH = dec}

ORCA added a number of nice enhancements on top of the conventional
distributed shared object model, including the use of compiler
optimizations to dynamically switch between write-invalidate and
write-update schemes. It ran on the Amoeba operating system, exploiting
Amoeba's ordered multicast protocol to perform efficient updates.

@ARTICLE{balkaashoek92,
AUTHOR = {H.E. Bal and M.F. Kaashoek and A.S. Tanenbaum},
TITLE = {Orca: A Language for Parallel Programming of Distributed
Systems},
YEAR = {1992},
JOURNAL = IEEE-TSE,
PAGES = {190-205},
MONTH = mar}


>>> Early sequentially consistent follow-ons to IVY: Mirage, Mether <<<

MIRAGE attacked the ping-pong (excessive invalidation and read miss)
problem that could occur in Ivy in the presence of heavy (false) sharing by
introducing a unique timeout parameter. Once a node received (exclusive?)
access to a piece of data, it was guaranteed to keep the data for at least
some fixed minimum about of time (delta-T). In this way, forward progress
was guaranteed. Although I never published anything about it, this type of
mechanism seriously improved the performance of conventional read-replicate
and write-invalidate in a number of circumstances.

@CONFERENCE{fleischpopek89,
AUTHOR = {B. Fleisch and G. Popek},
TITLE = {Mirage: A Coherent Distributed Shared Memory Design},
BOOKTITLE = SOSP12,
PAGES = {211-223},
MONTH = dec,
YEAR = 1989}

METHER was one of the few (only?) DSM systems that actually made it out
into public release. It supported a variety of specialized protocols on
top of SunOS - one set for traditional data motion and another that were
most useful for synchronization/short messages.

@CONFERENCE{minnichfarber89,
AUTHOR = {R.G. Minnich and D.J. Farber},
TITLE = {The {Mether} System: A Distributed Shared Memory for
{SunOS} 4.0},
BOOKTITLE = {Proceedings of the Summer 1989 USENIX Conference},
PAGES = {51-60},
YEAR = 1989,
MONTH = jun}

>>> Weak consistency <<<

Somewhere in this time frame, Scheurich and Dubois (and Briggs?) introduced
the notion of relaxed consistency models, through the introduction of "weak
consistency". The general idea of weak consistency is that even on uni-
-processor systems, programmers need to use synchronization to protect
shared data to handle arbitrarily-timed context switches. Since
programmers already worry about synchronizing access to shared data, why
not relax the require- -ments on the memory system so that shared data is
only required to be consistent at synchronization points. This relaxation
has little impact on program correctness (none in most cases -- see the
papers for details), but can lead to significant improvements in memory
system performance through the introduction of buffering and pipelining.
Although I don't know of any DSM systems that made use of weak consistency
as originally designed, it's development was an important turning point in
DSM protocol design. The Myrias SPS-1 multiprocessor supported a form of
parallel-loop vector memory accesses that probably weak consistent, but I
never saw their protocol discussed in sufficient detail in any publication
to say for sure.

@CONFERENCE{duboisscheurich86,
AUTHOR = {M. Dubois and C. Scheurich and F.A. Briggs},
TITLE = {Memory access buffering in multiprocessors},
BOOKTITLE = SIGARCH86,
MONTH = may,
YEAR = 1986,
PAGES = {434-442}}

@ARTICLE{duboisscheurich88,
AUTHOR = {M. Dubois and C. Scheurich and F.A. Briggs},
TITLE = {Synchronization, coherence, and event ordering in
multiprocessors},
JOURNAL = {{IEEE} Computer},
VOLUME = 21,
NUMBER = 2,
PAGES = {9-21},
MONTH = feb,
YEAR = 1988}

@BOOKLET{myrias90,
TITLE = {System Overview},
AUTHOR = {{Myrias Corporation}},
ADDRESS = {Edmonton, Alberta},
YEAR = 1990}

>>> Release consistent DSMs: DASH, Munin, Unnamed (Dubois, et. al.) <<<

DASH isn't really a "DSM system," rather its a hardware DSM multiprocessor,
but many of the issues are the same. Of particular interest to software
DSM developers is the development of the Release Consistency memory model,
a follow-on to Weak Consistency that exploited the logical difference
between acquires and releases. Only when you perform a release operation
(its easiest to think of this as being identical to releasing the lock at
the end of a critical section, but it turns out that "release operation" is
more general than this) do you need to ensure that the modifications you
made to shared date be made globally visible. Only at acquires do you need
to guarantee that you're consistent with remote writes made before the
corresponding release. This further relaxation improves memory system
performance even further. The DASH multiprocessor took advantage of this
by introducing a write buffer that allowed invalidation requests and
acknowledgements to be pipelined, essentially dropping the cost of
performing a shared write to the cost of communicating with the directory
to obtain ownership (almost all data invalidations were ack'd by the time
you hit the release point).

@CONFERENCE{gharachorloolenoski90,
AUTHOR = {K. Gharachorloo and D. Lenoski and J. Laudon and
P. Gibbons and A. Gupta and J. Hennessy},
TITLE = {Memory Consistency and Event Ordering in Scalable
Shared-Memory Multiprocessors},
BOOKTITLE = sigarch90,
ADDRESS = {Seattle, Washington},
MONTH = may,
YEAR = 1990,
PAGES = {15-26}}

@CONFERENCE{lenoskilaudon90,
AUTHOR = {D. Lenoski and J. Laudon and K. Gharachorloo and
A. Gupta and J. Hennessy},
TITLE = {The directory-based cache coherence protocol for
the {DASH} multiprocessor},
BOOKTITLE = SIGARCH90,
YEAR = 1990,
PAGES = {148-159},
MONTH = may}

@CONFERENCE{gharachorloogupta91,
AUTHOR = {K. Gharachorloo and A. Gupta and J. Hennessy},
TITLE = {Performance Evaluations of Memory Consistency Models
for Shared-Memory Multiprocessors},
BOOKTITLE = ASPLOS4,
YEAR = 1991,
MONTH = apr}


[WARNING: Munin is my baby, so I'm not exactly unbiased about it.]

MUNIN introduced a number of novel mechanisms designed to reduce what I saw
as the most serious performance problem of conventional DSMs -- excessive
communication. In particular, false sharing is common given the large
granularity of VM-based DSM systems (4k-16k pages), and the cost of read
misses due to invalidations will eat your lunch if there is more than a
trivial amount of sharing in your application. The major contributions of
Munin were (i) the first use of relaxed consistency in a DSM, (ii) support
for multiple protocols in a single DSM system, and (iii) efficient support
for multiple concurrent writers to a single data item. Points (i) and
(iii) are closely related -- Munin's "write-shared" protocol allowed
multiple nodes to simultaneously modify their cached copy of a data item
without communicating, using a "diffing" mechanism to merge updates when
the nodes synchronized. This general idea has been incorporated in several
follow on systems. Of particular importance, the write-shared protocol was
update-based (as opposed to invalidation-based as in all previous systems),
which greatly reduced the number of high-latency read misses that occur
when the invalidated nodes next access the shared data. Previous update
scheme had serious bandwidth problems, but the use of RC allowed Munin to
buffer and coalesce updates to the same cache line between release points,
greatly reducing the amount of update traffic.

In addition, Munin supported a suite of consistency protocols (read-only,
write-shared, conventional, migratory, synch) as well as hooks to let you
write your own, so that the programmer could specify the way in which they
wanted specific data items managed (using the protocol that most
efficiently managed each variable). In addition, programmers could specify
a tie between locks and data so that data would migrate with the locks that
protect it (similar to some ideas that follow, but not nearly so clean).

@CONFERENCE{bennettcarter90b,
AUTHOR = {J.K. Bennett and J.B. Carter and W. Zwaenepoel},
TITLE = {Adaptive Software Cache Management for
Distributed Shared Memory Architectures},
BOOKTITLE = SIGARCH90,
YEAR = 1990,
PAGES = {125-134},
MONTH = may}

@CONFERENCE{carterbennett91,
AUTHOR = {J.B. Carter and J.K. Bennett and W. Zwaenepoel},
TITLE = {Implementation and Performance of {M}unin},
BOOKTITLE = SOSP13,
PAGES = {152-164},
MONTH = oct,
YEAR = 1991}

@ARTICLE{carterbennett94,
AUTHOR = {J.B. Carter and J.K. Bennett and W. Zwaenepoel},
TITLE = {Techniques for Reducing Consistency-Related Communication
in Distributed Shared Memory Systems},
JOURNAL = ACM-TOCS,
NOTE = {To appear},
YEAR=1994}

[You can get these and other papers via anonymous ftp to cs.utah.edu:pub/dsm]


Another use of release consistency to delay coherence operations was
described in a paper by Dubois, Wang, Barroso, Lee, and Chen in
Supercomputing 1991. They evaluated the effect of adding buffers to a
multiprocessor cache to delay the sending and/or receiving of invalidation
messages. They found that for large cache line sizes and applications with
a lot of sharing, this could dramatically reduce the number of spurious
(false) read misses that occurred. Again, this was for a hardware system,
but the same protocol mechanisms could be incorporated into a software DSM.

[I just came across this one recently, so excuse the partial reference.]

@CONFERENCE{duboiswang,
AUTHOR = {M.D. Dubois and J-C Wang and L.A. Barroso and K. Lee and
Y-S Chen},
TITLE = {Delayed Consistency and Its Effects on the Miss Rate of
Parallel Programs},
BOOKTITLE = SUPERCOMPUTING91,
YEAR = {1991},
MONTH = {nov}}


>>> Lazy Release Consistent DSMs: Memo -> Treadmarks <<<

TREADMARKS (nee Memo) extends the multiwriter release consistency ideas of
Munin by exploiting a variant of release consistency known as Lazy Release
Consistency. Unlike Munin, which purged updates immediately upon the
release of a lock/barrier (pushes data "eagerly"), Treadmarks pulls data
lazily when a processor performs the corresponding acquire request. While
this has no impact when barriers are used, it can reduce communication
requirements dramatically in some cases (with some tradeoff in the amount
of past state that a node needs to maintain).

[Treadmarks is currently being commercialized. For information on getting
a license to run it on networks of SUN, DEC, SGI, and HP (soon) workstations,
send email to {alc,willy}@cs.rice.edu. A free plug for friends...]

@CONFERENCE{kelehercox92,
AUTHOR = {P. Keleher and A. L. Cox and W. Zwaenepoel},
TITLE = {Lazy Consistency for Software Distributed Shared Memory},
BOOKTITLE = SIGARCH92,
YEAR = 1992,
PAGES = {13-21},
MONTH = may}

@CONFERENCE{coxdwarkadas94,
AUTHOR = {A.L. Cox and S. Dwarkadas and P. Keleher and H. Lu and
R. Rajamony and W. Zwaenepoel},
TITLE = {Software Versus Hardware Shared-Memory Implementation:
A Case Study},
BOOKTITLE = SIGARCH94,
YEAR = 1994,
MONTH = may}

@CONFERENCE{keleherdwarkadas94,
AUTHOR = {P. Keleher and S. Dwarkadas and A. Cox and W. Zwaenepoel},
TITLE = {TreadMarks: Distributed Shared Memory On Standard Workstations
and Operating Systems},
BOOKTITLE = W-USENIX94,
YEAR = 1994,
PAGES = {115-131},
MONTH = jan}

[Treadmarks is currently being commercialized. For information on getting
a license to run it on networks of SUN, DEC, SGI, and HP (soon) workstations,
send email to {alc,willy}@cs.rice.edu. A free plug for friends...]


>>> Entry Consistent DSMs: Midway <<<

MIDWAY introduced the notion of Entry Consistency, which requires that each
variable be explicitly linked to a lock. Then, the data is moved with the
lock when it is acquired. Unlike previous DSM systems, which relied on the
VM system (page faults) or a strict object model (object invocations) to
detect accesses to shared data, Midway used a modified version of gcc to
detect when a shared data object was modified. This avoids the high cost
(on current OSs) of handling page faults and playing games with page
protections. (I'm going to put some words in the authors' mouths here -- I
hope they'll correct me if I'm wrong! ->) To overcome a couple of problems
with entry consistency's programming model, namely the need to write your
programs so that each block of shared data was protected by a single lock
(which in certain cases could force the programmer to decompose their data
into excessively small pieces), Midway also supports release and
sequentially consistent coherency protocols. The programmer (or compiler)
specifies what strength model to use when.

@CONFERENCE{bershadzekauskas93,
AUTHOR = {B.N. Bershad and M.J. Zekauskas and W.A. Sawdon},
TITLE = {The {Midway} Distributed Shared Memory System},
BOOKTITLE = {COMPCON '93},
PAGES = {528-537},
MONTH = feb,
YEAR = {1993}}

@TECHREPORT{bershadzekauskas91,
AUTHOR = {B.N. Bershad and M.J. Zekauskas},
TITLE = {Midway: Shared Memory Parallel Programming with Entry
Consistency for Distributed Memory Multiprocessors},
INSTITUTION = {Carnegie-Mellon University},
YEAR = {1991},
NUMBER = {CMU-CS-91-170},
MONTH = sep}


>>> Mixed Hardware/Software DSM systems: FLASH and Tempest/Typhoon <<<

FLASH is a follow-on to the DASH project (see above). It's basically a
hardware DSM multiprocessor, but the cache controller is going to include a
slimmed down RISC processor, and thus will be "software controlled" (by the
controller's processor, not the main CPU) and could include various
software protocols used in pure software DSMs. Some issues they're
exploring include support for both message passing and shared memory as
communication features (available in some of the above sw systems) and
the ability to model either a CC-NUMA or COMA architecture. I'm sure
they've got a pile of other things up their sleeves, but that they haven't
let the rest of us in on yet. :)

@CONFERENCE{kuskinofelt94,
AUTHOR = {J. Kuskin and D. Ofelt et al.},
TITLE = {The {Stanford FLASH} Multiprocessor},
BOOKTITLE = SIGARCH94,
YEAR = 1994,
MONTH = may}


The TEMPEST/TYPHOON project combines a multiprocessor architecture and
software interface. From the point of view of a DSM designer, the hardware
provides a set of fast support mechanisms for efficient software-controlled
consistency management, and the software (including a sophisticated
compiler) can exploit the hardware to support mechanisms like Munin's
multiple writer protocol, Midway's entry consistency-like links between
data and synchronization, and other mechanisms still being developed. It
is the only hardware system that supports multiple consistency protocols
(like Munin) or memory modes (like Midway). There are a pile of papers
coming out in the next couple of months on recent work (see other messages
in this newsgroup).

@CONFERENCE{hilllarus92,
AUTHOR={M. D. Hill and J. R. Larus and S. K. Reinhardt
and D. A. Wood},
TITLE={Cooperative Shared Memory: Software and Hardware Support
for Scaleable Multiprocessors},
YEAR = 1992,
PAGES = {262-273},
MONTH = oct,
BOOKTITLE = ASPLOS5}

@CONFERENCE{woodchandra93,
AUTHOR = {D.A. Wood and S. Chandra and B. Falsafi and M.D. Hill and
J.R. Larus and A.R. Lebeck and J.C. Lewis and S.S. Mukherjee
and S. Palacharla and S.K. Reinhardt},
TITLE = {Mechanisms for Cooperative Shared Memory},
BOOKTITLE = SIGARCH93,
YEAR = 1993,
PAGES = {156-167},
MONTH = may}

@CONFERENCE{reinhardtlarus94,
AUTHOR = {S.K. Reinhardt and J.R. Larus and D.A. Wood},
TITLE = {{Tempest and Typhoon}: User-Level Shared Memory},
BOOKTITLE = SIGARCH94,
MONTH = apr,
YEAR = 1994}


That pretty much gets us up to the current situation. Don't read too much
into the order of the systems near the end of my list -- several of the
projects are ongoing and more results are sure to come soon.

Here are some more readings that you might find useful either as further
background or as a glimpse at how hardware folks are incorporating some of
the ideas seen in the above DSM systems (perhaps without realizing that
they are doing so) or just plain because I think its interesting work that
is relevant. Of particular interest is the general discussion of directory
based consistency protocols by Agarwal et al and their implementation in
Alewife.

[These entries are in alphabetical order by first author's surname.]

@CONFERENCE{agarwallim90,
AUTHOR = {A. Agarwal and B.-H. Lim and D. Kranz and J. Kubiatowicz},
TITLE = {{APRIL}: A Processor Architecture for Multiprocessing},
BOOKTITLE = SIGARCH90,
PAGES = {104-114},
MONTH = may,
YEAR = 1990}

@CONFERENCE{ahamadhutto91,
AUTHOR = {M. Ahamad and P.W. Hutto and R. John},
TITLE = {Implementing and Programming
Causal Distributed Shared Memory},
BOOKTITLE = DCS91,
YEAR = 1991,
PAGES = {274-281},
MONTH = may}

@ARTICLE{ahujacarreiro86,
AUTHOR = {S. Ahuja and N. Carreiro and D. Gelernter},
TITLE = {Linda and Friends},
JOURNAL = IEEE-COMPUTER,
YEAR = 1986,
VOLUME = {19},
NUMBER = {8},
PAGES = {26-34},
MONTH = aug}

@ARTICLE{almesblack85,
AUTHOR = {G.T. Almes and A.P. Black and E.D. Lazowska and J.D. Noe},
TITLE = {The {E}den system: A technical review},
JOURNAL = IEEE-TSE,
VOLUME = {SE-11},
NUMBER = {1},
PAGES = {43-59},
YEAR = {1985},
MONTH = jan}

@CONFERENCE{alversoncallahan90,
AUTHOR = {R. Alverson and D. Callahan and D. Cummings and
B. Koblenz and A. Porterfield and B. Smith},
TITLE = {The {Tera} Computer System},
BOOKTITLE = ICS90,
YEAR = 1990,
PAGES = {1-6},
MONTH = sep}

@CONFERENCE{andersonlevy91,
AUTHOR = {T.E. Anderson and H.M. Levy and
B.N. Bershad and E.D. Lazowska},
TITLE = {The Interaction of Architecture and Operating System Design},
BOOKTITLE = ASPLOS4,
YEAR = 1991,
PAGES = {108-120},
MONTH = apr}

@ARTICLE{archibaldbaer86,
AUTHOR = {J. Archibald and J.-L. Baer},
TITLE = {Cache Coherence Protocols: Evaluation Using a Multiprocessor
Simulation Model},
JOURNAL = ACM-TOCS,
VOLUME = {4},
NUMBER = {4},
PAGES = {273-298},
MONTH = nov,
YEAR = 1986}

@CONFERENCE{baltanenbaum88,
AUTHOR = {H.E. Bal and A.S. Tanenbaum},
TITLE = {Distributed Programming with Shared Data},
BOOKTITLE = {Proceedings of the 1988 International Conference on
Computer Languages},
PAGES = {82-91},
MONTH = oct,
YEAR = 1988}

@CONFERENCE{bisianiravishankar90,
AUTHOR = {R. Bisiani and M. Ravishankar},
TITLE = {{PLUS}: A Distributed Shared-Memory System},
BOOKTITLE = SIGARCH90,
PAGES = {115-124},
MONTH = may,
YEAR = 1990}

@CONFERENCE{blackhutchinson86,
AUTHOR = {A. Black and N. Hutchinson and E. Jul and H. Levy},
TITLE = {Object Structure in the {E}merald System},
YEAR = 1986,
BOOKTITLE = OOPSLA86,
PAGES = {78-86},
MONTH = oct,
NOTE = {Special Issue of SIGPLAN Notices, Volume 21,
Number 11, November, 1986}}

@ARTICLE{blackhutchinson87,
AUTHOR = {A. Black and N. Hutchinson and E. Jul and
H. Levy and L. Carter},
TITLE = {Distribution and Abstract Types in {E}merald},
JOURNAL = IEEE-TSE,
YEAR = 1987,
VOLUME = {SE-13},
NUMBER = {1},
PAGES = {65-74},
MONTH = jan}

@CONFERENCE{borrmannherdieckerhoff90,
AUTHOR = {Lothar Borrmann and Martin Herdieckerhoff},
TITLE = {A coherency Model for Virtually Shared Memory},
BOOKTITLE = ICPP90,
PAGES = {252-257},
MONTH = aug,
YEAR = 1990}

@ARTICLE{censierfeautrier78,
AUTHOR = {L. Censier and P. Feautrier},
TITLE = {A New Solution to Coherence Problems in Multicache Systems},
JOURNAL = IEEE-TC,
YEAR = 1978,
VOLUME = {C-27},
NUMBER = {12},
PAGES = {1112-1118},
MONTH = dec}

@CONFERENCE{chaikenagarwal94,
AUTHOR = {D. Chaiken and A. Agarwal},
TITLE = {Software-Extended Coherent Shared Memory:
Performance and Cost},
BOOKTITLE = SIGARCH94,
MONTH = apr,
YEAR = 1994}

@CONFERENCE{chaikenkubiatowicz91,
AUTHOR = {D. Chaiken and J. Kubiatowicz and A. Agarwal},
TITLE = {Limit{LESS} Directories: A Scalable Cache Coherence Scheme},
BOOKTITLE = ASPLOS4,
PAGES = {224-234},
MONTH = apr,
YEAR = 1991}

@CONFERENCE{cheongveidenbaum88,
AUTHOR = {H. Cheong and A.V. Veidenbaum},
TITLE = {A Cache Coherence Scheme with Fast Selective Invalidation},
BOOKTITLE = sigarch88,
YEAR = 1988,
PAGES = {138-145},
MONTH = jun}

@ARTICLE{cheriton85,
AUTHOR = {D.R. Cheriton},
TITLE = {Preliminary Thoughts on Problem-Oriented Shared Memory:
A Decentralized Approach to Distributed Systems},
JOURNAL = osr,
YEAR = 1985,
VOLUME = {19},
NUMBER = {4},
PAGES = {26-33},
MONTH = oct}

@CONFERENCE{cheritonslavenburg86,
AUTHOR = {D.R. Cheriton and G.A. Slavenburg and P.D. Boyle},
TITLE = {Software-Controlled Caches in the {VMP} Multiprocessor},
BOOKTITLE = sigarch86,
YEAR = 1986,
MONTH = dec}

@CONFERENCE{coxfowler93,
AUTHOR = {A.L. Cox and R.J. Fowler},
TITLE = {Adaptive Cache Coherency for Detecting Migratory Shared Data},
BOOKTITLE = SIGARCH93,
YEAR = 1993,
PAGES = {98-108},
MONTH = may}

@CONFERENCE{delpsethi88,
AUTHOR = {G.S. Delp and A.S. Sethi and D.J. Farber},
TITLE = {An Analysis of {MemNet}: An Experiment in High-Speed
Shared-Memory Local Networking},
BOOKTITLE = SIGCOMM88,
YEAR = 1988,
PAGES = {165-174},
MONTH = aug}

@CONFERENCE{dubnickileblanc92,
AUTHOR = {C. Dubnicki and T. LeBlanc},
TITLE = {Adjustable block size coherent caches},
BOOKTITLE = SIGARCH92,
YEAR = 1992,
PAGES = {170-180},
MONTH = may}

@CONFERENCE{eggerskatz89,
AUTHOR = {S.J. Eggers and R.H. Katz},
TITLE = {The Effect of Sharing on the Cache and
Bus Performance of Parallel Programs},
BOOKTITLE = ASPLOS3,
YEAR = 1989,
PAGES = {257-270},
MONTH = apr}

@ARTICLE{fitzgeraldrashid86,
AUTHOR = {R. Fitzgerald and R.F. Rashid},
TITLE = {The Integration of Virtual Memory Management
and Interprocess Communication in {Accent}},
JOURNAL = ACM-TOCS,
VOLUME = {4},
NUMBER = {2},
PAGES = {147-177},
MONTH = may,
YEAR = 1986}

@CONFERENCE{forinbarrera89,
AUTHOR = {A. Forin and J. Barrera and R. Sanzi},
TITLE = {The Shared Memory Server},
BOOKTITLE = W-USENIX89,
YEAR = 1989,
PAGES = {229-243},
MONTH = dec}

@TECHREPORT{goodman91,
AUTHOR = {J.R. Goodman},
TITLE = {Cache consistency and sequential consistency},
INSTITUTION = {University of Wisconsin-Madison},
YEAR = 1991,
NUMBER = {CS-1006},
MONTH = feb}

@CONFERENCE{goodmanvernon89,
AUTHOR = {J. R. Goodman and M. K. Vernon and P.J. Woest},
TITLE = {Efficient Synchronization Primitives for Large-Scale
Cache-Coherent Multiprocessor},
BOOKTITLE = ASPLOS3,
PAGES = {64-75},
YEAR = 1989,
MONTH = apr}

@ARTICLE{guptaweber92,
AUTHOR = {A. Gupta and W.-D. Weber},
TITLE = {Cache Invalidation Patterns in Shared-Memory Multiprocessors},
JOURNAL = IEEE-TC,
VOLUME = {41},
NUMBER = {7},
PAGES = {794-810},
MONTH = jul,
YEAR = 1992}

@ARTICLE{hagerstenlandin92,
AUTHOR = {E. Hagersten and A. Landin and S. Haridi},
TITLE = {{DDM} -- {A} Cache-Only Memory Architecture},
JOURNAL = IEEE-COMPUTER,
VOLUME = {25},
NUMBER = {9},
PAGES = {241-248},
YEAR = 1992,
MONTH = sep}

@CONFERENCE{heinleingharachorloo94,
AUTHOR = {J. Heinlein and K. Gharachorloo and A. Gupta},
TITLE = {Integrating Multiple Communication Paradigms in
High Performance Multiprocessors},
BOOKTITLE = SIGARCH94,
MONTH = apr,
YEAR = 1994}

@CONFERENCE {huttoahamad90,
AUTHOR = {P.W. Hutto and M. Ahamad},
TITLE = {Slow Memory: Weakening Consistency to Enhance Concurrency
in Distributed Shared Memories},
BOOKTITLE = DCS90,
MONTH = may,
YEAR = 1990,
PAGES = {302-311}}

@ARTICLE{james90,
AUTHOR = {D.V. James},
TITLE = {Distributed Directory Scheme: Scalable Coherent Interface},
JOURNAL = IEEE-COMPUTER,
YEAR = 1990,
VOLUME = 23,
NUMBER = 6,
PAGES = {74-77},
MONTH = jun}

@CONFERENCE{kranzjohnson93,
Author="David Kranz and Kirk Johnson and Anant Agarwal and
John Kubiatowicz and Beng-Hong Lim",
Title="{Integrating Message-Passing and Shared-Memory; Early
Experience}",
BookTitle="Practice and Principles of Parallel
Programming (PPoPP) 1993",
Month="May",
Year="1993",
Pages="54-63"}

@ARTICLE{lamport79,
AUTHOR = {L. Lamport},
TITLE = {How to Make a Multiprocessor Computer that Correctly
Executes Multiprocess Programs},
JOURNAL = IEEE-TC,
YEAR = 1979,
VOLUME = {C-28},
NUMBER = {9},
PAGES = {690-691},
MONTH = sep}

@TECHREPORT{liptonsandberg88,
AUTHOR = {R.J. Lipton and J.S. Sandberg},
TITLE = {{PRAM}: A scalable shared memory},
INSTITUTION = {Princeton University},
YEAR = 1988,
NUMBER = {CS-TR-180-88},
MONTH = sep}

@ARTICLE{liskov88,
AUTHOR = {B. Liskov},
TITLE = {Distributed Programming in {A}rgus},
JOURNAL = CACM,
YEAR = {1988},
VOLUME = {31},
NUMBER = {3},
PAGES = {300-312},
MONTH = mar}

@CONFERENCE{minnichfarber89,
AUTHOR = {R.G. Minnich and D.J. Farber},
TITLE = {The {Mether} System: A Distributed Shared Memory for
{SunOS} 4.0},
BOOKTITLE = {Proceedings of the Summer 1989 USENIX Conference},
PAGES = {51-60},
YEAR = 1989,
MONTH = jun}

@CONFERENCE{minnichfarber90,
AUTHOR = {R.G. Minnich and D.J. Farber},
TITLE = {Reducing Host Load, Network Load, and Latency in a
Distributed Shared Memory},
BOOKTITLE = DCS90,
YEAR = 1990,
PAGES = {468-475},
MONTH = may}

@CONFERENCE{owickiagarwal89,
AUTHOR = {Susan Owicki and Anant Agarwal},
TITLE = {Evaluating the Performance of Software Cache Coherence},
BOOKTITLE = ASPLOS3,
YEAR = 1989,
PAGES = {230-242},
MONTH = may}

@CONFERENCE{ramachandranahamad89,
AUTHOR = {U. Ramachandran and M. Ahamad},
TITLE = {Programming with Distributed Shared Memory},
BOOKTITLE = {Proceedings of COMPSAC '89},
YEAR = 1989,
MONTH = sep}

@ARTICLE{ramachandrankhalidi89,
AUTHOR = {U. Ramachandran and M.Y.A. Khalidi},
YEAR = 1989,
JOURNAL = {Distributed and Multiprocessor Systems Workshop},
PAGES = {21-38},
TITLE = {An Implementation of Distributed Shared Memory}}

@ARTICLE{rashidtevanian88,
AUTHOR = {R. Rashid and A. Tevanian and M. Young and D. Golub and
R. Baron and D. Black and W.J. Bolosky and J. Chew},
TITLE = {Machine-Independent Virtual Memory Management for
Paged Uniprocessor and Multiprocessor Architectures},
JOURNAL = IEEE-TC,
YEAR = 1988,
VOLUME = {TC-37},
NUMBER = {8},
PAGES = {896-908},
MONTH = aug}

@CONFERENCE{reinhardthill93,
AUTHOR = {S.K. Reinhardt and M.D. Hill and J.R. Larus and A.R. Lebeck
and J.C. Lewis and D.A. Wood},
TITLE = {{The Wisconsin Wind Tunnel}:
Virtual Prototyping of Parallel Computers},
BOOKTITLE = {Proceedings of the 1993 ACM Sigmetrics Conference on
Measurement and Modeling of Computer Systems},
PAGES = {48-60},
MONTH = may,
YEAR = 1993}

@ARTICLE{scheurichdubois89,
AUTHOR = {C. Scheurich and M. Dubois},
YEAR = {August 1989},
JOURNAL = {IEEE Trans. on Computers},
NUMBER = {8},
PAGES = {1154-1163},
TITLE = {Dynamic Page Migration in Multiprocessors with
Distributed Global Memory},
VOLUME = {38},
xNOTE = {An earlier version appeared in
{\em Proceedings of the 8th International Conference
on Distributed Computing Systems},
pages 162--169, 1988.}}

@CONFERENCE{shapiro86,
AUTHOR = {M. Shapiro},
TITLE = {Structure and Encapsulation in Distributed
Systems: The Proxy Principle},
BOOKTITLE = DCS86,
YEAR = {1986},
PAGES = {198-204}}

@CONFERENCE{stenstrombrorsson93,
AUTHOR = {P. Stenstr\"{o}m and M. Brorsson and L. Sandberg},
TITLE = {An Adaptive Cache Coherence Protocol Optimized
for Migratory Sharing},
BOOKTITLE = SIGARCH93,
YEAR = 1993,
PAGES = {109-118},
MONTH = may}

@CONFERENCE{stenstromjoe92,
AUTHOR = {P. Stenstrom and T. Joe and A. Gupta},
TITLE = {Comparative Performance Evaluation of Cache-Coherent {NUMA}
and {COMA} Architectures},
BOOKTITLE = SIGARCH92,
YEAR = 1992,
PAGES = {80-91},
MONTH = may}

@ARTICLE{stummzhou90,
AUTHOR = {M. Stumm and S. Zhou},
TITLE = {Algorithms Implementing Distributed Shared Memory},
JOURNAL = IEEE-COMPUTER,
VOLUME = {24},
NUMBER = {5},
PAGES = {54-64},
MONTH = may,
YEAR = 1990}

@CONFERENCE{veenstrafowler92,
AUTHOR = {J.E. Veenstra and R.J. Fowler},
TITLE = {A Performance Evaluation of
Optimal Hybrid Cache Coherency Protocols},
BOOKTITLE = ASPLOS5,
YEAR = 1992,
PAGES = {149-160},
MONTH = sep}

@CONFERENCE{webergupta89,
AUTHOR = {W.-D. Weber and A. Gupta},
TITLE = {Analysis of Cache Invalidation Patterns in Multiprocessors},
BOOKTITLE = ASPLOS3,
YEAR = 1989,
PAGES = {243-256},
MONTH = apr}

@ARTICLE{wilsonlarowe92,
AUTHOR = {A. Wilson and R. LaRowe},
TITLE = {Hiding Shared Memory Reference Latency on the {GalacticaNet}
Distributed Shared Memory Architecture},
JOURNAL = {Journal of Parallel and Distributed Computing},
VOLUME = {15},
NUMBER = {4},
PAGES = {351-367},
YEAR = 1992,
MONTH = aug}

@CONFERENCE{wilsonlarowe93,
AUTHOR = {A. Wilson and R. Larowe and M. Teller},
TITLE = {Hardware Assist for Distributed Shared Memory},
BOOKTITLE = DCS93,
PAGES = {???-???},
YEAR = 1993,
MONTH = {May}}

@CONFERENCE{wittiemaples89,
AUTHOR = {L.D. Wittie and C. Maples},
TITLE = {Merlin: Massively Parallel Processing},
BOOKTITLE = ICPP89,
YEAR = 1989,
PAGES = {142-150},
ADDRESS = {St. Charles, IL},
MONTH = aug}

@CONFERENCE{wittieharmannsson92,
AUTHOR = {L.D. Wittie and G. Hermannsson and A. Li},
TITLE = {Eager Sharing for Efficient Massive Parallelism},
BOOKTITLE = ICPP92,
YEAR = 1992,
PAGES = {251-255},
ADDRESS = {St. Charles, IL},
MONTH = aug}

@ARTICLE{wufuchs90a,
AUTHOR = {K.-L. Wu and W.K. Fuchs},
TITLE = {Recoverable Distributed Shared Memory},
JOURNAL = IEEE-TC,
VOLUME = {39},
NUMBER = {4},
PAGES = {460-469},
MONTH = apr,
YEAR = 1990}

@CONFERENCE{youngtevanian87,
AUTHOR = {M. Young and A. Tevanian and R. Rashid and D. Golub and
J. Eppinger and J. Chew and W. Bolosky and D. Black and
R. Baron},
TITLE = {The Duality of Memory and Communication in the
Implementation of a Multiprocessor Operating System},
BOOKTITLE = SOSP11,
PAGES = {63-76},
MONTH = oct,
YEAR = 1987}

@CONFERENCE{zhoustumm90,
AUTHOR = {S. Zhou and M. Stumm and T. McInerney},
TITLE = {Extending Distributed Shared Memory to Heterogeneous Environments},
BOOKTITLE = DCS90,
PAGES = {30-37},
MONTH = may,
YEAR = 1990}


0 new messages