*Real* Distributed Computing

Michael N. Christoff

unread,

Aug 10, 2005, 1:15:07 AM8/10/05

to

Hi.

Sorry about the multi-posting, but I wanted to get a general opinion of
those online. The groups I chose (while on the surface may not seem like
they would have a lot in common with distributed computing) all intersect
with the idea of collaboratively solving problems in an environment where
one may not have a complete set of information. Where there is no 'global'
entity that sees the global state of the system - ie: where each
participating entity's view of the world is relative to the other entities
in direct proximity to it. Anyways, on to the original post:

I was wondering if anyone in this group is looking into the theoretical side
of distributed computing. I keep reading definitions of DC to the effect of
"DC is a method of computing where one a) breaks a large problem into
smaller parts, b) distributes the partial workloads to a set of processes
that compute the partial solutions, c) finally recombining the partial
solutions into a total solution". ie: SETI, etc...

This is an *example* of distributed computing - a particular, and very
straight-forward use of many computers, but is by no means a definition of
DC.

Does anyone here look into things such as decision tasks (the renaming
problem, k-set agreement, consensus) or any theoretical papers in the field
(FLP, the asynchronous computability theorem, the relationship between
algebraic topology and distributed computing, dihomotopy, ditopological
homology groups, various calculi for concurrent systems, fine-grained
concurrency, fault-tolerance, provable impossibility of certain tasks,
computability in asynchronous, semi-synchronous and synchronous networks,
worst-case message complexity of distributed tasks, randomized distributed
algorithms, self-stabilization, shared-memory vs. message passing models,
fault-tolerance, Byzantine failures, crash failures, omission failures, etc,
etc, etc...)?

Check out the following link to get a small peek into what I mean:

http://scholar.google.com/scholar?q=algebraic.topology+distributed.computing
&ie=UTF-8&oe=UTF-8&hl=en&btnG=Search

I'd be interested to know if these things are still entirely a part of
academia or whether there are people out there who discuss these things in
their spare time.

Thanks,

Mike N. Christoff

Edward A. Feustel

unread,

Aug 10, 2005, 6:27:15 AM8/10/05

to

"Michael N. Christoff" <mchri...@sympatico.caREMOVETHIS> wrote in message
news:E1gKe.1890$yH2.2...@news20.bellglobal.com...

You left out discussion of many practical engineering issues of
implementation including
provisions for security, discovery, and other issues described in the
Reference Model for
Open Distributed Processing, an ISO/ITU standard. Without an appropriate
implementation, many
of the theoretical subjects described cannot be implemented.
Ed

Zorro

unread,

Aug 11, 2005, 3:55:26 AM8/11/05

to

Replying to Mike N. Christoff.

The link was quite refreshing. Thanks.

A desirable result in mathematics is the existence and uniqueness of
solution whenever we are facing the discovery of an algorithm for
finding solutions (differential equations, linear systems, etc.). In my
lower undergrad courses, I usually gave them the following example.

Suppose you have two equations in two unkowns, and the only way you
know to find a solution is by guessing and trial. At least if you knew
a way of determining whether there is a solution, and that the solution
is unique, you would then continue to try and once you found a pair of
numbers that satisfied the equations you would be content.

Some of the articles at the link you have posted (I will have to spend
more time on them) prove interesting results along the lines of the
example I mentioned. Once we are aware of such results, we can try to
find a solution, though in this case the uniqueness is not the issue.
Here, we need to know what can or cannot be accomplished, and what are
the limits etc. of various formulations (shared memory, message
passing, etc).

At some point, we need to use such results to provide an
infrastructure. That is, there is more than mathematics here. We have a
theoretical solution (as in mathematics) and then we need a
computational model for the theoretical solution. That is where CORBA
etc come in (as AndyW nicely puts it). With regard to this, my view of
(concurrent) distributed applications (on heterogeneous systems) is
very different from DSOM (and DCOM). It has taken me decades to
translate the theory into a computational model.

The link was quite refreshing. Not sure whether you were asking
something?

Regards,
zor...@ZHMicro.com
http://www.zhmicro.com
http://distributed-software.blogspot.com
http://groups-beta.google.com/group/computerlangZ?lnk=la&hl=en

Michael N. Christoff

unread,

Aug 13, 2005, 9:12:24 PM8/13/05

to

"Edward A. Feustel" <efeu...@direcway.com> wrote in message
news:J_OdnYqBqKg...@giganews.com...

>
> "Michael N. Christoff" <mchri...@sympatico.caREMOVETHIS> wrote in
message
> news:E1gKe.1890$yH2.2...@news20.bellglobal.com...

[snip]

> >
> You left out discussion of many practical engineering issues of
> implementation including
> provisions for security, discovery, and other issues described in the
> Reference Model for
> Open Distributed Processing, an ISO/ITU standard. Without an appropriate
> implementation, many
> of the theoretical subjects described cannot be implemented.
> Ed
>

Hi Ed.

The types of problems I'm thinking of seem to be a bit more 'low-level'.
For example, problems such as 'leader election' in anonymous networks and
'consensus' (which are heavily studied in theoretical DC) are things that
may be employed by the higher-level distributed services that you describe
above resulting in a more robust\fault-tolerant system overall. So although
what you say is true about the theoretical sometimes needing an appropriate
implementation first, its often the case that the implementation benefits
from the results of the theoretical. For example, there are many results in
theoretical DC where proofs that show lower-bounds in the message-complexity
of certain tasks also produce real-world distributed algorithms for solving
the task as fast as its provably solvable.

Thanks,

Mike

Michael N. Christoff

unread,

Aug 13, 2005, 9:31:43 PM8/13/05

to

"Zorro" <zor...@comcast.net> wrote in message
news:1123746926.4...@f14g2000cwb.googlegroups.com...

Hi Zorro.

I'm glad you enjoyed the links. With regard to theory vs practice - this
will always be an issue. However, as I mentioned to Ed, the tasks that are
currently being explored, at least in the literature I have read, have more
to do with the provable robustness of more basic tasks upon which large
grained infrastructure can be built.

But perhaps, it would be better for those who are interested to read for
themselves the aims of theoretical DC. I've provided a much more distilled
list of papers than the link to scholar.google.com I posted last time.

Here is, IMO, a good sampling of some of the most important papers in the
field (and by 'the field of DC', I mean the small part of it that I have
been looking into). I believe the papers are in chronological order and
should help to illustrate how the field evolved from models based on graph
theory, to models that employ tools from algebraic topology.

[1] Impossibility of distributed consensus with one faulty process
---
(The famous 'FLP' paper)
http://theory.lcs.mit.edu/tds/papers/Lynch/jacm85.pdf

[2] Extended Impossibility Results for Asynchronous Complete Networks
Unfortunately - could not locate a free copy online - though the ACM Portal
has it

[3] A combinatorial Characterization of the Distributed Tasks which are
Solvable in the Presence of One Faulty Processor
---
http://citeseer.ist.psu.edu/context/80502/0

[4] The Topological Structure of Asynchronous Computability
---
(This was the 2004 'Principles of Distributed Computing' (PODC) Godel Prize
Winning Paper)

[5] Distributed Computing - A Glimmer of a Theory
---
http://citeseer.ist.psu.edu/334587.html

[6] Hundreds of Impossibility Results for Distributed Computing
---
http://citeseer.ist.psu.edu/fich03hundreds.html

I would *especially* like to direct your attention to [6], as it is a recent
and well written overall survey of the field. [5] is also a very good
survey with more of a focus on fine-grained concurrency (eg: using point-set
and algebraic topology in deadlock detection, and other problems). Finally,
[4], if you can locate it online (I know its there, but have been trouble
finding it) is a *very* important paper, but is very technically 'heavy'.

> The link was quite refreshing. Not sure whether you were asking
> something?
>

Just trying to get a feel for the amount of interest there is in these
topics. It seems like a big part of getting people more excited in the
theory will be to illustrate how it can be practically exploited.

Thanks for your time,

Mike

Tim Tyler

unread,

Aug 14, 2005, 6:27:34 AM8/14/05

to

Michael N. Christoff <mchri...@sympatico.caremovethis> wrote or quoted:

> Sorry about the multi-posting, but I wanted to get a general opinion of
> those online. The groups I chose (while on the surface may not seem like
> they would have a lot in common with distributed computing) all intersect
> with the idea of collaboratively solving problems in an environment where
> one may not have a complete set of information. Where there is no 'global'
> entity that sees the global state of the system - ie: where each
> participating entity's view of the world is relative to the other entities
> in direct proximity to it. Anyways, on to the original post:
>
> I was wondering if anyone in this group is looking into the theoretical side
> of distributed computing.

The theoretical side would get a boost by better availability of parallel
hardware.

Humanity seems to be stuck in the vicious circle where:

* No parallel hardware exists - because no parallel software exists to
take advantage of it, and so it doesn't pay hardware engineers to
build it.

* No parallel software exists - because there's no parallel hardware
to run it on - so extra effort in developing it is pointless.

IMO, it's got to the point where this is a pretty embarrasing state
of affairs - indicating a lack of planning and a communications failure
between the software guys and the hardware guys.

The attempts of hardware engineers to get better parallelism by
developing "VLIW" architectures is the most obvious embarassment.

I reckon the best place to break the circle is by developing new
computer programming languages, that make it easier to write
software that can get better performance on parallel machines.

Threads and "fork" don't cut it. Apart from HDLs, there's been
little effort towards making concurrent programming more popular -
apart from Erlang. ISTM that what's needed is something more like a
system where every inter-object message with no return value can be
sent asynchronously at the flick of a modifier - and those with return
values can be dealt with using return value handlers - and half-decent
simulations of parallel environments exist to test under.

If we had something like that, people might start using it,
and the vicous circle above might start to break down.
--
__________
|im |yler http://timtyler.org/ t...@tt1lock.org Remove lock to reply.

Michael N. Christoff

unread,

Aug 22, 2005, 10:15:33 PM8/22/05

to

> From:
> Sent: Wednesday, August 10, 2005 2:16 PM
> To: Michael N. Christoff
> Subject: Your contribution to comp.theory / comp.distributed
>
>
> Dear Mike,
>
> I chose the direct way to contact you (not via newsgroup) since
> (1) I don't want to be spammed
> and do not know precisely how to defend myself,
> (2) my answer is so incomplete that I dare to publish it.
> (Is your trick REMOVETHIS appropriate?)
>
> To summarize, I believe that you are pointing to one of
> the biggest gaps in (at least: theoretical) computer science.
> This feeling (nothing more than a feeling) derives from my experiences
> as a developer of "mobile distributed systems"
> which I had to fit with my mathematical education.
> I did quite a lot of inquiries
> to identify some theories which would help people like me.
> But I failed.
>
> I'll now look at Algebraic Topology following your hints,
> but this will take some time.
> (In my Math books, this field is not even mentioned!
> But clearly, I know a little bit of Algebra and of Topology.)
>
> Maybe, it helps me, but will it also help
> the non-mathematician? What are the concepts
> behind Algebraic Topology which might bring the practicioner
> to a BETTER understanding of distributed systems?
> Or do they have to study modern Math?
>
> Yours
>
>
> PS: The fact that, up to now, there are only two answers altogether
> to your contribution is illuminating!

Hi.

One of the problems is that algebraic topology is hard stuff. Really hard.
You end up doing a lot of stuff and sit there wondering where its all
leading. But eventually it all starts to make sense. Its just that it
takes a lot of time and a lot of effort, and in the mean time you're not
writing any 'cool frameworks'. I understand how that can suck and is
probably a turn-off to the 99% of people who just want to have fun, write
their own systems and try to perfect them through trial and error. My
personal belief is that distributed computing is just another example of a
'complex system'. I imagine it would be like telling someone from the past
who fashioned sailing ships that manipulating a bunch of symbols on a piece
of paper could possibly help them design a better vessel. Or that the very
laws of nature could be well predicted by this same seemingly entirely
unrelated endeavour (math). But who knows, right? Maybe all breakthroughs
in DC will be the result of nothing more than trial, error and elbow-grease.

Anyhow. I initially wrote my email due to a desire to meet people to
discuss such things with. I really don't have the energy to try and
convince people that this field of research isn't one of the "the biggest
gaps in theoretical computer science". If it peaks your interest cool. If
not, then c'est la vie. You're always going to have the adamant anti-theory
guys and adamant pro-theory guys. I hope I can remain comfortably in the
centre.

But in leaving, I'll throw in an entirely rudimentary example of the idea of
'similarity', or 'indistinguishability'. Imagine the simple task of
deciding whether the majority of an odd number of processes (greater than 2)
was given an input of 0 or 1. So each process starts with a private input
(a 0 or 1), then after a finite number of steps, all non-faulty processes
must output 1 if the majority were given the input 1, and 0 if the majority
were given 0. They communicate by passing messages over a truly
asynchronous network (ie: UDP packets over the internet for example). By
asynchronous I mean that messages could be delayed for arbitrary amounts of
time.

So let's say we have 3 processes: a,b,c with inputs 0,0,1 respectively.
a,b,c should all output 0 at the end of the protocol. Easy. Also easy to
figure out is that if they're inputs were 0,1,1 (respectively) then they
should all output 1. Now what if processor b 'fell asleep'? ie: its
messages were extremely delayed, or the computer it was being hosted on
started disk defragmenting and it slowed down to a crawl, etc... Then a and
c would eventually have to make a decision (since they must decide in a
finite amount of time). But what decision do they make? What if they
decide 1 and b 'wakes up' and announces its input was a 0 - then they made
the wrong decision. But if they decide 0 and b wakes up announcing its
input was a 1, then they also have made a bad decision. The problem is that
if b is arbitrarily delayed, a and c alone cannot distinguish between the
inital conditions 0,0,1 and 0,1,1. Hence there will always be valid runs of
the protocol were the distributed algorithm makes the wrong decision.

One of the things you can do with algebraic topology is determine in a
system of N processes, whether two states are indistinguishable to N-1
processes. Well, you can do that with graph theory. But graph theory will
only let you see similarity results in the face of up to one faulty
processor (f = 1). With algebraic topology ("simplicial complexes" in
particular - an n-dimensional extension of graphs) one can determine
whether, for example, 3 global states are indistingusihable to 4 processors,
or 2 global states are are indistinguishable to 6 processors, etc... This
allows you to prove many impossibility results for almost arbitray amounts
of failures.

There's way to much to write about in one go. I'm just shaving the tip of
the ice berg.

But anyways, as I said. If it peaks your interest, great. If not, there's
not much I can do about it. Good luck.

-mike

> -----Original Message-----

Michael N. Christoff

unread,

Aug 22, 2005, 10:26:07 PM8/22/05

to

[sorry if this appears twice]

> anon

Hi.

figure out is that if their inputs were 0,1,1 (respectively) then they

Markus Redeker

unread,

Aug 27, 2005, 5:03:48 AM8/27/05

to

"Michael N. Christoff" <mchri...@sympatico.caREMOVETHIS> writes:

>One of the things you can do with algebraic topology is determine in a
>system of N processes, whether two states are indistinguishable to N-1
>processes. Well, you can do that with graph theory. But graph theory will
>only let you see similarity results in the face of up to one faulty
>processor (f = 1). With algebraic topology ("simplicial complexes" in
>particular - an n-dimensional extension of graphs) one can determine
>whether, for example, 3 global states are indistingusihable to 4
>processors, or 2 global states are are indistinguishable to 6 processors,
>etc... This allows you to prove many impossibility results for almost
>arbitray amounts of failures.

What concepts from algebraic topology do you use? I could think of homology
groups, since the zeroth homology group counts the number of connected
components of a space. But how do the other homology groups come in (if that
is the method you use)?

--
Markus Redeker Hamburg, Germany

Michael N. Christoff

unread,

Sep 4, 2005, 6:24:34 AM9/4/05

to

"Markus Redeker" <c...@ibp.de> wrote in message
news:k9aped...@ID-219982.user.uni-berlin.de...

They are used to determine, for example, if the input complex of a protocol
has any 'holes' below a certain dimension.

-mike