[erlang-questions] Please criticise these principles

47 views

Skip to first unread message

Richard A. O'Keefe

unread,

Aug 27, 2008, 12:47:21 AM8/27/08

to erlang-questions Questions

I've just been looking at ISO/EIC DTR 13211-5:2007
"Prolog Multi-threading predicates".
This is a proposed addition to the ISO Prolog standard,
whose declared aim is "to promote the portability of
multi-threaded Prolog applications".

As I read through it, I felt sicker and sicker and
sicker. If any of you are parents, you may know the
feeling when a child is being naughty and seems to be
going out of his/her way to do things that are OBVIOUSLY
to his/her detriment.

I've boiled my reactions down to "here is a short list
of design principles, every single one of which is
violated by the proposal." Before sending them off to
the ISO Prolog crowd, I thought I'd ask the opinion of
Erlangers, especially Joe Armstrong, should he happen
to read this.

I concede that debugging tools may need to do all sorts
of things that are otherwise risky. There are quite a
few predicates in the 'erlang' module that are labelled
as "debugging only". What I'm talking about is *core*
facilities to be used in *normal* code that is meant to
be portable and reliable.

Some principles for simpler safer threading.
============================================

1. No omniscient users.

Users shall not be required to provide information,
such as space allocation or tuning parameters, the
values of which they cannot determine.

Applicability: consider a list cell. In NU Prolog,
a list cell holding a single character could be as
little as 1 byte (on a 32-bit machine). On some
systems, a list cell could be 4 words, which on a
64-bit system would be 32 bytes. While it is
imaginable that a user might know how many list cells
a thread would need, it is not possible for them to
say how many BYTES will be needed and if they did
give a number it would not be portable. This means
that while a user *could* give an initial heap size
for a thread, they could *not* give a *fixed* size
that would suit all systems.

2. No distinction between indistinguishables.

A specification shall not mandate distinct responses
to situations that user programs cannot distinguish.

Application: sending a message to a process.
Case 1 Case 2
T+1 Pid is alive Pid is alive
T+2 Pid dies Pid ! Message
T+3 Pid ! Message Pid dies
The ISO draft requires case 1 to produce a runtime
exception in the send call. In case 2, Pid dies
and there is no send call to be blamed, so there is
no such exception.

3. No breaches of encapsulation.

If process A wants process B to do something, it should
ASK. It should not FORCE process B to perform some action.

Application 1: the ISO draft includes an operation to
kill any process. There are mutexes. There are global
variables of a kind. If you kill a process that is
holding some mutexes, all those mutexes are released.
This means that all the data protected by those mutexes
is now in an unknown state and you dare not use it for
the rest of the program's existence.

Application 2: the ISO draft includes an operation
thread_signal(Thread, Goal) which causes Thread to be
interrupted at the next opportunity and forced to call
Goal. The goal can do anything, including unlocking a
mutex that the Thread is holding (and after the
interrupt, mistakenly believes it is still holding).

4. No unprotected shared mutable variables.

While some thread has the power to write a variable,
it is VERIFIED that no other thread has the power to
read or write that variable.

Application 1: Prolog has an analogue of Erlang process
dictionaries, but it is global. [More precisely, it is
partitioned into named pieces each of while is local to
a *module*, but the pieces are global to *threads.*]
While SWI Prolog offers thread-local mutable data, the
ISO draft includes no such thing. It's as if Erlang
offered only global ETS tables accessed without locks.
While mutexes (but oddly, not reader/writer locks) are
present in the ISO draft, there is no *intrinsic*
connection between any mutable table and any mutex.

Application 2: the draft introduces three kinds of IPC
data: thread IDs, mutex IDs, and message queue IDs.
There are three name-spaces for 'aliases', rather like
the Erlang registry for process ids. These things are
in effect mutable variables. There are operations to
create and destroy threads, mutexes, and message queues.
Although there are no operations for rebinding aliases,
this can happen:
create a thingy and give it the alias 'fred'
create a thread that refers to 'fred'
destroy the thingy
create another thingy and give it the alias 'fred'
So the other thread *thinks* it knows what 'fred' refers
to, but it is wrong. As an example, there is a
'thread_join(Thread, Result)' operation which waits for
the Thread to complete and then picks up its Result; if
Thread is an alias, this could wait for the wrong thread.

5. No intrinsically unreliable information flows.

There should be no query operations that give you
information that you would have to be crazy to use.
In particular, if you want some information about a
thread, you should ASK it [so this may be a version
of principle 3] and then you know that the information
should be interpreted with reference to that specific
synchronisation point.

Application: the ISO draft provides some operations
of which it says "almost any usage of these ... is
unsafe". These relate to finding the 'instantaneous'
state of IPC objects. Because these are 'direct'
queries that do not involve any explicit synchronisation,
the point in the lifetime of the other thread that they
refer to is entirely unknown. You cannot expect these
values to apply "now" (whatever that means) and you
cannot tell at _what_ point in the past of the other
thread they do relate to.

6. No zombies.

When a process dies, a death notice should be sent to
its family and friends, if any, but the process itself
should disappear completely.

Application: because the thread_join/2 operation
merely _exists_ in the interface, at least the full
exit or exception status of a process must be kept
around as long as there is a live copy of its Pid
anywhere in a process or the global data base, in
case someone should wait for it. The term given to
thread_exit/1 could be arbitrarily large. There is
no way to promise that you won't use thread_join.
In effect this is a mandatory space leak.

While Erlang doesn't perfectly conform to these principles
(the process registry being a particularly painful example),
you can program *as if* it did. And if you think I would
prefer multi-threading in Prolog to look as much as possible
like Erlang, why yes, I would. I would like that very much.

What got me looking at this was someone asking me to review
a paper about how to implement thread_cancel/1, the operation
that kills any thread. The paper claims that

"The ability to cancel a thread is useful for
application development and is critical to
Prolog embeddability."

and I found myself saying "but the ability to cancel a
thread is like the ability to apply a chainsaw to your
own neck! It's an incredibly easy way to violate
system integrity."

What's really frightening is that if I hadn't been exposed to
Erlang, my previous exposure to Ada and Occam and Concurrent
Pascal, nice though they are, might not have been enough to
stop me reading the DTR and going "yeah, this looks like a
fairly straightforward layer over pthreads, nice job" instead
of "yuck". THANK YOU JOE!

--
If stupidity were a crime, who'd 'scape hanging?

Joe Armstrong

unread,

Aug 27, 2008, 4:54:35 AM8/27/08

to Richard A. O'Keefe, erlang-questions

Hi Richard I've added my comments in-line

/Joe

On Wed, Aug 27, 2008 at 6:47 AM, Richard A. O'Keefe <o...@cs.otago.ac.nz> wrote:
> ...

nt.
>
> I've boiled my reactions down to "here is a short list
> of design principles, every single one of which is
> violated by the proposal." Before sending them off to
> the ISO Prolog crowd, I thought I'd ask the opinion of
> Erlangers, especially Joe Armstrong, should he happen
> to read this.

I'm reading with bated breath. (btw I still love Prolog - Robert Virding has
implemented a Prolog in Erlang with run Nrev at 100 KLips (on the nrev
benchmark)
I've found the old Erlang in prolog emulator, so soon we should be able to
implement erlang-in-prolog-in-erlang :-)

>
> I concede that debugging tools may need to do all sorts
> of things that are otherwise risky. There are quite a
> few predicates in the 'erlang' module that are labelled
> as "debugging only". What I'm talking about is *core*
> facilities to be used in *normal* code that is meant to
> be portable and reliable.
>

Absolutly - things like erlang:display/1 are *very* useful for debugging
the I/O code but should never be used in production programs.

>
>
> Some principles for simpler safer threading.
> ============================================
>
>
> 1. No omniscient users.
>
> Users shall not be required to provide information,
> such as space allocation or tuning parameters, the
> values of which they cannot determine.
>
> Applicability: consider a list cell. In NU Prolog,
> a list cell holding a single character could be as
> little as 1 byte (on a 32-bit machine). On some
> systems, a list cell could be 4 words, which on a
> 64-bit system would be 32 bytes. While it is
> imaginable that a user might know how many list cells
> a thread would need, it is not possible for them to
> say how many BYTES will be needed and if they did
> give a number it would not be portable. This means
> that while a user *could* give an initial heap size
> for a thread, they could *not* give a *fixed* size
> that would suit all systems.

Agree

> 2. No distinction between indistinguishables.
>
> A specification shall not mandate distinct responses
> to situations that user programs cannot distinguish.
>
> Application: sending a message to a process.
> Case 1 Case 2
> T+1 Pid is alive Pid is alive
> T+2 Pid dies Pid ! Message
> T+3 Pid ! Message Pid dies
> The ISO draft requires case 1 to produce a runtime
> exception in the send call. In case 2, Pid dies
> and there is no send call to be blamed, so there is
> no such exception.

Impossible. Suppose Pid is on a remote machine. You cannot distiguish
communication failure, with machine failure. So you cannot implement this.

The Erlang Tao says if you want to know if a message was received then
send a reply
and wait for it.

Even if you send a message and it is received there is no guarantee of
"liveness"
the receiver might receive the message and go into an infinite loop.

This is why we invented the link mechanism.

> 3. No breaches of encapsulation.
>
> If process A wants process B to do something, it should
> ASK. It should not FORCE process B to perform some action.

Yes

> Application 1: the ISO draft includes an operation to
> kill any process. There are mutexes. There are global
> variables of a kind. If you kill a process that is
> holding some mutexes, all those mutexes are released.
> This means that all the data protected by those mutexes
> is now in an unknown state and you dare not use it for
> the rest of the program's existence.

Madness

>
> Application 2: the ISO draft includes an operation
> thread_signal(Thread, Goal) which causes Thread to be
> interrupted at the next opportunity and forced to call
> Goal. The goal can do anything, including unlocking a
> mutex that the Thread is holding (and after the
> interrupt, mistakenly believes it is still holding).

Daft

> 4. No unprotected shared mutable variables.
>
> While some thread has the power to write a variable,
> it is VERIFIED that no other thread has the power to
> read or write that variable.

I have always thought that there should not be shared variables AT ALL.
You can't actually share a variable, it's a violation of causality - think
rays of light, electrons running along wires, think time, failures.

The problem is not the mutexes, it's the shared state. Since you shouldn't
have shared state, then you shouldn't need mutexes to protect the shared state.

Erlang programmers have happlily been writing distributed and
concurrent programs
for twenty years without the use of mutexes - they are just NOT needed.

Now deep under the covers, where no user should be lurking, there are some
ets tables - *which were added to implement large data bases* - (solving the
"we can't copy the entire universe problem" - the correct way to use ets tables
is not to use them - but use them via the mnesia transactions (a form
of transaction memory) - if you really know what you are doing you can
disregard this advice.

>
> Application 1: Prolog has an analogue of Erlang process
> dictionaries, but it is global. [More precisely, it is
> partitioned into named pieces each of while is local to
> a *module*, but the pieces are global to *threads.*]
> While SWI Prolog offers thread-local mutable data, the
> ISO draft includes no such thing. It's as if Erlang
> offered only global ETS tables accessed without locks.
> While mutexes (but oddly, not reader/writer locks) are
> present in the ISO draft, there is no *intrinsic*
> connection between any mutable table and any mutex.

This will lead to many horrific errors.

> Application 2: the draft introduces three kinds of IPC
> data: thread IDs, mutex IDs, and message queue IDs.
> There are three name-spaces for 'aliases', rather like
> the Erlang registry for process ids. These things are
> in effect mutable variables. There are operations to
> create and destroy threads, mutexes, and message queues.
> Although there are no operations for rebinding aliases,
> this can happen:
> create a thingy and give it the alias 'fred'
> create a thread that refers to 'fred'
> destroy the thingy
> create another thingy and give it the alias 'fred'
> So the other thread *thinks* it knows what 'fred' refers
> to, but it is wrong. As an example, there is a
> 'thread_join(Thread, Result)' operation which waits for
> the Thread to complete and then picks up its Result; if
> Thread is an alias, this could wait for the wrong thread.

Not good

>
> 5. No intrinsically unreliable information flows.
>
> There should be no query operations that give you
> information that you would have to be crazy to use.
> In particular, if you want some information about a
> thread, you should ASK it [so this may be a version
> of principle 3] and then you know that the information
> should be interpreted with reference to that specific
> synchronisation point.

yes

> Application: the ISO draft provides some operations
> of which it says "almost any usage of these ... is
> unsafe". These relate to finding the 'instantaneous'
> state of IPC objects. Because these are 'direct'
> queries that do not involve any explicit synchronisation,
> the point in the lifetime of the other thread that they
> refer to is entirely unknown. You cannot expect these
> values to apply "now" (whatever that means) and you
> cannot tell at _what_ point in the past of the other> thread they do relate to.

UUgh - impossible - there is no "instantaneous" state of a remote object.
Light takes finite time to propagate through the ether. Think special
relativity -
I guess the ISO standards committee members are not ex physicists :-)

>
> 6. No zombies.
>
> When a process dies, a death notice should be sent to
> its family and friends, if any, but the process itself
> should disappear completely.

yes

Concurrency isn't a "nice layer over pthreads" - the most important thing
is isolation - anything that mucks up isolation is a mistake.

If my computer crashes (the one I'm typing on NOW) crashes I hope that
this will not crash your computer. So it should be with threads.

/Joe

> --
> If stupidity were a crime, who'd 'scape hanging?
>
>
>
>
>
>
>

> _______________________________________________
> erlang-questions mailing list
> erlang-q...@erlang.org
> http://www.erlang.org/mailman/listinfo/erlang-questions
>

Richard A. O'Keefe

unread,

Aug 28, 2008, 12:56:52 AM8/28/08

to Joe Armstrong, erlang-questions

Joe Armstrong replied to my request for criticism.

>>
>> 2. No distinction between indistinguishables.
>>
>> A specification shall not mandate distinct responses
>> to situations that user programs cannot distinguish.

> Impossible. Suppose Pid is on a remote machine. You cannot distiguish
> communication failure, with machine failure. So you cannot implement
> this.

There is one fundamental difference between what you were thinking of
when implementing Erlang and what the ISO DTR authors were thinking of
when they came up with their design.

You were thinking "I have to implement distributed concurrent systems.
What kind of language would give me a fighting chance at getting such
programs right (or right _enough_)?"

They were thinking "How can we adapt the model of multiple threads in
a single address space to Prolog". It is an _essential_ characteristic
of the ISO DTR that it is not and cannot (without major hassle) be
distributed. It probably never occurred to the authors that a
concurrent Prolog-like language *could* be distributed over a cluster
or NOW just as easily as running in a single UNIX or Windows process.
I suspect they were thinking of "distribution" as "a problem solved
by sockets" and "concurrency" as a separate "problem solved by threads".

>> 4. No unprotected shared mutable variables.
>>
>> While some thread has the power to write a variable,
>> it is VERIFIED that no other thread has the power to
>> read or write that variable.
>

> I have always thought that there should not be shared variables AT
> ALL.

Of course I agree. In fact that principle is a little sneaky. It's
such an obvious thing to agree to (and it is basic to the way
Concurrent Pascal, Ada, and Occam work) that I hoped people would
agree before they realised that banning shared variables entirely is
by far the easiest way to do the verification in Prolog.

>
> Erlang programmers have happlily been writing distributed and
> concurrent programs
> for twenty years without the use of mutexes - they are just NOT
> needed.

When I was writing that message, I was trying to figure out a way
to work the phrase "astonished delight" into it, as in

"When you consider with what astonished delight programmers
have found that concurrent programming in Erlang and Haskell
are easier than they imagined possible, it would be a great
shame for Prolog to adopt an approach famed for the opposite."

I'm also of the view that if you have possibly unreliable components
co-operating on shared data of some sort (such as a data base) you
will find yourself wanting transactions sooner or later. It will be
interesting to see how the new Sun machines work out (if only rich
old Uncle Nemo would order me one for Christmas; to bad I have no
rich old Uncle Nemo) with their hardware support for STM.

>
> Concurrency isn't a "nice layer over pthreads" - the most important
> thing
> is isolation - anything that mucks up isolation is a mistake.