Threads

Andrew Cooke

unread,

Mar 30, 2000, 3:00:00 AM3/30/00

to

Hi,

I realise threads are outside the ANSI spec, but I am curious if there
are any generally accepted conventions on Lisp threading> In
particular:

- If I understood an earlier topic correctly, assigning a value to a
dynamic scoped variable using let copies the original value to a safe
place for the scope of the let before replacing it. In a multi-threaded
environment this implies that the value will change for all threads - is
that correct?

- Is the model for thread support C/Java-like (mutexes etc) or
CSP/Occam/Jsp-like? (I suspect the former, but wouldn't be very
surprised, given the elegance of the latter (imho), if multi-threaded
lisps actually had separate dynamic scope variables with all
communication between threads limited to "meeting" threads).

Thanks,
Andrew
http://www.andrewcooke.free-online.co.uk/index.html

Sent via Deja.com http://www.deja.com/
Before you buy.

Erik Naggum

unread,

Mar 30, 2000, 3:00:00 AM3/30/00

to

* Andrew Cooke <and...@andrewcooke.free-online.co.uk>

| - If I understood an earlier topic correctly, assigning a value to a
| dynamic scoped variable using let copies the original value to a safe
| place for the scope of the let before replacing it. In a multi-threaded
| environment this implies that the value will change for all threads - is
| that correct?

let doesn't assign values, it creates a new binding. how it is actually
implemented underneath has _some_ implications for the rest of the
system, but no matter how it is implemented, you should not assume that
the implementation violates the semantics. this implies that the new
binding is local to the process (thread), since the other processes
(threads) are not in the dynamic scope of the new binding.

#:Erik

Tim Bradshaw

unread,

Mar 30, 2000, 3:00:00 AM3/30/00

to

* Andrew Cooke wrote:

> - If I understood an earlier topic correctly, assigning a value to a
> dynamic scoped variable using let copies the original value to a safe
> place for the scope of the let before replacing it. In a multi-threaded
> environment this implies that the value will change for all threads - is
> that correct?

No. Although it's formally undefined of course, an implementation
which works like this would typically arrange life so that the symbol
value got replaced on a thread switch, thus ensuring that bindings
work per thread. Of course this approach has problems on a
multiprocessor system where there are no thread switches, and a
system like that would do something else.

--tim

Joe Marshall

unread,

Mar 30, 2000, 3:00:00 AM3/30/00

to

Andrew Cooke <and...@andrewcooke.free-online.co.uk> writes:

> Hi,
>
> I realise threads are outside the ANSI spec, but I am curious if there
> are any generally accepted conventions on Lisp threading> In
> particular:
>

> - If I understood an earlier topic correctly, assigning a value to a
> dynamic scoped variable using let copies the original value to a safe
> place for the scope of the let before replacing it. In a multi-threaded
> environment this implies that the value will change for all threads - is
> that correct?

In all multi-threaded lisps I have worked with, each thread has its
own set of dynamic bindings. Thus dynamically binding a value in
thread A has no discernable effect on thread B. I remember at least
one system that had a function SYMEVAL-IN-PROCESS or somesuch name
that given a symbol and a thread would return the `current value' of
the symbol in that thread.

If you were to SETQ a dynamically bound variable in a thread, the side
effect would only be seen in that thread. However, if the symbol were
NOT dynamically bound (global), *all* the threads would see the change.

> - Is the model for thread support C/Java-like (mutexes etc) or
> CSP/Occam/Jsp-like?

I don't know what CSP/Occam/Jsp-like is. Most multithreaded Lisps I
have used borrowed heavily from the Lisp machine thread model. The
primary method of synchronization was WITHOUT-INTERRUPTS which would
defer interrupt processing during the dynamic execution of a form.

The other primitive synchronization mechanisms were arbitrary chunks
of Lisp code that were evaluated by the scheduler when it was deciding
when a process should run. This method was used to implement locking
(mutexes).

In some respects, the multi-threading code was crude, but in a Lisp
system, it is far less likely that different threads will interfere
with each other. No thread can corrupt memory (unless it goes out of
its way to use `subprimitives') so the integrity of the heap is never
in question. Since there is a GC, consing unshared structure is a
reasonable way to ensure that no other thread can modify your data
from underneath you. The code that ran the different threads was
often in different packages, so it would not have access to the same
symbols that other threads use. So it is far easier to make a
working multi-tasking system in Lisp than in many languages.

Sam Steingold

unread,

Mar 30, 2000, 3:00:00 AM3/30/00

to

>>>> In message <8buulu$jp$1...@nnrp1.deja.com>
>>>> On the subject of "Threads"
>>>> Sent on Thu, 30 Mar 2000 07:11:03 GMT

>>>> Honorable Andrew Cooke <and...@andrewcooke.free-online.co.uk> writes:
>>
>> I realise threads are outside the ANSI spec, but I am curious if

>> there are any generally accepted conventions on Lisp threading?

lispm-type threads appear to be the common ground.

you might want to look at clocc/src/port/proc.lsp (see
http://clocc.sourceforge.net and
http://www.podval.org/~sds/data/port.html) for thread unification.

--
Sam Steingold (http://www.podval.org/~sds)
Micros**t is not the answer. Micros**t is a question, and the answer is Linux,
(http://www.linux.org) the choice of the GNU (http://www.gnu.org) generation.
It's not just a language, it's an adventure. Common Lisp.

Andrew Cooke

unread,

Mar 30, 2000, 3:00:00 AM3/30/00

to

In article <u7lekw...@alum.mit.edu>,

Joe Marshall <jmar...@alum.mit.edu> wrote:
> I don't know what CSP/Occam/Jsp-like is. Most multithreaded Lisps I
> have used borrowed heavily from the Lisp machine thread model. The
> primary method of synchronization was WITHOUT-INTERRUPTS which would
> defer interrupt processing during the dynamic execution of a form.

Thanks for all the useful replies.

For the record, CSP/Occam/JSP type multi-threading is based on Hoare's
book (Communicating Sequential Processes) where threads "meet" to
exchanging data. I'm not an expert - I'm used to Java threads (mutexes,
semaphors) - but what I've read strikes me as very elegant. For more
info check out http://archive.comlab.ox.ac.uk/csp.html (all this is in
the light of imperative programming - as at least one reply (this one?)
pointed out, many of the problems with threads are much less significant
in a more functional environment).

Thanks again,

Tim Bradshaw

unread,

Mar 30, 2000, 3:00:00 AM3/30/00

to

* Joe Marshall wrote:

> I don't know what CSP/Occam/Jsp-like is. Most multithreaded Lisps I
> have used borrowed heavily from the Lisp machine thread model. The
> primary method of synchronization was WITHOUT-INTERRUPTS which would
> defer interrupt processing during the dynamic execution of a form.

Which is a hideously bad way of doing it unfortunately. Assuming it's
taken to mean `no context switches during this bit of code' then on a
multiprocessor this is an appallingly bad method of synchronisation as
this translates as `stop all other processors during this form'. It's
unfortunate that a lot of lisp multithreaded code does seem to do
synchronisation like this.

--tim

Joe Marshall

unread,

Mar 30, 2000, 3:00:00 AM3/30/00

to

Tim Bradshaw <t...@cley.com> writes:

At Lucid, we had `without-interruptions', which basically meant that
asynchronous signal processing would be deferred in the stack group
that was using it. Thread switches would still happen, synchronous
interrupts (usually caused by errors) would still work, but things
like control-c would be deferred until interruptions were enabled.

I've never liked the `stack-group' model of multitasking anyway. I
don't know how many customers got screwed by writing code like this:

(process-wait
#'(lambda ()
(hairy-calculation-involving-thread-specific-state)))

Which would run in arbitrary stack-groups and make the scheduler
drop dead.

--
~jrm

Tim Bradshaw

unread,

Mar 31, 2000, 3:00:00 AM3/31/00

to

* Joe Marshall wrote:

> At Lucid, we had `without-interruptions', which basically meant that
> asynchronous signal processing would be deferred in the stack group
> that was using it. Thread switches would still happen, synchronous
> interrupts (usually caused by errors) would still work, but things
> like control-c would be deferred until interruptions were enabled.

Related to this, I suspect that what most uses of what CLIM calls
WITHOUT-SCHEDULING really want to do is say `only one thread can be in
this section of code at once', not `no other code can run'. The right
way to do this is to have a lock which you grab before getting into
that code. This is conventionally kind of clunky because you might
not have to invent a lock object purely to protect the code section.

Java gets around this by being able to say `synchronised' around a
method -- by adding the feature especially to the language.

But in Lisp, if you have locks, you can *write* a SYNCHRONISED macro
with no explicit lock object -- you can create a suitable lock object
(you have to worry about MAKE-LOAD-FORM &c, but that's not a big pain)
in the macro expansion which gets wired into the code. So you can
just say (synchronised (do-this) (do-that)).

I always thought that this ability to implement what other languages
have to provide as primitives was enormously cool.

--tim

dave

unread,

Apr 4, 2000, 3:00:00 AM4/4/00

to

Since we're on the topic of threads, I've had a few questions myself.

recently I called up Xanalys, Harlequin's successor. I had just been
reading the docs on ACL5, and it said that threads in ACL5 are still
implemented as stack groups in the UNIX implementations, but ACL5 for Win32
platforms use OS threads.

LispWorks doesn't use OS threads on any of their implementations, based on
the answers I got.

My questions really go out to the people that have actually written and/or
worked on Lisp systems from the ground up. What are the benefits and
drawbacks of OS threads vs. stack groups?

I remember reading about the threads implementation in Linux, and how it was
very efficient, and how threads and processes are both scheduled by the same
scheduler, and the whole thing was extremely efficient (for reasons that I
didn't dive too deeply into).

My guess is that OS threads would be better, in that you don't have to
re-invent an efficient scheduler, and also that you can take better
advantage of nuggets in the kernel of whatever OS you're operating on.

dave

Andrew Cooke <and...@andrewcooke.free-online.co.uk> wrote in message
news:8buulu$jp$1...@nnrp1.deja.com...
>
>
> Hi,

>
> I realise threads are outside the ANSI spec, but I am curious if there

> are any generally accepted conventions on Lisp threading> In
> particular:
>
> - If I understood an earlier topic correctly, assigning a value to a
> dynamic scoped variable using let copies the original value to a safe
> place for the scope of the let before replacing it. In a multi-threaded
> environment this implies that the value will change for all threads - is
> that correct?
>

> - Is the model for thread support C/Java-like (mutexes etc) or

> CSP/Occam/Jsp-like? (I suspect the former, but wouldn't be very
> surprised, given the elegance of the latter (imho), if multi-threaded
> lisps actually had separate dynamic scope variables with all
> communication between threads limited to "meeting" threads).
>
> Thanks,

Christopher C Stacy

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

>>>>> On Tue, 4 Apr 2000 20:16:03 -0700, dave ("dave") writes:
dave> My guess is that OS threads would be better, in that you don't have to
dave> re-invent an efficient scheduler, and also that you can take better
dave> advantage of nuggets in the kernel of whatever OS you're operating on.

I don't think it's that simple. The OS threads scheduler might not have
everything that you need to easily support the kind of scheduling control
that you want from Lisp. It seems very likely that OS threads could play
a useful role for some purposes in the Lisp runtimes, and it also seems
like a good idea to be able to access that capability at least as an FFI,
but it's not automatically clear that "OS threads are better than Lisp threads"
or anything like that. It would be nice if Lisp could take advantage of
the multiprocessor support, though.

Robert Monfera

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

dave wrote:

> LispWorks doesn't use OS threads on any of their implementations, based on
> the answers I got.

Except LWW AFAIK.

Robert

Tim Bradshaw

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

* dave wrote:

> My guess is that OS threads would be better, in that you don't have to

> re-invent an efficient scheduler, and also that you can take better

> advantage of nuggets in the kernel of whatever OS you're operating on.

I think the significant potential win for OS threads is that you can
potentially run on multiple processors, which means you can take
advantage of multiprocessor machines easily. However that's far from
trivial to do even if you do use OS threads because of
memory-management issues.

I suspect (as a non-implementor) that the downside of OS threads is
that implementations of supposedly `standard' posix threads differ
significantly from vendor to vendor & release to release, making it a
nightmare for the implementor.

It looks to me like all the lisp threading APIs are similar enough
that it's impossible to write code that will port reasonably easily
from one to the other though.

--tim

William Deakin

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

Tim Bradshaw wrote:

> I suspect (as a non-implementor) that the downside of OS threads is
> that implementations of supposedly `standard' posix threads differ
> significantly from vendor to vendor & release to release, making it a
> nightmare for the implementor.

Having tried to compile some threaded LDAP code I can vouch for this.

One of the big standardistion problems with threading is that how do you
actually implement a threading system. As I understand it, under UNIX there
are basically two solutions: some form of shared process resources (using
IPC communications/semphores/memory handling) or to wire something-extra
into the kernel to handle this. I think these method produce what are then
called `light' and `heavy' threads. As examples of this (again this is as
far as I understand) Solaris 2.6 provides kernel (`heavy') support while the
gnu-libthread module part of the gnu c-libraries (used by Linux) is an
example of `light' thead support.

However, please take this with a pinch of salt as there is a good chance
that I am talking through my trousers (again).

Best Regards,

:) will

Michael Hudson

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

William Deakin <wi...@pindar.com> writes:

> One of the big standardistion problems with threading is that how do you
> actually implement a threading system. As I understand it, under UNIX there
> are basically two solutions: some form of shared process resources (using
> IPC communications/semphores/memory handling) or to wire something-extra
> into the kernel to handle this. I think these method produce what are then
> called `light' and `heavy' threads. As examples of this (again this is as
> far as I understand) Solaris 2.6 provides kernel (`heavy') support while the
> gnu-libthread module part of the gnu c-libraries (used by Linux) is an
> example of `light' thead support.

Linux certainly has kernel support for threading; threads are
lightweight processes. Whether various threading libraries use this,
is of course up to them, but I'm pretty sure pthreads does.

> However, please take this with a pinch of salt as there is a good chance
> that I am talking through my trousers (again).

Given how much thread programming I've done (none), the same should be
said for me.

Cheers,
M.

--
well, take it from an old hand: the only reason it would be easier
to program in C is that you can't easily express complex problems
in C, so you don't. -- Erik Naggum, comp.lang.lisp

William Deakin

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

Michael Hudson wrote:

> Linux certainly has kernel support for threading; threads are
> lightweight processes. Whether various threading libraries use this,
> is of course up to them, but I'm pretty sure pthreads does.

Ok. I am happy to believe that I am wrong and that within the Linux kernel there
is code that is specifically there to deal with threads rather than using the gnu
C-threads library which I thought used IPC communications. When I get a chance
I'll dig through the kernel code and have a look ;)

However, I was aware that thread are lightweight processes cf `normal' UNIX
processes. What I was trying to say (and obviously failing ) is that there
different ways of implementing threads and that these implementations fall into
two broad categories , which I believe goes someway to explain some of problems
with using threads.

But then again this would not be the first mistake I've made,

Best Regards,

:) will

Marc Feeley

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

> > One of the big standardistion problems with threading is that how do you
> > actually implement a threading system. As I understand it, under UNIX there
> > are basically two solutions: some form of shared process resources (using
> > IPC communications/semphores/memory handling) or to wire something-extra
> > into the kernel to handle this. I think these method produce what are then
> > called `light' and `heavy' threads. As examples of this (again this is as
> > far as I understand) Solaris 2.6 provides kernel (`heavy') support while the
> > gnu-libthread module part of the gnu c-libraries (used by Linux) is an
> > example of `light' thead support.
>

> Linux certainly has kernel support for threading; threads are
> lightweight processes. Whether various threading libraries use this,
> is of course up to them, but I'm pretty sure pthreads does.

I don't think the term "lightweight" is appropriate for Linux threads
because they involve the OS kernel for scheduling, process creation
etc. I have run a simple benchmark (threaded fibonacci, code below)
to compare LinuxThreads, Java green threads (with the Kaffe JIT) and
the Scheme thread system I have designed for the Gambit-C Scheme
compiler. The difference is dramatic: the Scheme thread system runs
this benchmark 1000 times faster than LinuxThreads and 100 times
faster than the Kaffe JIT. Moreover LinuxThreads has a limit of about
1000 threads and a high per thread space overhead, which can be a
problem in some applications.

Marc Feeley

#include <stdio.h>
#include <pthread.h>

void *tfib (void *n)
{
if ((int)n < 2)
return (void*)1;
else
{
pthread_t x;
void *y;
void *z;
if (pthread_create (&x, NULL, tfib, (void*)((int)n - 2)) != 0) exit (1);
y = tfib ((void*)((int)n - 1));
if (pthread_join (x, &z) != 0) exit (1);
return (void*)((int)y + (int)z);
}
}

int main (int argc, char *argv[])
{
int n = atoi (argv[1]);
int repeat = atoi (argv[2]);
void *result;
do
{
pthread_t x;
if (pthread_create (&x, NULL, tfib, (void*)((int)n)) != 0) exit (1);
if (pthread_join (x, &result) != 0) exit (1);
repeat--;
} while (repeat > 0);
printf ("nb. threads = fib(%d) = %d\n", n, (int)result);
return 0;
}

Michael Hudson

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

Marc Feeley <fee...@trex.IRO.UMontreal.CA> writes:

> I don't think the term "lightweight" is appropriate for Linux threads
> because they involve the OS kernel for scheduling, process creation
> etc.

Gah, bad wording; linux threads are lightweight compared to processes
- if your benchmark used fork it would presumably be even slower - but
heavywieght compared to most userland threads. Do Kaffe threads or
your scheme threads work on multi processor machines (ie. split across
processors, not work at all)? Your benchmark seemed to thrash thread
creation, which is not what kernel land threads were intended for, as
I understand it.

Is this off-topic enough yet?

M.

--
it's not that perl programmers are idiots, it's that the language
rewards idiotic behavior in a way that no other language or tool
has ever done -- Erik Naggum, comp.lang.lisp

William Deakin

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

Marc Feeley wrote:

> I have run a simple benchmark (threaded fibonacci, code below)
> to compare LinuxThreads, Java green threads (with the Kaffe JIT) and
> the Scheme thread system I have designed for the Gambit-C Scheme
> compiler.

How do these benchmarks compare with a full-on heavy-weight process creation?

Thanks anyway,

:) will

Fernando D. Mato Mira

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

Enough with "speed". Response time and jitter are the real issues.

--
Fernando D. Mato Mira
Real-Time SW Eng & Networking
Advanced Systems Engineering Division
CSEM
Jaquet-Droz 1 email: matomira AT acm DOT org
CH-2007 Neuchatel tel: +41 (32) 720-5157
Switzerland FAX: +41 (32) 720-5720

www.csem.ch www.vrai.com ligwww.epfl.ch/matomira.html

David Hanley

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

dave wrote:

>
> My guess is that OS threads would be better, in that you don't have to
> re-invent an efficient scheduler, and also that you can take better
> advantage of nuggets in the kernel of whatever OS you're operating on.

And perhaps more critically: the ability to make use of more than one
CPU if present.

dave

Marc Feeley

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

> Marc Feeley <fee...@trex.IRO.UMontreal.CA> writes:
>
> > I don't think the term "lightweight" is appropriate for Linux threads
> > because they involve the OS kernel for scheduling, process creation
> > etc.
>
> Gah, bad wording; linux threads are lightweight compared to processes
> - if your benchmark used fork it would presumably be even slower - but
> heavywieght compared to most userland threads.

Actually it is the other way around! The same benchmark using fork
is 10 times faster than the one using LinuxThreads. Here are the
timings on a 600MHz Athlon with Linux RH 6.1:

% time ./pthread-demo 14 10
nb. threads = fib(14) = 610
0.30user 32.87system 0:33.17elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (109major+6114minor)pagefaults 0swaps
% time ./fork-demo 14 10
fib(14) mod 256 = 98
0.35user 2.97system 0:03.32elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (18389major+48818minor)pagefaults 0swaps

The code is below if anybody would like to duplicate the experiment.

Marc

/* fork-demo.c */

#include <stdio.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>

int tfib (int n)
{
if (n < 2)
return 1;
else
{
pid_t x;
int y;
int z;
x = fork ();
if (x < 0) exit (1);
if (x == 0) exit (tfib (n - 2));
y = tfib (n - 1);
if (waitpid (x, &z, 0) < 0) exit (1);
return y + WEXITSTATUS(z);
}
}

int main (int argc, char *argv[])
{
int n = atoi (argv[1]);
int repeat = atoi (argv[2]);

int result;
do
{
pid_t x;
x = fork ();
if (x < 0) exit (1);
if (x == 0) exit (tfib (n));
if (waitpid (x, &result, 0) < 0) exit (1);
repeat--;
} while (repeat > 0);
printf ("fib(%d) mod 256 = %d\n", n, WEXITSTATUS(result));
return 0;
}

/* pthread-demo.c */

Tim Bradshaw

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

* Marc Feeley wrote:

> Actually it is the other way around! The same benchmark using fork
> is 10 times faster than the one using LinuxThreads. Here are the
> timings on a 600MHz Athlon with Linux RH 6.1:

For Solaris the results are pretty much backwards from this -- 2.2
secs for the thread one, 16 for the fork one (on a 333MHz machine,
with gcc).

I'm not sure what if anything this says, except that solaris has fast
thread creation perhaps. I'd kind of expect their thread impl to be
pretty good because their market is big multiprocessor machines.

--tim

Joe Marshall

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

"dave" <deen...@yahoo.com> writes:

>
> My guess is that OS threads would be better, in that you don't have to
> re-invent an efficient scheduler, and also that you can take better
> advantage of nuggets in the kernel of whatever OS you're operating on.
>

In *theory*, most of the threads aren't doing much but waiting around
for something interesting to happen. The OS is the ultimate oracle
for interesting events, so in *theory*, the OS can schedule things
*very* efficiently. But....

In *practice*, the sort of events a thread might find `interesting'
might not be easily described to the OS. For instance, you might want
a thread to wait until a particular lisp symbol has a value of NIL.
In a case like this (unless something *extraordinarily* clever is
going on), you have little recourse but to ask the scheduler to check
the value of the symbol periodically. It would be difficult to
persuade unix to do this, but rather trivial if you wrote your own
scheduler.

In *practice*, the OS scheduler could be very poorly implemented.

So you have to find a balance between the expressive power you need to
describe your thread synchronization and the actual efficiency of the
OS scheduler.

Erik Naggum

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

* Tim Bradshaw <t...@cley.com>

| I'm not sure what if anything this says, except that solaris has fast
| thread creation perhaps. I'd kind of expect their thread impl to be
| pretty good because their market is big multiprocessor machines.

I'd also expect Sun to want a "favorable demo" effect that shows this.
Linux has made fork exceptionally fast (my experience is that PIDs are
used up about three times faster under Linux than under SunOS 4.1.3_U1,
with equivalent work being performed on the systems), and has implemented
the performance sensitive parts of vfork in fork. ironically, it now
appears that threads have to do _more_ work than processes because they
_share_ the memory and a bunch of other stuff that fork splits off into
objects with distinct identity.

#:Erik

Erik Naggum

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

* David Hanley <d...@ncgr.org>

| And perhaps more critically: the ability to make use of more than one
| CPU if present.

yet this may be what slows down the threads and speeds up the processes.

I think a valuable benchmarking approach would be test the same kind of
communication between the various kinds of processes, like sockets and
some simple protocol.

#:Erik

David Bakhash

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

Joe Marshall <jmar...@alum.mit.edu> writes:

> In *practice*, the sort of events a thread might find `interesting'
> might not be easily described to the OS. For instance, you might want
> a thread to wait until a particular lisp symbol has a value of NIL.
> In a case like this (unless something *extraordinarily* clever is
> going on), you have little recourse but to ask the scheduler to check
> the value of the symbol periodically. It would be difficult to
> persuade unix to do this, but rather trivial if you wrote your own
> scheduler.

But the main Lisp process can still send an asynch event to wake up
the thread that was sleeping.

Of course, some would instinctively say that this adds another level
of indirection, but I don't think that the argument above is valid.

If you have a Lisp-based threading system (i.e. non-OS threads), and
you wanted a thread to wake up when a symbol got bound, for example,
then what's the difference between the Lisp process (which is
scheduled identically to threads in Linux, for example) sees the
boundp and wakes up the "thread" (as a stack group) vs. sending some
kind of interrupt or event to another process-like thread to wake it
up.

If you think about all this all over again, you'll probably see that
the better thing all around is to use OS threads, when possible.

The reason for this has a lot to do with the subtle differences
between OSs. The people who understand the OS very well probably are
the best ones to take advantage of low-level, subtle optimizations
that, at the Lisp level, are unreachable. Also, pushing off
responsibility (e.g. scheduling) is a good thing, especially if you
consider what KMP once said which is that it's not good to code [Lisp]
to use local Lisp-level optimizations; it's better to write nice,
elegant, simple code and use tools (I think he was referring to the
LOOP macro in some way) when just do it, and don't try too hard to
figure out if the system you're on does it well. For example, in this
case, if the Lisp guys relied on the OS guys to make (future)
improvements in multithreading, then I'd be happy to use a slightly
less efficient implementation now, knowing that it would get better
and better as the OS (in this case Linux) gets better (with respect to
threads). The bottom line is for the Lisp implementor NOT to try too
hard to figure out if the system he's on has multiple processors,
etc. If there's a mechanism to abstract that away, then they should
use it, unless the advantages to breaking into the guts reaps massive
rewards.

> In *practice*, the OS scheduler could be very poorly implemented.
>
> So you have to find a balance between the expressive power you need to
> describe your thread synchronization and the actual efficiency of the
> OS scheduler.

again, even if this is so, the implementors should assume that the OS
guys knew what they were doing. Even if they didn't do the best job,
it's inevitably going to improve.

dave

Tim Bradshaw

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

* Erik Naggum wrote:

> I'd also expect Sun to want a "favorable demo" effect that shows this.
> Linux has made fork exceptionally fast (my experience is that PIDs are
> used up about three times faster under Linux than under SunOS 4.1.3_U1,
> with equivalent work being performed on the systems), and has implemented
> the performance sensitive parts of vfork in fork. ironically, it now
> appears that threads have to do _more_ work than processes because they
> _share_ the memory and a bunch of other stuff that fork splits off into
> objects with distinct identity.

I doubt performance of sunos 4.1.3 says anything about recent solaris
as that was before all the multiprocessor support in sunos 5 (I'm not
even sure how much code they share, probably very little). However
I'm not interested in performance comparisons between the systems (I'm
sure linux is quicker on any small machine, but almost anything is
fast enough now unless you have to run Word or something), I was kind
of trying to point out that the benchmark was spurious.

--tim

Sam Steingold

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

>>>> In message <ey37lec...@cley.com>
>>>> On the subject of "Re: more questions about threads..."
>>>> Sent on 05 Apr 2000 11:10:52 +0100

>>>> Honorable Tim Bradshaw <t...@cley.com> writes:
>>
>> It looks to me like all the lisp threading APIs are similar enough
>> that it's impossible to write code that will port reasonably easily
>> from one to the other though.

you might want to try
http://clocc.sourceforge.net
clocc/src/port/proc.lsp

--
Sam Steingold (http://www.podval.org/~sds)
Micros**t is not the answer. Micros**t is a question, and the answer is Linux,
(http://www.linux.org) the choice of the GNU (http://www.gnu.org) generation.

The only time you have too much fuel is when you're on fire.

Erik Naggum

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

* Tim Bradshaw <t...@cley.com>

| I doubt performance of sunos 4.1.3 says anything about recent solaris
| as that was before all the multiprocessor support in sunos 5 (I'm not
| even sure how much code they share, probably very little).

note that I was talking about the consumption of PIDs, not performance.
this has to do with the number of scripts and invoked interpreters and
such, which may or may not impact performance. what I was suggesting was
that Linux is very heavily script-and-interpreters-oriented, while BSD-
based Unices have traditionally been less so than AT&T-based Unices, for
whatever this is worth. in any case, I would expect Linux to be faster
at forking _because_ it is used so much more. Linux was heavily into
this mode of thinking long before it had SMP support.

#:Erik

Rolf Rander Naess

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

[ Marc Feeley, 05 Apr 2000 16:25 ]

> > Gah, bad wording; linux threads are lightweight compared to processes
> > - if your benchmark used fork it would presumably be even slower - but
> > heavywieght compared to most userland threads.
>

> Actually it is the other way around! The same benchmark using fork
> is 10 times faster than the one using LinuxThreads. Here are the
> timings on a 600MHz Athlon with Linux RH 6.1:

I believe I read a quote from the linux-kernel developers once stating
that there wouldn't be put much effort into making threads effective,
as they regarded heavy use of threads to be bad design in the first
place. (No, I don't have a reference)

Rolf Rander

--
(c) 2000 Rolf Rander Næss
http://www.pvv.org/~rolfn/

My mailer limits .sigs to 4 lines. But ingeniously I bypassed this by

Jochen Schmidt

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

Your benchmark only compares thread-creation - what is with the speed
and quality of
the locking-mechanisms or the context-switch times???
Productionready Code doesn't simply create some threads from the
threading system - You should
use thread-pools so that the tread-creation-time would be obvious.
As I remember the actual implementation of fork in linux is only a
compatibility-wrapper around
LinuxTreads (particularily the "clone()" system-call)
Linuxthreads have to manage cloning of filedescriptors and much more
OS-level stuff than User-Space-Threads
have.

Marc Feeley wrote:
>
> > Marc Feeley <fee...@trex.IRO.UMontreal.CA> writes:
> >
> > > I don't think the term "lightweight" is appropriate for Linux threads
> > > because they involve the OS kernel for scheduling, process creation
> > > etc.
> >

> > Gah, bad wording; linux threads are lightweight compared to processes
> > - if your benchmark used fork it would presumably be even slower - but
> > heavywieght compared to most userland threads.
>
> Actually it is the other way around! The same benchmark using fork
> is 10 times faster than the one using LinuxThreads. Here are the
> timings on a 600MHz Athlon with Linux RH 6.1:
>

--
cya,
Jochen Schmidt
j...@dataheaven.de
http://www.dataheaven.de

Andrew Cooke

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

In article <ukwln2s...@pvv.org>,

Rolf Rander Naess <rolfn...@pvv.org> wrote:
> I believe I read a quote from the linux-kernel developers once stating
> that there wouldn't be put much effort into making threads effective,
> as they regarded heavy use of threads to be bad design in the first
> place. (No, I don't have a reference)

Wow. That's a pretty extreme position. Personally, I'd like an OS that
lets me decide what *I* want to do. I don't have multithreaded Lisp
experience, but work in Java, where threads are part of the standard
language, and they are invaluable in making responsive, robust systems
without complicated coupling between separate parts of the program. So
they often help *good design*.

They certainly have a cost - they were a major source of errors until we
became experienced with suitable patterns - but so do many other useful,
powerful ideas in computing. (I'd say they need an extra level of
abstraction, but I don't know what it is yet).

Maybe it was in a particular context, or a long time ago....

Andrew

Joe Marshall

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

I wasn't arguing that lisp-level threads were unconditionally superior
to OS threads, just that there are cases where OS threads may not have
the necessary functionality or performance, or cases where Lisp
threads have poor performance. You have to evaluate the benefits of
each on a case by case basis. In a perfect world, OS threads would
win hands down, but the world is far from perfect.

David Bakhash <ca...@alum.mit.edu> writes:

> Joe Marshall <jmar...@alum.mit.edu> writes:
>
> > In *practice*, the sort of events a thread might find `interesting'
> > might not be easily described to the OS. For instance, you might want
> > a thread to wait until a particular lisp symbol has a value of NIL.
> > In a case like this (unless something *extraordinarily* clever is
> > going on), you have little recourse but to ask the scheduler to check
> > the value of the symbol periodically. It would be difficult to
> > persuade unix to do this, but rather trivial if you wrote your own
> > scheduler.
>
> But the main Lisp process can still send an asynch event to wake up
> the thread that was sleeping.

Let's make a more concrete example. Suppose we have a lisp process
servicing http requests. While it is serving such a request, it binds
a special variable *HTTP-REQUEST-BEING-SERVICED* to T.

Now suppose we have another process that cannot begin some sort of
action if an http-request is being serviced. The code for this
process might have a fragment like this in it:

(progn (process-wait "waiting for http to be done"
#'(lambda () (not *http-request-being-serviced*)))
(begin-some-action))

(yes, there is a race condition here. Ignore it.)

PROCESS-WAIT, or something like it, is typically found in lisp
multi-threading packages. It tells the scheduler to put this process
to sleep until the function returns T. The scheduler is supposed to
evaluate the function from time to time.

Finally, suppose yet another process is listening to the serial port.
It is using some sort of magic involving the OS.

This third process is using an OS thread in the most effective
manner. Provided the OS is clever, it doesn't need to consider
scheduling the thread or look at the thread resources *at all* until a
serial line interrupt occurs.

The second process, however, doesn't have it so lucky. Someone,
either the process itself or a scheduling process, has to periodically
peek at the value of *http-request-being-serviced*. This process has
to be in lisp because we can't indicate to the OS that we're waiting
on a symbol binding. This also cannot be the process servicing the
http port because it is busy with other things. (Let's not violate
abstraction by rewriting the http service routine to handle
synchronization of multiple unrelated threads.)

So we have a lisp process polling the value of
*http-request-being-serviced*. The OS doesn't know under what
conditions this might happen, as far as it is concerned, this is a
running thread and must be considered when processes are scheduled.

> If you have a Lisp-based threading system (i.e. non-OS threads), and
> you wanted a thread to wake up when a symbol got bound, for example,
> then what's the difference between the Lisp process (which is
> scheduled identically to threads in Linux, for example) sees the
> boundp and wakes up the "thread" (as a stack group) vs. sending some
> kind of interrupt or event to another process-like thread to wake it
> up.

I don't know how threads are scheduled in Linux. If they are
scheduled with a simple `round robin' type of scheduler that simply
polls pending OS-level events, then it really wouldn't make much
difference.

On the other hand, suppose each thread on the OS is awaiting some form
of I/O. The OS can simply turn off the scheduler until an interrupt
comes in. Or suppose that all threads but one is awaiting I/O. The
OS can transfer all CPU time to that thread.

> If you think about all this all over again, you'll probably see that
> the better thing all around is to use OS threads, when possible.

Not necessarily.

> The reason for this has a lot to do with the subtle differences
> between OSs.

An ideal reason to use lisp-level threads: it is more likely to
present a uniform interface.

> The people who understand the OS very well probably are
> the best ones to take advantage of low-level, subtle optimizations
> that, at the Lisp level, are unreachable.

An ideal reason to use OS threads.

> Also, pushing off
> responsibility (e.g. scheduling) is a good thing, especially if you
> consider what KMP once said which is that it's not good to code [Lisp]
> to use local Lisp-level optimizations;

Again, a tradeoff. If you want portable, high-level code, of course
not. If portability is not an issue and performance (and correctness)
is, then why not.

> it's better to write nice,
> elegant, simple code and use tools (I think he was referring to the
> LOOP macro in some way) when just do it, and don't try too hard to
> figure out if the system you're on does it well.

Yes. You shouldn't re-invent the wheel.

But suppose the system you're on doesn't do it well enough? Or does
it wrong?

> For example, in this
> case, if the Lisp guys relied on the OS guys to make (future)
> improvements in multithreading, then I'd be happy to use a slightly
> less efficient implementation now, knowing that it would get better
> and better as the OS (in this case Linux) gets better (with respect to
> threads). The bottom line is for the Lisp implementor NOT to try too
> hard to figure out if the system he's on has multiple processors,
> etc. If there's a mechanism to abstract that away, then they should
> use it, unless the advantages to breaking into the guts reaps massive
> rewards.

Again, in *theory* this is correct. Reality often rears its ugly
head, though.

> > In *practice*, the OS scheduler could be very poorly implemented.
> >
> > So you have to find a balance between the expressive power you need to
> > describe your thread synchronization and the actual efficiency of the
> > OS scheduler.
>
> again, even if this is so, the implementors should assume that the OS
> guys knew what they were doing.

Even if the OS is Unix or NT? I don't think so.

> Even if they didn't do the best job,
> it's inevitably going to improve.

You have got to be kidding! Unix is about 25 years old and it *still*
doesn't handle pcluser'ing correctly. It still runs fsck at
boot time. NFS still has no cache coherency.

Or look at Windows 2000. Is 30 million lines of code an improvement?

--
~jrm

Erik Naggum

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

* Andrew Cooke <and...@andrewcooke.free-online.co.uk>

| I don't have multithreaded Lisp experience, but work in Java, where
| threads are part of the standard language, and they are invaluable in
| making responsive, robust systems without complicated coupling between
| separate parts of the program. So they often help *good design*.

you don't need OS support to make this work the way you want. the desire
for OS support can often lead you to dismiss what you can actually do as
"undesirable". many would-be CL users have run into this "mindset trap"
where they refuse to use Common Lisp because they have this fixation that
some feature or another must be "standard" before they can use it.

investigate your Common Lisp environment. programming only in Common
Lisp as per the standard _only_ is like programming in any other language
as per the standard _only_ (with the exception that you can actually get
quite a lot of interesting work done in standard CL) -- ignoring the
programming environment (such as Allegro CL for CL, and Unix for C) is
just plain stupid.

#:Erik

David Hanley

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

Joe Marshall wrote:

> "dave" <deen...@yahoo.com> writes:
>
> >
> > My guess is that OS threads would be better, in that you don't have to
> > re-invent an efficient scheduler, and also that you can take better
> > advantage of nuggets in the kernel of whatever OS you're operating on.
> >
>
> In *theory*, most of the threads aren't doing much but waiting around
> for something interesting to happen.

Not for every app. I have an app which runs on a multiple CPU
machine, and needs to generate a number of images ( time intensive ).
I shoot off a number of threads to work on the image ( one each )
and this makes the whole thing happen a *lot* faster.

> The OS is the ultimate oracle
> for interesting events, so in *theory*, the OS can schedule things
> *very* efficiently. But....
>

> In *practice*, the sort of events a thread might find `interesting'
> might not be easily described to the OS. For instance, you might want
> a thread to wait until a particular lisp symbol has a value of NIL.
> In a case like this (unless something *extraordinarily* clever is
> going on), you have little recourse but to ask the scheduler to check
> the value of the symbol periodically. It would be difficult to
> persuade unix to do this, but rather trivial if you wrote your own
> scheduler.

What your scheduler is going to do, then is
(while *waiting-flag*
(sleep 5))

Which you can do in lisp, and still use the system threads.

> In *practice*, the OS scheduler could be very poorly implemented.

That's true, it could. If so, it's likely you're going to have *other*
performance issues, and still not use more than one CPU if present.

dave

David Hanley

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

Erik Naggum wrote:

> * David Hanley <d...@ncgr.org>
> | And perhaps more critically: the ability to make use of more than one
> | CPU if present.
>
> yet this may be what slows down the threads and speeds up the processes.

This is true, particularly if you're doing a lot of memory allocation. It
depends a *lot* on the task at hand. However, I don't think it justifies
using some homebrewed version of threads instead of OS threads, though.

Here: Java used/uses homebrewed "green" threads. We dont't
want to be like java, do we? :)

dave

Joe Marshall

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

David Hanley <d...@ncgr.org> writes:

> Joe Marshall wrote:
>
> >
> > In *theory*, most of the threads aren't doing much but waiting around
> > for something interesting to happen.
>
> Not for every app. I have an app which runs on a multiple CPU
> machine, and needs to generate a number of images ( time intensive ).
> I shoot off a number of threads to work on the image ( one each )
> and this makes the whole thing happen a *lot* faster.

A real good argument for OS threads.

> > The OS is the ultimate oracle
> > for interesting events, so in *theory*, the OS can schedule things
> > *very* efficiently. But....
> >
> > In *practice*, the sort of events a thread might find `interesting'
> > might not be easily described to the OS. For instance, you might want
> > a thread to wait until a particular lisp symbol has a value of NIL.
> > In a case like this (unless something *extraordinarily* clever is
> > going on), you have little recourse but to ask the scheduler to check
> > the value of the symbol periodically. It would be difficult to
> > persuade unix to do this, but rather trivial if you wrote your own
> > scheduler.
>
> What your scheduler is going to do, then is
> (while *waiting-flag*
> (sleep 5))
>
> Which you can do in lisp, and still use the system threads.

Yes, but you don't get all the benefits of OS threads. You'd have a
thread that unconditionally wakes up every five seconds (and gets
swapped in). If you were waiting on something the OS is familiar with
(like an i/o event) the OS wouldn't even bother with your thread until
it was actually runnable.

> > In *practice*, the OS scheduler could be very poorly implemented.
>
> That's true, it could. If so, it's likely you're going to have *other*

> performance issues, and still not use more than one CPU if present.

I was thinking of bugs and race conditions more than performance.
Certainly if you want to use multiple cpu's, you are almost required
to use OS threads (unless there is some funky mechanism that lets you
gain control of the other cpus directly).

--
~jrm

Jens Kilian

unread,

Apr 6, 2000, 3:00:00 AM4/6/00

to

Rolf Rander Naess <rolfn...@pvv.org> writes:
> I believe I read a quote from the linux-kernel developers once stating
> that there wouldn't be put much effort into making threads effective,
> as they regarded heavy use of threads to be bad design in the first
> place. (No, I don't have a reference)

References? (BeOS is a good counterexample to this opinion, BTW.)
--
mailto:j...@acm.org phone:+49-7031-464-7698 (HP TELNET 778-7698)
http://www.bawue.de/~jjk/ fax:+49-7031-464-7351
PGP: 06 04 1C 35 7B DC 1F 26 As the air to a bird, or the sea to a fish,
0x555DA8B5 BB A2 F0 66 77 75 E1 08 so is contempt to the contemptible. [Blake]

William Deakin

unread,

Apr 6, 2000, 3:00:00 AM4/6/00

to

Joe Marshall wrote:

> A real good argument for OS threads.

This depends what you mean by OS threads. A lightweight thread implementation
which is supplied withing the OS but keeps the threads within a single process
memory and resources will not be affected by whether you have 1 or 87 cpus.

Best Regards,

:) will

Marc Battyani

unread,

Apr 6, 2000, 3:00:00 AM4/6/00

to

William Deakin <wi...@pindar.com> wrote in message
news:38EC451A...@pindar.com...

> This depends what you mean by OS threads. A lightweight thread
implementation
> which is supplied withing the OS but keeps the threads within a single
process
> memory and resources will not be affected by whether you have 1 or 87
cpus.

I don't think that OS threads are limited to run on one CPU.
At least it's not the case under NT. There are even functions to specify on
which processors can and should a thread run.

The SetThreadIdealProcessor function is used to specify a preferred
processor for a thread. The system schedules threads on their preferred
processors whenever possible.

The SetThreadAffinityMask function sets a processor affinity mask for a
specified thread.
A thread affinity mask is a bit vector in which each bit represents the
processors that a thread is allowed to run on.

Marc Battyani

William Deakin

unread,

Apr 6, 2000, 3:00:00 AM4/6/00

to

Marc Battyani wrote:

> I don't think that OS threads are limited to run on one CPU.

Which is not what I tried to say. What I was trying to say is that there are
implementations of OS threads (looking at UNIX which I know a little more about
than NT) in which you *are* limited to run on one CPU. This is because although
the thread is supplied by the OS (that is the operation comes bundled with the
kernel [1]) the thread operations are limited to run in user and not kernel
space. As user processes generally do not have access to the SMP/kernel
scheduler, the OS thread is limited to run on the CPU on which the user process
runs. Thread scheduling is then handled within the user process. This is a
pretty bad way of implementing threads but such implementations *do* exist.

This is contrast to system in which the thread system is implemented in such as
way to give access to kernel space at which point threads can be happily fired
off onto whatever CPU is free. Thread scheduling is then handled by the
kernel. However, this again this is somewhat problematic as these threads are
very similar to processes. (A way round this is to have some hybrid system that
has some happy mixture of the light-weight user-thread system and the
heavy-weight kernel-thread.)

Anyway, maybe its just *me* that is confused and but I am fairly sure that I am
not that unique. So my problem with this thread thread (sic) is the confusion
as to what threads are, the jargon used to describe them and how this maps to
the way in which they are implemented. For example: the differences OS threads,
green threads, kernel threads, light and heavy threads, nice threads &c ;)

Cheers,

:) will

[1] and the reason why I was asking for a definition of an OS thread. If an OS
thread is `a thread that has access to kernel space' then you have defined away
the problems.

William Deakin

unread,

Apr 6, 2000, 3:00:00 AM4/6/00

to

Fernando D. Mato Mira wrote:

> Enough with "speed". Response time and jitter are the real issues.

That is very true. But this wouldn't be c.l.l. if we didn't have some
arbitary benchmarks to argue about,

;) will

Fernando D. Mato Mira

unread,

Apr 6, 2000, 3:00:00 AM4/6/00

to

Joe Marshall wrote:

> The second process, however, doesn't have it so lucky. Someone,
> either the process itself or a scheduling process, has to periodically
> peek at the value of *http-request-being-serviced*. This process has
> to be in lisp because we can't indicate to the OS that we're waiting
> on a symbol binding. This also cannot be the process servicing the
> http port because it is busy with other things. (Let's not violate
> abstraction by rewriting the http service routine to handle
> synchronization of multiple unrelated threads.)
>
> So we have a lisp process polling the value of
> *http-request-being-serviced*. The OS doesn't know under what
> conditions this might happen, as far as it is concerned, this is a
> running thread and must be considered when processes are scheduled.

This means Lisp globals should be implemented using condition variables
(better yet, do not support their use for synchronization unless a
corresponding (nonignorable) declaration has been performed)

Jens Kilian

unread,

Apr 6, 2000, 3:00:00 AM4/6/00

to

Rolf Rander Naess <rolfn...@pvv.org> writes:
> (No, I don't have a reference)

Jens Kilian <Jens_...@agilent.com> writes:
> References?

Well, duh. Seems I have to increase my caffeine intake.
Sorry.

Erik Naggum

unread,

Apr 6, 2000, 3:00:00 AM4/6/00

to

* David Hanley <d...@ncgr.org>

| Not for every app. I have an app which runs on a multiple CPU machine,
| and needs to generate a number of images ( time intensive ). I shoot off
| a number of threads to work on the image ( one each ) and this makes the
| whole thing happen a *lot* faster.

how do they communicate the results back to eachother or a master?

#:Erik

David Hanley

unread,

Apr 6, 2000, 3:00:00 AM4/6/00

to

William Deakin wrote:

> Joe Marshall wrote:
>
> > A real good argument for OS threads.
>

> This depends what you mean by OS threads. A lightweight thread implementation
> which is supplied withing the OS but keeps the threads within a single process
> memory and resources will not be affected by whether you have 1 or 87 cpus.

True, with a sufficently stupid implementation. I draw a parallel to
"lisp is slow" arguments, by pointing to croddy lisp implementations.

dave

David Hanley

unread,

Apr 6, 2000, 3:00:00 AM4/6/00

to

Erik Naggum wrote:

They each get an object to modify, and call a synchronized reporting function
when they are complete, which calls another function when every thread has
reported in.

Unfortunately, processes wouldn't work well for this. The images have to come
back to one place, and it's prohibitively expensive to send an image via IPC
in java, and, of course, java has no shared memory. The threads do
not have to communicate with each other once they are started, though they
do have to make sure someone isn't already generating a resource they need
(conflict). If someone else is, they wait for it. It took awhile to get right,
but
it's worth it, and I learned quite a bit about threaded caching. :)

dave

David Hanley

unread,

Apr 6, 2000, 3:00:00 AM4/6/00

to

Joe Marshall wrote:

> David Hanley <d...@ncgr.org> writes:
>
> > What your scheduler is going to do, then is
> > (while *waiting-flag*
> > (sleep 5))
> >
> > Which you can do in lisp, and still use the system threads.
>
> Yes, but you don't get all the benefits of OS threads. You'd have a
> thread that unconditionally wakes up every five seconds (and gets
> swapped in). If you were waiting on something the OS is familiar with
> (like an i/o event) the OS wouldn't even bother with your thread until
> it was actually runnable.

Yes, but the OS os going to do something almost exactly like the above,
is my point. If it takes place in the OS space or the user space doesn't
make a huge amount of difference.

Actually in clos, you could put an :after method on the stetter of that
flag and have that wake up the thread.. Even more efficient.

>
> I was thinking of bugs and race conditions more than performance.
> Certainly if you want to use multiple cpu's, you are almost required
> to use OS threads (unless there is some funky mechanism that lets you
> gain control of the other cpus directly).

You're right, you will have less potential and real threading bugs using
non-os threads. I still wouldn't trust such code, however. Suppose
your compiler is upgraded to use OS threads, and all the code breaks?
Or you port to a certian platform? It's also true, as you say ( I think )
that OS thread could cause more system problems too. I say, get
the system right.

On a tangent, there were bugs in java's stop() of a thread, so it is
deprecated, and will be removed. They say a thead should
programmatically test a stop flag periodically if someone else
may need to kill it. This is a *really horrible* idea. Flag checks
take up time, potentially a lot, and what if you're calling long-running
library code? Geez.

dave

not_s...@my-deja.com

unread,

Apr 6, 2000, 3:00:00 AM4/6/00

to

Hey,

I'm going to chime in, as half the things I have read here go against
everything I have learned about threading and concurrent programming.
That may well be because I haven't done that much of it, so this might
turn out to be an interesting learning experience. Please don't take
the rest of this message as preaching, I'm merely regurgitating what I
remember. Feel free to correct me :-)

The OS attempts to provide a generic level of functionality that works
well across a variety of applications. Sometimes, the OS
characteristics match the application well, and sometimes not. If not,
we often find ourselves tweaking the OS/kernel to get better
performance. That is true of threading as well.

OS threads have to be scheduled by the kernel. Each time such
scheduling takes place, the previous thread's context has to be saved
by the kernel, the next thread's context restored, and the thread
started. This makes OS threads rather crufty. They turn out to be great
for dividing work within a process where there isn't that much
interaction between the parts. This is often the case, for example, in
image processing and handling HTTP requests. An area of an image can be
handed over to different OS threads, which can now potentially run on
multiple CPU's. The same with different threads handling different HTTP
requests. There is sometimes a large cost in collating results through
IPC, but that's a one time cost. Overall, performance is much improved.

This also shows the problems with OS level threads. If the threads have
to be closely synchronized when they run, then user threads are
preferable. Consider the consumer/producer problem, where there is one
thread acting as a listener on an intermittent stream (for example user
input), and a worker thread collecting data from the listener thread
when it has time. The turn-around time for such coordination has to be
relatively short, otherwise we will probably find a rather frustrated
user. In the absence of shared memory, we incur yet more overhead in
getting stuff from the listener to the worker. The context switch that
would be necessary in using OS threads would slow down the application
substantially.

Another problem with user threads is that if one thread in the user
process blocks on an OS call, the entire process blocks. That sometimes
is unavoidable and unacceptable. In such a situation, we again turn to
OS threads, so that the entire process cannot be blocked from just one
call.

I've heard rumblings about lightweight OS threads that don't require a
full context switch and some such, but I don't know much about this at
all. Yes, they would be more efficient, but still not as efficient as
user threads. Simply because the process knows better than the OS what
its scheduling requirements are.

Comments welcome.

Sunil

In article <38ECC298...@ncgr.org>,

--
Sunil Mishra ** www.everest.com
All messages sent to the above email account shall go unread.

Tim Bradshaw

unread,

Apr 6, 2000, 3:00:00 AM4/6/00

to

* Joe Marshall wrote:
> The Demers-Boehm conservative GC can support the use of multiple
> threads, but they cause some difficulty. In order to keep allocation
> serialized, they have a lock that must be siezed on each allocation.
> This is a lot of overhead. When it is time to flip, all the threads
> must be synchronized (basically halted) so that the GC can act
> atomically. This turns out to be an issue on Windows NT because there
> is no system call that says `don't allow any other threads in this
> process to run'. You have to enumerate each thread (and there is no
> call for *that*, either), then send them a suspend message, then wait
> for them all to suspend, before you can do a gc-flip.

I'm not sure if the Boehm GC is single-threaded in the sense that the
GC needs to halt all threads during the GC or just for a short time
while it fiddles some parameters. However, multithreaded GCs are
something that are going to have to be faced if you want to run on
reasonable-sized multiprocessors, or you just crash into Amdahl's law
(if GC and other serial memory management is 10%, then a 10 processor
machine is not a very big limit to scaling any more: large commercial
machines already seem to be between 6 and ?20? times that).

I realise these things are *very* hard to do, and probably have
significant overhead.

--tim

Dave Bakhash

unread,

Apr 6, 2000, 3:00:00 AM4/6/00

to

I agree with many of joe's arguments, but this following one, I think,
was a misunderstanding:

Joe Marshall <jmar...@alum.mit.edu> writes:

> > The reason for this has a lot to do with the subtle differences
> > between OSs.
>
> An ideal reason to use lisp-level threads: it is more likely to
> present a uniform interface.

But I wasn't arguing about the interface that the Lisp programmer
sees. Let it remain to be the Genera-style threads or whatever. What
I'm talking about should absolutely NOT affect the Lisp interface from
the perspective of the programmer; just for the implementor.

as a user, I just want a fast Lisp that doesn't suck up too much
memory.

dave

Andrew Cooke

unread,

Apr 7, 2000, 3:00:00 AM4/7/00

to

In article <38ECC15D...@ncgr.org>,

David Hanley <d...@ncgr.org> wrote:
> On a tangent, there were bugs in java's stop() of a thread, so it is
> deprecated, and will be removed. They say a thead should
> programmatically test a stop flag periodically if someone else
> may need to kill it. This is a *really horrible* idea. Flag checks
> take up time, potentially a lot, and what if you're calling
long-running
> library code? Geez.

Not sure this was a bug as much as a realisation that it could really
mess up state - if threads only stop where they check flags, they can
exit cleanly (I may be wrong, but I'm sure Doug Lea's book will have a
well reasoned explanation....)

Not a problem with functional code, I guess. Andrew

Rob Warnock

unread,

Apr 7, 2000, 3:00:00 AM4/7/00

to

Tim Bradshaw <t...@cley.com> wrote:
+---------------

| However, multithreaded GCs are something that are going to have to be
| faced if you want to run on reasonable-sized multiprocessors, or you just
| crash into Amdahl's law (if GC and other serial memory management is 10%,\
| then a 10 processor machine is not a very big limit to scaling any more:
| large commercial machines already seem to be between 6 and ?20? times that).

+---------------

Indeed. SGI routinely ships 64-P & 128-P machines, and have even shipped
a few 256-P & 512-P machines. These are "single-system image" (single copy
of the operating system) CC-NUMA (single cache-coherent memory space, though
access times to "local" and "other-node" memories are different) machines.

Of course, we have *lots* of compiler & library hooks for writing multi-
processing code in C, C++, and Fortran (especially the latter!), and
several flavors of "user" and "operating-system" threads to choose from
(SGI "sprocs", POSIX threads, copy-on-write fork(), etc.), depending on
what kind of sharing you want between the threads (e.g., total, partial,
or no sharing of address spaces, open files, etc.), and the usual primitives
for synchronization (spin-locks, semaphores, condition variables, etc.).

But I have long wondered whether there was a Common Lisp anywhere that
could do something useful with all those CPUs. Is there? Anybody know?

+---------------

| I realise these things are *very* hard to do, and probably have
| significant overhead.

+---------------

I'm sure. My guess is that at a minimum you'd want some sort of real-time
or at least incremental GC, so you could run the mutator on some CPUs while
running the collector on others, and for efficiency you'd almost certainly
need separate allocation pools per mutator thread, so you could do lock-free
allocation "most" of the time. Stuff like that...

-Rob

-----
Rob Warnock, 41L-955 rp...@sgi.com
Applied Networking http://reality.sgi.com/rpw3/
Silicon Graphics, Inc. Phone: 650-933-1673
1600 Amphitheatre Pkwy. PP-ASEL-IA
Mountain View, CA 94043

Tim Bradshaw

unread,

Apr 7, 2000, 3:00:00 AM4/7/00

to

* not smishra wrote:

> OS threads have to be scheduled by the kernel. Each time such
> scheduling takes place, the previous thread's context has to be saved
> by the kernel, the next thread's context restored, and the thread
> started.

If there are no more active threads than processors then there will
essentially be no context switching.

Purely userland threads typically can not use more than one processor,
so on any machine with more than one you're compelled to use kernel
threads to at least some extent if you want to get any significant
benefit from the extra hardware.

--tim

William Deakin

unread,

Apr 7, 2000, 3:00:00 AM4/7/00

to

David Hanley wrote:

> > This depends what you mean by OS threads. A lightweight thread implementation
> > which is supplied withing the OS but keeps the threads within a single process
> > memory and resources will not be affected by whether you have 1 or 87 cpus.
>
> True, with a sufficently stupid implementation. I draw a parallel to
> "lisp is slow" arguments, by pointing to croddy lisp implementations.

Clearly your glass in neither half-full nor half-empty but permanently full. At
which point I shall stop raining on your parade,

Cheers,

:) will

William Deakin

unread,

Apr 7, 2000, 3:00:00 AM4/7/00

to

not_smishra wrote:

> I've heard rumblings about lightweight OS threads that don't require a
> full context switch and some such, but I don't know much about this at
> all.

There have been no `rumblings' here ;)

Ok then, please could you define what is meant by OS threads. Why can you
not have OS threads that only work within the user space? This maybe the
source of my confusion and what I mean in my `rumblings' about lightweight
OS threads.

Best Regards,

:) will

not_s...@my-deja.com

unread,

Apr 7, 2000, 3:00:00 AM4/7/00

to

In article <38ED9B63...@pindar.com>,

I'm defining OS threads to be those that are scheduled for execution by
the kernel. So to switch between two kernel threads requires an exit
from the user context into an OS context, then a switch to another user
context. User threads are scheduled for execution by some scheduler
residing in the user process, so all the user threads reside in one OS
thread/process. They do not require any such context switches.

Note that a user thread library may be provided by the OS. (pthreads, I
believe, is a good example on several unixes (unices?).) That does not
make them OS threads. The defining criterion is whether the kernel or
the user process takes responsibility for scheduling the threads.

Regards,

Sunil

--
Sunil Mishra ** www.everest.com
All messages sent to the above email account shall go unread.

ro...@corman.net

unread,

Apr 7, 2000, 3:00:00 AM4/7/00

to

In article <8ck1rp$rfp$1...@nnrp1.deja.com>,

Andrew Cooke <and...@andrewcooke.free-online.co.uk> wrote:
> In article <38ECC15D...@ncgr.org>,
> David Hanley <d...@ncgr.org> wrote:
> > On a tangent, there were bugs in java's stop() of a thread, so it is
> > deprecated, and will be removed. They say a thead should
> > programmatically test a stop flag periodically if someone else
> > may need to kill it. This is a *really horrible* idea. Flag checks
> > take up time, potentially a lot, and what if you're calling
> long-running
> > library code? Geez.
>
> Not sure this was a bug as much as a realisation that it could really
> mess up state - if threads only stop where they check flags, they can
> exit cleanly (I may be wrong, but I'm sure Doug Lea's book will have a
> well reasoned explanation....)
>

I have to deal with Java threads on a daily basis, and believe I
understand exactly what the issues with stop() are.
Calling stop() from another thread arbitrarily halts the thread
wherever it happens to be. "Finally" code blocks (unwind-protect)
still get called, but any state information which is in
the process of being modified could cause an object to
be left unusable. This is a problem in Java libraries as well as
user code.

Fundamentally, the best way around this (for Java) is to make
sure that a thread stops itself, rather than being stopped by
another thread. The polled flag method is just an example. If you
do it intelligently i.e. sleep for a period of time, wake up,
poll, etc., or judiciously check a flag after various operations,
it works OK.

However, it makes working with threads a real pain in Java.
I found working with threads much easier in C++, using the Win32
thread APIs, because they were much richer with things like
non-blocking IO calls (which Java doesn't have) and
WaitForMultipleObject. Another problem with stop(): if a thread
is blocked on IO (very common with reader threads, for example),
stop() will not break out of the IO call. Getting the thread to
stop in this instance is a big hassle. Various workarounds have
caused various browsers to crash.

I find threads to be good in a server, where they can be
managed nicely in a pool, started up, and run for a long time.
Using threads in an applet is not very good, because it is
meant to start quickly in response to a user URL click, and to
end just as quickly when the user clicks on another link or button.
Starting and stopping threads really messes this up. The big problem
is that many Java APIs pretty much require you use threads to
avoid the whole applet blocking. I think this is a main reason
why java applets are becoming less popular (not that they ever
were very popular). These days most of the interest in Java is
in servers.

Roger Corman

Colin Walters

unread,

Apr 7, 2000, 3:00:00 AM4/7/00

to

>>>>> "William" == William Deakin <wi...@pindar.com> writes:

William> Ok then, please could you define what is meant by OS
William> threads. Why can you not have OS threads that only work
William> within the user space?

I believe that an OS thread is essentially the same thing as a UNIX
process, except it shares text and heap with the other threads. The
benefit of having your threads be the same thing as processes (like
they are with LinuxThreads) is that the kernel can schedule them to
(for example) run on different CPUs. This is something that can
really only be done from inside the kernel, as far as I know.

So generally, the distinction between user-level threads and
kernel-level threads is simply that the user-level threads are really
one single process, whereas the kernel-level threads are each a
separate process. (Although I hear operating systems like Solaris do
have "light weight processes", which are somewhere in between)

Now, knowing this, how could the kernel schedule multiple user-space
threads? After all, to the kernel, those user-space threads appear as
a single process.

Joe Marshall

unread,

Apr 7, 2000, 3:00:00 AM4/7/00

to

not_s...@my-deja.com writes:

> Hey,
>
> I'm going to chime in, as half the things I have read here go against
> everything I have learned about threading and concurrent programming.

I'm curious as to which particular things you've read seem to be so
incorrect.

> The OS attempts to provide a generic level of functionality that works
> well across a variety of applications. Sometimes, the OS
> characteristics match the application well, and sometimes not. If not,
> we often find ourselves tweaking the OS/kernel to get better
> performance. That is true of threading as well.

Yep. And sometimes the OS neglects to provide functionality that is
critical to the application, in which case you are pretty much
screwed. In my experience, the OS services concerning multi-tasking,
multi-threading and synchronization are often missing crucial elements
or have buggy implementation.

> OS threads have to be scheduled by the kernel. Each time such
> scheduling takes place, the previous thread's context has to be saved
> by the kernel, the next thread's context restored, and the thread

> started. This makes OS threads rather crufty.

Not necessarily. The CPU usually has hardware designed to make
context switching fairly fast (but CPU designers have been known to
get it wrong as well: consider the 68000 and the 80286). The OS
usually has more context than the CPU needs, so there is bound to be
some extra overhead. An OS designed for multitasking would make a
strong effort to reduce the extra overhead.

> They turn out to be great
> for dividing work within a process where there isn't that much
> interaction between the parts. This is often the case, for example, in
> image processing and handling HTTP requests. An area of an image can be
> handed over to different OS threads, which can now potentially run on
> multiple CPU's. The same with different threads handling different HTTP
> requests. There is sometimes a large cost in collating results through
> IPC, but that's a one time cost. Overall, performance is much improved.

In a situation where you are running an application or a set of
applications that share very little information, there would be a lot
of context to switch anyway, so there is little noticable penalty if
the processor is saving a lot of context.

> This also shows the problems with OS level threads. If the threads have
> to be closely synchronized when they run, then user threads are
> preferable. Consider the consumer/producer problem, where there is one
> thread acting as a listener on an intermittent stream (for example user
> input), and a worker thread collecting data from the listener thread
> when it has time. The turn-around time for such coordination has to be
> relatively short, otherwise we will probably find a rather frustrated
> user. In the absence of shared memory, we incur yet more overhead in
> getting stuff from the listener to the worker. The context switch that
> would be necessary in using OS threads would slow down the application
> substantially.

Certainly true (well, except for the case of listening for user input,
actually, the user is so slow compared to the computer that context
switches are hardly a problem). In a non-clever implementation,
there'd be a context switch from producer to kernel, a switch from
kernel to consumer, a switch back to the kernel, and finally a switch
back to the producer --- way too much.

> Another problem with user threads is that if one thread in the user
> process blocks on an OS call, the entire process blocks. That sometimes
> is unavoidable and unacceptable. In such a situation, we again turn to
> OS threads, so that the entire process cannot be blocked from just one
> call.

Well, if the user is clever, and the OS is helpful, you can ameliorate
this kind of problem. For instance, suppose all your user threads
were awaiting input from various file descriptors (under unix), you
could make a blocking call to SELECT with the union of all the
descriptors because nothing needs to run.

> I've heard rumblings about lightweight OS threads that don't require a
> full context switch and some such, but I don't know much about this at

> all. Yes, they would be more efficient, but still not as efficient as
> user threads. Simply because the process knows better than the OS what
> its scheduling requirements are.

They could be competitive with user threads. Let's suppose you are
writing a user-level thread implementation. You have the option of
requiring the user threads to periodically yield control (polling or
co-operative multitasking), or you could ask the OS to periodically
deliver timer interrupts (pre-emptive multitasking).

Now each thread runs with its own stack, but the OS only gave you one
stack (some OS's allow you to put your stack wherever you want, others
give you a fixed chunk and you have to live with it). When you switch
processes, you have to switch stacks. If you have only one stack, you
have two options: partition your stack into stacklets (contiguous
regions each containing a thread stack), or by replacing the contents
of the stack with the contents of the next threads stack. Neither
mechanism is appealing. The former is prone to wreaking havoc on
stacklet overflow because it bypasses any OS stack overflow
protection, the latter makes user context switches an enormously
expensive task.

A lightweight OS thread would be able to efficiently switch between
user threads by dumping the entire processor state on the current
stack, and reloading it with the state from the next stack.
Presumably, the OS has provided a different stack for each thread and
can maintain stack protection correctly for all of them.

--
~jrm

Joe Marshall

unread,

Apr 7, 2000, 3:00:00 AM4/7/00

to

rp...@rigden.engr.sgi.com (Rob Warnock) writes:

>
> I'm sure. My guess is that at a minimum you'd want some sort of real-time
> or at least incremental GC, so you could run the mutator on some CPUs while
> running the collector on others, and for efficiency you'd almost certainly
> need separate allocation pools per mutator thread, so you could do lock-free
> allocation "most" of the time. Stuff like that...

You want a generational GC. Each processor would get its own youngest
generation and be able to allocate from it without interlocking.

William Deakin

unread,

Apr 7, 2000, 3:00:00 AM4/7/00

to

Thank you for your description of user and kernel/OS threads.

Colin Walters wrote:

> Now, knowing this, how could the kernel schedule multiple user-space
> threads?

OK then, if the kernel is not scheduling multiple user-space threads, what
is? What I think happens is: the OS provides scheduling in the user space.
However, as the kernel runs this code scheduling code, it must in some
sense `schedule multiple user-space threads.' Although I agree it must use
a different mechanism from that used to schedule other operations such as
process scheduling or io buffering or whatever.

> After all, to the kernel, those user-space threads appear as a single
> process.

So to the kernel what does user-space thread scheduling code appear like?
I think making arbitary statements about how things do or don't appear to
the kernel doesn't really help.

Cheers,

:) will

ps:
Dear Bored Reader,
I am aware that this is getting dull and is verging on `counting angels on
the heads of pins' territory and so I will bow out of this scrap before
anybody starts quoting koans at me ;)

Tim Bradshaw

unread,

Apr 7, 2000, 3:00:00 AM4/7/00

to

* William Deakin wrote:
> OK then, if the kernel is not scheduling multiple user-space threads, what
> is? What I think happens is: the OS provides scheduling in the user space.
> However, as the kernel runs this code scheduling code, it must in some
> sense `schedule multiple user-space threads.' Although I agree it must use
> a different mechanism from that used to schedule other operations such as
> process scheduling or io buffering or whatever.

No. All you need is timed interrupts and you can schedule your own
threads. You simply arrange life so that you get an interrupt
(signal) every nth of a second, and have the signal handler be the
scheduler, which schedules the threads appropriately. The kernel's
only involvement in things is to keep sending you signals.

The classic problem with this scheme is I/O, since it's traditionally
been hard to make it so that if one of your threads blocks on I/O then
others can still run. I think this kind of thing is addressed better
in recent Unixes.

--tim

Tim Moore

unread,

Apr 7, 2000, 3:00:00 AM4/7/00

to

On 7 Apr 2000, Tim Bradshaw wrote:

> * William Deakin wrote:
> > OK then, if the kernel is not scheduling multiple user-space threads, what
> > is? What I think happens is: the OS provides scheduling in the user space.
>

> No. All you need is timed interrupts and you can schedule your own
> threads. You simply arrange life so that you get an interrupt
> (signal) every nth of a second, and have the signal handler be the
> scheduler, which schedules the threads appropriately. The kernel's
> only involvement in things is to keep sending you signals.
>
> The classic problem with this scheme is I/O, since it's traditionally
> been hard to make it so that if one of your threads blocks on I/O then
> others can still run. I think this kind of thing is addressed better
> in recent Unixes.

Yeah. All the libc interfaces to I/O system calls that can block are
mapped into something that is either non-blocking or that can block on
multiple events like select() or poll(). It's pretty hairy.

On user vs. kernel threads: Unixes that support kernel threads generally
also support user threads and provide a many user threads to single kernel
thread mapping. This way you get the benefit of kernel scheduling and
multiprocessors without the overhead of kernel data structures and
scheduling for every thread.

Tim

David Hanley

unread,

Apr 7, 2000, 3:00:00 AM4/7/00

to

Joe Marshall wrote:

> David Hanley <d...@ncgr.org> writes:
>
> The OS wouldn't necessarily be doing this sort of polling (if it were
> implemented cleverly). Suppose you were awaiting some sort of input
> from the serial port.

Yes, but the example was not waiting for data from the serial port.
The example was waiting for a flag to change. I know full well you
don't have to poll ( in your program, it's happening in hardware ) to get
data from a socket.

> In the shower this morning I thought of another reason one might
> prefer non-OS threads. OS threads are usually pre-emptive -- they can
> occur between any pair of instructions. When implementing Lisp in the
> presence of OS threads, you have to be very careful that the state of
> the processor and the stack are well behaved at *all* times. If they
> aren't, the GC could run at an inopportune time and get confused by
> untagged values and such.

Yeah.. This is an issue. I don't pretend to have all the answers to this,
but as it's obviously possible, as there are systems which do it.

> The Demers-Boehm conservative GC can support the use of multiple
> threads, but they cause some difficulty. In order to keep allocation
> serialized, they have a lock that must be siezed on each allocation.
> This is a lot of overhead.

Yep. It seems that some of this, however, is due to the fact that
objects are not allowed to be moved. If you can, you can make
clear pages, and allocations can happen in them in a stack-like
fashion. If each thread gets a few free pages ( no big deal ) then
there needs to be no sync on lock.

> [problems on NT]

Yes, this does sound like a huge pain.

> The Lisp Machine, on the other hand, implemented threads at the Lisp
> level. The lisp machine microcode was single threaded. Synchronizing

>
> with the GC for flips or allocation wasn't necessary.

Yeah, but I don't know if that's really relevent or worth discussing.
You could ( sort of ) simulate threads in lisp, but the expense would
be hideous on normal architectures.

dave

David Hanley

unread,

Apr 7, 2000, 3:00:00 AM4/7/00

to

Andrew Cooke wrote:

> In article <38ECC15D...@ncgr.org>,
> David Hanley <d...@ncgr.org> wrote:
> > On a tangent, there were bugs in java's stop() of a thread, so it is
> > deprecated, and will be removed. They say a thead should
> > programmatically test a stop flag periodically if someone else
> > may need to kill it. This is a *really horrible* idea. Flag checks
> > take up time, potentially a lot, and what if you're calling
> long-running
> > library code? Geez.
>
> Not sure this was a bug as much as a realisation that it could really
> mess up state - if threads only stop where they check flags, they can
> exit cleanly (I may be wrong, but I'm sure Doug Lea's book will have a
> well reasoned explanation....)

Yes, it can mess up state, but I don't think that's an excuse. It is
possible
to code it so it won't mess up state ( after all, posix threads have
stop() )
but instead of fixing it ( even if it's a lot of work ) they just took
away
this important fucntionality. I seem to recall this being referred to as
the new jersy approach.

dave

Joe Marshall

unread,

Apr 7, 2000, 3:00:00 AM4/7/00

to

David Hanley <d...@ncgr.org> writes:

No, the new jersy approach would be to leave it in and document it in
the BUGS section. Conforming code would be required to avoid calling
stop() when the target thread was in an unsafe state.

Joe Marshall

unread,

Apr 7, 2000, 3:00:00 AM4/7/00

to

Tim Bradshaw <t...@cley.com> writes:

> No. All you need is timed interrupts and you can schedule your own
> threads. You simply arrange life so that you get an interrupt
> (signal) every nth of a second, and have the signal handler be the
> scheduler, which schedules the threads appropriately. The kernel's
> only involvement in things is to keep sending you signals.

Easier said than done. If you are trying to do pre-emptive
scheduling, exactly how do you switch to a different thread? If you
simply jump to the context of the next thread (without dismissing the
interrupt), you can confuse the kernel (I have no idea why the kernel
would care if you did a return-from-interrupt, but I've seen kernel
panics caused by `too deep recursion in the signal handler'). If you
*do* dismiss the interrupt, the kernel restarts your code right where
the interrupt took place, so you don't accomplish anything.

You *could* attempt to parse out the processor context that is dumped
on the stack when the timer goes off, and bash out the saved stack
pointer and the saved return pc. If you do you'd better hope that the
interrupt context was actually saved on the user stack, and that the
OS doesn't double check it when you return from interrupts.

--
~jrm

David Hanley

unread,

Apr 7, 2000, 3:00:00 AM4/7/00

to

ro...@corman.net wrote:

> In article <8ck1rp$rfp$1...@nnrp1.deja.com>,

> Andrew Cooke <and...@andrewcooke.free-online.co.uk> wrote:
> > In article <38ECC15D...@ncgr.org>,
> > David Hanley <d...@ncgr.org> wrote:
> > > On a tangent, there were bugs in java's stop() of a thread, so it is
> > > deprecated, and will be removed. They say a thead should
> > > programmatically test a stop flag periodically if someone else
> > > may need to kill it. This is a *really horrible* idea. Flag checks
> > > take up time, potentially a lot, and what if you're calling
> > long-running
> > > library code? Geez.
> >
> > Not sure this was a bug as much as a realisation that it could really
> > mess up state - if threads only stop where they check flags, they can
> > exit cleanly (I may be wrong, but I'm sure Doug Lea's book will have a
> > well reasoned explanation....)
> >

> I have to deal with Java threads on a daily basis, and believe I
> understand exactly what the issues with stop() are.
> Calling stop() from another thread arbitrarily halts the thread
> wherever it happens to be. "Finally" code blocks (unwind-protect)
> still get called, but any state information which is in
> the process of being modified could cause an object to
> be left unusable. This is a problem in Java libraries as well as
> user code.

Yes, an object could be left funky, but it seem to me that accepting
this is better than taking out something that might actually be necessary.
Now, if there's a security issue, it seems nore justified, but I can't
think of a case where it's a real security problem that could not be
fixed. For example, you might stop a thread constructing a security
manager, and get a funky one, but the stop() method could refuse to
stop threads until they leave the security package. You shouldn't get
totally invalid objects, since pointer assignments are still atomic.

> Fundamentally, the best way around this (for Java) is to make
> sure that a thread stops itself, rather than being stopped by
> another thread. The polled flag method is just an example. If you
> do it intelligently i.e. sleep for a period of time, wake up,
> poll, etc., or judiciously check a flag after various operations,
> it works OK.

I don't agee with this at all. There are a couple large issues:

1) You might call code which doesn't do this testing and will run
for a long time. For example, certian maths code.

2) All your design becomes obfuscated by this thread stopping
which gets shot through the architecture. Programming is hard
enough without having to work out what might need to be stopped,
how, why, and when.

3) It's not so easy to design where to put stops. As a very stupid example:

int sumVals( Vector v )
{
int sum = 0;
Enumeraton e = v.elements();
while( e.hasMoreElements())
{
sum += ((InterfaceWithValueMethod)e.nextElement()).value();
}
return sum;
}

Now, if your flag test goes in the sumVals header, the loop might take
a very long time ( maybe the call to value() is expensive ) before
noticing the thread-stopping flag has been tripped. Inside the loop,
it might become to expensive, particularly if you try to figure out
a clever way to use sumVals as library code. In fact, I'm not sure I
can think of a very elegant way of doing this. What happens to
reusability in this case?

dave

Will Deakin

unread,

Apr 7, 2000, 3:00:00 AM4/7/00

to

Tim wrote:
>The kernel's only involvement in things is to keep sending you signals.

I could go all sophist on you and say `ah but yes the kernel has some (albeit
minimal) involvement.' But this would be stupid and annoying, so I won't ;)
Instead I'll say urghh. That is rather ugly.

>
>The classic problem with this scheme is I/O, since it's traditionally
>been hard to make it so that if one of your threads blocks on I/O then
>others can still run.

Ouch.

Cheers,

:) will

Will Deakin

unread,

Apr 7, 2000, 3:00:00 AM4/7/00

to

In which case I agree with what you have said. My confusion is one of
definition.

Cheers,

:)will

Rob Warnock

unread,

Apr 8, 2000, 3:00:00 AM4/8/00

to

Joe Marshall <jmar...@alum.mit.edu> wrote:
+---------------

| No, the new jersy approach would be to leave it in and document it in
| the BUGS section. Conforming code would be required to avoid calling
| stop() when the target thread was in an unsafe state.

+---------------

Would that require all threads to link to the "omniscience" module?
Or would (use :clairvoyance) be adequate?

Rob Warnock

unread,

Apr 8, 2000, 3:00:00 AM4/8/00

to

Joe Marshall <jmar...@alum.mit.edu> wrote:
+---------------

+---------------

Yes, I was already assuming a generational GC, certainly with a per-thread
(or per-CPU) "nursery", but perhaps even with one or two generations beyond
that being allocated per-thread/CPU. But maybe you might want some sort of
incremental collection that can run entirely in parallel with the per-thread
mutators so you don't have to lock the whole bloody mess to do a higher-
generation collection. And, gee, wouldn't it be nice if you could have
more than one of those collector threads or CPUs, too, so you could adjust
the balance between mutators (the ones that are doing the "work") and
collectors depending on the number of CPUs you had available and the kind
of problem you were solving?

If you have "enough" CPUs to throw at the problem, you could afford to
spend quite a bit of overhead in fine-grained locking (say, as might be
added by introducing read barriers), as long as by doing so you could
avoid almost entirely any *global* locking.

That's the kind of evolution we saw on SGI's "Irix" kernel as we tuned it
for larger & larger configurations. Yes, changing all the classical Unix
global locks to fine-grained locks (e.g., in the protocol stacks, replacing
global "splnet()"s with spinlocks or semaphores on individual PCBs, devices,
etc.) did in some cases hurt single-CPU performance a little. But it also
allowed the operating system to scale well to *much* larger numbers of CPUs
(as mentioned before, up to 512 so far).

That's why I was wondering if anyone had done that sort of thing for
Common Lisp yet...

Tim Bradshaw

unread,

Apr 8, 2000, 3:00:00 AM4/8/00

to

* Will Deakin wrote:
> Instead I'll say urghh. That is rather ugly.

I don't see why it's ugly. Why *should* the kernel get involved?

--tim

Rahul Jain

unread,

Apr 8, 2000, 3:00:00 AM4/8/00

to

In article <ey3itxs...@cley.com> posted on Saturday, April 8, 2000

So that we can use more than one CPU, for one

--
-> -\-=-=-=-=-=-=-=-=-=-/^\-=-=-=<*><*>=-=-=-/^\-=-=-=-=-=-=-=-=-=-/- <-
-> -/-=-=-=-=-=-=-=-=-=/ { Rahul -<>- Jain } \=-=-=-=-=-=-=-=-=-\- <-
-> -\- "I never could get the hang of Thursdays." - HHGTTG by DNA -/- <-
-> -/- http://photino.sid.rice.edu/ -=- mailto:rahul...@usa.net -\- <-
|--|--------|--------------|----|-------------|------|---------|-----|-|
Version 11.423.999.210020101.23.50110101.042
(c)1996-2000, All rights reserved. Disclaimer available upon request.

Tim Bradshaw

unread,

Apr 8, 2000, 3:00:00 AM4/8/00

to

* Rahul Jain wrote:

> So that we can use more than one CPU, for one

yes, I know that (in fact I mentioned just this earlier on). I meant
that, other things being equal, I don't see anything wrong with
purely user-level threading system.

--tim

Joe Marshall

unread,

Apr 9, 2000, 3:00:00 AM4/9/00

to

rp...@rigden.engr.sgi.com (Rob Warnock) writes:

> Joe Marshall <jmar...@alum.mit.edu> wrote:
> +---------------

> | No, the new jersy approach would be to leave it in and document it in
> | the BUGS section. Conforming code would be required to avoid calling
> | stop() when the target thread was in an unsafe state.
> +---------------
>
> Would that require all threads to link to the "omniscience" module?
> Or would (use :clairvoyance) be adequate?

import java.lang.Dwim;

Fernando D. Mato Mira

unread,

Apr 9, 2000, 3:00:00 AM4/9/00

to

Tim Bradshaw wrote:

1 usec jitter @ 5KHz. That's the kind of specs I've had to work with
lately

--
Fernando D. Mato Mira
Real-Time SW Eng & Networking
Advanced Systems Engineering Division
CSEM
Jaquet-Droz 1 email: matomira AT acm DOT org
CH-2007 Neuchatel tel: +41 (32) 720-5157
Switzerland FAX: +41 (32) 720-5720

www.csem.ch www.vrai.com ligwww.epfl.ch/matomira.html

Marc Battyani

unread,

Apr 9, 2000, 3:00:00 AM4/9/00

to

Fernando D. Mato Mira <mato...@acm.org> wrote in message
news:38F0AC93...@acm.org...

> 1 usec jitter @ 5KHz. That's the kind of specs I've had to work with
> lately

In LISP or in LISP + some other language.
When I have such hard requirements I generally use LISP + VHDL.

Marc Battyani

unread,

Apr 9, 2000, 3:00:00 AM4/9/00

to

Marc Battyani <Marc_B...@csi.com> wrote in message
news:F569DFA344693086.FBD2D0E6...@lp.airnews.net...

Hum, I forgot the #\Question-Mark and my question is not very good.
Let's make it more clear:

Have you done this in LISP, in LISP + some other language, or not in lisp?

Marc Battyani.

Tim Bradshaw

unread,

Apr 9, 2000, 3:00:00 AM4/9/00

to

* Fernando D Mato Mira wrote:

> 1 usec jitter @ 5KHz. That's the kind of specs I've had to work with
> lately

Are you trying to imply that conventional *nix / NT kernel threads can
give you this but user ones can't, or just trying to look clever and
cryptic?

If the former then I see no reason why you can't achieve jitter as low
as you like if the kernel can send you interrupts accurately enough.

--tim

Will Deakin

unread,

Apr 9, 2000, 3:00:00 AM4/9/00

to

Sorry, I should be more clear about what I post.

Tim wrote:
>I don't see why it's ugly. Why *should* the kernel get involved?

The aesthetic of kernel vs kernel absence is not what I think is ugly. It is the
use of timed interrupts to provide scheduling that jars my sensibilities. For
what that is worth.

Now, although I've learnt some stuff, I have now made my quota of dumb remarks
about threads, and to save compounding my ignorance and stupidity with arrogance
and insult, I will stop.

Best Regards,

Will Deakin

Tim Bradshaw

unread,

Apr 10, 2000, 3:00:00 AM4/10/00

to

* Will Deakin wrote:
>> I don't see why it's ugly. Why *should* the kernel get involved?
> The aesthetic of kernel vs kernel absence is not what I think is ugly. It is the
> use of timed interrupts to provide scheduling that jars my sensibilities. For
> what that is worth.

But remember that's how all (preemptive) schedulers work -- something
kicks the scheduler periodically and it wakes up, takes control of the
machine, and decides who gets to run next. Of course there are other
events that wake the scheduler, like the current task blocking for
something, but ultimately something needs to wake it once in a while.

--tim

ro...@corman.net

unread,

Apr 10, 2000, 3:00:00 AM4/10/00

to

On Fri, 07 Apr 2000 13:59:58 -0600, David Hanley <d...@ncgr.org> wrote:

>Yes, an object could be left funky, but it seem to me that accepting
>this is better than taking out something that might actually be necessary.
>Now, if there's a security issue, it seems nore justified, but I can't
>think of a case where it's a real security problem that could not be
>fixed. For example, you might stop a thread constructing a security
>manager, and get a funky one, but the stop() method could refuse to
>stop threads until they leave the security package. You shouldn't get
>totally invalid objects, since pointer assignments are still atomic.

I have not heard that stop() was being removed--only that it is now
deprecated. As far as I know all Java compilers continue to support
it and no doubt will for the forseeable future (for backward
compatibility if nothing else). However, if you use it you are asking
for trouble unless you are very careful and avoid calling library
functions in your threads.
>

>I don't agee with this at all. There are a couple large issues:
>
>1) You might call code which doesn't do this testing and will run
>for a long time. For example, certian maths code.
>
>2) All your design becomes obfuscated by this thread stopping
>which gets shot through the architecture. Programming is hard
>enough without having to work out what might need to be stopped,
>how, why, and when.

I don't think we reallly disagree on anything. When I said it works OK
I meant that a cooperative thread start/stop model can be made to work
if you want to deal with the hassle, and you don't call library code
which doesn't cooperate. I certainly didn't mean to imply that it
wasn't a big pain (it is) or that I like it (I don't).

Java is one of the first mainstream languages to support threads, and
a lot is always made of this among Java supporters. In my opinion its
support for them is very poor, yet the design of the API's basically
require you to use them. I certainly would not brag about Java's
thread capabilities if I were a proponent of Java.

I am not cross-posting this to the java group simply because I don't
care to get into an argument with java fanatics. There are plenty of
nice things about java (compared to C++). I actually like C++ also
(oh my god, I hope I haven't blasphemed!) but I will try to save
myself by saying I find Common Lisp superior to both Java and C++.

Roger Corman

Fernando D. Mato Mira

unread,

Apr 10, 2000, 3:00:00 AM4/10/00

to

Marc Battyani wrote:

> Have you done this in LISP, in LISP + some other language, or not in lisp?

C on VxWorks. And 5KHz is as fast as I can go (at least on a specific 300Mhz
PPC603-based
SBC with a classic Zilog 8536 I/O chip + interrupt ASIC).

[Should really be using sone tiny open source kernel, not VxWorks, BTW]

Joe Marshall

unread,

Apr 10, 2000, 3:00:00 AM4/10/00

to

ro...@corman.net writes:

> I actually like C++ also (oh my god, I hope I haven't blasphemed!)
> but I will try to save myself by saying I find Common Lisp superior
> to both Java and C++.

Infidel! For your penance, you must implement CommonLisp on Windows
using COM and ActiveX! (oh, you already did that....)

Joerg-Cyril Hoehle

unread,

Apr 14, 2000, 3:00:00 AM4/14/00

to

Hi,

Joe Marshall <jmar...@alum.mit.edu> writes:
> No, the new jersy approach would be to leave it in and document it in
> the BUGS section. Conforming code would be required to avoid calling
> stop() when the target thread was in an unsafe state.

Have you thought about what CLtL2 and CLHS say about traversing
hashtables and modifying entries other than the current one? Using
MAPHASH one can call an arbitrary function which can itself some other
function which could then again use directly or indirectly access and
modify that same hash table (at some other place).

Can you guarantee that for every application of a given complexity,
there are no nested accesses like this? Speaking about iteration and
state :-(

Regards,
Jorg Hohle
Telekom Research Center -- SW-Reliability

Thomas A. Russ

unread,

Apr 14, 2000, 3:00:00 AM4/14/00

to

hoehle...@tzd.dont.telekom.spam.de.me (Joerg-Cyril Hoehle) writes:

> Can you guarantee that for every application of a given complexity,
> there are no nested accesses like this? Speaking about iteration and
> state :-(

No. That's why we will often do a two pass operation where the first
pass constructs a list of the elements of the hashtable, and the second
pass iterates through the list, performing operations that mutate the
hashtable.

--
Thomas A. Russ, USC/Information Sciences Institute t...@isi.edu

Flemming Gram Christensen

unread,

Apr 16, 2000, 3:00:00 AM4/16/00

to

Joe Marshall <jmar...@alum.mit.edu> writes:

> rp...@rigden.engr.sgi.com (Rob Warnock) writes:
>
> >
> > I'm sure. My guess is that at a minimum you'd want some sort of real-time
> > or at least incremental GC, so you could run the mutator on some CPUs while
> > running the collector on others, and for efficiency you'd almost certainly
> > need separate allocation pools per mutator thread, so you could do lock-free
> > allocation "most" of the time. Stuff like that...
>
> You want a generational GC. Each processor would get its own youngest
> generation and be able to allocate from it without interlocking.

How about references between the heaps. You need to somehow find the
references in one processors heap to gc the other. You can not scan
the stack if the unless the thread is in a "gc-secure" state.
Because of this many Java implementations waits for all threads before
doing GC.

Potential calls to external code (like C routines) increases the difficulty.

Flemming Gram Christensen

eli

unread,

Apr 17, 2000, 3:00:00 AM4/17/00

to

Hi,
We are looking for the Java gurus to join our
warm and challenging start-up, having a successful product,
having branches in Israel and at Michgan for automated
data service provisioing.

Any "Comando Java Person" that is intersted to verify more, please send
a CV to eli...@harmonycom.com.

Eli

Joe Marshall

unread,

Apr 17, 2000, 3:00:00 AM4/17/00

to

Yes. These are all problems.

I was just pointing out to Rob Warnock that what he probably wanted
to ensure his GC was generational before investigating real-time or
incremental GC.

Synchronization among multiple threads/processors is a big problem
with GC. I wonder if a replicating GC would have attractive
performance characteristics.

--
~jrm

Flemming Gram Christensen

unread,

Apr 17, 2000, 3:00:00 AM4/17/00

to

Joe Marshall <jmar...@alum.mit.edu> writes:

I am involved in the maintenance of the BETA programming languages.
The garbage collector problems is one of the reasons why we do not
use system threads. Another reason is the we would like to support
both programs with one or two thread and simulation programs with
eg 500 threads. You need to be able to support user defined schedulars
then. You can not expect the OS to perform well in both cases.

What do you mean replicating GC?

/gram

> --
> ~jrm

Joe Marshall

unread,

Apr 17, 2000, 3:00:00 AM4/17/00

to

Flemming Gram Christensen <gr...@ull.mjolner.dk> writes:

>
> What do you mean replicating GC?
>

I first heard about them from James O'Toole, but I don't know if it
was him or someone else who first thought them up. The idea is this:
the garbage collector, instead of modifying the current heap, makes an
identical, compact copy of the heap instead. The mutator continues to
run in the old heap. When the replicator has made an identical copy
of the heap, the new heap is swapped for the old and the old is
discarded.

There is a write barrier of sorts. The mutator may at times modify
data that the replicator has already copied, thus invalidating the
replicated copy. The way to get around this is to write a `log' of
the write operations. The replicator uses the log to fix-up the copy
it has of the heap.

This method is interesting for use on with databases because the
mutator process is *already* generating log entries for transaction
consistency, so there is no extra overhead for the GC. Another reason
it is interesting is that there is the coupling between the GC and the
mutator processes is very loose. The GC is allowed to replicate stale
data provided it can eventually ascertain that its copy of the world
is functionally equivalent to the original. You only have to
synchronize the processes at flip time.

It seems to me that you might be able to exploit this loose coupling
on a multi-processor system. But I don't have any specific ideas
about it.

--
~jrm

Pekka P. Pirinen

unread,

Apr 18, 2000, 3:00:00 AM4/18/00

to

Joe Marshall <jmar...@alum.mit.edu> writes:
> Flemming Gram Christensen <gr...@ull.mjolner.dk> writes:
> > What do you mean replicating GC?
>
> I first heard about them from James O'Toole, but I don't know if it
> was him or someone else who first thought them up.

He and his coworkers invented it and wrote a series of papers about
it. <URL:http://www.xanalys.com/software_tools/mm/glossary/r.html#rep
licating.garbage.collector> lists four; there are at least three more
(which I'll add RSN).

> The idea is this: the garbage collector, instead of modifying the
> current heap, makes an identical, compact copy of the heap instead.

"Identical" meaning functionally equivalent; omitting the garbage.

> [...] It seems to me that you might be able to exploit this loose

> coupling on a multi-processor system.

O'Toole and Nettles did try exactly that in SML/NJ, and reported
"excellent pause time performance and moderate execution time
speedups". It's not clear from the paper (James O'Toole, Scott
Nettles. 1994. Concurrent Replicating Garbage Collection.) how well
this would scale with more processors.
--
Pekka P. Pirinen
Censure, not censor!