Google Groups

Reasons why --enable-threads shouldn't be on by default

George Peter Staplin Dec 10, 2007 12:25 AM
Posted in group: comp.lang.tcl
After much research I want to share my concerns over the ideas and
conflicting goals some of the core team has with threads, and why I am
against having --enable-threads by default.

We have the following goals/constraints:

1. The core team wants extensions that use fork() like Expect, TclX,
etc. to work in a threaded core.

2. The core team wants the ability to work with existing libraries that
don't necessarily use the Tcl pthread interfaces (based on what I
understood KBK to say).

3. QNX and possibly others, disable fork() syscalls when pthreads are
used.  According to the documentation for QNX if you use fork() and have
more than 1 thread created, an ENOSYS will result.  The alternative to
fork() + exec*() is spawn().  This means that QNX would be an
odd-one-out in the Tcl world, and TclX and Expect wouldn't work anyway,
unless a user built with --disable-threads.

Constraint 1 is possible with some platforms using pthread_atfork(), but
it requires keeping a record for every mutex, spinlock, etc. used by
Tcl.  However, constraint 2 eliminates this possibility.

I now suggest you read what David Butenhof has to say about
pthread_atfork().  He is a recognized expert with pthreads, an author of
a pthread book, and helped form the pthread standard.

He has also written other articles about how to possibly implement the
solution to 1.  But given what he has written, "threads by default" may
not be as appealing as some have thought.

I believe the following:
  A) threaded builds should be specifically marked, and use a
library/executable suffix.
  B) package distributors should probably shy away from threaded builds
of Tcl, unless absolutely needed.

In a somewhat related chain of thoughts, I'm concerned about the
implementation of Tcl's threading.  In particular it bothers me that
thread-specific data (TSD) is stored in a bunch of hash tables, so
everytime a C function needs TSD, it has to find the hash table, do a
hash table lookup, and return the result.  This is not ideal.  The
reason I believe this was chosen is that many threading implementations,
such as pthreads, and Win32, have a limited, fixed number of TSD slots.  
A proper fix in my opinion is to use one or 2 structures per-thread, and
restructure Tcl to keep its data in those structures.  This should
result in better performance, and not be detrimental in the general case
where you have 1 thread.

What are your thoughts on all of this?