After much research I want to share my concerns over the ideas and conflicting goals some of the core team has with threads, and why I am against having --enable-threads by default.
We have the following goals/constraints:
1. The core team wants extensions that use fork() like Expect, TclX, etc. to work in a threaded core.
2. The core team wants the ability to work with existing libraries that don't necessarily use the Tcl pthread interfaces (based on what I understood KBK to say).
3. QNX and possibly others, disable fork() syscalls when pthreads are used. According to the documentation for QNX if you use fork() and have more than 1 thread created, an ENOSYS will result. The alternative to fork() + exec*() is spawn(). This means that QNX would be an odd-one-out in the Tcl world, and TclX and Expect wouldn't work anyway, unless a user built with --disable-threads.
Constraint 1 is possible with some platforms using pthread_atfork(), but it requires keeping a record for every mutex, spinlock, etc. used by Tcl. However, constraint 2 eliminates this possibility.
He has also written other articles about how to possibly implement the solution to 1. But given what he has written, "threads by default" may not be as appealing as some have thought.
I believe the following: A) threaded builds should be specifically marked, and use a library/executable suffix. B) package distributors should probably shy away from threaded builds of Tcl, unless absolutely needed.
In a somewhat related chain of thoughts, I'm concerned about the implementation of Tcl's threading. In particular it bothers me that thread-specific data (TSD) is stored in a bunch of hash tables, so everytime a C function needs TSD, it has to find the hash table, do a hash table lookup, and return the result. This is not ideal. The reason I believe this was chosen is that many threading implementations, such as pthreads, and Win32, have a limited, fixed number of TSD slots. A proper fix in my opinion is to use one or 2 structures per-thread, and restructure Tcl to keep its data in those structures. This should result in better performance, and not be detrimental in the general case where you have 1 thread.