From what I guess, Python's use of operating system threads results in an
ugly global interpreter lock which actually means that at any time, only one
Python thread can run the interpreter!
On the other hand, simulating threads in Python results in simpler code but
any blocking call in an extension module freezes all Ruby threads at once.
Why would not choose to get the best of both worlds :
* use one main thread which runs the Ruby interpreter
* use worker threads in which potentially blocking extensions module calls
would run.
This way, the structure of the interpreter and the threading model could
remain under tight control while allowing blocking native calls to run in
parallel without blocking the ruby threads.
Sure enough, it is easier said than done !!
I don't know to which extent the interpreter core would need to be modified
in order to support this model.
Anyone cares to comment ?
>
> On the other hand, simulating threads in Python results in simpler code
but
I assume that was meant as "simulating threads in Ruby"
> any blocking call in an extension module freezes all Ruby threads at once.
>
> Why would not choose to get the best of both worlds :
>
> * use one main thread which runs the Ruby interpreter
> * use worker threads in which potentially blocking extensions module calls
> would run.
What stops you now from firing up a worker thread for your blocking
syscall(s)? as long as you're not calling back into the Ruby interpreter you
should be fine (the implicit assumption here is that the Ruby interpreter is
not reentrant, which is likely to be the case since it doesn't support
system threads).
>
> This way, the structure of the interpreter and the threading model could
> remain under tight control while allowing blocking native calls to run in
> parallel without blocking the ruby threads.
>
> Sure enough, it is easier said than done !!
> I don't know to which extent the interpreter core would need to be
modified
> in order to support this model.
> Anyone cares to comment ?
Umm. let's review the options here, for any "interpreter", not only Ruby:
1) "green" threads [Ruby as of today]
2) system threads, with a non-reentrant interpreter [Python, global
interpreter lock]
3) system threads, fully reentrant interpreter [no example comes to mind]
The reason Python stopped at 2) is because 3) is hard. The reason Ruby
stopped at 1) is because ...LOL.
The issue with 2) is that one cannot take advantage of multiprocessor
machines, but it will solve your blocking problem with minimal effort. I
don't think you have other option [in Ruby] but to spawn your worker thread
from an extension module (and not call back into the interpreter). The only
reason I'm repeating myself is because last week I've been doing that in a
Python extension module, without owning the lock LOL.
Alex
Practically, I'm not sure implementing 3) buys you much. Threads are
often used for two purposes:
1) When writing a network server, using threads may make the
code simpler -- you don't need to write a state machine that
interweaves the protocol, you just write clearer threaded
code. (It's not obvious to me that the threaded code actually
is clearer or more maintainable, given how tricky it can be to
find threading bugs, and eventually you probably find yourself
implementing the state machine for high peformance, but
anyway...)
In this application, the program spends most of its time
waiting for I/O, and the global interpreter lock would be
released around the relevant system calls, so it's not much of
a hindrance.
2) You're doing heavy computing and want to use multiple CPUs.
But if you're doing number crunching, you almost certainly
will not be implementing it in Python/Ruby/some other
scripting language; you'll likely write it in C or Fortran,
and write a little wrapper to make it available from your
scripting language.
Thus, the global interpreter lock doesn't hurt much because
most of the computational work is done in low-level wrapped code,
and the wrapper can release the global interpreter lock.
Since full threading is complicated and causes increased overhead even
for single-threaded programs, it doesn't seem to be worth the trouble;
the two most common applications for threading don't really benefit
from it.
--amk
I suspect that the most common appliation for threading is
to allow a GUI to be responsive even when the program is
busy doing something else.
In this case increased overhead is not really an issue,
and this application definitely benefits from having real
threading. Considering that GUI wrappers are a common
use for scripting languages, I would not underestimate the
importance of this case.
Cheers,
Ben
_________________________________________________________________
Get your FREE download of MSN Explorer at http://explorer.msn.com
In message "[ruby-talk:10716] Re: Threading model change, proposal"
on 01/02/12, "Alex Maranda" <alex_m...@telus.net> writes:
|1) "green" threads [Ruby as of today]
|2) system threads, with a non-reentrant interpreter [Python, global
|interpreter lock]
|3) system threads, fully reentrant interpreter [no example comes to mind]
|
|The reason Python stopped at 2) is because 3) is hard. The reason Ruby
|stopped at 1) is because ...LOL.
Two good reasons, one bad reason:
* universal availability, even on DOS machines
* universal behavior
* conservative GC does not run fine with native threads without
affecting portability
matz.
But note that most GUI toolkits implement an event loop internally,
and CPU-heavy computations needs to take this into account in order to
work smoothly. Having a fine-threaded scripting language won't help
if your GUI has a global lock of its own. Quoting from documentation
for various toolkits:
GTk+: http://www.gtk.org/faq/#AEN473
The GLib library can be used in a thread-safe mode by calling
g_thread_init() before making any other GLib calls. In this
mode GLib automatically locks all internal data structures as
needed. This does not mean that two threads can simultaneously
access, for example, a single hash table, but they can access
two different hash tables simultaneously. If two different
threads need to access the same hash table, the application is
responsible for locking itself.
When GLib is intialized to be thread-safe, GTK+ is thread
aware. There is a single global lock that you must acquire
with gdk_threads_enter() before making any GDK calls, and
release with gdk_threads_leave() afterwards.
Qt: http://doc.trolltech.com/threads.html
In version 2.2, Qt introduced thread support to Qt in the
shape of some basic platform-independent threading classes, a
thread-safe way of posting events and a global Qt library lock
that allows you to call Qt methods from different threads.
...
Calling a function in Qt without holding a mutex will
generally result in unpredictable behavior. Calling a
GUI-related function in Qt from a different thread requires
holding the Qt library mutex.
Tcl -- I can't find anything specific on Tk itself, but assuming this
description of Tcl is still correct, I can't see how Tk could be more
flexible than Tcl: http://dev.scriptics.com/doc/howto/thread_model.html
Tcl lets you have one or more Tcl interpreters (e.g., created
with Tcl_CreateInterp()) in each operating system thread.
However, each interpreter is tightly bound to its OS thread
and errors will occur if you let more than one thread call
into the same interpreter (e.g., with Tcl_Eval).
So most GUI toolkits impose their own serialization, apparently
similar to having a global lock.
--amk
I think we are talking past each other. It makes perfect
sense for the GUI toolkit to serialize various internal
operations. However what is frustrating is to (for
instance) push a button that launches a possibly long
database request in the background and then have your
screen freeze. Or to be unable to break off a web page
request.
Multi-threading with real threads is valuable, and can
work just fine even if only one thread talks to the GUI.
In real-world development practice, having only "one
thread talk to the GUI" is nearly universal. I can't
think of any good reason to have more. I'm keeping an
open mind on this one, though.
Along with the categories Andrew listed, there are a few
other reasons to use threads:
1. Fashion and misinformation. There are a
lot of people who sincerely believe they
need to or should use threads, just be-
cause ... well, someone said so.
2. System-level bindings that only give good
programmability for threaded APIs.
3. Mildly tangled with the first two, Ben's
cases of external stuff that demands thread-
level management. He rightly described
a long-running database query. A *good*
interface will have an asynchronous API, but
a lot of the interfaces we all deal with
aren't good ones.
A counter to this that occasionally helps is
that, if it really is a long-running execu-
tion context that's external to the body of
the application, it can just as well be in its
own process as its own thread. At that point,
pre-emptive scheduling becomes easier again.
--
Cameron Laird <cla...@NeoSoft.com>
Business: http://www.Phaseit.net
Personal: http://starbase.neosoft.com/~claird/home.html
> Multi-threading with real threads is valuable, and can
> work just fine even if only one thread talks to the GUI.
In KSH we &;wait... works just fine ;)
=====
John van Vlaanderen
#############################################
# CXN, Inc. Contact: jo...@thinman.com # #
# Proud Sponsor of Perl/Unix of NY #
# http://puny.vm.com #
#############################################
__________________________________________________
Do You Yahoo!?
Get personalized email addresses from Yahoo! Mail - only $35
a year! http://personal.mail.yahoo.com/
What about an application server ?
> Along with the categories Andrew listed, there are a few
> other reasons to use threads:
> 1. Fashion and misinformation....
> 2. System-level bindings that only give good
> programmability for threaded APIs.
> 3. Mildly tangled with the first two, Ben's
> cases of external stuff that demands thread-
> level management....
>
So, to sum things up,
1 - one GUI thread is enough and more is definitely bad.
1 - not having system threads running several copies of the interpreter is
not a major hindrance.
2 - having the interpreter run in only one thread eases the implementation
of the GC.
3 - threads are really useful for :
* coding style (a very personal matter !!)
* keeping the system responsive under external API calls.
What triggered this discussion is the fact that some calls in the Ruby lib
actually block according to Dave Thomas excellent book.
All the contributions seem to point in the same direction : providing a
simple worker thread wrapper around some of the extension modules should be
more than enough.
My thought was that if the Ruby API could provide a standard easy worker
threading framework for the extensions module to call, it would be nice.
That way, the interpreter could remain untouched.
I'll try to work on that.
Cheers.
I don't mean to devalue the former reason, by
the way. Programming with threads is a fine
and wonderful thing, apart from any perfor-
mance considerations, if it's a model that
helps a developer express solutions.
In regard to the second, note that it's for
a rather subtle sense of "external". It means
something like, "external to the language",
with occasional exceptions that might be de-
pendent on implementation. People know pretty
quickly when they have this need.
>
>What triggered this discussion is the fact that some calls in the Ruby lib
>actually block according to Dave Thomas excellent book.
>All the contributions seem to point in the same direction : providing a
>simple worker thread wrapper around some of the extension modules should be
>more than enough.
Yes, it matches my experience that a simple
thread wrapping of this sort brings a lot of
quick happiness.
.
.