Threads... last call

Dan Sugalski

unread,

Jan 22, 2004, 2:59:30 PM1/22/04

to perl6-i...@perl.org

Last chance to get in comments on the first half of the proposal. If
it looks adequate, I'll put together the technical details
(functions, protocols, structures, and whatnot) and send that off for
abuse^Wdiscussion. After that we'll finalize it, PDD the thing, and
get the implementation in and going.
--
Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
d...@sidhe.org have teddy bears and even
teddy bears get drunk

Deven T. Corzine

unread,

Jan 22, 2004, 5:24:04 PM1/22/04

to Dan Sugalski, perl6-i...@perl.org

Dan Sugalski wrote:

> Last chance to get in comments on the first half of the proposal. If
> it looks adequate, I'll put together the technical details (functions,
> protocols, structures, and whatnot) and send that off for
> abuse^Wdiscussion. After that we'll finalize it, PDD the thing, and
> get the implementation in and going.

Dan,

Sorry to jump in out of the blue here, but did you respond to Damien
Neil's message about locking issues? (Or did I just miss it?)

This sounds like it could be a critically important design question;
wouldn't it be best to address it before jumping into implementation?
If there's a better approach available, wouldn't this be the best time
to determine that?

Deven

Date: Wed, 21 Jan 2004 13:32:52 -0800
From: Damien Neil <ne...@misago.org>
To: perl6-i...@perl.org
Subject: Re: Start of thread proposal
Message-ID: <20040121213...@misago.org>
References: <a0601020dbc31eb8d1b67@[10.0.1.2]> <200401210925...@thu8.leo.home> <a06010201bc34710dcfb2@[10.0.1.2]>
In-Reply-To: <a06010201bc34710dcfb2@[10.0.1.2]>
Content-Length: 1429

On Wed, Jan 21, 2004 at 01:14:46PM -0500, Dan Sugalski wrote:
> >... seems to indicate that even whole ops like add P,P,P are atomic.
>
> Yep. They have to be, because they need to guarantee the integrity of
> the pmc structures and the data hanging off them (which includes
> buffer and string stuff)

Personally, I think it would be better to use corruption-resistant
buffer and string structures, and avoid locking during basic data
access. While there are substantial differences in VM design--PMCs
are much more complicated than any JVM data type--the JVM does provide
a good example that this can be done, and done efficiently.

Failing this, it would be worth investigating what the real-world
performance difference is between acquiring multiple locks per VM
operation (current Parrot proposal) vs. having a single lock
controlling all data access (Python) or jettisoning OS threads
entirely in favor of VM-level threading (Ruby). This forfeits the
ability to take advantage of multiple CPUs--but Leopold's initial
timing tests of shared PMCs were showing a potential 3-5x slowdown
from excessive locking.

I've seen software before that was redesigned to take advantage of
multiple CPUs--and then required no less than four CPUs to match
the performance of the older, single-CPU version. The problem was
largely attributed to excessive locking of mostly-uncontested data
structures.

- Damien

Josh Wilmes

unread,

Jan 22, 2004, 5:58:50 PM1/22/04

to Deven T. Corzine, Dan Sugalski, perl6-i...@perl.org

I'm also concerned by those timings that leo posted.
0.0001 vs 0.0005 ms on a set- that magnitude of locking overhead
seems pretty crazy to me.

It seemed like a few people have said that the JVM style of locking
can reduce this, so it seems to me that it merits some serious
consideration, even if it may require some changes to the design of
parrot.

I'm not familiar enough with the implementation details here to say much
one way or another. But it seems to me that if this is one of those
low-level decisions that will be impossible to change later and will
forever constrain perl's performance, then it's important not to rush
into a bad choice because it seems more straightforward.

Perhaps some more experimentation is in order at this time.

--Josh

At 17:24 on 01/22/2004 EST, "Deven T. Corzine" <de...@ties.org> wrote:

> Dan Sugalski wrote:
>
> > Last chance to get in comments on the first half of the proposal. If
> > it looks adequate, I'll put together the technical details (functions,
> > protocols, structures, and whatnot) and send that off for
> > abuse^Wdiscussion. After that we'll finalize it, PDD the thing, and
> > get the implementation in and going.
>
> Dan,
>
> Sorry to jump in out of the blue here, but did you respond to Damien
> Neil's message about locking issues? (Or did I just miss it?)
>
> This sounds like it could be a critically important design question;
> wouldn't it be best to address it before jumping into implementation?
> If there's a better approach available, wouldn't this be the best time
> to determine that?
>
> Deven
>
> Date: Wed, 21 Jan 2004 13:32:52 -0800
> From: Damien Neil <ne...@misago.org>
> To: perl6-i...@perl.org
> Subject: Re: Start of thread proposal
> Message-ID: <20040121213...@misago.org>

> References: <a0601020dbc31eb8d1b67@[10.0.1.2]> <200401210925.i0L9PQc26131@thu

Dan Sugalski

unread,

Jan 23, 2004, 10:07:25 AM1/23/04

to Deven T. Corzine, perl6-i...@perl.org

At 5:24 PM -0500 1/22/04, Deven T. Corzine wrote:
>Dan Sugalski wrote:
>
>>Last chance to get in comments on the first half of the proposal.
>>If it looks adequate, I'll put together the technical details
>>(functions, protocols, structures, and whatnot) and send that off
>>for abuse^Wdiscussion. After that we'll finalize it, PDD the thing,
>>and get the implementation in and going.
>
>Dan,
>
>Sorry to jump in out of the blue here, but did you respond to Damien
>Neil's message about locking issues? (Or did I just miss it?)

Damian's issues were addressed before he brought them up, though not
in one spot.

A single global lock, like python and ruby use, kill any hope of SMP-ability.

Hand-rolled threading has unpleasant complexity issues, is a big
pain, and terribly limiting. And kills any hope of SMP-ability.

Corruption-resistent data structures without locking just don't exist.

Dan Sugalski

unread,

Jan 23, 2004, 10:24:30 AM1/23/04

to Josh Wilmes, Deven T. Corzine, perl6-i...@perl.org

At 5:58 PM -0500 1/22/04, Josh Wilmes wrote:
>I'm also concerned by those timings that leo posted.
>0.0001 vs 0.0005 ms on a set- that magnitude of locking overhead
>seems pretty crazy to me.

It looks about right. Don't forget, part of what you're seeing isn't
that locking mutexes is slow, it's that parrot does a lot of stuff
awfully fast. It's also a good idea to get more benchmarks before
jumping to any conclusions -- changing designes based on a single,
first cut, quick-n-dirty benchmark isn't necessarily a wise thing.

>It seemed like a few people have said that the JVM style of locking
>can reduce this, so it seems to me that it merits some serious
>consideration, even if it may require some changes to the design of
>parrot.

There *is* no "JVM-style" locking. I've read the docs and looked at
the specs, and they're not doing anything at all special, and nothing
different from what we're doing. Some of the low-level details are
somewhat different because Java has more immutable base data
structures (which don't require locking) than we do. Going more
immutable is an option, but one we're not taking since it penalizes
things we'd rather not penalize. (String handling mainly)

There is no "JVM Magic" here. If you're accessing shared data, it has
to be locked. There's no getting around that. The only way to reduce
locking overhead is to reduce the amount of data that needs locking.

>I'm not familiar enough with the implementation details here to say much
>one way or another. But it seems to me that if this is one of those
>low-level decisions that will be impossible to change later and will
>forever constrain perl's performance, then it's important not to rush
>into a bad choice because it seems more straightforward.

This can all be redone if we need to -- the locking and threading
strategies can be altered in a dozen ways or ripped out and
rewritten, as none of them affect the semantics of bytecode execution.

Deven T. Corzine

unread,

Jan 23, 2004, 4:14:21 PM1/23/04

to Dan Sugalski, perl6-i...@perl.org

Dan Sugalski wrote:

> At 5:24 PM -0500 1/22/04, Deven T. Corzine wrote:
>
> Damian's issues were addressed before he brought them up, though not
> in one spot.
>
> A single global lock, like python and ruby use, kill any hope of
> SMP-ability.
>
> Hand-rolled threading has unpleasant complexity issues, is a big pain,
> and terribly limiting. And kills any hope of SMP-ability.

What about the single-CPU case? If it can really take 4 SMP CPUs with
locking to match the speed of 1 CPU without locking, as mentioned,
perhaps it would be better to support one approach for single-CPU
systems (or applications that are happy to be confined to one CPU), and
a different approach for big SMP systems?

> Corruption-resistent data structures without locking just don't exist.

The most novel approach I've seen is the one taken by Project UDI
(Uniform Driver Interface). Their focus is on portable device drivers,
so I don't know if this idea could work in the Parrot context, but the
approach they take is to have the driver execute in "regions". Each
driver needs to have at least one region, and it can create more if it
wants better parallelism. All driver code executes "inside" a region,
but the driver does no locking or synchronization at all. Instead, the
environment on the operating-system side of the UDI interface handles
such issues. UDI is designed to ensure that only one driver instance
can ever be executing inside a region at any given moment, and the
mechanism it uses is entirely up to the environment, and can be changed
without touching the driver code.

This white paper has a good technical overview (the discussion of
regions starts on page 9):

http://www.projectudi.org/Docs/pdf/UDI_tech_white_paper.pdf

I'm told that real-world experience with UDI has shown performance is
quite good, even when layered over existing native drivers. The
interesting thing is that a UDI driver could run just as easily on a
single-tasking, single-CPU system (like DOS) or a multi-tasking SMP
system equally well, and without touching the driver code. It doesn't
have to know or care if it's an SMP system or not, although it does have
to create multiple regions to actually be able benefit from SMP. (Of
course, even with single-region drivers, multiple instances of the same
driver could benefit from SMP, since each instance could run on a
different CPU.)

I don't know if it would be possible to do anything like this with
Parrot, but it might be interesting to consider...

Deven

Damien Neil

unread,

Jan 23, 2004, 3:27:41 PM1/23/04

to perl6-i...@perl.org

On Fri, Jan 23, 2004 at 10:07:25AM -0500, Dan Sugalski wrote:
> A single global lock, like python and ruby use, kill any hope of
> SMP-ability.

Assume, for the sake of argument, that locking almost every PMC
every time a thread touches it causes Parrot to run four times
slower. Assume also that all multithreaded applications are
perfectly parallelizable, so overall performance scales linearly
with number of CPUs. In this case, threaded Parrot will need
to run on a 4-CPU machine to match the speed of a single-lock
design running on a single CPU. The only people that will benefit
from the multi-lock design are those using machines with more than
4 CPUs--everyone else is worse off.

This is a theoretical case, of course. We don't know exactly how
much of a performance hit Parrot will incur from a lock-everything
design. I think that it would be a very good idea to know for
certain what the costs will be, before it becomes too late to change
course.

Perhaps the cost will be minimal--a 20% per-CPU overhead would
almost certainly be worth the ability to take advantage of multiple
CPUs. Right now, however, there is no empirical data on which to
base a decision. I think that making a decision without that data
is unwise.

As I said, I've seen a real-world program which was rewritten to
take advantage of multiple CPUs. The rewrite fulfilled the design
goals: the new version scaled with added CPUs. Unfortunately, lock
overhead made it sufficiently slower that it took 2-4 CPUs to match
the old performance on a single CPU--despite the fact that almost
all lock attempts succeeded without contention.

The current Parrot design proposal looks very much like the locking
model that app used.

> Corruption-resistent data structures without locking just don't exist.

An existence proof:

Java Collections are a standard Java library of common data structures
such as arrays and hashes. Collections are not synchronized; access
involves no locks at all. Multiple threads accessing the same
collection at the same time cannot, however, result in the virtual
machine crashing. (They can result in data structure corruption,
but this corruption is limited to "surprising results" rather than
"VM crash".)

- Damien

nigels...@btconnect.com

unread,

Jan 23, 2004, 4:10:38 PM1/23/04

to perl6-i...@perl.org

On Fri, 23 Jan 2004 10:24:30 -0500, d...@sidhe.org (Dan Sugalski) wrote:
> If you're accessing shared data, it has
> to be locked. There's no getting around that. The only way to reduce
> locking overhead is to reduce the amount of data that needs locking.
>

One slight modification I would make to that statement is:

You can reduce locking overhead by only invoking that overhead
that overhead when locking is necessary. If there is a 'cheaper'
way of detecting the need for locking, then avoiding the cost
of locking, by only using it when needed is beneficial.

This requires the detection mechanism to be extremely fast and
simple relative to the cost of aquiring a lock. This was what I
attempted to describe before, in win32 terms, without much success.

I still can't help tinking that other platforms probably have similar
possibilities, but I do not know enough of them to describe the
mechanism in thise terms.

Nigel.

Gordon Henriksen

unread,

Jan 23, 2004, 5:45:58 PM1/23/04

to Deven T. Corzine, perl6-i...@perl.org

Deven T. Corzine wrote:

> The most novel approach I've seen is the one taken by Project UDI
> (Uniform Driver Interface).

This is very much the "ithreads" model which has been discussed. The
problem is that, from a functional perspective, it's not so much
threading as it is forking.

--

Gordon Henriksen
IT Manager
ICLUBcentral Inc.
gor...@iclub.com

Melvin Smith

unread,

Jan 28, 2004, 12:53:09 PM1/28/04

to Damien Neil, perl6-i...@perl.org

Pardon me, I am trudging through all this thread backlog and
have been trying not to post to add to the bandwidth.

At 12:27 PM 1/23/2004 -0800, Damien Neil wrote:
>An existence proof:
>
>Java Collections are a standard Java library of common data structures
>such as arrays and hashes. Collections are not synchronized; access
>involves no locks at all. Multiple threads accessing the same
>collection at the same time cannot, however, result in the virtual
>machine crashing. (They can result in data structure corruption,
>but this corruption is limited to "surprising results" rather than
>"VM crash".)

But this accomplishes nothing useful and still means the data structure
is not re-entrant, nor is it corruption "resistant", regardless of how we
judge it.

-Melvin

Gordon Henriksen

unread,

Jan 28, 2004, 11:45:50 PM1/28/04

to Melvin Smith, Damien Neil, perl6-i...@perl.org

On Wednesday, January 28, 2004, at 12:53 , Melvin Smith wrote:

> At 12:27 PM 1/23/2004 -0800, Damien Neil wrote:
>
>> Java Collections are a standard Java library of common data structures
>> such as arrays and hashes. Collections are not synchronized; access
>> involves no locks at all. Multiple threads accessing the same
>> collection at the same time cannot, however, result in the virtual
>> machine crashing. (They can result in data structure corruption, but
>> this corruption is limited to "surprising results" rather than "VM
>> crash".)
>
> But this accomplishes nothing useful and still means the data structure
> is not re-entrant, nor is it corruption "resistant", regardless of how
> we judge it.

It does accomplish something very useful indeed: It avoids the overhead
of automatic locking when it isn't necessary. When *is* that locking
necessary? To a second order approximation, ***NEVER.***

Never? Yes. From the user's perspective, "synchronized objects" more
often than not provide no value! For instance, this Java code, run from
competing threads, does not perform its intended purpose:

// Vector was Java 1's dynamically sizable array class.
if (!vector.Contains(obj))
vector.Add(obj);

I does not prevent obj from appearing more than once in the collection.
It's equivalent to this:

bool temp;
synchronized (collection) {
temp = vector.Contains(obj);
}
// Preemption is possible between here...
if (!temp) {
sychronized (collection) {
// ... and here.
vector.Add(obj);
}
}

The correct code is this:

synchronized (vector) {
if (!vector.Contains(obj))
vector.Add(obj);
}

If we again make explicit Vector's synchronized methods, a startling
redundancy will become apparent:

synchronized (vector) {
bool temp;
synchronized (collection) {
temp = vector.Contains(obj);
}
if (!temp) {
sychronized (collection) {
vector.Add(obj);
}
}
}

This code is performing 3 times as many locks as necessary! More
realistic code will perform many times more than 3 times the necessary
locking. Imagine the waste in sorting a 10,000 element array.

Beyond that stunning example of wasted effort, most structures aren't
shared in the first place, and so the overhead is *completely*
unnecessary for them.

Still further, many "shared" objects are unmodified once they become
shared, and so again require no locking.

The only time automatically synchronized objects are useful is when the
user's semantics exactly match the object's methods. (e.g., an
automatically synchronized queue might be quite useful.) In that case,
it's a trivial matter for the user to wrap the operation in a
synchronized block—or subclass the object with a method that does so.
But it is quite impossible to remove the overhead from the other 99% of
uses, since the VM cannot discern the user's locking strategy.

If the user is writing a threaded program, then data integrity is
necessarily his problem—a virtual machine cannot solve it for him, and
automatic synchronization provides little help. "Protecting" a
policy-neutral object (such as an array) from its user is pointless; all
we could do is to ensure that the object is in a state consistent with
an incorrect sequence of operations. That's useless to the user, because
his program still behaves incorrectly; the object is, to his eyes,
corrupted since it no longer maintains his invariant conditions.

Thus, since the VM cannot guarantee that the the user's program behaves
as intended (only as written), and all that locking is wasted 4 times
over—the virtual machine would be wise to limit its role to the
prevention of crashes and to limiting the corruption that can result
from incorrect (unsynchronized) access to objects. If automatic locking
is the only way to do that for a particular data structure, then so be
it. Oftentimes, though, thoughtful design can ensure that even
unsynchronized accesses cannot crash or corrupt the VM as a whole--even
if those operations might corrupt one object's state, or provide less
than useful results.

As an example of how this applies to parrot, all of the FixedMumbleArray
classes[*] recently discussed could clearly be implemented completely
and safely without automatic locks on all modern platforms—if only
parrot could allow lock-free access to at least some PMCs. ([*] Except
FixedMixedArray. That stores UVals [plus more for a discriminator
field], and couldn't be quite safe, as a UVal isn't an atomic write on
many platforms. Then again, nor is a double.)

Remember, Java already made the mistake of automatic synchronization
with their original set of collections (incl. Vector). It was a major
design flaw, crippling the performance of early Java programs. Fixing it
for Java 2's class library was quite non-trivial, but because the VM was
powerful enough to allow it, more efficient unsynchronized collections
could be made available as a class library even for the earlier Java VMs.

—

Gordon Henriksen
mali...@mac.com

Damien Neil

unread,

Jan 28, 2004, 8:28:53 PM1/28/04

to perl6-i...@perl.org

On Wed, Jan 28, 2004 at 12:53:09PM -0500, Melvin Smith wrote:
> At 12:27 PM 1/23/2004 -0800, Damien Neil wrote:
> >Java Collections are a standard Java library of common data structures
> >such as arrays and hashes. Collections are not synchronized; access
> >involves no locks at all. Multiple threads accessing the same
> >collection at the same time cannot, however, result in the virtual
> >machine crashing. (They can result in data structure corruption,
> >but this corruption is limited to "surprising results" rather than
> >"VM crash".)
>
> But this accomplishes nothing useful and still means the data structure
> is not re-entrant, nor is it corruption "resistant", regardless of how we
> judge it.

Quite the contrary--it is most useful.

Parrot must, we all agree, under no circumstances crash due to
unsynchronized data access. For it to do so would be, among other
things, a gross security hole when running untrusted code in a
restricted environment.

There is no need for any further guarantee about unsynchronized data
access, however. If unsyncronized threads invariably cause an exception,
that's fine. If they cause the threads involved to halt, that's fine
too. If they cause what was once an integer variable to turn into a
string containing the definition of "mulching"...well, that too falls
under the heading of "undefined results". Parrot cannot and should
not attempt to correct for bugs in user code, beyond limiting the extent
of the damage to the threads and data structures involved.

Java, when released, took the path that Parrot appears to be about
to take--access to complex data structures (such as Vector) was
always synchronized. This turned out to be a mistake--sufficiently
so that Java programmers would often implement their own custom,
unsynchronized replacements for the core classes. As a result,
when the Collections library (which replaces those original data
structures) was released, the classes in it were left unsynchronized.

In Java's case, the problem was at the library level, not the VM
level; as such, it was relatively easy to fix at a later date.
Parrot's VM-level data structure locking will be less easy to change.

- Damien

Melvin Smith

unread,

Jan 29, 2004, 11:55:09 AM1/29/04

to Gordon Henriksen, Damien Neil, perl6-i...@perl.org

At 11:45 PM 1/28/2004 -0500, Gordon Henriksen wrote:
>On Wednesday, January 28, 2004, at 12:53 , Melvin Smith wrote:
>
>>At 12:27 PM 1/23/2004 -0800, Damien Neil wrote:
>>
>>>Java Collections are a standard Java library of common data structures
>>>such as arrays and hashes. Collections are not synchronized; access
>>>involves no locks at all. Multiple threads accessing the same
>>>collection at the same time cannot, however, result in the virtual
>>>machine crashing. (They can result in data structure corruption, but
>>>this corruption is limited to "surprising results" rather than "VM crash".)
>>
>>But this accomplishes nothing useful and still means the data structure
>>is not re-entrant, nor is it corruption "resistant", regardless of how we
>>judge it.
>
>It does accomplish something very useful indeed: It avoids the overhead of
>automatic locking when it isn't necessary. When *is* that locking
>necessary? To a second order approximation, ***NEVER.***

Pardon me but I've apparently lost track of context here.

I thought we were discussing correct behavior of a shared data structure,
not general cases. Or maybe this is the general case and I should
go read more backlog? :)

-Melvin

Leopold Toetsch

unread,

Jan 29, 2004, 4:11:29 PM1/29/04

to Melvin Smith, perl6-i...@perl.org

Melvin Smith <mrjol...@mindspring.com> wrote:

> I thought we were discussing correct behavior of a shared data structure,
> not general cases. Or maybe this is the general case and I should
> go read more backlog? :)

Basically we have three kinds of locking:
- HLL user level locking [1]
- user level locking primitives [2]
- vtable pmc locking to protect internals

Locking at each stage and for each PMC will be slow and can deadlock
too. Very coarse grained locking (like Pythons interpreter_lock) doesn't
give any advantage on MP systems - only one interpreter is running at
one time.

We can't solve user data integrity at the lowest level: data logic and
such isn't really visible here. But we should be able to integrate HLL
locking with our internal needs, so that the former doesn't cause
deadlocks[3] because we have to lock internally too, and we should be able
to omit internal locking, if HLL locking code already provides this
safety for a specific PMC.

The final strategy when to lock what depends on the system the code is
running. Python's model is fine for single processors. Fine grained PMC
locking gives more boost on multi-processor machines.

All generalization is evil and 47.4 +- 1.1% of all
statistics^Wbenchmarks are wrong;)

leo

[1] BLOCK scoped, e.g. synchronized {... } or { lock $x; ... }
These can be rwlocks or mutex typed locks
[2] lock.aquire, lock.release

[3] not user caused deadlocks - the mix of e.g. one user lock and
internal locking.

> -Melvin

leo

Gordon Henriksen

unread,

Jan 30, 2004, 1:27:12 AM1/30/04

to Melvin Smith, perl6-i...@perl.org

A "shared" data structure, as per Dan's document? It's a somewhat novel
approach, trying to avoid locking overhead with dynamic dispatch and
vtable swizzling. I'm discussing somewhat more traditional technologies,
which simply allow an object to perform equally correctly and with no
differentiation between shared and unshared cases. In essence, I'm
arguing that a "shared" case isn't necessary for some data structures in
the first place.

—

Gordon Henriksen
mali...@mac.com

Dan Sugalski

unread,

Feb 2, 2004, 11:52:13 AM2/2/04

to Gordon Henriksen, Melvin Smith, perl6-i...@perl.org

At 1:27 AM -0500 1/30/04, Gordon Henriksen wrote:
>On Thursday, January 29, 2004, at 11:55 , Melvin Smith wrote:
>
>>At 11:45 PM 1/28/2004 -0500, Gordon Henriksen wrote:
>>
>>>On Wednesday, January 28, 2004, at 12:53 , Melvin Smith wrote:
>>>
>>>>At 12:27 PM 1/23/2004 -0800, Damien Neil wrote:
>>>>
>>>>>Java Collections are a standard Java library of common data
>>>>>structures such as arrays and hashes. Collections are not
>>>>>synchronized; access involves no locks at all. Multiple threads
>>>>>accessing the same collection at the same time cannot, however,
>>>>>result in the virtual machine crashing. (They can result in
>>>>>data structure corruption, but this corruption is limited to
>>>>>"surprising results" rather than "VM crash".)
>>>>
>>>>But this accomplishes nothing useful and still means the data
>>>>structure is not re-entrant, nor is it corruption "resistant",
>>>>regardless of how we judge it.
>>>
>>>It does accomplish something very useful indeed: It avoids the
>>>overhead of automatic locking when it isn't necessary. When *is*
>>>that locking necessary? To a second order approximation,
>>>***NEVER.***
>>
>>Pardon me but I've apparently lost track of context here.
>>

>>Elizabeth Mattijsen <l...@dijkmat.nl>, Leopold Toetsch
>><l...@toetsch.at>, perl6-i...@perl.org I thought we were

>>discussing correct behavior of a shared data structure, not general
>>cases. Or maybe this is the general case and I should go read more
>>backlog? :)
>
>A "shared" data structure, as per Dan's document? It's a somewhat
>novel approach, trying to avoid locking overhead with dynamic
>dispatch and vtable swizzling. I'm discussing somewhat more
>traditional technologies, which simply allow an object to perform
>equally correctly and with no differentiation between shared and
>unshared cases. In essence, I'm arguing that a "shared" case isn't
>necessary for some data structures in the first place.

Oh, absolutely.

One thing that was made very clear from the (arguably failed) perl 5
experiments with threads is that trying to have the threaded and
unthreaded code paths be merged was an exercise in extreme pain and
code maintenance hell.

Allowing for vtable fiddling's a way to get around that problem by
isolating the threaded and nonthreaded paths. It also allows us to
selectively avoid threaded paths on a per-class per-method basis--on
those classes that don't permit morph (and it is *not* a requirement
that pmcs support it)

For those data types that don't require synchronization on some (or
all) paths, the locking method can be a noop. Not free, as it still
has to be called (unfortunately) but as cheap as possible. If someone
wants to make a good case for it I'm even OK with having a NULL be
valid for the lock vtable method and checking for non-NULLness in the
lock before calling if anyone wants to try their hand at
benchmarking. (I could see it going either way-- if(!NULL) or an
empty sub being faster)

Still the "No crash" VM requirement is a bit of a killer, though. It
will definitely impact performance, and there's no way around that
that I know of. I've a few ideas to minimize the locking needs,
though, and I'll try and get an addendum to the design doc out soon.