Making CLOS thread-safe, how difficult?

Jean-Claude Beaudoin

unread,

Aug 10, 2010, 12:10:55 AM8/10/10

to

I have been reading the CLOS code of ECL lately,
a somewhat far derivative of PCL I think, and I
realized that all that metaobject business was
quite clearly NOT thread-safe. Classes, methods
and generic functions are happily modified
outside of any lock.

The implementation notes of clisp currently state
also that: "CLOS is NOT thread-safe". So the
situation seems to be somewhat common among CL
implementations.

That cannot be too much of a good thing in
these days of multi-core CPUs and it bothers
me quite a bit! So I'd like to fix it, at least
in the implementation I currently use (ECL).

How difficult can it be to make CLOS thread-safe?
Does any of you have an (informed?) opinion on that?

I saw that SBCL wraps a good number of CLOS
functions in a "with-world-lock". Is that a
sound approach? Wouldn't a more focused
"with-metadata-lock" be good enough instead
of locking the whole world?

CLOS feels like it has so many customizable hooks
that the potential for deadlock through some
of that customization code is quite great.
In that context a clearly stated and widely
publicized metadata locking policy would have
to be established. Am I wrong on this?

Also, make-instance and its "two step"
process feels like a source of troubles.
First call generic-function "allocate-instance"
and then call generic-function "initialize-instance";
what about the void between these two?
I would hate to have to grab a lock to
instantiate objects!

Any advice on this whole subject would be most appreciated.

Thanks,

Jean-Claude Beaudoin

D Herring

unread,

Aug 10, 2010, 8:28:49 PM8/10/10

to

On 08/10/2010 12:10 AM, Jean-Claude Beaudoin wrote:
> Any advice on this whole subject would be most appreciated.

The following comments are an initial reaction to reading your post.
i.e. not well thought out.

> I have been reading the CLOS code of ECL lately,
> a somewhat far derivative of PCL I think, and I
> realized that all that metaobject business was
> quite clearly NOT thread-safe. Classes, methods
> and generic functions are happily modified
> outside of any lock.

...

> How difficult can it be to make CLOS thread-safe?
> Does any of you have an (informed?) opinion on that?
>
> I saw that SBCL wraps a good number of CLOS
> functions in a "with-world-lock". Is that a
> sound approach? Wouldn't a more focused
> "with-metadata-lock" be good enough instead
> of locking the whole world?

SBCL's approach seems like a good one. It is relatively simple and
effective.

I imagine SBCL achieves the sync by using the same mechanism it uses
for GC. Stop the world. At that point, there is no benefit to using
a different name. This is slow when it happens; but it removes the
need for conditional checks during normal runtime.

> CLOS feels like it has so many customizable hooks
> that the potential for deadlock through some
> of that customization code is quite great.
> In that context a clearly stated and widely
> publicized metadata locking policy would have
> to be established. Am I wrong on this?

As long as all synchronization uses the same CLOS-mutation lock, and
the code interior to CLOS doesn't grab any other locks, its probably
deadlock-free. If the CLOS mutex is always the last to be grabbed,
then it guarantees a proper partial order. A fine-grain system with
multiple locks or lock-free algorithms would require a better
specification.

> Also, make-instance and its "two step"
> process feels like a source of troubles.
> First call generic-function "allocate-instance"
> and then call generic-function "initialize-instance";
> what about the void between these two?
> I would hate to have to grab a lock to
> instantiate objects!

That is indeed something to fear. I don't see an obvious solution.
Any function which sequences CLOS calls could be affected. Stuff like
this argues for a sort of immutable structure system, something that
allows each thread to use a consistent set of bindings until its stack
frame returns past some boundary. Pascal C. has illustrated the use
of ContextL for cleanly solving such issues. Don't know if it was
truly thread-safe, though.

- Daniel

Kenneth Tilton

unread,

Aug 11, 2010, 1:02:20 AM8/11/10

to

Sorry, what exactly is the problem? A relevant method being defined
between allocate-instance and initialize-instance calls on one instance
in a way that matters?

What we do in my shop is take that monkey out back and shoot them. PETA
gives us hell, but it does wonders for discipline.

hth, kt

--
http://www.stuckonalgebra.com
"The best Algebra tutorial program I have seen... in a class by itself."
Macworld

Tim Bradshaw

unread,

Aug 11, 2010, 5:33:20 AM8/11/10

to

On 2010-08-11 01:28:49 +0100, D Herring said:
>
> SBCL's approach seems like a good one. It is relatively simple and effective.

If this world-lock means "stop all other mutation in the system", and
if it happens significntly often, then Amdahl's law will be hurting you.

It seems to me that it would be reasonable for the system to protect
things which (for instance) create new classes and so on, but for (for
instance) instance creation then it's really up to you, since CLOS
can't know what methods you define. And even then there are so many
places where user code can intervene that it's hard to see how much
useful could be done.

There are clearly cases where the system needs to at least ensure
things are safe. For instance I think that any serious implementation
would be caching effective methods, and that cache needs to be safe
(it's OK for multiple things to think they need to recompute the
method).

I agree with Kenny that it's not realistic to expect things to be safe
in the presence of (for instance) new methods being defined half way
through the computation of an effective method: if your program does
that, you deserve to lose.

Jean-Claude Beaudoin

unread,

Aug 11, 2010, 10:46:51 AM8/11/10

to

I missed the fact that there is a third generic function involved in the business of make-instance: shared-initialize.
So it makes for at least two (if not three) holes through which trouble can creep in. First between allocate-instance
and instance-initialize and second between the beginning of instance-initialize and its own invocation of shared-initialize.

Kenneth Tilton wrote:
>
> Sorry, what exactly is the problem? A relevant method being defined
> between allocate-instance and initialize-instance calls on one instance
> in a way that matters?
>

Currently I see at least two problematic scenarios.

The first one is in the event of a change in the DAG of classes that happen to modify the class precedence list (CPL) of
a class for which a make-instance is in progress. I don't have much hope for the coherence of the whole set of
allocate-instance, instance-initialize and shared-initialize if they have been customized with code that share some
critical assumptions about the class they work on. But also in this situation you will end up in a call to generic
function update-instance-for-redefined-class right in the middle of the on-going make-instance and this in the context
of a partially or totally uninitialized instance. I doubt this is a context that update-instance-for-redefined-class is
likely to handle gracefully. The potential for all of this to end in a call to the debugger is much too large for me to
be happy about it. I think a solution here would be to defer somehow the instance updating to a point after the
initialization is done.

The second scenario is a call to make-instance between a (defmethod allocate-instance ...) and a (defmethod
instance-initialize ...) or vice-versa and again allocate-instance and instance-initialize share some critical
assumption about the class to instantiate. Caching as one single unit of the 3 effective methods involved in the
execution of make-instance is probably a solution to this one, the cache line being filled under the protection of the
metadata lock.

For the rest of the CLOS metadata modifying code (defclass, defmethod, defgeneric, ...) I am about to do pretty much
like SBCL and wrap most of it in a bunch of with-metadata-lock...

Cheers,

Jean-Claude Beaudoin

Captain Obvious

unread,

Aug 11, 2010, 10:57:35 AM8/11/10

to

JCB> Also, make-instance and its "two step"
JCB> process feels like a source of troubles.
JCB> First call generic-function "allocate-instance"
JCB> and then call generic-function "initialize-instance";
JCB> what about the void between these two?
JCB> I would hate to have to grab a lock to
JCB> instantiate objects!

Then maybe it is better solved on application level?

During development you probably do not need to care.
If you want to hot-patch thing in production you might want to make
something like read-write lock.

E.g. if your application is serving requests it grabs "code read" lock for
duration of request.
When you want to hotpatch you grab "code write" lock and after all request
release read locks you can hotpatch code.
Then new requests will use new code.

Probably this is the only way to guarantee consitency for hotpatching. (It's
not like I'm an expert in this, just speculating.)

You cannot guarantee this sort of consistency on implementation level, so
why try at all?
From practical perspecitves attempt to do that would be half-assed, but
still might impose some overhead, as you've noted above.

So I think instead of making it abstractly safe it is better to make it
documented and reasonable -- bad things which can happen should be
documented and unreasonably bad things should be fixed.

Tim Bradshaw

unread,

Aug 11, 2010, 1:36:29 PM8/11/10

to

On 2010-08-11 15:46:51 +0100, Jean-Claude Beaudoin said:

> The first one is in the event of a change in the DAG of classes that
> happen to modify the class precedence list (CPL) of a class for which a
> make-instance is in progress. I don't have much hope for the coherence
> of the whole set of allocate-instance, instance-initialize and
> shared-initialize if they have been customized with code that share
> some critical assumptions about the class they work on.

I'm with Kenny: find the people who write code which has this problem,
kill them and sell them for pet food.

Günther Thomsen

unread,

Aug 11, 2010, 7:58:03 PM8/11/10

to

On Aug 11, 7:46 am, Jean-Claude Beaudoin

<jean.claude.beaud...@gmail.com> wrote:
> I missed the fact that there is a third generic function involved in the business of make-instance: shared-initialize.
> So it makes for at least two (if not three) holes through which trouble can creep in. First between allocate-instance
> and instance-initialize and second between the beginning of instance-initialize and its own invocation of shared-initialize.
>

Two/three holes? This assumes that allocate-instance, instance-
initialize, etc. are atomic operations, doesn't it? Please ignore my
ignorance, but are they specified to be that way? Are they implemented
that way?

Jean-Claude Beaudoin

unread,

Aug 12, 2010, 4:37:38 AM8/12/10

to

Well, in CLOS, class redefinition is an advertised feature supported by an officially standard protocol
(update-instance-for-???-class, make-instances-obsolete). Considering that, the scenario I mentioned is not far-fetched
and is not an abuse of the system. So I think that the issue here cannot be dismissed trivially.

Jean-Claude Beaudoin

unread,

Aug 12, 2010, 5:23:38 AM8/12/10

to

Yes, in that statement I did assume the mentioned generic functions to be "atomic with respect to CLOS metadata" which
is a kind of worst case against the point I was trying to make in an informal way which is: there exists a lower bound,
that is not zero, on the number of regions, that I called holes, where synchronization issues can appear. If you remove
that "atomicity" assumption the situation only gets even worst.

The ANSI CL standard is silent on any aspect involving multi-threading including this one. But usually the business of
generic functions with respect to CLOS metadata is done by a call to compute-applicable-methods followed by a call to
compute-effective-method, both MOP generic functions. The result of this is usually cached somewhere and is reused in
subsequent calls to the generic function, thus it gives somewhat of an "atomic" flavor to the thing. One would also
wish very strongly that the sequence compute-applicable-methods+compute-effective-method be atomic with respect to CLOS
metadata mutation but nothing in the standard on that.

There is currently some locking done in the CLOS code of SBCL but one gets the impression from reading the code that
this was done more as a partial treatment of an immediate symptom (crash following redefinition of a class) than as a
cure for the disease.

Captain Obvious

unread,

Aug 12, 2010, 7:36:17 AM8/12/10

to

JCB> Well, in CLOS, class redefinition is an advertised feature supported
JCB> by an officially standard protocol (update-instance-for-???-class,
JCB> make-instances-obsolete). Considering that, the scenario I mentioned
JCB> is not far-fetched and is not an abuse of the system. So I think that
JCB> the issue here cannot be dismissed trivially.

I think the point is that application should not use classes _while_
redefining them.
Redefining classes is OK, but multi-threaded application should take
measures to maintain consistency and only do modifications when objects are
not in work.

Tim Bradshaw

unread,

Aug 12, 2010, 2:06:30 PM8/12/10

to

On 2010-08-12 09:37:38 +0100, Jean-Claude Beaudoin said:

> Well, in CLOS, class redefinition is an advertised feature supported by
> an officially standard protocol (update-instance-for-???-class,
> make-instances-obsolete). Considering that, the scenario I mentioned is
> not far-fetched and is not an abuse of the system. So I think that the
> issue here cannot be dismissed trivially

Yes, redefinition is supported. But function redefinition, loading
code, and so on are also supported, and I seriously doubt if many
implementations are thread-safe in the presence of people (say) loading
patches.

Personally, I think that there's a fair amount to learn from Java here:
they originally had almost everything defined to be "thread safe" and
it turned out to hurt performance a good deal and not actually to help,
because making operations defined by the language/library thread safe
does not actually help programs, which tend to do more than one of thse
things, be thread safe: the programs needed to interlock *anyway*. So
the second generation of Java stuff (Java 2?) had a lot less thead-safe
things.

If I was going to spend time thinking about CLOS and thread-safety, the
approach I'd take would be to think about sealing, and specifically the
ability to seal parts of the system while leaving others unsealed. If
you, say, seal a generic function, then you *know* people will not be
defining new methods on it, and for added value you can do a lot of
work to optimise it, because the sealing declaration licenses that as
well. Similarly sealing classes allows a lot of nice assumptions to be
made. Finally sealing declarations are great ways of expressing intent
(in the same way that type declarations can be in CL, even when there
is no expectation that the system will enforce them). It's a long time
since I read the Dylan documentation (probably not since it had an
sexp-based syntax in fact), but I remember they seemed to have quite a
good approach to this - probably not surprising since the Dylan object
system is very much a post-CLOS system, if I remember rightly.

Tim Bradshaw

unread,

Aug 12, 2010, 2:07:58 PM8/12/10

to

On 2010-08-12 12:36:17 +0100, Captain Obvious said:

> I think the point is that application should not use classes _while_
> redefining them.
> Redefining classes is OK, but multi-threaded application should take
> measures to maintain consistency and only do modifications when objects
> are not in work.

Exactly. In just the same way that one would not expect to use a
system while it was being recompiled & loaded (or if that was
supported, which it can be, you wuld expect to do a lot of
application-level work to make it work properly)

Jean-Claude Beaudoin

unread,

Aug 13, 2010, 11:04:07 PM8/13/10

to

Tim Bradshaw wrote:
>
> Yes, redefinition is supported. But function redefinition, loading
> code, and so on are also supported, and I seriously doubt if many
> implementations are thread-safe in the presence of people (say) loading
> patches.

In that case the problematic data "belongs" to the application, with CLOS the problematic data belongs to the internals
of the CLOS runtime system. Nothing else in CL has that amount of (somewhat) complex internal data in play.

>
> Personally, I think that there's a fair amount to learn from Java here:
> they originally had almost everything defined to be "thread safe" and it
> turned out to hurt performance a good deal and not actually to help,

I agree to some extent here. A few years ago during a conference I heard over lunch, from someone very well in the know
about the details of the design process of Java, that the biggest mistake committed during the design of Java had been
the inclusion of monitors (the famous "synchronized" keyword). That person went on stating that they didn't know what
they were doing when they did that. That event made a very strong and lasting impression on me.

> If I was going to spend time thinking about CLOS and thread-safety, the
> approach I'd take would be to think about sealing, and specifically the
> ability to seal parts of the system while leaving others unsealed.

This is an interesting idea but I think it is orthogonal to my current concern of thread-safety. Surely worth pursuing.
Could you point me to some reference on the subject that would be somewhat more focused that the whole of the Dylan
reference manual?

I would like finally to take a few words to thank very much everyone that replied to my original post of a few days ago.
Your input were valuable and useful in helping me progress toward what I think will be a very viable solution. Maybe
will it be somewhat of a bold new experiment but so be it.

Cheers,

Jean-Claude Beaudoin

Tim Bradshaw

unread,

Aug 16, 2010, 8:57:50 AM8/16/10

to

On 2010-08-14 04:04:07 +0100, Jean-Claude Beaudoin said:

> This is an interesting idea but I think it is orthogonal to my current
> concern of thread-safety.

I don't think it is: sealed bits of the system are easy to make
thread-safe because you can't redefine things.

> Surely worth pursuing. Could you point me to some reference on the
> subject that would be somewhat more focused that the whole of the Dylan
> reference manual?

No, not really, there must have been some documents but I don't
remember them or whether they still exist.

Vladimir Sedach

unread,

Sep 9, 2010, 12:44:29 PM9/9/10

to

On Aug 11, 10:57 am, "Captain Obvious"

<udode...@users.sourceforge.net> wrote:
> JCB> Also, make-instance and its "two step"
> JCB> process feels like a source of troubles.
> JCB> First call generic-function "allocate-instance"
> JCB> and then call generic-function "initialize-instance";
> JCB> what about the void between these two?
> JCB> I would hate to have to grab a lock to
> JCB> instantiate objects!
>
> Then maybe it is better solved on application level?

That is indeed the only reasonable conclusion. Even if you combine
allocate-instance and initialize-instance, creating object instances
can still be a multi-step process because of call-next-method.
Actually, object instantiation is not necessarily thread-safe even for
*immutable* objects with trivial constructors! Java punted on this
issue until version 5: http://pveentjer.wordpress.com/2007/03/18/immutability-doesnt-guarantee-thread-safety/

Be careful what you do with your new object reference until you're
absolutely sure all its slots have been initialized.

Vladimir