Done with thread-safe CLOS.

Jean-Claude Beaudoin

unread,

Aug 23, 2010, 5:15:02 AM8/23/10

to

Yesterday, after a few days of effort, I finished coding my modifications to the internals of the CLOS implementation I
currently use. So now I have (what I think to be) a thread-safe CLOS environment. I will make it available to you all in
a release I will make a few weeks from now, as soon as I get reasonably convinced that nothing got terribly broken in
the Win64 port by this surgery of the internals (there will also be Linux 32/64 and Win32 versions).

The solution I went for is rather simple (simplistic one may think) but efficient I hope. I defined a macro
"with-metadata-lock" (in the obvious way) over a recursive lock and I wrapped with it every upper layer entry point of
CLOS and a few of what looked to me as mid-level layer entry points. Here is the list of the wrapped functions:

add-method
compute-applicable-methods
compute-effective-method
defclass
defgeneric
define-method-combination
defmethod
ensure-generic-function
make-instances-obsolete
remove-method
reinitialize-instance

Function make-instance does not get the lock as long as its internal constructor cache is valid. If this cache becomes
invalid then the metadata lock is acquired before building a closure over its required effective methods, and this
process should be rare.

It is my opinion that the existence of the metadata lock cannot be kept as an internal implementation secret and needs
to be made public. Macro with-metadata-lock must be part of the external interface of CLOS on the same level as
defclass or, say, slot-value. One of the first big users of it will most probably be the code of the inspector.

The most difficult part of this CLOS thread-safety mini-project was in making the instance constructor cacheable. This
required some significant rewrite of the shared-initialize main/default methods in order to disconnect it from its
direct (and uncontrolled) access to the metadata. Doing this made me realize that the make-instance protocol as
currently specified (with its triade of generic functions: allocate-instance, initialize-instance and shared-initialize)
is essentially slow, brittle, complicated and almost impossible to implement in a thread-safe manner. One really gets
the feeling that this complexity has no real purpose. Since it is implemented by every compliant CLOS implementation it
is most probably used, but is it really useful? (BTW, the implementation details of the instance constructor cache will
have to be shared with any extension method on the triade since they too cannot be permitted to access the metadata
without control. So much for modularity!)

I come out of this experience with a strong urge to define my own implementation-specific extension to CLOS to fix this.
It will most probably take the form of a new defining macro like this:

(defconstructor my-class-name #| put some options here |#)

Obviously this would define a constructor function (with keyword arguments or BOA style) more in the defstruct style
that would do the whole business of instance initialization on its own without any of the make-instance protocol. I bet
such a constructor would be lock-free, a lot faster than its make-instance counter-part and entirely sufficient in the
vast majority of cases.

Cheers,

Jean-Claude Beaudoin

P.S.: Since I saw no mention of any CLOS metadata locking in the public documentation of the commercial CL
implementations (I looked at AllegroCL, LispWorks and Clozure), and since I believe the disclosure of the existence of
such a lock is pretty much unavoidable, should I conclude that those CLOS implementations are not thread-safe?

Christophe Rhodes

unread,

Aug 23, 2010, 6:48:34 AM8/23/10

to

Jean-Claude Beaudoin <jean.claud...@gmail.com> writes:

> It is my opinion that the existence of the metadata lock cannot be
> kept as an internal implementation secret and needs to be made public.
> Macro with-metadata-lock must be part of the external interface of
> CLOS on the same level as defclass or, say, slot-value. One of the
> first big users of it will most probably be the code of the inspector.

I think this is true, but I don't think that your lock is sufficient.
(I'd be happy to be proved wrong.) I think that even with the presence
of a metadata lock, things such as accessing slots, invoking generic
functions, executing method bodies and the like are very difficult to
implement atomically, and hence either also need to take a lock or will
be unsafe. This is most easy to see in the presence of CHANGE-CLASS (in
one thread) altering an object between effective method computation and
invoking the effective method (in another thread), but I believe similar
issues apply to various other racy cases.

Christophe

Jean-Claude Beaudoin

unread,

Aug 23, 2010, 8:06:27 AM8/23/10

to

Christophe Rhodes wrote:
>
> I think this is true, but I don't think that your lock is sufficient.

How embarrassing. :-( I am not quite done after all. I have to chase down
all those hidden "slot-value" calls in readers and writers...

In the case of generic function invocation I am sorry to say that
I don't understand what role the CLOS metadata has to play outside
the call to compute-effective-method (which is properly locking now).
I don't see references to class objects without access to their content
to be what I call metadata use; I see such references as being right
on the border but not crossing it. Would this have something to do with
subtyping relationships?

I also missed the fact that change-class has to be included in the
list of the CLOS functions wrapped in the with-metadata-lock macro.

Sorry for all the premature noise,

Jean-Claude Beaudoin

Pascal Costanza

unread,

Aug 24, 2010, 7:11:33 AM8/24/10

to

Yes, the functionality is useful. However, it should be possible to
determine at the right moments in time that no applicable methods on
allocate-instance, initialize-instance and shared-initialize exist other
than the standard methods, and in that case the initialization steps can
all be inlined. This is even possible if there are only :after methods
on these functions. ANSI CL comments on these optimization opportunities
in several places. So you should be able to arrange the implementation
in such a way that only that code pays the price for the more complex
initializations that actually needs them.

Pascal

--
My website: http://p-cos.net
Common Lisp Document Repository: http://cdr.eurolisp.org
Closer to MOP & ContextL: http://common-lisp.net/project/closer/

Jean-Claude Beaudoin

unread,

Aug 25, 2010, 8:29:58 PM8/25/10

to

Pascal Costanza wrote:
>
> Yes, the functionality is useful.
>

Would you care to substantiate this statement. I suspect that we each
put the usefulness bar at very different levels but considering the
amount of detail in this statement I can only do that, suspect.

Jean-Claude

Simon Brooke

unread,

Aug 27, 2010, 4:06:08 PM8/27/10

to

On Mon, 23 Aug 2010 05:15:02 -0400, Jean-Claude Beaudoin wrote:

> P.S.: Since I saw no mention of any CLOS metadata locking in the public
> documentation of the commercial CL implementations (I looked at
> AllegroCL, LispWorks and Clozure), and since I believe the disclosure of
> the existence of such a lock is pretty much unavoidable, should I
> conclude that those CLOS implementations are not thread-safe?

Not directly answering your question: in the InterLISP/Medley environment
in which the development work on LOOPS and Common LOOPS (from which PCL
which is the ancestor of most CLOS implementations evolved) were done,
there was a 'round robin' scheduler which scheduled LISP threads. This
was not pre-emptive - you could block the whole machine permanently (big
red switch time) simply by evaluating the CLISP statement
'uninterruptably do'. The processor these things ran on was a highly
configurable (in microcode) bit sliced processor; I never wrote any
microcode for it but I don't believe the hardware supported multi-
threading, and certainly the task scheduler was itself written in LISP.

I'm not certain whether the MIT LISP Machine and the Symbolics machines
which broadly followed it had pre-emptive multi-tasking, but I don't
think they did. The Connection Machine obviously had (very large numbers
of) parallel hardware threads, but as each thread ran on a processor with
its own private physical memory classic thread-safety issues don't apply,
I think. In any case I don't recall CLOS being implemented in *Lisp.

Obviously being round robin processed had to explicitly yield control to
allow other processes to get a look in, but you explicitly chose when to
yield. So the thread safety problems which arise from multiple concurrent
threads and pre-emptive multi-tasking were not a design issue when CLOS
was designed, I think.

--

;; Semper in faecibus sumus, sole profundam variat

Jean-Claude Beaudoin

unread,

Aug 27, 2010, 9:54:12 PM8/27/10

to

Simon Brooke wrote:
>
> I'm not certain whether the MIT LISP Machine and the Symbolics machines
> which broadly followed it had pre-emptive multi-tasking,

Quite a long time ago, in what now feels like a previous life, I worked
exclusively and for an extended period (about 15 months) on a set of LMI
Lisp Machines, one of which was dedicated to me. I had time enough to
hack pretty deep into it. AFAICR the scheduler on it was preemptive but
with a rather coarse scheduling slice of 1 second.

The OO system we used on them was Flavors and it was used extensively in
the system software. At the very least, the scheduler and the whole of
the windowing system was implemented in Flavors. Some parts of our
application code was also in Flavors but the bulk of it was CLTL1.
CLOS was too much of a new thing for us to try it before the project
ended.

But I strongly believe that, had we used CLOS for real then its
thread-safety flaws would have been exposed way back then.
It is my opinion that all of this is rooted in a fundamental
mistake in the design of CLOS pretty much from its origins
(I write this with, in from of me, my copy of the special issue
of SIGPLAN Notices, September 1988, for those who remember it).
Watch for my upcoming post in this newsgroup for my analysis
of that root cause and a description of my solution to fix
the situation.

Cheers,

Jean-Claude Beaudoin

Tim Bradshaw

unread,

Aug 28, 2010, 1:06:02 PM8/28/10

to

On 2010-08-28 02:54:12 +0100, Jean-Claude Beaudoin said:

> But I strongly believe that, had we used CLOS for real then its
> thread-safety flaws would have been exposed way back then.

I'm reasonably sure that (new) flavors had all the issues that CLOS
has. I also think you are making mountains out of molehills in some
mad kind of way.

Pascal Costanza

unread,

Aug 29, 2010, 9:03:54 AM8/29/10

to

There are cases where you need to define additional actions on
initialize-instance and reinitialize-instance that differ in some ways.
In such cases, it's useful that you can define methods on each. In other
cases, they don't differ, and then it's useful that you can define them
on shared-initialize, in order to avoid code duplication. I can imagine
a 'world' where shared-initialize wouldn't exist, but then you would get
unnecessary code duplication. I do have examples for both kinds of
methods in my own code. Considering that it's quite straightforward how
to optimize these protocols, I don't see any significant issues here.

Pascal Costanza

unread,

Aug 29, 2010, 9:08:42 AM8/29/10

to

On 28/08/2010 03:54, Jean-Claude Beaudoin wrote:

> But I strongly believe that, had we used CLOS for real then its
> thread-safety flaws would have been exposed way back then.
> It is my opinion that all of this is rooted in a fundamental
> mistake in the design of CLOS pretty much from its origins
> (I write this with, in from of me, my copy of the special issue
> of SIGPLAN Notices, September 1988, for those who remember it).
> Watch for my upcoming post in this newsgroup for my analysis
> of that root cause and a description of my solution to fix
> the situation.

I'm definitely interested in that post, so please make sure to write it,
if possible.

Just as a side note, the LispWorks folks claim to have solved
integrating CLOS with their support for SMP in LispWorks 6.0. It may be
interesting to read their documentation.

Duane Rettig

unread,

Aug 29, 2010, 10:19:45 AM8/29/10

to

Flavors didn't (doesn't) have a change-class that affects instances.
Many years ago one of our large clients who used flavors for their
system asked us to look into the possibility of reworking flavors to
allow for a clos-like change-class. It was going to be a tough
project; unfortunately the emphasis changed and the funds for that
project dried up. If you think about it, though, one of the hardest
things to pin down in CLOS is the combination of change-class and lazy-
updating of instances. It's also one of its greatest assets.

Duane

Jean-Claude Beaudoin

unread,

Aug 29, 2010, 6:30:11 PM8/29/10

to

Pascal Costanza wrote:
>
> I'm definitely interested in that post, so please make sure to write it,
> if possible.
>

I'll do my best to post it in the coming 24 hours.

> Just as a side note, the LispWorks folks claim to have solved
> integrating CLOS with their support for SMP in LispWorks 6.0. It may be
> interesting to read their documentation.
>

I just finished browsing it and all I could find in it was 3 small references.
In the User's Manual, section 15.3.2 says that slot access is atomic with
respect to slot modification and with respect to class redefinition (probably
because slots are one machine word large and the accessors cache the
effective slot location), section 15.4 says that slot-value locks the instance.
In the Release Notes, section 13.14 says that update-instance-for-redefined-class
and update-instance-for-different-class each lock the redefined instance and
then adds "so your methods should take care to avoid deadlocks". And I have
the feeling that avoiding deadlocks will be quite sporty with such a scheme.

I am also uneasy about the fact that the instance locking they keep referring
to seems to be a semi-secret device with no public interface. (The ghost of
Java "synchronized" keeps coming back in my mind.)

I could not find a word about CLOS metadata locking.

We'll see how this plays out...

Cheers,

Jean-Claude Beaudoin

unread,

Aug 29, 2010, 6:35:50 PM8/29/10

to

Duane Rettig wrote:
>
> Flavors didn't (doesn't) have a change-class that affects instances.

That is precisely it, that and a very different class redefinition
semantic.

> If you think about it, though, one of the hardest
> things to pin down in CLOS is the combination of change-class and lazy-
> updating of instances. It's also one of its greatest assets.
>

I will dispute it for the "lazy-updating of instances" part.

Jean-Claude

Joe Marshall

unread,

Aug 30, 2010, 1:56:39 PM8/30/10

to

On Aug 27, 1:06 pm, Simon Brooke <stillyet+n...@googlemail.com> wrote:
>
> I'm not certain whether the MIT LISP Machine and the Symbolics machines
> which broadly followed it had pre-emptive multi-tasking, but I don't
> think they did.

They did.

Simon Brooke

unread,

Aug 30, 2010, 3:06:35 PM8/30/10

to

OK, thanks. I sit corrected.

Antony

unread,

Aug 31, 2010, 2:25:29 AM8/31/10

to

On 8/29/2010 3:30 PM, Jean-Claude Beaudoin wrote:
> effective slot location), section 15.4 says that slot-value locks the
> instance.

I read(skimmed) the manual chap 15. I think it's a bit unclear.
I think what they are saying is
(paraphrasing)
an instance lock is obtained *if* you ask for the atomic version of
an operation on the slot value

It would help if they showed two examples - one that is not atomic and
another that's atomic and hence slower (which is ok).

I hope my interpretation is right, else it would be a pretty slow CLOS

-Antony

Jean-Claude Beaudoin

unread,

Aug 31, 2010, 3:13:52 AM8/31/10

to

Antony wrote:
> On 8/29/2010 3:30 PM, Jean-Claude Beaudoin wrote:
>> effective slot location), section 15.4 says that slot-value locks the
>> instance.
> I read(skimmed) the manual chap 15. I think it's a bit unclear.
> I think what they are saying is
> (paraphrasing)
> an instance lock is obtained *if* you ask for the atomic version of
> an operation on the slot value
>

I re-read 15.4 and I get pretty much the same thing. (slot-value obj name)
is just one of the few setfable places on which a low-level atomic operation,
like say atomic-incf, can be applied.

So it says that slot-value can lock the instance, not that it must.
But the real scoop here is that there is a lock associated with each
instance. And the question then is: What else is done with that lock?
Oh the ghost of Java monitors...

>
> I hope my interpretation is right, else it would be a pretty slow CLOS
>

I am afraid that it could get a lot worst than simply being slow, it
could get dead slow.

Jean-Claude

Pascal Costanza

unread,

Aug 31, 2010, 4:33:52 AM8/31/10

to

Yes, the documentation could be clearer here. From conversations with
the folks from LispWorks I got the information, though, that slot
accesses normally don't use any locks, only for the documented cases.

Pascal Costanza

unread,

Sep 6, 2010, 6:41:00 AM9/6/10

to

On 30/08/2010 00:30, Jean-Claude Beaudoin wrote:
> Pascal Costanza wrote:
>>
>> I'm definitely interested in that post, so please make sure to write
>> it, if possible.
>>
>
> I'll do my best to post it in the coming 24 hours.
>
>> Just as a side note, the LispWorks folks claim to have solved
>> integrating CLOS with their support for SMP in LispWorks 6.0. It may
>> be interesting to read their documentation.
>>
>
> I just finished browsing it and all I could find in it was 3 small
> references.
> In the User's Manual, section 15.3.2 says that slot access is atomic with
> respect to slot modification and with respect to class redefinition
> (probably
> because slots are one machine word large and the accessors cache the
> effective slot location), section 15.4 says that slot-value locks the
> instance.

15.3.2 doesn't imply that the lock is taken on every slot access. This
is only true for class redefinition. (I found it hard to understand that
part as well, so I asked the LispWorks folks, and they confirmed that
slot access doesn't take a lock under normal operation.)

15.4 only states that locks are taken for low-level atomic operations
(like compare-and-swap and atomic-incf, etc.). 15.4 also clearly
recommends against using CLOS slots for such atomic operations.

> In the Release Notes, section 13.14 says that
> update-instance-for-redefined-class
> and update-instance-for-different-class each lock the redefined instance
> and
> then adds "so your methods should take care to avoid deadlocks". And I have
> the feeling that avoiding deadlocks will be quite sporty with such a
> scheme.

I don't understand why. update-instance-for-redefined-class is supposed
to perform only short-lived initializations of new slots, or adaptations
of existing slots. This shouldn't require taking locks in the first place.

> I am also uneasy about the fact that the instance locking they keep
> referring
> to seems to be a semi-secret device with no public interface. (The ghost of
> Java "synchronized" keeps coming back in my mind.)

The comparison is not quite right, because the locks that each and every
Java object has are always publicly accessible, so everybody can lock an
object if they want to.