Clojure + Terracotta = Yeah, Baby!

15 views
Skip to first unread message

Paul Stadig

unread,
Feb 27, 2009, 3:20:33 PM2/27/09
to Clojure
I've recently done some experimentation with Clojure and Terracotta.
I've detailed my experience at:

http://paul.stadig.name/2009/02/clojure-terracotta-yeah-baby.html

and shared my code at:

http://github.com/pjstadig/terraclojure/tree/master/

I'm the first to admit that I'm not an expert in Terracotta. This is
actually my first time working with it.

I was able to setup a permanent store, and to run a couple of
transactions against some refs shared across multiple JVMs. I ran into
two snags. The first was that Keyword did not implement hashCode,
which was a simple fix. Rich graciously added a hashCode method. :)

The other was how to point Terracotta to a var from the Java side. I
ended up using a simple Java class to hold on to my refs, since then
it was easy to point Terracotta to the right place. It works just
fine, but I'm not sure if there is a better way to do it.

I'd be interested in knowing if anyone else has experience to share
about using Terracotta with Clojure.


Paul Stadig

Luc Prefontaine

unread,
Feb 27, 2009, 4:37:44 PM2/27/09
to clo...@googlegroups.com
We are trying to get Clojure shared over Terracotta, not just specific things but the whole Clojure object space
(name spaces, root values, ....) except stuff that needs to remain local (streams, ....).

We take an all or nothing approach here, we would to see many Clojure instances work as a single
entity.

We are facing a number of issues, one of them being the lack of support by Terracotta of AtomicReference.

Terracotta has placed this class in the list of classes they will eventually support but has of 2.7.3
it's still not supported.

The Sun AtomicReference does peeks and pokes in the heap using internal routines of the JVM (JNI).
Clearly this cannot be done with multiple JVMs, different object heaps, ....

Terracotta suggest using the oswego library but since Java 5 has been out, it's in maintenance mode only and
integrating another library to the puzzle for that single purpose did not look to us as very efficient.

So we created a SharedAtomicReference implementation that uses standard locks to control access to
the value with the corresponding Terracotta configuration.

We use a factory to decide at run time which implementation to use based on a system property.
To preserve the Clojure code base, we implemented an AtomicReference interface and a delegate to
the AtomicReference class. The Clojure code uses now  interface.
This allows us to make progress without disturbing the Clojure code base too much.

This has some performance implications that we do not fully understand yet since we need a living Clojure implementation
to work with, something we are in the process of creating... what a catch 22 :)))
We thought about spending time working on this aspect right now but we prefer to wait for the first alpha release.

As for the rest of the Clojure code, we are working on the classes implementing vectors, hash maps, ....
to get them shared through terracotta.

Our main challenge these days is putting together the terracotta configuration of the internal classes of Clojure that need to be shared.

We may hit issues that make some classes non portable.
These will require us to implement an alternative.  We will use then a slightly different approach,
we need only factories to allocate an alternate version or the "local" implementation.
The Clojure code already uses interfaces to access the data objects so this will be almost transparent in the code.

We prefer to re implement an "distributed" implementation side by side with the original one and easily switch between them
to compare behaviour and performance with the same code base.

When we have a clearer understanding of how it performs we will look at how to merge this in the code base
that will have changed in between. We may be able then to reduce the number of alternate implementations
if more where created.

The work is in progress, next week is a school break week here so the pace will slow down a bit.
We wanted to start earlier on this (before XMas) but we had updates to put in production , client caring to do, ...
and other mundane tasks to get the bread and butter on the table.

Comments/suggestions are welcomed...

Luc
-- 

Luc Préfontaine

Off.:(514) 993-0320
Fax.:(514) 993-0325

Armageddon was yesterday, today we have a real problem...

Phil Hagelberg

unread,
Feb 27, 2009, 4:49:00 PM2/27/09
to clo...@googlegroups.com
Paul Stadig <pa...@stadig.name> writes:

> I've recently done some experimentation with Clojure and Terracotta.
> I've detailed my experience at:
>
> http://paul.stadig.name/2009/02/clojure-terracotta-yeah-baby.html

Very exciting; I'm looking forward to trying this out! Thanks for posting.

-Phil

Paul Stadig

unread,
Feb 27, 2009, 4:51:51 PM2/27/09
to clo...@googlegroups.com
Yeah, after sharing clojure.lang.Keyword.table I tried to share
clojure.lang.Namespace.namespaces, but ran into a problem because
Namespace uses an AtomicReference for its mappings and aliases
members. I thought that not sharing namespaces would be a problem (and
maybe it still is I don't have as much practical experience with this
as you), but I wasn't sure it would be a problem. Looking up a var in
a namespace on different JVMs would find different vars, but since
they are thread local, then I'm not sure its an issue. Having all of
your refs shared without having to specify each specific ref would be
interesting, but since most of the stuff (functions, vars, etc.) are
immutable or thread local, then I'm not sure how much of an issue it
is.

Obviously, if you were going to redefine some functions or something,
then you'd either have to do so on each JVM, or just restart all of
the clients.

And as I said in my article, I didn't do any work with agents, so
maybe there's a lot missing from my part of the puzzle.


Paul

Luc Prefontaine

unread,
Feb 27, 2009, 6:54:14 PM2/27/09
to clo...@googlegroups.com
Having the ability to redefine a function once for all instances is something we really want...
and you need name spaces to be shared for that to happen.
We take the approach of sharing everything that seems to worth it, then we will see
what we might need to keep private to each JVM.

Sharing var roots and refs also is something we want so we can initiate STM transactions
involving  multiple JVMs.

We have much to learn from a prototype and that is what we strive to achieve the immediate future.
After this step, real fun will begin, right now were having only an appetizer...

Luc

Rich Hickey

unread,
Feb 27, 2009, 9:05:09 PM2/27/09
to Clojure


On Feb 27, 6:54 pm, Luc Prefontaine <lprefonta...@softaddicts.ca>
wrote:
> Having the ability to redefine a function once for all instances is
> something we really want...
> and you need name spaces to be shared for that to happen.
> We take the approach of sharing everything that seems to worth it, then
> we will see
> what we might need to keep private to each JVM.
>
> Sharing var roots and refs also is something we want so we can initiate
> STM transactions
> involving multiple JVMs.
>
> We have much to learn from a prototype and that is what we strive to
> achieve the immediate future.
> After this step, real fun will begin, right now were having only an
> appetizer...
>
> Luc
>

It would be great if you could mention the difficulties you face as
you go, before you spend too much time on workarounds. I am interested
in seeing Clojure on Terracotta and if there are things I can do
easily to support this effort I will.

Rich

Luc Prefontaine

unread,
Feb 27, 2009, 9:55:29 PM2/27/09
to clo...@googlegroups.com
You're right Rich,

We all have to agree on the means used to implement this in the Clojure runtime.
Any code we throw right now has to be somewhat aligned with these decisions.

The decision to hide the AtomicReference class was easy to take. It was an unavoidable
obstacle. Any other issues from the Clojure run time will need more thinking and
there might be other ways to do work around these issues.

I can post an update each 2 weeks or so or on demand before we spit out code
if we face an issue.

Right now we are busy writing a tool in Clojure to generate the terracotta configuration
from a set of classes. Finding each dependent class one by one is too slow.
We always end up missing one. It will also spit out the locking section with all these member functions.

This tool will speed us a lot in getting the prototype up. We will get the master pieces of
the XML configuration. We can them removing unwanted classes from the configuration
and tune the locking strategy as you test things.

Luc

Paul Stadig

unread,
Feb 28, 2009, 6:35:53 AM2/28/09
to clo...@googlegroups.com
My approach was just to share what few refs I wanted, but another
approach (like Luc's) is to share everything. The obvious advantage
being that you can set! the root binding of vars (like function
definitions). The goal with Terracotta is to make things as
transparent as possible, so I don't think "share everything" would be
a huge number of changes here. I think there are only a few
roadblocks.

1. Remove AtomicReferences from Namespace class. If this is possible,
I think a major win for the share everything approach would be to
share clojure.lang.Namespace.namespaces. Once that is done almost all
of the objects will get shared by nature of being included in the
clojure.lang.Namespace.namespaces object graph. Almost all of the
static members of classes are assigned by looking up a var in a
namespace, or by creating Symbols, which are not a problem as far as I
have seen.

2. Static members that are not assigned by looking up a var in a
namespace. I'm not sure how many there are, but one example would be
PersistentHashMap.EMPTY. There is one trick to sharing this.
PersistentHashMap depends upon APersistentMap to generate and assign
it's hashCode value to the _hash member. The first time that hashCode
is called on the PersistentHashMap.EMPTY instance, Terracotta throws
an exception because PersistentHashMap.EMPTY is changed outside of a
lock. This can be fixed by adding
<autolock auto-synchronized="true">
<method-expression>*
clojure.lang.APersistentMap.hashCode(..)</method-expression>
<lock-level>write</lock-level>
</autolock>
to the Terracotta config. This will make Terracotta guard the
APersistentMap.hashCode method. This works, but adding the
synchronized keyword to the APersistentMap.hashCode method (as well as
ASeq, APersistentSet, and APersistentVector) would be a simple change
(assuming there aren't any consequences I'm missing).

I don't see why any of the Persistent data structures would need to be
modified. There may be other issues that will come about after these
changes, but I think it would be a good first step. After that fine
tuning the lock definitions could be done (although one could always
just define a broad lock config). I could look into creating a
Terracotta Integration Module so that people could just say,
"Terracotta, I'm using Clojure," and all of the instrumented classes,
roots, and locks would be configured automagically, then the work
wouldn't have to be duplicated every time.


Paul

Luc Prefontaine

unread,
Feb 28, 2009, 11:51:23 AM2/28/09
to clo...@googlegroups.com

1) AtomicReference is used in several places. Instead of changing it, we think we can keep
it when Clojure runs "locally" and provide an alternative when running in "shared" mode.

AtomicReference is optimized to be efficient in a standalone JVM. We would like to
keep it that way. Eventually Terracotta will provide instrumentation on this class
by default so the "shared" implementation could be thrown away in the near future.
We see the double implementations as a transition period until Terracotta supports
it directly.

2) Noted

Shared versus local mode:

That's what we have in mind, getting Clojure to work in a "shared" mode versus a
local/standalone mode. We want 0 impacts on the user code. Eventually we
could use meta data to provide some hints that would allow us to fine tune
shared interactions from user code. This would not impact "local" mode behaviours.
We're not there yet but we know that this possibility exists so that's reassuring
for the future. 

Integration is pretty simple once the common code base integrates the necessary
changes. We need a shell script, a Terracotta configuration that will be maintained
as part of the Clojure code base and some documentation.

As of now we use a system property to toggle the modes, we will implement a
transparent way (testing the presence of a terracotta property most probably).


Luc

Paul Stadig

unread,
Feb 28, 2009, 2:33:39 PM2/28/09
to clo...@googlegroups.com
In the Namespace case, it might be premature optimization to worry
about AtomicReference being replaced. If there is a way to rewrite
that code with, say, synchronized blocks, and it will work better with
Terracotta, I think it would be worth doing. I don't think it would be
normal usage to be updating the mappings and aliases in a namespace
1,000 times a second.

AtomicReference is also used in Atom and Agent. Those cases may not be
as straight forward.


Paul

Luc Prefontaine

unread,
Feb 28, 2009, 3:02:16 PM2/28/09
to clo...@googlegroups.com
We think the same way. Our first implementation of an alternative to AtomicReference
is straightforward, we will look at improving it if the need arises.

It will be easier to do so when we get stats from Terracotta after running some benchmarks.
There's much to do before getting there.

Luc
-- 

Luc Préfontaine

Off.:(514) 993-0320

Nabib El-Rahman

unread,
Feb 28, 2009, 9:48:43 PM2/28/09
to clo...@googlegroups.com
Hi guys,

I work for Terracotta ( on the server side ) and find this work with Clojure + Terracotta very exciting.  Writing a TIM is definitely the way to go, It's a place to hide the glue until both Terracotta and Clojure catches up with each other. If you have any questions feel free to post on our forums http://forums.terracotta.org/forums/forums/list.page

If you check out our trunk version, theres also an effort to make a common-api which will help writing a TIM easier for you guys.

Good luck!

-Nabib

hank williams

unread,
Feb 28, 2009, 11:16:59 PM2/28/09
to clo...@googlegroups.com

  Writing a TIM is definitely the way to go, It's a place to hide the glue until both Terracotta and Clojure catches up with each other.


uhhh.... what is a TIM?

 
Thanks
Hank



--
blog: whydoeseverythingsuck.com

Nabib El-Rahman

unread,
Feb 28, 2009, 11:35:10 PM2/28/09
to clo...@googlegroups.com
Its a way to package integration details into a module.  For example, if I want to cluster EHCache, I can drive through the code and figure out what data structure to share and subsequently lock on.  All that work can be packaged into a module for terracotta, so that way people who just want to use ehcache + terracotta change just include tim-ehache in terracotta configuration and that's it.

the same can be done for clojure. the details can be abstract to a tim-clojure.

http://www.terracotta.org/web/display/docs/Terracotta+Integration+Modules+Manual

-Nabib

On Sat, Feb 28, 2009 at 8:16 PM, hank williams <han...@gmail.com> wroote:

Paul Stadig

unread,
Mar 1, 2009, 6:38:58 AM3/1/09
to clo...@googlegroups.com
I've started work on a Terracotta Integration Module for Clojure
already. As I understand it, we can package up the Terracotta config
as well as any replacement classes. This way we can "patch" Clojure
temporarily until either Terracotta supports the features we need, or
Clojure can be rewritten so that it doesn't use classes that are
unsupported by Terracotta (if prudent and possible), and there would
be no need to fracture the Clojure code base.

I'll keep everyone apprised of my progress.


Paul

Luc Prefontaine

unread,
Mar 1, 2009, 11:37:40 AM3/1/09
to clo...@googlegroups.com
We will go for a TIM. Just looked at the doc and tes that would simplify our work a lot.

Thank you,

Luc

Amit Rathore

unread,
Mar 1, 2009, 3:21:47 PM3/1/09
to Clojure
Are any of the folks on this thread in/around the bay area? (I know
Nabib is).
We're having a clojure user-group meeting on the 12th of March - and
the clojure/terracotta topic is of interest to a lot of people...
It would be wonderful if someone would come and talk about the
progress...

Regards,
Amit.

http://www.meetup.com/The-Bay-Area-Clojure-User-Group/

On Mar 1, 8:37 am, Luc Prefontaine <lprefonta...@softaddicts.ca>
wrote:

Luc Prefontaine

unread,
Mar 1, 2009, 6:46:36 PM3/1/09
to clo...@googlegroups.com
I'm in Montreal Quebec, we have several feet of snow here and still have a month of snow storms to go.
Having some spare time I would love to visit you but there's too much work here
and you area is a bit too far away :)))

Luc

Paul Stadig

unread,
Mar 1, 2009, 7:48:56 PM3/1/09
to clo...@googlegroups.com
Sounds fun! I'd love to come out, but I'm in the Washington DC area.
It would be an expensive flight.

I pushed an initial version of a TIM for Clojure. It's is simply my
previous work bundled as a TIM. Not much, but it's a base to build on.
I'm starting to figure out how to have Terracotta replace Clojure
classes with versions that can be clustered.

http://github.com/pjstadig/tim-clojure-1.0-snapshot/tree/master


Paul

Joseph Mikhail

unread,
Mar 2, 2009, 4:25:49 PM3/2/09
to Clojure
Amit, Is it possible to configure a video conference for the next
clojure meeting?
Reply all
Reply to author
Forward
0 new messages