Using Clojure as an embedded / hosted scripting language

353 views
Skip to first unread message

Randy Kahle

unread,
Apr 22, 2008, 5:18:27 PM4/22/08
to Clojure
In a system written in Java that wishes to have some processing
delegated to Clojure (such as in NetKernel) it seems the approach is
to do a static

RT.init();

and then for each script that is to be run, the full context must be
established by

Compiler.load(...)

the common.clj and the custom script file (e.g. custom.clj) before
doing

Compiler.eval(...)

In a situation where only the contents of "custom.clj" are going to
change is there a way to save the state of the system after the
RT.init() and the first Compiler.load(...) call?

-- Randy

Randy Kahle

unread,
Apr 22, 2008, 6:46:34 PM4/22/08
to Clojure
I must restate my question as I took a closer look at RT.init() and
realized that it does in fact boot up the whole standard Clojure
language including boot.clj.

What I am trying to figure out is how to:

* Perform the maximum amount of Clojure initialization once
* Create a clean context for each subsequent "load/eval" of a
clojure language file (*.clj)
* Have the system eliminate as much redudant work as possible.

Am I correct that the following is optimal and correct:

* RT.init() once and only once
* For each separate invocation of Clojure do Compiler.load(...)
and Compiler.eval(...)

-- Randy

Steve Harris

unread,
Apr 23, 2008, 2:20:08 PM4/23/08
to Clojure
Randy,
I don't know if you've seen it already or not, but Rich's response to
me before I started the NK module in this thread may help as a
starting point:

http://groups.google.com/group/clojure/browse_frm/thread/e45be60b8f7a4d16/da0555cdd9f816be?lnk=gst&q=netkernel#da0555cdd9f816be

Another thing to do is to look at the Clojure source code,
particularly:
src/clojure/jvm/lang/Repl.java and Compiler.java

It looks like if you're interested in greater efficiency, maybe you
could keep track of which clj files have already been loaded and have
not changed since loading, and skip the Compiler.load() step in that
case, going straight to Compiler.eval(). Any top level commands in
the source file would only be guaranteed to execute once, but I don't
think it would be a good idea for the user to rely on such side
effects of loading anyway. I think the loading/compiling in Clojure
is pretty quick though so I'm not sure it's worth the trouble.

As for a clean environment for each new Compiler.load(), I guess it
*could* be a problem for scripts that were changed in certain ways by
masking problems with the change, e.g. if a declaration were removed
but not references to the declared item - the problem wouldn't show up
until restart. I'd be happy to live with that myself, and I can't
really think of any other sort of problems off hand though. How is it
solved for the Java code when the source changes?


Steve Harris

unread,
Apr 23, 2008, 2:26:53 PM4/23/08
to Clojure
> and skip the Compiler.load() step in that
> case, going straight to Compiler.eval().

Now that I think about it, it's a bad idea - you'd probably end up
calling some other source file's process-request function, from
whichever clojure accessor last executed.

I think evaluating each source file in its own Clojure namespace,
hashed from the source file's uri, might be the safest way to achieve
isolation. I believe Rich does such a thing in Repl.java that I
mentioned in the above message.



Steve Harris

unread,
Apr 23, 2008, 2:46:41 PM4/23/08
to Clojure
> I believe Rich does such a thing in Repl.java that I
> mentioned in the above message.

PS: I meant by this that he pushes a new namespace ("user"), not the
hash stuff.

Randy Kahle

unread,
Apr 24, 2008, 8:13:46 AM4/24/08
to Clojure
Steve,

I think you are right about the need to focus on isolation. First is
correctness - efficiency questions are secondary.

In NetKernel a programming language is a service and all languages are
treated the same - BeanShell, Ruby, Groovy, JavaScript, XSLT, XQuery -
they are all languages that run as a service. For example, to run a
BeanShell program with the BeanShell language service one would
request the following URI "active:beanshell+operator@ffcpl:/
myprogram.bsh"

With Clojure the language service (I'm using Tom Hicks' mod_clonk)
loads the static classes when it is first instantiated as an object in
its constructor. Ideally, Clojure would not use static classes so it
can be treated as other language services are within NetKernel.

However, given that it is static I will need pursue your idea of using
namespaces. Ideally I could remove an entire namespace and get a
pristine context. I'm new to Clojure so I'll have to look at the
source code, etc. and see if this can be done. If it can, then naming
the namespace based on the hash of the requesting URI might be the
best way to go.

This would mean the Clojure runtime accessor would check for the
existence of the namespace, delete it, load the program into the
namespace and then do the Compiler.eval().

Randy

Rich Hickey

unread,
Apr 24, 2008, 9:18:24 AM4/24/08
to Clojure


On Apr 24, 8:13 am, Randy Kahle <randyka...@gmail.com> wrote:
> Steve,
>
> I think you are right about the need to focus on isolation. First is
> correctness - efficiency questions are secondary.
>
> In NetKernel a programming language is a service and all languages are
> treated the same - BeanShell, Ruby, Groovy, JavaScript, XSLT, XQuery -
> they are all languages that run as a service. For example, to run a
> BeanShell program with the BeanShell language service one would
> request the following URI "active:beanshell+operator@ffcpl:/
> myprogram.bsh"
>
> With Clojure the language service (I'm using Tom Hicks' mod_clonk)
> loads the static classes when it is first instantiated as an object in
> its constructor. Ideally, Clojure would not use static classes so it
> can be treated as other language services are within NetKernel.
>

Doesn't NetKernel support Java? Clojure's model is the same as Java's.

> However, given that it is static I will need pursue your idea of using
> namespaces. Ideally I could remove an entire namespace and get a
> pristine context. I'm new to Clojure so I'll have to look at the
> source code, etc. and see if this can be done. If it can, then naming
> the namespace based on the hash of the requesting URI might be the
> best way to go.
>

Yes, you can make unique namespaces and load into them.

> This would mean the Clojure runtime accessor would check for the
> existence of the namespace, delete it, load the program into the
> namespace and then do the Compiler.eval().
>

Well, since you are functional (vs. say something like a Clojure
server pages that might keep the code around between invocations as
long as the source hasn't changed), you might as well delete the
namespace on your way out.

Rich

Tom Hicks

unread,
Apr 26, 2008, 1:37:48 PM4/26/08
to Clojure
Rich,

I could be wrong but I think the problem that we are worried about is
that Clojure
appears to have only one runtime instance. So, if Clojure is embedded
in a
server environment which provides Clojure to multiple users, how do we
provide
the Clojure runtime to multiple users without the users being able to
access/inspect/alter
each other's environments?

As I understand Clojure at the moment (please correct my
misunderstandings):
1) Clojure will correctly handle the threading and concurrent access
to
various data structures for multiple users but this doesn't isolate
the users
in any way from each other.
2) Namespaces can be used to separate users' work but this does not
enforce isolation, since users can switch namespaces and inspect the
contents of other namespaces.
3) The examples of namespace usage that I've found in the code seem
to work on a "push/do work/pop" basis and it isn't clear to me how
that
could be extended to allow a user to work within a context for a long
period of time (an extended session -- such as an interactive
conversation
between a server and a user on a client). Can Clojure be useful in
this
kind of multiple-user server environment?
-tom
Message has been deleted

Rich Hickey

unread,
Apr 26, 2008, 8:01:19 PM4/26/08
to Clojure


On Apr 26, 1:37 pm, Tom Hicks <hickstoh...@gmail.com> wrote:
> Rich,
>
> I could be wrong but I think the problem that we are worried about is
> that Clojure
> appears to have only one runtime instance. So, if Clojure is embedded
> in a
> server environment which provides Clojure to multiple users, how do we
> provide
> the Clojure runtime to multiple users without the users being able to
> access/inspect/alter
> each other's environments?
>
> As I understand Clojure at the moment (please correct my
> misunderstandings):
> 1) Clojure will correctly handle the threading and concurrent access
> to
> various data structures for multiple users but this doesn't isolate
> the users
> in any way from each other.
> 2) Namespaces can be used to separate users' work but this does not
> enforce isolation, since users can switch namespaces and inspect the
> contents of other namespaces.

In these 2 ways Clojure is the same as Java. How do you isolate Java
in the same environment?

> 3) The examples of namespace usage that I've found in the code seem
> to work on a "push/do work/pop" basis and it isn't clear to me how
> that
> could be extended to allow a user to work within a context for a long
> period of time (an extended session -- such as an interactive
> conversation
> between a server and a user on a client). Can Clojure be useful in
> this
> kind of multiple-user server environment?

Sure, you'll just have to work out when to cleanup the per-session
namespace.

Tom Hicks

unread,
Apr 27, 2008, 1:50:35 PM4/27/08
to Clojure

On Apr 26, 5:01 pm, Rich Hickey <richhic...@gmail.com> wrote:
> On Apr 26, 1:37 pm, Tom Hicks <hickstoh...@gmail.com> wrote:
>
> > I could be wrong but I think the problem that we are worried about is
> > that Clojure
> > appears to have only one runtime instance. So, if Clojure is embedded
> > in a
> > server environment which provides Clojure to multiple users, how do we
> > provide
> > the Clojure runtime to multiple users without the users being able to
> > access/inspect/alter
> > each other's environments?
>
> > As I understand Clojure at the moment (please correct my
> > misunderstandings):
> > 1) Clojure will correctly handle the threading and concurrent access
> > to
> > various data structures for multiple users but this doesn't isolate
> > the users
> > in any way from each other.
> > 2) Namespaces can be used to separate users' work but this does not
> > enforce isolation, since users can switch namespaces and inspect the
> > contents of other namespaces.
>
> In these 2 ways Clojure is the same as Java. How do you isolate Java
> in the same environment?

Wow....I was hoping for a little less terse and Socratic reply.
Perhaps you
thought the question was too stupid to answer so I'll try to rephrase
it and
perhaps you can elaborate on where I'm going wrong in my thinking.

For a Java application, I would provide each user with a separate
instance
of the application's central state-holding class. Since Clojure
provides
a language and its environment, I am assuming the central state-
holding
class of Clojure is RT. But, since Clojure uses a single static RT
class,
how can I provide each user with his/her own, non-shared copy of the
"state" of the Clojure environment?

Thanks for your consideration of my question,
-tom

Rich Hickey

unread,
Apr 27, 2008, 2:56:10 PM4/27/08
to Clojure
Please don't read anything into that terseness, I really am interested
in the answer, and needed to know it in order to follow up more fully.
I know nothing about NetKernel, but I do think it is important to
consider Clojure more like Java than the interpreted languages that
were mentioned.

For instance, you could be using classloaders for Java isolation - I
have no idea. But if you were, that same technique could be used to
achieve the same degree of isolation with Clojure instances, including
separate instances of RT.

> For a Java application, I would provide each user with a separate
> instance
> of the application's central state-holding class.

Then the Java instances aren't name-environment-isolated, i.e. they
all share the same set of Java classes, and the same set of static
instances/members of those classes, and any Java name maps to a single
entity. Clojure is the same.

> Since Clojure
> provides
> a language and its environment, I am assuming the central state-
> holding
> class of Clojure is RT. But, since Clojure uses a single static RT
> class,
> how can I provide each user with his/her own, non-shared copy of the
> "state" of the Clojure environment?
>

Here's where you lose me, in treating Clojure's name environment
differently then Java's. The state of the Clojure environment is the
same as the state of the Java environment, a shared set of names bound
to loaded code, i.e. there is no difference between clojure/rest and
java.lang.String.format - both are compiled, get loaded once and are
shared by and visible to everyone (under the same classloader). But
you wouldn't use that name environment for isolated application data
any more than you would use Java static variables for isolated
application data.

If you are able to create a per-user instance of a state-holding class
in Java, you can and should do a similar thing in Clojure. Whatever
mechanism you use to create those instances and make them available to
the sessions in Java can be done similarly in Clojure.

I guess my point is that if you are using named global variables in
those interpreted languages to hold state, and relying on separate
interpreter instances to make them distinct, then that is not the
model to follow for Clojure, since, like Java, it is compiled and has
a single shared name environment.

> Thanks for your consideration of my question,

Sure. You definitely misread me if you thought I considered it any
less valid than the hundreds of other questions I've answered here on
the list.

Rich

Tom Hicks

unread,
Apr 27, 2008, 10:57:38 PM4/27/08
to Clojure

On Apr 27, 11:56 am, Rich Hickey <richhic...@gmail.com> wrote:
> On Apr 27, 1:50 pm, Tom Hicks <hickstoh...@gmail.com> wrote:
>
> Please don't read anything into that terseness, I really am interested
> in the answer, and needed to know it in order to follow up more fully.
> I know nothing about NetKernel, but I do think it is important to
> consider Clojure more like Java than the interpreted languages that
> were mentioned.

OK, thanks, I'll keep that in mind.

> For instance, you could be using classloaders for Java isolation - I
> have no idea. But if you were, that same technique could be used to
> achieve the same degree of isolation with Clojure instances, including
> separate instances of RT.

I should know more about what the 1060 Research guys are doing
but I'm still learning the internals of NetKernel. They do use
classloaders
extensively to implement to the dynamic linking of modules so
perhaps I can use that infra-structure to get the user separation
I am looking for.


> > For a Java application, I would provide each user with a separate
> > instance of the application's central state-holding class.
>
> Then the Java instances aren't name-environment-isolated, i.e. they
> all share the same set of Java classes, and the same set of static
> instances/members of those classes, and any Java name maps to a single
> entity. Clojure is the same.
> ....
> Here's where you lose me, in treating Clojure's name environment
> differently then Java's. The state of the Clojure environment is the
> same as the state of the Java environment, a shared set of names bound
> to loaded code, i.e. there is no difference between clojure/rest and
> java.lang.String.format - both are compiled, get loaded once and are
> shared by and visible to everyone (under the same classloader). But
> you wouldn't use that name environment for isolated application data
> any more than you would use Java static variables for isolated
> application data.
>
> If you are able to create a per-user instance of a state-holding class
> in Java, you can and should do a similar thing in Clojure. Whatever
> mechanism you use to create those instances and make them available to
> the sessions in Java can be done similarly in Clojure.
>
> I guess my point is that if you are using named global variables in
> those interpreted languages to hold state, and relying on separate
> interpreter instances to make them distinct, then that is not the
> model to follow for Clojure, since, like Java, it is compiled and has
> a single shared name environment.

So, after reading the above, I've decided I must have some
misconceptions about what's going on behind the scenes so
I'm going to follow your pointers and go learn some more.
Thanks for the detailed response as it's helping me focus
on what I need to know.


> > Thanks for your consideration of my question,
>
> Sure. You definitely misread me if you thought I considered it any
> less valid than the hundreds of other questions I've answered here on
> the list.

I apologize for being overly sensitive. I am very impressed with
Clojure and I do appreciate your help and feedback as I
try to understand and use it.
regards,
-tom

Tom Hicks

unread,
Apr 28, 2008, 3:26:04 PM4/28/08
to Clojure
So, I started to dig into how NetKernel provides scripting
languages as served resources (using Javascript as the sample)
and immediately found the kinds of programming abstractions
that I had been imagining in our discussion and which I
had been asking about for Clojure. For examples, please see:

http://www.mozilla.org/rhino/apidocs/

and note the documentation for the classes Context, ContextFactory,
and Script. The Rhino engine Context class, for instance, says:

"This class represents the runtime context of an executing script.
Before executing a script, an instance of Context must be created and
associated with the thread that will be executing the script. The
Context will be used to store information about the executing of the
script such as the call stack. Contexts are associated with the
current thread using the call(ContextAction) or enter() methods."

Please read on (below) for embedded comments....


On Apr 27, 11:56 am, Rich Hickey <richhic...@gmail.com> wrote:
> On Apr 27, 1:50 pm, Tom Hicks <hickstoh...@gmail.com> wrote:
>
> > On Apr 26, 5:01 pm, Rich Hickey <richhic...@gmail.com> wrote:
>
> > > On Apr 26, 1:37 pm, Tom Hicks <hickstoh...@gmail.com> wrote:
>
> > > > I could be wrong but I think the problem that we are worried about is
> > > > that Clojure
> > > > appears to have only one runtime instance. So, if Clojure is embedded
> > > > in a
> > > > server environment which provides Clojure to multiple users, how do we
> > > > provide
> > > > the Clojure runtime to multiple users without the users being able to
> > > > access/inspect/alter
> > > > each other's environments?
>
> > For a Java application, I would provide each user with a separate
> > instance
> > of the application's central state-holding class.
>
> If you are able to create a per-user instance of a state-holding class
> in Java, you can and should do a similar thing in Clojure. Whatever
> mechanism you use to create those instances and make them available to
> the sessions in Java can be done similarly in Clojure.

OK but I guess what I'm asking is not whether Clojure CAN do it, but
whether
Clojure DOES do it. Is there any Clojure implementation class (i.e.
written in Java)
which aggregates and holds the execution state of a script on a per-
script
(or per-user) basis? Clearly this kind of state must exist somewhere
within Clojure
classes during the evaluation process (in the RT?, Compiler?, etc?).
Perhaps
the question is then can that state be captured and associated with a
user thread
such that multiple users can be using the Clojure language
simultaneously in
isolation from one another (while running in the same JVM)?


> I guess my point is that if you are using named global variables in
> those interpreted languages to hold state, and relying on separate
> interpreter instances to make them distinct, then that is not the
> model to follow for Clojure, since, like Java, it is compiled and has
> a single shared name environment.

I don't think that these languages use global variables but instances
of a context-holding class. The instances are then associated with
a single user's thread of execution through a script. (I think of this
as
directly analogous to a process' context in an OS). The script
interpreters I have seen provide such an execution context on a per
execution
basis, encapsulated in some kind of Context class, and available
for programmatic use by the underlying implementation language
(in this case by Java).

Does this make any sense? If not, is there some fundamental principle
of Clojure which I am missing here?
seeking enlightenment,
-tom

Steve Harris

unread,
Apr 28, 2008, 4:30:58 PM4/28/08
to Clojure
Tom,
I'm not Rich but from my discussion with him prior to my doing the
initial NK module, I can tell you that Clojure doesn't provide for you
the kind of interpreter-like indirection infrastructure you're seeing
in those scripting languages.

It's better to think of Clojure as a class loader whose preferred
format is not jars but rather clojure source code - loading Clojure
source code is like loading any other Java archives, except they are
stored in source format and compiled on the fly as they are loaded.
For that reason, as Rich pointed out (and as I asked Randy too), it's
better to compare and study the case of NK accessors written in *Java*
than those of the scripting languages. When NK detects a change in
Java source, it compiles it on the fly and reloads the classes, I
believe, IIRC using a separate classloader to gaurantee some kind of
isolation (Randy?) That's the critical case to study - how is
isolation achieved in the Java case? That's what Rich meant by "it's
the same model as Java" I believe...

Steve

Rich Hickey

unread,
Apr 28, 2008, 5:24:53 PM4/28/08
to Clojure


On Apr 28, 3:26 pm, Tom Hicks <hickstoh...@gmail.com> wrote:
> So, I started to dig into how NetKernel provides scripting
> languages as served resources (using Javascript as the sample)

This is a bit frustrating, as I keep repeating that the model to
follow is Java's, and that Clojure is not like these interpreted
languages.
What does not make sense is the presumption that therefore all
languages work that way. Java does not, for instance.

> If not, is there some fundamental principle
> of Clojure which I am missing here?

Yes, you are missing that Clojure is a compiler, not an interpreter.
Thus above where it says

>The
> Context will be used to store information about the executing of the
> script such as the call stack.

is no more true of Clojure than it is of Java, which is to say, not at
all. In fact, Clojure uses Java's call stack, which is inaccessible to
user programs. It does not need any context-like object in order to
work, and thus has none (and neither does Java). Clojure has no notion
of per-script scope (and neither does Java).

That said, Clojure's vars can be thread-locally bound. So, if you
wanted some var, say nk/context, to have a different, isolated value
per thread, you could bind it thread-locally in a block surrounding
the script execution like this:

static final Var context = RT.var("nk", "context");

Var.pushThreadBindings(RT.map(context, somePerScriptState));

try{
//load and run script
}
finally{
Var.popThreadBindings();
}

Each script would see a unique nk/context, and any changes they made
to it would be isolated.

There are likely many ways in which Clojure can be integrated into
NetKernel, which would likely be apparent to someone well versed in
NetKernel and willing to learn about how Clojure actually works, but
further pursuit of the wishing-Clojure-was-an-interpreter path is
unlikely to be productive.

All of this discussion of per-script state in the context of NetKernel
is a bit confusing to me, as I thought that NetKernel had a
fundamentally functional model.

Rich

Tom Hicks

unread,
Apr 28, 2008, 7:27:13 PM4/28/08
to Clojure
On Apr 28, 2:24 pm, Rich Hickey <richhic...@gmail.com> wrote:
> This is a bit frustrating, as I keep repeating that the model to
> follow is Java's, and that Clojure is not like these interpreted
> languages.

I'm sorry to frustrate you...I'll try to stay away from these other
languages.


> That said, Clojure's vars can be thread-locally bound. So, if you
> wanted some var, say nk/context, to have a different, isolated value
> per thread, you could bind it thread-locally in a block surrounding
> the script execution like this:
>
> static final Var context = RT.var("nk", "context");
>
> Var.pushThreadBindings(RT.map(context, somePerScriptState));
>
> try{
> //load and run script}
>
> finally{
> Var.popThreadBindings();
>
> }
>
> Each script would see a unique nk/context, and any changes they made
> to it would be isolated.

I think this is getting very close to what I am struggling toward
except
that I believe that I want to local thread-bind the entire Clojure
"environment": the set of all vars accessible to a script. I believe I
want to do this because then I will have captured the "state" of
the Clojure runtime in each separate user thread and each thread
can then make local modifications which will not affect the state of
the
other users/threads (in fact they will never see the other
modifications).
Does this make sense or is it also "off-base"?

To step back a bit, what I am trying to accomplish is to be able
to have two threads execute the same script in the same JVM
and not to interfere with each other. For Clojure, I'm thinking that
means that they don't change each other's vars?
It is possible that I could achieve this by just loading the Clojure
JAR
file into two separate class loaders but this seems to me to be a
large
overhead with a large startup time. So, I am looking for some "entry
point" within
the existing code base that I can hook into to achieve the same
effect.


> There are likely many ways in which Clojure can be integrated into
> NetKernel, which would likely be apparent to someone well versed in
> NetKernel and willing to learn about how Clojure actually works, but
> further pursuit of the wishing-Clojure-was-an-interpreter path is
> unlikely to be productive.

Hey, come on....not fair....I'm willing to learn how Clojure works but
you must admit there is almost no documentation on how it works
internally and you yourself admitted, in one of your lectures, that
your coding approach (mostly with static methods and few class
instances)
is not the usual object-oriented implementation approach.


> All of this discussion of per-script state in the context of NetKernel
> is a bit confusing to me, as I thought that NetKernel had a
> fundamentally functional model.

Which properties do you mean to connote by "functional"?...Immutable
data structures? Referentially transparent methods?

NetKernel could be best described as a REST-based micro-kernel.
It is written in Java but is heavily oriented toward the use of XML as
a kind of logical data model. It attempts to separate out the notions
of logical and physical level computing. It's hard to describe it as
it is
a bit like an application server crossed with an OS crossed with a
web server (and probably a couple more things in there, too).
I don't think I've seen it described as "functional" but there is
the assumption that most resources will be immutable, although this
is certainly not a requirement of the system as it allows accessors
to implement SINK and DELETE (side-effecting) verbs.

Thanks for the continued dialog...hang in there...don't give up on
me yet...Clojure is a unique system and I'm struggling to understand
it.
regards,
-tom

Rich Hickey

unread,
Apr 28, 2008, 8:54:55 PM4/28/08
to Clojure
It makes no more sense than trying to capture the "state" of Java.

I think the term environment is a loaded one in this context, with
direct connotations for interpreters that don't cross over. In normal
use, most var values will not change, i.e. the only reason to change
the root value of a var is to reload code in order to fix a bug or
provide an enhancement. Vars used for data should usually be thread-
locally bound. Most code should be functional, passing and returning
immutable values.

So a key presumption in this discussion is the need for
'modifications' and I would ask - modifications to what, specifically
(i.e. the answer should not be 'the environment')? What is an example
of something a script would modify? Then we can talk about how to
isolate that.

> To step back a bit, what I am trying to accomplish is to be able
> to have two threads execute the same script in the same JVM
> and not to interfere with each other.

Well, the most straightforward way is to make them functional. I guess
that's part of the problem here - an idiomatic Clojure program
wouldn't need the feature you are requesting.

> For Clojure, I'm thinking that
> means that they don't change each other's vars?

Scripts don't have/own vars. There are vars. They have namespace
segregated global names. Scripts should avoid modifying them if they
are not thread-locally bound. This is the same advice as for all
Clojure programs.

> It is possible that I could achieve this by just loading the Clojure
> JAR
> file into two separate class loaders but this seems to me to be a
> large
> overhead with a large startup time.

I agree.

> So, I am looking for some "entry
> point" within
> the existing code base that I can hook into to achieve the same
> effect.
>
> > There are likely many ways in which Clojure can be integrated into
> > NetKernel, which would likely be apparent to someone well versed in
> > NetKernel and willing to learn about how Clojure actually works, but
> > further pursuit of the wishing-Clojure-was-an-interpreter path is
> > unlikely to be productive.
>
> Hey, come on....not fair....I'm willing to learn how Clojure works but
> you must admit there is almost no documentation on how it works
> internally and you yourself admitted, in one of your lectures, that
> your coding approach (mostly with static methods and few class
> instances)
> is not the usual object-oriented implementation approach.
>

Ok, I suppose I'm frustrated that in this conversation you seem to not
be listening to me when I attempt to explain how it works, i.e. it is
compiled, it works like Java, how the namespaces work etc. It seems
your expectations of how it should work are getting in the way of
understanding how it does work.

> > All of this discussion of per-script state in the context of NetKernel
> > is a bit confusing to me, as I thought that NetKernel had a
> > fundamentally functional model.
>
> Which properties do you mean to connote by "functional"?...Immutable
> data structures? Referentially transparent methods?
>

Both.

> NetKernel could be best described as a REST-based micro-kernel.
> It is written in Java but is heavily oriented toward the use of XML as
> a kind of logical data model. It attempts to separate out the notions
> of logical and physical level computing. It's hard to describe it as
> it is
> a bit like an application server crossed with an OS crossed with a
> web server (and probably a couple more things in there, too).
> I don't think I've seen it described as "functional" but there is
> the assumption that most resources will be immutable, although this
> is certainly not a requirement of the system as it allows accessors
> to implement SINK and DELETE (side-effecting) verbs.
>

Again, I don't know NetKernel, but the functional nature was mentioned
in this thread:

http://groups.google.com/group/clojure/browse_frm/thread/4db8c56a0b8e0a29#

If the system supports stateful multi-conversation interactions then
it must have a place to store the state independent of script
variables.

> Thanks for the continued dialog...hang in there...don't give up on
> me yet...Clojure is a unique system and I'm struggling to understand
> it.

I'm not giving up, just please consider carefully what I am saying in
this dialog. The whole NetKernel concept sounds neat - I'd like there
to be a good story for using Clojure there, as it seems like a good
fit conceptually, and I'll keep trying to help you and the others here
trying to make it work.

Thanks,

Rich

Tom Hicks

unread,
Apr 28, 2008, 11:08:12 PM4/28/08
to Clojure

>On Apr 28, 5:54 pm, Rich Hickey <richhic...@gmail.com> wrote:
> > On Apr 28, 7:27 pm, Tom Hicks <hickstoh...@gmail.com> wrote:
> >
> > I think this is getting very close to what I am struggling toward
> > except
> > that I believe that I want to local thread-bind the entire Clojure
> > "environment": the set of all vars accessible to a script. I believe I
> > want to do this because then I will have captured the "state" of
> > the Clojure runtime in each separate user thread and each thread
> > can then make local modifications which will not affect the state of
> > the
> > other users/threads (in fact they will never see the other
> > modifications).
>
> I think the term environment is a loaded one in this context, with
> direct connotations for interpreters that don't cross over.
> In normal use, most var values will not change, i.e. the only reason to change
> the root value of a var is to reload code in order to fix a bug or
> provide an enhancement.

OK, so I focused on the wrong thing as the "environment". The
"environment"
in Clojure must be the set of all symbol-to-var bindings in all
namespaces.
(this seems to be supported by the following text from the Vars page
under the
Interning section: "...It also means that namespaces constitute a
global environment...").


> > Does this make sense or is it also "off-base"?
>
> It makes no more sense than trying to capture the "state" of Java.

But that does have a reasonable interpretation for me:
I can't capture the "state" of the Java language and
I can't capture the "state" of a Java source code program
but I can certainly capture the "state" of the Java runtime
environment
when it's running on a JVM (Debuggers do it and display it to us).

I want to capture the state of the Clojure runtime environment
at a certain point in its execution and replicate it for a new
thread, which is assigned to a new user so that I don't have
to recreate that state for each user.

> So a key presumption in this discussion is the need for
> 'modifications' and I would ask - modifications to what, specifically
> (i.e. the answer should not be 'the environment')? What is an example
> of something a script would modify? Then we can talk about how to
> isolate that.

OK, take this script:

(if ((somerandombooleanfn))
(def a 88)
(def a 99)
)

Now, in the same JVM, if user A executes this script and,
simultaneously, user B executes this script and the fn returns 'true'
for user A and false for user B, what is the value of the
global symbol 'a'? Obviously, I'm having trouble understanding
Clojure but I believe the following:
1) there is one symbol 'a' in the 'user' namespace after this
code is executed.
2) Neither of the threads bound the symbol 'a' locally while
executing so
3) Clojure has mediated the binding of the symbol
and one of the threads 'won' (i.e it's binding was 'last in' and
is now the current global binding of symbol 'a' (in the 'user'
namespace)).

If (1), (2), and (3) above are true then the two users just had
their programs interfere with each other. That's what I'm
trying to avoid.


> Scripts don't have/own vars. There are vars. They have namespace
> segregated global names. Scripts should avoid modifying them if they
> are not thread-locally bound. This is the same advice as for all
> Clojure programs.

OK, I see the distinction. But scripts create vars and give them root
values? And they create symbols which point to those vars? During
the execution of the program they rebind symbols to point to different
vars? So the set of symbols in all the namespaces is what I am calling
the "environment" and a snapshot of that set at any point in the
execution
of the Clojure runtime environment is the "state" of the program.


> Ok, I suppose I'm frustrated that in this conversation you seem to not
> be listening to me when I attempt to explain how it works,

Oh man....don't confuse not listening with not understanding. :)
I'm definitely guilty of the latter. :)


> It seems
> your expectations of how it should work are getting in the way of
> understanding how it does work.

I apologize. I try not to, but I think we're all guilty of seeing
things
through the lense of things we already know.


> Again, I don't know NetKernel, but the functional nature was mentioned
> in this thread:
>
> http://groups.google.com/group/clojure/browse_frm/thread/4db8c56a0b8e...

I think that was Steve Harris' observation that NK could be used
functionally.
Personally, I think that while NK can be used functionally, it is not
inherently functional but this is arguing over semantics.


> The whole NetKernel concept sounds neat - I'd like there
> to be a good story for using Clojure there, as it seems like a good
> fit conceptually, and I'll keep trying to help you and the others here
> trying to make it work.

Thanks, your help is definitely appreciated.
I think it is a good fit too, if I can just understand a little more
about how it works.
regards,
-tom

Rich Hickey

unread,
Apr 29, 2008, 8:50:48 AM4/29/08
to Clojure
No they don't, not as a single entity. Debuggers let you see
individual locations, but don't create an enumerable/storable/copyable
environment.

> I want to capture the state of the Clojure runtime environment
> at a certain point in its execution and replicate it for a new
> thread, which is assigned to a new user so that I don't have
> to recreate that state for each user.
>

I know very well what you want to do. I've been trying to explain to
you that you can't, and why. Here again, I think you dropped the Java
analogy too soon. Try replacing 'Clojure' with 'Java' above. You can't
do that either. Java names and the storage locations to which they
refer are shared for all code loaded under the same classloader, in
all threads on which that code runs. Same for Clojure.

> > So a key presumption in this discussion is the need for
> > 'modifications' and I would ask - modifications to what, specifically
> > (i.e. the answer should not be 'the environment')? What is an example
> > of something a script would modify? Then we can talk about how to
> > isolate that.
>
> OK, take this script:
>
> (if ((somerandombooleanfn))
> (def a 88)
> (def a 99)
> )
>

That code shows you really haven't gotten the gist of Clojure yet.

> Now, in the same JVM, if user A executes this script and,
> simultaneously, user B executes this script and the fn returns 'true'
> for user A and false for user B, what is the value of the
> global symbol 'a'? Obviously, I'm having trouble understanding
> Clojure but I believe the following:
> 1) there is one symbol 'a' in the 'user' namespace after this
> code is executed.

One _var_ named a.

> 2) Neither of the threads bound the symbol 'a' locally while
> executing so

the _var_ user/a

> 3) Clojure has mediated the binding of the symbol
> and one of the threads 'won' (i.e it's binding was 'last in' and
> is now the current global binding of symbol 'a' (in the 'user'
> namespace)).
>

not symbol, var

> If (1), (2), and (3) above are true then the two users just had
> their programs interfere with each other. That's what I'm
> trying to avoid.
>

Then don't write code like that! The whole point of Clojure is that
programs that depend upon shared mutable state are to be avoided. If
you want script isolation then don't put your state in vars, or, make
sure your script entry point thread-locally binds the var with
'binding'.

Here's a more reasonable script:

(def a) ;a is unbound, will throw exception if accessed without a
binding

(defn script-entry-point [init]
(binding [a init]
... do rest of script, some of which might refer to a))

> > Scripts don't have/own vars. There are vars. They have namespace
> > segregated global names. Scripts should avoid modifying them if they
> > are not thread-locally bound. This is the same advice as for all
> > Clojure programs.
>
> OK, I see the distinction. But scripts create vars and give them root
> values? And they create symbols which point to those vars?

No, your understanding of vars/symbols is off.

Suppose you had this Java code:

public class User{
public static Object a;
...
}

and somewhere else (in a different class) you have this:

void foo()
{
if(User.a != null)
{
...
}
}

When foo is compiled, a symbolic reference to User.a is placed in the
bytecode. At some point prior to foo being accessed, that reference to
User.a is looked up via the classloader that loaded foo and resolved
to a particular location. When foo is called, the symbol/token "a" is
not used to find the location.

Most compilers work the same way - at some point during compilation/
linking/loading, names are resolved to locations and the names are not
used to find the locations during runtime, the resolved locations are
used directly.

Clojure works the same way. While you may see, and even access, the
namespaces/vars environment, that doesn't mean it's being used in the
way you presume. The _compiler_ does a one-time lookup of any vars
(essentially locations) used by a function and resolves all symbols
directly to those vars, and the static initializer of the class that
represents a function places those vars into fields that have been
compiled into the bytecode. Thus, when the function runs, symbols are
not used to look up the locations, the vars are used directly - >there
is no runtime lookup<.

This is the contrast I am trying to make with interpreters, where it
is quite common to pass a name->value environment through every
function call, and all symbolic references are resolved dynamically
via a runtime lookup in the environment. There it is easy to copy
environments and plug them into the call chain.

The tradeoffs are mostly in performance (pre-resolved access being
faster), but also, in Clojure's case, interoperability, since Clojure
code can be called from Java code that would have no knowledge of, or
access to, the environment. Putting environments in thread-local
storage complicates multithreaded programming.

I'm quite happy with the choices I've made for Clojure and the
resulting performance and ease of interop, so it's not going to change
in this area. You simply must consider it a compiled language like
Java.

> During
> the execution of the program they rebind symbols to point to different
> vars?

No, vars can be bound to different values. Every reference to user/a
under the same classloader is resolved to the same var object.

> So the set of symbols in all the namespaces is what I am calling
> the "environment" and a snapshot of that set at any point in the
> execution
> of the Clojure runtime environment is the "state" of the program.
>

...set of vars in all the namespaces...

Therein lies the rub - while you could create a snapshot of the
environment, you would have no mechanism of getting code to use it, as
the environment is not used by code at runtime. (Just like Java's
internal symbol table is not used for runtime access, only during
loading/linking).

Hope that helps,

Rich

Steve Harris

unread,
Apr 29, 2008, 10:26:31 AM4/29/08
to Clojure
> I think that was Steve Harris' observation that NK could be used
> functionally

FWIW, the 1060 docs describe active URI's like we're using here as
"functional programs":
http://docs.1060.org/docs/3.0.0/book/solutiondeveloperguide/doc_guide_activeURI.html

Tom Hicks

unread,
Apr 29, 2008, 11:36:19 AM4/29/08
to Clojure
Thank-you!....that helps tremendously and I see now where I was
going wrong. But, for others like me coming from other backgrounds
(including Lisp and Scheme) let me explore a few of the ramifications
to make sure I understand things correctly. Please correct any
mis-statements below:

1) Named vars really exist only in source code....by load time the
names have been "resolved away" to storage locations
2) New vars cannot be created at runtime (the set of vars was
entirely fixed at compile time)
3) There is no runtime "symbol table" so
4) there is no "intern" function and
5) new symbols may not be created and stored at runtime
6) There is no runtime "environment" (bindings of symbols to
storage locations) that is so prevalent in intepreters (and in
descriptions of Lisp evaluation)
7) vars (storage locations established at compile time) may be
rebound but code should avoid modifying them if they
are not thread-locally bound
8) vars bound to data should always be thread-locally bound


OK, but something is still nagging at me. Named vars are also
used for (bound to) function definitions, right? You didn't like my
example with the data in vars but what about this example:

User A and User B, in the same JVM, each load a script
called "init" from their respective home directories. In A's
script is the definition:

(defn foo [x] (foo (baz x)))

but in B's script it's a different definition:

(defn foo [x y] (bar (foo x) 88 (baz y)))

At compile time, a var for foo is created in the 'user' namespace

the_var_user/foo

but since there's only one....whose definition is loaded?
(Since Clojure resolves entirely to Java then the answer
must be tied to the behavior of the Java classloader?)
regards,
-tom

Tom Hicks

unread,
Apr 29, 2008, 11:43:59 AM4/29/08
to Clojure
Oh thanks Steve...I had not seen that reference but, again,
I think we're just arguing semantics. My advice is to interpret this
statement to say that URIs can be used functionally: as there is
nothing in NetKernel that prevents an accessor, referenced by a URI,
from side-effecting (or even deleting) the resources referenced by
its argument URIs.
regards,
-tom
> "functional programs":http://docs.1060.org/docs/3.0.0/book/solutiondeveloperguide/doc_guide...

Rich Hickey

unread,
Apr 29, 2008, 12:27:52 PM4/29/08
to Clojure
No. Vars are real objects. What resolving does is find them by name,
so that, from then on, all accesses are direct. You can get at a var
object using 'var' or #':

user=> (var first)
#'clojure/first
user=> (class (var first))
class clojure.lang.Var

> 2) New vars cannot be created at runtime (the set of vars was
> entirely fixed at compile time)

Nope. You can create a var at any time with 'def'.

> 3) There is no runtime "symbol table" so

Nope. There is, but it is not used to find vars in compiled code. It
is still used by the compiler (which is available at runtime), and can
be used for reflective purposes (see the various ns-* functions).

> 4) there is no "intern" function and

Nope. def interns.

> 5) new symbols may not be created and stored at runtime

Symbols are not vars. Vars represent locations, symbols are just
names:

user=> 'ethel
ethel
user=> (var ethel)
java.lang.Exception: Unable to resolve var: ethel in this context

user=> (def ethel)
#'user/ethel
user=> (var ethel)
#'user/ethel

> 6) There is no runtime "environment" (bindings of symbols to
> storage locations) that is so prevalent in interpreters (and in
> descriptions of Lisp evaluation)

Again, there is, but it is not used by compiled code to find
locations. Runtime code can perform lookups with resolve:

user=> (resolve 'ethel)
#'user/ethel
user=> (resolve 'lucy)
nil

> 7) vars (storage locations established at compile time) may be
> rebound but code should avoid modifying them if they
> are not thread-locally bound
> 8) vars bound to data should always be thread-locally bound
>

In an app that will run each script once (e.g. not multiple users
running the same script), putting a ref in a var and changing it
transactionally is also fine.

> OK, but something is still nagging at me. Named vars are also
> used for (bound to) function definitions, right? You didn't like my
> example with the data in vars but what about this example:
>
> User A and User B, in the same JVM, each load a script
> called "init" from their respective home directories. In A's
> script is the definition:
>
> (defn foo [x] (foo (baz x)))
>
> but in B's script it's a different definition:
>
> (defn foo [x y] (bar (foo x) 88 (baz y)))
>
> At compile time, a var for foo is created in the 'user' namespace
>
> the_var_user/foo
>
> but since there's only one....whose definition is loaded?

This makes no more sense than expecting 2 pieces of Java code that
both define a Foo class in the User package to coexist. In this case,
either the 2 scripts should be in explicitly different namespaces,
or, if they have no namespace, whatever system is loading user scripts
should load them in separate namespaces.

Any such "multi-user" stuff is outside the scope of Clojure itself.

> (Since Clojure resolves entirely to Java then the answer
> must be tied to the behavior of the Java classloader?)

I never said Clojure resolves entirely to Java, I said they work
similarly. Clojure, for example, is much more dynamic, and its
namespaces are reified and enumerable. You can't enumerate a Java
package, for instance.

Instances of Clojure loaded in separate classloaders are completely
independent, and scripts loaded into each would be similarly so.

Rich

Tom Hicks

unread,
Apr 29, 2008, 2:23:31 PM4/29/08
to Clojure
> > Thank-you!....that helps tremendously and I see now where I was
> > going wrong. But, for others like me coming from other backgrounds
> > (including Lisp and Scheme) let me explore a few of the ramifications
> > to make sure I understand things correctly. Please correct any
> > mis-statements below:
>
> > 1) Named vars really exist only in source code....by load time the
> > names have been "resolved away" to storage locations
>
> No. Vars are real objects. What resolving does is find them by name,
> so that, from then on, all accesses are direct. You can get at a var
> object using 'var' or #':
>
> user=> (var first)
> #'clojure/first
> user=> (class (var first))
> class clojure.lang.Var

Sorry, my sloppy description. I should have said names have been
"resolved away" to objects in memory (the vars).


> > 2) New vars cannot be created at runtime (the set of vars was
> > entirely fixed at compile time)
>
> Nope. You can create a var at any time with 'def'.

OK, but you can't dynamically, on-the-fly, at runtime create a new
var that doesn't correspond to a 'def' call in the original
Clojure source code, right? So, the compiler, by examining
the source code, can determine how many vars will be
created at runtime....right?


> > 3) There is no runtime "symbol table" so
>
> Nope. There is, but it is not used to find vars in compiled code. It
> is still used by the compiler (which is available at runtime), and can
> be used for reflective purposes (see the various ns-* functions).

OK, thanks for clarification.


> > 4) there is no "intern" function and
>
> Nope. def interns.

OK, but it seems like def interns as part of compilation in the sense
that the compiler generates code to create a new var and to intern the
associated symbol in the global namespace map of symbols to vars?
So def is the only way to create an entry in the global namespace map
of
symbols to vars? (i.e. there is no explicit 'intern' function).


> > 5) new symbols may not be created and stored at runtime
>
> Symbols are not vars. Vars represent locations, symbols are just
> names:
>
> user=> 'ethel
> ethel
> user=> (var ethel)
> java.lang.Exception: Unable to resolve var: ethel in this context
>
> user=> (def ethel)
> #'user/ethel
> user=> (var ethel)
> #'user/ethel

OK, thanks for the clarification.


> > 6) There is no runtime "environment" (bindings of symbols to
> > storage locations) that is so prevalent in interpreters (and in
> > descriptions of Lisp evaluation)
>
> Again, there is, but it is not used by compiled code to find
> locations. Runtime code can perform lookups with resolve:
>
> user=> (resolve 'ethel)
> #'user/ethel
> user=> (resolve 'lucy)
> nil

So this means that 'resolve' is just checking for the
existance or non-existance of the symbol, right?
Because, unlike Lisp, symbols are not bound to values?


> > 7) vars (storage locations established at compile time) may be
> > rebound but code should avoid modifying them if they
> > are not thread-locally bound
> > 8) vars bound to data should always be thread-locally bound
>
> In an app that will run each script once (e.g. not multiple users
> running the same script), putting a ref in a var and changing it
> transactionally is also fine.

OK, thanks for the clarification.


> > OK, but something is still nagging at me. Named vars are also
> > used for (bound to) function definitions, right? You didn't like my
> > example with the data in vars but what about this example:
>
> > User A and User B, in the same JVM, each load a script
> > called "init" from their respective home directories. In A's
> > script is the definition:
>
> > (defn foo [x] (foo (baz x)))
>
> > but in B's script it's a different definition:
>
> > (defn foo [x y] (bar (foo x) 88 (baz y)))
>
> > At compile time, a var for foo is created in the 'user' namespace
>
> > the_var_user/foo
>
> > but since there's only one....whose definition is loaded?
>
> This makes no more sense than expecting 2 pieces of Java code that
> both define a Foo class in the User package to coexist. In this case,
> either the 2 scripts should be in explicitly different namespaces,
> or, if they have no namespace, whatever system is loading user scripts
> should load them in separate namespaces.

OK. Of course it will be Clojure itself that is loading the scripts
and
so even separate namespaces will not enforce total isolation
between thescripts.

> Any such "multi-user" stuff is outside the scope of Clojure itself.

I understand.

> > (Since Clojure resolves entirely to Java then the answer
> > must be tied to the behavior of the Java classloader?)
>
> I never said Clojure resolves entirely to Java, I said they work
> similarly. Clojure, for example, is much more dynamic, and its
> namespaces are reified and enumerable. You can't enumerate a Java
> package, for instance.

Ah so...thanks for the clarification.


> Instances of Clojure loaded in separate classloaders are completely
> independent, and scripts loaded into each would be similarly so.

Right and this may be what we ultimately have to do to insure user-
separation
in the sytem. But I really appreciate all your assistance and help in
trying to figure this out. I feel like I understand Clojure a lot
better than before
(although I still have a way to go :).
regards,
-tom

Randy Kahle

unread,
Apr 30, 2008, 5:37:21 PM4/30/08
to Clojure
Steve,

Thank you for this tip. I had not fully understood what you meant when
you suggested that I think of Clojure this way. Making the comparison
with a Java classloader is most interesting.

NetKernel uses a custom highly specialized classloader for each
"module". In NetKernel a module is the unit of physical distribution
but more importantly it creates a private logical address space (sort
of like having your own world wide web as the logical addresses in the
module's address space are URIs). A module may export a portion of its
address space to be imported by other modules. The reason NetKernel
uses a classloader per module is to support the logical level
programming model, support dynamic updating of modules in a 24*7
system and other reasons. External requests are detected by a
transport (HTTP, JMS, anything really) and the transport creates a
corresponding internal request specifying a logical URI address and
issues that into the logical address space of the transport's hosting
module. The microkernel in NetKernel then initiates a two-phase
process, (1) resolution and then (2) bind and execute. During
resolution (using the web analogy it's like a DNS resolution) a search
is initiated for a mapping from the logical address to a physical
endpoint (an endpoint is analogous to a web server). This resolution
search may traverse a path that leads into imported modules and when
the first mapping is found the microkernel then binds the request to
the physical endpoint and schedules work on one of the threads it has
in its pool of worker threads (analogous to a load balancer deciding
which server in a server farm to send the web request to). The binding
exist only for the duration of the request processing, new requests,
even for the same URI follow the exact same two phase process.

A physical endpoint may issue a request back to the microkernel for
information (for example, its configuration information) and these
logical sub-requests go through the same two phase process. This means
that physical code is -not- bound together as it is in a OO
environment. One might think this inefficient but it turns out that
the overhead of resolution, etc. is not high and NetKernel can cache
responses (which are immutable representations of information) using
the URI as the key. The net result is that NetKernel applications run
about 3x to 4x compared to pure Java / J2EE and can scale linearly
with CPU cores (without any threaded programming).

It is important to note that (the vast majority of) physical endpoints
are stateless. They get their entire execution state per request. For
scripted languages in NetKernel a language runtime is a physical
endpoint. Each time a script is to be run, the (probably compiled form
of the) program is passed to the runtime engine and the appropriate
"state" of the full context is provided via a "context" object. In
Java the context object is passed as a parameter to the
processRequest() method, in scripting languages a "context" variable
is injected into the script language's name space. (This is your and
Tom's approach)

Given this context, I think I can comment on parts of the rest of this
thread...

The idiom for script language support in NetKernel is to build a
module that is dedicated to the language (e.g. JRuby) and the language
JAR files are placed in that module. When a script language is run
it's hosting module's classloader loads the language JAR file and
provides compilation services (transparently) so that the result for
the programmer working at the logical level is that language appear as
service:

active:beanshell+operator@ffcpl:/program.bsh

Once the script exits, all local state created by the program is lost.

Physical endpoints written in Java are different because they can hold
state between uses. NetKernel is frugal with objects and will
instantiate only one object per physical endpoint class. For example,
one can write an endpoint to which the entire "session:" scheme is
mapped. If one SINKS to e.g. session:{GUID}+myvar that representation
will be held in memory so that subsequent SOURCE requests for session:
{GUID}+myvar will return the previously "saved" representation.

So, the mode of thinking in NetKernel is one of "islands" of
physically separated object-graphs that support a unified logical
information processing model and which are isolated by a logical-
address-resolving microkernel.

What Tom and I have been trying to figure out is how to isolate state
changes induced by a thread executing a Clojure program from other
threads executing other (or possibly the same) program.

The point of confusion for me has been how to think about Clojure's
model - which relies on static classes to store shared information -
and how to make that work in NetKernel's model - which (works hard to)
prevents such sharing. It may just be that the implementations of the
physical models are in fact diametrically opposed to each other and a
satisfactory approach to integration is not technologically feasible.
This would be a frustratingly unfortunate outcome as it appears that
the intent of Clojure and NetKernel are very very similar if not
potentially complementary.

Rich's example of using thread-bound local store sounds like a good
idea. For this to work:

1) Associate "context" with the in-bound thread
2) Clojure programmers -only- modify state bound to the thread
3) On exit, pop the thread-bound context

So far, so good!

However, after this point things get tricky.

In the ideal world, Clojure could "load and compile" a program and
load it into a distinct classloader and then provide access to that
classloader. On exit from an invocation not only do we want to pop the
thread-local bound variables, we want to remove the compiled code from
Clojure. Or, even better, if the code in Clojure that reads in and
compiles the program returned the classloader into which it compiled
the code then integration with NetKernel would probably be straight
forward.

So - let me ask my question (given all of this context!)... would it
be possible to separate the Clojure read/compile code so that the
dynamic classloader used for the compilation is returned back to
NetKernel?

Thanks -- Randy

Rich Hickey

unread,
Apr 30, 2008, 9:46:37 PM4/30/08
to Clojure


On Apr 30, 5:37 pm, Randy Kahle <randyka...@gmail.com> wrote:

> What Tom and I have been trying to figure out is how to isolate state
> changes induced by a thread executing a Clojure program from other
> threads executing other (or possibly the same) program.
>

Ok. I took a (10 minute) look at NetKernel and some of the script
engines. I still know almost nothing about it, nevertheless, here is
my advice:

Clojure is not a scripting language. By that, I mean that Clojure
doesn't have the notion of a script as an execution unit.

So, a presumption of all these discussions that the Clojure 'scripts'
that you run under NetKernel will look like Jython/Groovy/Ruby scripts
with data variables at the top level, whose 'statements' get run every
time the script runs, is wrong.

Here's the way to think about it - for Clojure, a script gets loaded
(once) and functions get executed (many times, even simultaneously).

NetKernel seems to encourage caching and sharing a script across
multiple/simultaneous invocations until/unless it changes. To the
extent a script contains only code and vars intended to be thread-
locally bound, that should be fine.

A Clojure script destined for NetKernel should be loaded once during
initialise() and a designated function in that script should be run
during execute().

Scripts destined for NetKernel should have no namespace.
ClojureScriptEngine should implement IScriptEngine as follows:

During initialise:

Set the current thread's context classloader to be the supplied
classloader.

Obtain a fresh UUID to name a one-off namespace.

Call Namespace.findOrCreate() to create a namespace with the UUID as
its name.

Use Var.pushThreadBindings to bind *ns* to the one-off namespace.
(See Repl.java)

in a try block:

load the script using Compiler.load(rdr)

You'll need a convention for the name of the function in the script
that is the script entry point for NetKernel - let's say it must be
called 'run'. Find the run var using oneOffNamespace.findInternedVar()
- if it's not there that is a script error. Save the var you get in a
field of the script engine of type Var.

Remove the one-off namespace using Namespace.remove(). This will
allow the code and all related memory to be GCed when the ScriptEngine
is gone. It's not needed in the global namespace registry since no
other namespace could refer to it and all of its internal references
are already resolved.

pop the thread bindings in the finally clause!

During execute:

Set the current thread's context classloader to be the supplied
classloader.

Call invoke() on the run var, passing any args.

You might want to bind some well-known vars to NetKernel things that
scripts might want to access.

There may be some classloader issues we'll have to work out.

Hope that makes sense,

Rich

Tom Hicks

unread,
May 1, 2008, 11:56:14 AM5/1/08
to Clojure

On Apr 30, 6:46 pm, Rich Hickey <richhic...@gmail.com> wrote:
> On Apr 30, 5:37 pm, Randy Kahle <randyka...@gmail.com> wrote:
>
> > What Tom and I have been trying to figure out is how to isolate state
> > changes induced by a thread executing a Clojure program from other
> > threads executing other (or possibly the same) program.
>
> Ok. I took a (10 minute) look at NetKernel and some of the script
> engines. I still know almost nothing about it, nevertheless, here is
> my advice:

Thanks Rich! Your algorithm is very similar to my best attempt
to do this except that you've got two enhancements that I think
might make it really workable.....small comments and questions
below.


>....
> Here's the way to think about it - for Clojure, a script gets loaded
> (once) and functions get executed (many times, even simultaneously).
>
> NetKernel seems to encourage caching and sharing a script across
> multiple/simultaneous invocations until/unless it changes. To the
> extent a script contains only code and vars intended to be thread-
> locally bound, that should be fine.

You're exactly right about NetKernel implementation of scripting
langauges.


> A Clojure script destined for NetKernel should be loaded once during
> initialise() and a designated function in that script should be run
> during execute().
>
> Scripts destined for NetKernel should have no namespace.
> ClojureScriptEngine should implement IScriptEngine as follows:
>
> During initialise:
>
> Set the current thread's context classloader to be the supplied
> classloader.
>
> Obtain a fresh UUID to name a one-off namespace.
>
> Call Namespace.findOrCreate() to create a namespace with the UUID as
> its name.
>
> Use Var.pushThreadBindings to bind *ns* to the one-off namespace.
> (See Repl.java)

Do I also need to execute the equivalent of a Clojure refer to allow
acces to
the main 'clojure' namespace? "(clojure/refer 'clojure)"


> in a try block:
>
> load the script using Compiler.load(rdr)
>
> You'll need a convention for the name of the function in the script
> that is the script entry point for NetKernel - let's say it must be
> called 'run'.

Right. The current implementation of the CloNK module also requires
this and calls it 'main', just in keeping with the NK scripting
modules.


> Find the run var using oneOffNamespace.findInternedVar()
> - if it's not there that is a script error. Save the var you get in a
> field of the script engine of type Var.
>
> Remove the one-off namespace using Namespace.remove(). This will
> allow the code and all related memory to be GCed when the ScriptEngine
> is gone. It's not needed in the global namespace registry since no
> other namespace could refer to it and all of its internal references
> are already resolved.

The fact that we can do this is a major embelishment since it would
seem to solve the problem of "peeking" between namespaces. I
didn't realize it was possible but, now that you say it, it makes
sense in the context of everything else you've told me about Clojure.


> pop the thread bindings in the finally clause!
>
> During execute:
>
> Set the current thread's context classloader to be the supplied
> classloader.
>
> Call invoke() on the run var, passing any args.
>
> You might want to bind some well-known vars to NetKernel things that
> scripts might want to access.

The current CloNK module implementation allows the "pre-loading"
of a Clojure 'initialization' script where this common stuff can be
defined.


> There may be some classloader issues we'll have to work out.
> Hope that makes sense,

It sounds great. I will try it out and let you know how it goes.
Thanks again for all your help on this!
regards,
-tom

Rich Hickey

unread,
May 1, 2008, 7:24:29 PM5/1/08
to Clojure
Probably would be nice for scripts to know that has been done. Just
make sure you follow Repl.java. Nothing about this hosting should
require eval.

Rich
Reply all
Reply to author
Forward
0 new messages