two pass compilation, intern and def

298 views
Skip to first unread message

Phillip Lord

unread,
Sep 19, 2013, 3:27:51 AM9/19/13
to clo...@googlegroups.com


In a clean repl, try the following...

(use 'clojure.test)
(is (do (intern 'user 'bob2 2) bob2))
(is (do (def bob3 3) bob3))

The first is form breaks with a "Unable to resolve symbol" error
at the compilation stage. The second one, on the other hand is quite
happy.

Now, I guess what is happening is this. At the compilation stage,
clojure has identified that

(def bob3 3)

happens before

bob3

so, everything is fine. But the same is not happening with the intern
statement. But surely it should be? After all

(intern 'user 'bob3 3)
bob3

is perfectly happy.

I want to use intern because I can do stuff like

(let [nm "bob4"
vl 4]
(intern 'user (symbol nm) vl))

which is harder to do with def.

Any advice?

Phil

Meikel Brandmeyer (kotarak)

unread,
Sep 19, 2013, 5:06:21 AM9/19/13
to clo...@googlegroups.com
Hi,

Clojure's compile unit is one toplevel form. Therefore


(intern 'user 'bob3 3)
bob3


works, while


(is (do (intern 'user 'bob2 2) bob2))

does not, because the former are two compilation units while the latter is only one. (Note: (do ...) is a special case. (do (intern 'user 'bob3 3) bob3) should actually work.)

Then there is a separate effect in that def immediately creates the Var when encountered during compilation. That explains why the def variant works as well. (And as a side-note: that's also why you don't do (defn foo [] (def a ...) ...))

Kind regards
Meikel

Phillip Lord

unread,
Sep 19, 2013, 5:35:45 AM9/19/13
to clo...@googlegroups.com
"Meikel Brandmeyer (kotarak)" <m...@kotka.de> writes:
> Clojure's compile unit is one toplevel form. Therefore
>
> (intern 'user 'bob3 3)
> bob3
>
> works, while
>
> (is (do (intern 'user 'bob2 2) bob2))
>
> does not, because the former are two compilation units while the latter is
> only one. (Note: (do ...) is a special case. (do (intern 'user 'bob3 3)
> bob3) should actually work.)

Yep, do on it's own does work.

The problem, here, then is that a chunk of code like

(do (intern 'user 'bob3 3) bob3)

Sometimes works, and sometimes doesn't. And whether it does or does not
depends on it's context. As I said in the last post, I'd worked out why
the immediate reason it fails. But I cannot understand from looking at
the code why

(do (intern 'user 'bob3 3) bob3)

is two compilation units, while

(is (do (intern 'user 'bob3 3) bob3))

or

(try (do (intern 'user 'bob3 3) bob3))

are both one (the latter fails also).

Seems rather like a bug to me. If the compiler can identify that

(is (do (def bob3) bob3))

is valid, the same should be true for an intern form.

Phil

Gary Verhaegen

unread,
Sep 19, 2013, 6:01:15 PM9/19/13
to clo...@googlegroups.com
As Meikel said in his previous mail, 'do' at the top-level is treated specially: each form is treated as a separate top-level form. This is, for example, useful for defining a macro that defines multiple functions.

So what Meikel was really trying to say is that the reason (do (intern 'user 'bob3 3) bob3) works is that it is treated the same as
(intern 'user 'bob3 3)
bob3

This special handling only occurs when do appears as a top-level form, which is the reason why your other examples fail.
--
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Colin Fleming

unread,
Sep 20, 2013, 1:05:30 AM9/20/13
to clo...@googlegroups.com
This is interesting - are there any other cases where forms are treated specially at top-level?

Meikel Brandmeyer

unread,
Sep 20, 2013, 3:49:02 AM9/20/13
to clo...@googlegroups.com
Not that I am aware of. do is special to allow macros expanding into several defs where the later depend on the former.

(defmacro foo [a b]
  `(do
     (def ~a 5)
     (def ~b ~a)))

This is necessary, because a macro can only return one form.

And the name resolution for ns is special-cased, IIRC.

Meikel



2013/9/20 Colin Fleming <colin.ma...@gmail.com>
You received this message because you are subscribed to a topic in the Google Groups "Clojure" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/clojure/JMNLZ5slcqI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to clojure+u...@googlegroups.com.

Phillip Lord

unread,
Sep 20, 2013, 3:52:07 AM9/20/13
to clo...@googlegroups.com
Gary Verhaegen <gary.ve...@gmail.com> writes:

> As Meikel said in his previous mail, 'do' at the top-level is treated
> specially: each form is treated as a separate top-level form. This is, for
> example, useful for defining a macro that defines multiple functions.


I'd be interested to see an example, where this behaviour is important.
It is certainly producing a negative effect here. Or perphaps I should
say it is covering up a negative effect; given that the compiler is
capable of understanding that

(def bob3 3)

introduces a new symbol, I am struggling to see why

(intern 'user 'bob3 3)

cannot be recognised similarly.


> So what Meikel was really trying to say is that the reason (do (intern
> 'user 'bob3 3) bob3) works is that it is treated the same as
> (intern 'user 'bob3 3)
> bob3
>
> This special handling only occurs when do appears as a top-level form,
> which is the reason why your other examples fail.


I did manage to work out why they failed (although only be inference),
but the exceptional behaviour of do confused me as it confounds the
explanation. I'm happy that I understand now the situation. Still think
it's a bug.

Phil

Meikel Brandmeyer

unread,
Sep 20, 2013, 4:03:25 AM9/20/13
to clo...@googlegroups.com
Hi,

2013/9/20 Phillip Lord <philli...@newcastle.ac.uk>


(def bob3 3)

introduces a new symbol, I am struggling to see why

(intern 'user 'bob3 3)

cannot be recognised similarly.

Because intern happens at runtime. It's a normal function. The above intern call is easily translated to the def. intern is only interesting when bob3 is computed based on runtime information, which is not available at compile time. But then the compile cannot special case it anyway.

Meikel

Phillip Lord

unread,
Sep 20, 2013, 6:30:34 AM9/20/13
to clo...@googlegroups.com
From a user perspective, I think that this is very messy. I can see your
example about "do", but then I think using a common form like do is a
mistake; having "do" behave differently depending on whether it is top
level or not is exposing a lot of detail that should not be.

Phil

Gary Verhaegen

unread,
Sep 21, 2013, 12:00:56 PM9/21/13
to clo...@googlegroups.com
Well, do actually means "execute all of the following forms, sequentially, at this point". It seems to me that the position of saying this also works when "this point" is the top-level is sensible.

I do not know what your actual use-case is, but, as Meikel said, it seems a bit strange to use intern if you know the symbol at compile time.

Phillip Lord

unread,
Sep 23, 2013, 4:09:47 AM9/23/13
to clo...@googlegroups.com


Gary Verhaegen <gary.ve...@gmail.com> writes:
> Well, do actually means "execute all of the following forms, sequentially,
> at this point". It seems to me that the position of saying this also works
> when "this point" is the top-level is sensible.

Perhaps; what does not seem sensible is that it *only* works at
top-level. As I gave in my original email

(do (intern 'user 'x 1)
x)

works, but

(try
(do (intern 'user 'y 1)
y))

fails.


> I do not know what your actual use-case is, but, as Meikel said, it seems a
> bit strange to use intern if you know the symbol at compile time.

Yes, I didn't explain that.

So, I have a library which provides a load of "def" forms which all
intern. This is implemented through a macro which expands to a def form.


(defmacro intern-owl
"Intern an OWL Entity. Compared to the clojure.core/intern function this
signals a hook, and adds :owl true to the metadata. NAME must be a symbol"
([name entity]
;; we use def semantics here, rather than intern-owl-string, because
;; intern is not picked up by the compiler as defining a symbol
`(tawny.owl/run-intern-hook
(def ~(vary-meta name
merge
{:owl true})
~entity))))


As you can see, I add some metadata and run a hook function. This is my
main use case; it works because I know the symbol at compile time.

However, in a secondary use case, I do not know the symbol and have to
calculate it. It looks like this....

(defn intern-owl-string
"Interns an OWL Entity. Compared to the clojure.core/intern function this
signals a hook, and adds :owl true to the metadata. NAME must be a strings"
([name entity]
(tawny.owl/run-intern-hook
(intern *ns*
(with-meta
(symbol name)
{:owl true})
entity))))

So, now we have two problems. The first is that I have two nearly
identical functions. And the second, intern-owl-string will fail under
difficult to predict circumstances (for someone who does't know the guts
of the compiler).

As it happens, the second is mostly a pain during testing -- and I can
work around it -- this form for instance works okay.

(try (let [sym 'b] (intern 'user sym 1)(var-get (get (ns-publics *ns*) sym))))

and does the same as this.

(try (do (intern 'user 'y 1) y))

Phil

Meikel Brandmeyer (kotarak)

unread,
Sep 23, 2013, 4:51:11 AM9/23/13
to clo...@googlegroups.com
Hi,

don't get me wrong: I don't want to discuss away your use case! But I don't understand it, yet. After your calling intern, do you compile other code referencing the created Vars? If no: why do you need Vars? If yes: why don't you just generate code with a def and compile it?

Meikel

Phillip Lord

unread,
Sep 23, 2013, 5:37:10 AM9/23/13
to clo...@googlegroups.com


"Meikel Brandmeyer (kotarak)" <m...@kotka.de> writes:
So, I have build a library which allows me to generate logic statements,
which can then be computing over. So:

(defclass A)
(defclass B :subclass A)

which means, all instances of B are also instances of A. defclass is
implemented with a macro which underneath expands to a def form. The
vars in this case (A and B) hold Java objects representating these
statements.

These statements can be saved in an XML representation called OWL, which
is a W3C standard. Most of the people generating OWL files are not using
my library; so I need to be able to read these OWL files and interact
with them in manner which is similar to if I had written them with my
library. So I parse the XML file, generate some Java objects, then
use intern to generate vars. So, now I can refer to an OWL file in
exactly the same way as if it were written in Clojure. All of this
works, except for the problems I have had with testing I cannot load
vars on the fly.



I have a related problem when I want to test a renderer that I have
written which generates clojure code (again, representing logical
statements). After rendering the Clojure, I then run a require the
rendered code. Nice idea, but fails -- you can see an exemplar of the
problem here:


user=> clojure.set/difference
CompilerException java.lang.ClassNotFoundException: clojure.set, compiling:(NO_SOURCE_PATH:0:0)
user=> (do (require 'clojure.set) clojure.set/difference)
#<set$difference clojure.set$difference@6ee76fcc>

(restart repl)


user=> clojure.set/difference
CompilerException java.lang.ClassNotFoundException: clojure.set, compiling:(NO_SOURCE_PATH:0:0)
user=> (try (require 'clojure.set) clojure.set/difference)
CompilerException java.lang.ClassNotFoundException: clojure.set, compiling:(NO_SOURCE_PATH:2:1)


In the end, my work around was equivalent to this...

(try (require 'clojure.set) (eval 'clojure.set/difference))
#<set$difference clojure.set$difference@63ae6098>

Tell me if all this is making you feel queasy. In either circumstance,
the code *should* run, although I understand why it does not. This
strong separation of undynamic compilation and a dynamic eval seems
unnatural to me.

Anyway, I do have a work-arounds to be going on with, and I suspect that
it's not going to change because it's depths of clojure stuff.

Phil

Meikel Brandmeyer (kotarak)

unread,
Sep 23, 2013, 7:02:32 AM9/23/13
to clo...@googlegroups.com
Hi,


Am Montag, 23. September 2013 11:37:10 UTC+2 schrieb Phillip Lord:
So, now I can refer to an OWL file in
exactly the same way as if it were written in Clojure.

Ok. So, it is correct to think of this as a fancy require-owl, isn't it? Then this means you read the OWL files at compilation time of your code. But that means, that a macro expanding to a series of def will do the job. There is no need for a require. I could imagine something like this (a lot of pseudo-code there):

(defmacro defowl
  [sym init]
  `(do
     (def ~(with-meta sym {:owl true}) ~init)
     (tawny.owl/run-intern-hook (var ~sym))
     (var ~sym)))

(defn owl-entry-definition
  [{:keys [sym init]}]
  `(defowl ~(further-process sym) ~init))

(defmacro require-owl
  [file]
  (let [owl-entries (owl-seq (io/reader file))]
    `(do ~@(map owl-entry-definition owl-entries))))

; Usage
(require-owl "containing-A-and-B.owl")

(defn use-classes
  []
  (do-something-owly A B))


This could be further cleaned up, but I think you get the idea. (One could add for example namespace handling for owl files or provide it as an option to require-owl, use with-open &c. &c.)

For testing, resolve and ns-resolve are your friend:

(deftest loading-owls
  (require-owl "some-dummy-A.owl")
  (let [dummy-A (resolve 'A)]
    (is (not (nil? dummy-A)))
    (is (= @dummy-A some-reference-A))))



Hope that helps.

Meikel

Phillip Lord

unread,
Sep 23, 2013, 9:20:29 AM9/23/13
to clo...@googlegroups.com
"Meikel Brandmeyer (kotarak)" <m...@kotka.de> writes:
> Am Montag, 23. September 2013 11:37:10 UTC+2 schrieb Phillip Lord:
>>
>> So, now I can refer to an OWL file in
>> exactly the same way as if it were written in Clojure.
>>
>
> Ok. So, it is correct to think of this as a fancy require-owl, isn't it?
> Then this means you read the OWL files at compilation time of your code.
> But that means, that a macro expanding to a series of def will do the job.
>
> (defmacro defowl
> [sym init]
> `(do
> (def ~(with-meta sym {:owl true}) ~init)
> (tawny.owl/run-intern-hook (var ~sym))
> (var ~sym)))
>
> (defn owl-entry-definition
> [{:keys [sym init]}]
> `(defowl ~(further-process sym) ~init))
>
> (defmacro require-owl
> [file]
> (let [owl-entries (owl-seq (io/reader file))]
> `(do ~@(map owl-entry-definition owl-entries))))
>
> ; Usage
> (require-owl "containing-A-and-B.owl")
>
> (defn use-classes
> []
> (do-something-owly A B))


Yes, this works, and it's a good idea. My only criticism would be that
it's forcing me to do rather more with macros than I want -- the
"require-owl" step actually filters entities from the OWL file (not all
of them get vars) and transforms them (the legal name restrictions for
OWL are not the same as clojure). This also has implications for how
require-owl works; so

(def x "containing-A-and-B.owl")
(require-owl x)

won't work, while it would with a function.

I was a bit concerned this might hit the limit of the size of forms that
you see when load'ing code, but this doesn't happen -- I've just done
1,000,000 def statements in a single macro and it all seems happy.

Thanks for the feedback!

Phil

Meikel Brandmeyer (kotarak)

unread,
Sep 23, 2013, 9:33:27 AM9/23/13
to clo...@googlegroups.com
Hi,

you are right. require-owl also works as a function:

(defn require-owl
  [file & {nspace :namespace :or {nspace *ns*}}]
  (with-open [rdr (io/reader file)]
    (doseq [entry (owl-seq rdr)
            :let [entry-name (translate-name (:name entry))]]
      (intern nspace entry-name (:init entry))
      (alter-meta! (ns-resolve nspace entry-name) assoc :owl true))


This should work as before. I think the important part for your question is actually the ns-resolve to fix the tests.

Meikel

Phillip Lord

unread,
Sep 23, 2013, 10:09:08 AM9/23/13
to clo...@googlegroups.com
"Meikel Brandmeyer (kotarak)" <m...@kotka.de> writes:
Nope that fails as well. intern already returns the var, so there is no
point looking it up with ns-resolve And because you've used intern, this
will still fail in the original case I was complaining about...

(do (require-owl "bob.owl")
bob)

(where bob is defined by the require-owl) unless the `do` is a
top-level.

(do (require-owl "bob.owl")
(var-get (ns-resolve 'bob 'bob)))

would work.

See, it's counter-intuitive isn't it? I've tried this backwards and
forwards now. intern is much better when you don't know the symbol at
compile-time because the first parameter evals, unlike def, but then you
get these random compilation failures.

AFAICT, this is an inevitable outcome of Clojure's non-interning
compiler. At least, I think that the documentation of intern, do and
defn needs to be updated, even if it is an edge-case.

Phil
Reply all
Reply to author
Forward
0 new messages