Inconsistent AOT classnotfoundexception

476 views
Skip to first unread message

Austin Haas

unread,
Sep 10, 2019, 10:44:47 PM9/10/19
to Clojure
I have to use AOT for a project, because it uses Apache Beam/Google Dataflow.

There are two libraries: A and B. A depends on B. Both require AOT.

B uses :aot :all in the project.clj :dev profile, so that it will AOT for tests, but not for the jar.

A uses :aot :all at the top level of its project.clj.

Sometimes, when the program runs, I will get a ClassNotFoundException. The file name for the missing class has a number appended to it (same number every time). The corresponding class file is in the target directory, but the appended number is much higher.

What is going on here? Are my classes being compiled multiple times? Are they being compiled in a nondeterministic order?

Any clues would be appreciated.

So far, this code always works when I build and install locally. It seemed to fail every time after I built it on our build server and served it though a jar repository, but it started working a few hours later when I got on VPN. And then it started throwing the exception again when I brought it in as a dependency of a new project.

Thanks.

Kimmo Koskinen

unread,
Sep 12, 2019, 1:37:24 AM9/12/19
to Clojure
Hi!

Not a direct answer, but have you looked at clj-headlights https://github.com/logrhythm-oss/clj-headlights, a Apache Beam wrapper for Clojure. It might have pointers related to Beam/AOT specifically.

- Kimmo

atdixon

unread,
Sep 12, 2019, 5:41:36 PM9/12/19
to Clojure
Interesting! I had not seen clj-headlights but my org is using Beam + Clojure and we've made some similar decisions as clj-headlights. Specifically we are avoiding AOT and its headaches,.

Austin Haas

unread,
Sep 17, 2019, 12:38:45 AM9/17/19
to Clojure
Thanks for your replies.

I've looked at clj-headlights a bunch, and datasplash, too. I was mistaken to think that AOT was necessary.  Earlier in the project, AOT simplified a few things, like affording the use of anonymous functions (in ParDo implementations), and I don't think I realized until now that we had a made a severe tradeoff.

I still don't know what is going on with the AOT and the loading, but I'm optimistic that I can bypass the issue entirely by avoiding AOT.

-austin

Austin Haas

unread,
Oct 22, 2019, 12:02:09 AM10/22/19
to Clojure

We ended up sticking with AOT (for now, anyway), because it seems easier to manage in the codebase. The alternative is to use data structures that can be eval'd, like you would use in the body of a macro. I like how that clearly separates the code that runs on the local machine from that which runs on the server, but our pipeline is dynamic and there is a lot of function composition, and it seems unwieldy to work with code-as-data on such a large scale. It is like having several 500 line macros that each depend on a bunch of "code fragment" emitting functions.

Regarding our compilation issue, we've narrowed the problem to the timestamps inside our library jars. Our build server is using GMT/UTC and our local workstations are using PT (currently, 7 hrs behind GMT). If I understand correctly, Lein checks timestamps to determine which files need to be recompiled, and our library jars will have later timestamps than anything new for 7 hours. I don't understand what is happening, but the result is that we get unbound function or class not found exceptions. It may have something to due with how timestamps are represented in jars. We have confirmed that the clock and timezone is set correctly on the server.

Any clues?
Reply all
Reply to author
Forward
0 new messages