Help sought on issue with AOT, hadoop, classloaders, and consistency of Clojure fn classes

241 views
Skip to first unread message

Jason Wolfe

unread,
Jan 29, 2015, 2:39:54 AM1/29/15
to clo...@googlegroups.com
First off, I apologize in advance for not having a reduced test case, and express my sincere gratitude in advance for any assistance.  I've been tearing my hair out for a day or so and not making headway, and figured someone here might recognize some keywords and have a pointer in the right direction. (I'm admittedly pretty green when it comes to class loading, and have largely exhausted my google fu).  

Problem: 

I'm submitting a hadoop job using clojure-hadoop.  All is well with a simple job, but once I require something that transitively depends on Schema, I end up with: 

clojure.lang.Compiler$CompilerException: java.lang.IllegalArgumentException: No implementation of method: :walker of protocol: #'schema.core/Schema found for class: clojure.core$long, compiling:(crane/config.clj:33:4)

It works fine when run in-process with hadoop-mapreduce-client-jobclient, but not with bin/hadoop -jar.  This stunk of a classloader issue, and after digging in it seems that there are multiple versions of clojure.core$long floating around.  The version on which the protocol is extended is not the same class for the fn that the symbol 'long resolves to in client code.

Context: 

clojure-hadoop is AOT-compiled, and after being loaded by hadoop it dynamically loads the target namespace (not AOT-compiled, nor any other of the code in question) using https://github.com/alexott/clojure-hadoop/blob/master/src/clojure_hadoop/load.clj#L3

From here, schema is transitively required, and then client namespaces attempt to use the Schema protocol to generate validators, and when the schema 'long is used (which resolves to the fn with class clojure.core$long), it fails to find the appropriate method.  

After repeated head-bashing, I've determined that there are (at least two) versions of the clojure.core$long class floating around -- the one used to extend the protocol, which stems from a DynamicClassLoader, and the one that 'long resolves to in client code, which stems from a URLClassLoader.  The URLClassLoader is the loader of the current thread and Compiler, but not @(clojure.lang.Compiler/LOADER).

Attempts:

I've tried wrapping the clojure-hadoop loading code with .setContextClassLoader on some obvious candidates and binding *use-context-classloader* around the code doing the loading, with no avail.  I've tried changing the schema code to reference the class in different ways (class (resolve 'long)), (class 'long), etc and that hasn't made a difference.  I've checked and the clojure-hadoop jar doesn't contain any .class files for clojure, schema, or other offending code.  

Plea:

I suspect there's something obvious I'm missing.  (In retrospect it seems like the design of Schema may be suboptimal in light of this, but if possible I'd like to figure out a workaround without changing that substantially). Thanks in advance for your help -- any and all pointers are welcome.  

-Jason



 

Sean Corfield

unread,
Jan 29, 2015, 12:45:25 PM1/29/15
to clo...@googlegroups.com
Which version of Clojure are you using?

Does clojure-hadoop or Schema include AOT-compiled versions of other libraries and/or core namespaces?

If the answers are 1.7.0 Alpha 5 and "yes" then you've run into the same problem I and a few others did: the previously undefined behavior of loading both AOT and JIT versions of the same code now has a defined behavior (preferring AOT) - and you get this exception.

Sean

Jason Wolfe

unread,
Jan 29, 2015, 1:26:58 PM1/29/15
to clo...@googlegroups.com
On Thursday, January 29, 2015 at 9:45:25 AM UTC-8, Sean Corfield wrote:
Which version of Clojure are you using?

1.6.0, both for AOT and at runtime.
 

Does clojure-hadoop or Schema include AOT-compiled versions of other libraries and/or core namespaces?

No, as far as I know the only AOT compiled code present is clojure-hadoop and possibly the Clojure jar itself.  We may depend on other libs that are AOT as well, but none that live above schema.  
 

If the answers are 1.7.0 Alpha 5 and "yes" then you've run into the same problem I and a few others did: the previously undefined behavior of loading both AOT and JIT versions of the same code now has a defined behavior (preferring AOT) - and you get this exception.

I guess this is probably a different issue then, but maybe there's a common workaround?  

Thanks for the quick reply!

shlomi...@gmail.com

unread,
Jan 29, 2015, 7:18:15 PM1/29/15
to clo...@googlegroups.com
Hey, 
Just to be sure, are you loading the full uberjar to hadoop? 

Jason Wolfe

unread,
Jan 29, 2015, 11:56:31 PM1/29/15
to clo...@googlegroups.com


On Thursday, January 29, 2015 at 4:18:15 PM UTC-8, shlomi...@gmail.com wrote:
Hey, 
Just to be sure, are you loading the full uberjar to hadoop? 

Yes.  The issue isn't too few classes being found, it's too many :).  

If I start an nrepl server inside the job and futz around with loading and reloading a few times I can get all the code to load, but haven't figured out yet if it's possible to turn this into a klutzy workaround...

Jason Wolfe

unread,
Jan 30, 2015, 1:59:17 AM1/30/15
to clo...@googlegroups.com
Thanks for the help everyone.  After some more fiddling, with two changes I can hack things to work but it's not pretty.


Second, if I just 
(require 'top-level-ns) 

things still break, but if I execute a careful sequence of steps like
(require 'schema.core 'another-ns 'yet-another-ns 'top-level-ns) where the intermediate ones are the place things were breaking, then everything works fine.  

I'm guessing this is something to do with a nested sequence of classloaders created along the namespace loading chain, which this effectively flattens out so that the problem namespaces are loaded with the root classloader.  But I'm admittedly out of my depth here.  We'd obviously like to avoid such hackery; does this extra info give enough context for anyone to offer a better solution?

Thanks!
Jason 


On Wednesday, January 28, 2015 at 11:39:54 PM UTC-8, Jason Wolfe wrote:

Marshall Bockrath-Vandegrift

unread,
Jan 30, 2015, 10:27:12 AM1/30/15
to clo...@googlegroups.com
Not a solution to your immediate problem, but if this is for new development (not an existing mass of clojure-hadoop code), I'd suggest looking at Parkour instead.  As the main Parkour developer I'm obviously biased, but Parkour exists in part because the compilation model used by clojure-hadoop in order to meet Hadoop's expectations is very much at odds with typical Clojure development. In particular, Parkour does not require AOT compilation.

Jason Wolfe

unread,
Jan 30, 2015, 5:00:31 PM1/30/15
to clo...@googlegroups.com
On Friday, January 30, 2015 at 7:27:12 AM UTC-8, Marshall Bockrath-Vandegrift wrote:
Not a solution to your immediate problem, but if this is for new development (not an existing mass of clojure-hadoop code), I'd suggest looking at Parkour instead.  As the main Parkour developer I'm obviously biased, but Parkour exists in part because the compilation model used by clojure-hadoop in order to meet Hadoop's expectations is very much at odds with typical Clojure development. In particular, Parkour does not require AOT compilation.


Thanks for the recommendation.  For now we're looking for a simple low-level interface to MR, but we're also keeping an eye on parkour and pigpen for more complex tasks down the road.  Can you explain why I might prefer parkour to pigpen or vice-versa? 

Marshall Bockrath-Vandegrift

unread,
Feb 2, 2015, 10:12:14 AM2/2/15
to clo...@googlegroups.com
On Friday, January 30, 2015 at 5:00:31 PM UTC-5, Jason Wolfe wrote:
Thanks for the recommendation.  For now we're looking for a simple low-level interface to MR, but we're also keeping an eye on parkour and pigpen for more complex tasks down the road.  Can you explain why I might prefer parkour to pigpen or vice-versa? 

Parkour actually is a low-level interface to MR.  It just exposes that interface though relatively Clojure-idiomatic and composable abstractions, which can make it look higher-level than it really is.  At Parkour's core is the support required for MR tasks to invoke a regular Clojure var-bound function in place of the `.run` method of a `Mapper` or `Reducer` class.  Everything else in Parkour is built to make using that primitive, low-level interface more composable, convenient, and pleasant; but ultimately nothing replaces that interface -- your Parkour MR Clojure task code runs in exactly the way equivalent raw Hadoop MR Java task code would.

Parkour's documentation includes a "motivation" document describing the project motivation in the face of the Clojure-Hadoop integration projects which existed when I started Parkour (including clojure-hadoop): https://github.com/damballa/parkour/blob/master/doc/motivation.md . It doesn't yet cover PigPen, although I certainly should add a section. I honestly haven't evaluated PigPen in detail, but the approach of compiling Clojure code to Pig seems excessively complex to me, to the point of only be worth it for organizations which have already made a significant investment in Pig.

-Marshall
 

Jason Wolfe

unread,
Feb 2, 2015, 1:42:28 PM2/2/15
to clo...@googlegroups.com
Ah, that's very interesting -- thanks for the explanation.  I think when I saw the similarity to pigpen syntax I just assumed parkour was also compiling to Pig.  I'll definitely have to look into parkour further.  

Cheers,
Jason
 

-Marshall
 

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to a topic in the Google Groups "Clojure" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/clojure/bqGU3VRNFhY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to clojure+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages