Confusing ClassLoader problem running Cascading 3.1.0/Tez 0.8.3 on EMR 4.7.0

24 views
Skip to first unread message

Luis Casillas

unread,
Jun 10, 2016, 10:16:59 PM6/10/16
to cascading-user
(This is less of a question and more of an postmortem infodump on a hard-to-diagnose issue I just ran into.  But suggestions are certainly welcome!)

Recent EMR versions support Tez, so I just spent a very confusing while trying to get a moderately complex in-house application to run with Cascading 3.1.0/Tez on EMR 4.7.0 (Tez 0.8.3).

The first obstacle I ran into: my application is using the Hadoop 2.6.0+ `HADOOP_USE_CLIENT_CLASSLOADER=true` feature.  This environment variable instructs the `hadoop jar` command to place the contents of the application JAR file, its lib/ directory and the HADOOP_CLASSPATH environment variable ahead of Hadoop's own jars.  I've found this feature necessary with my application because Hadoop ships with old, buggy versions of Avro.

But this feature and Tez don't play along at all.  I've filed a bug against Tez that details the issue:
But to make a long story short:
  • I got NoClassDefFoundErrors, which you'd first think that my application's classpath is wrong;
  • But actually, after troubleshooting I see that EMR 4.7.0 sets up the Tez classpath correctly as far as I can tell;
  • Moreover, the application was successfully loading other classes from the same jar file of the (supposedly) missing class;
So the cause actually turns out to be ClassLoader shenanigans.  I found a way to get past them (see my comment to the ticket).

Andre Kelpe

unread,
Jun 13, 2016, 6:15:01 AM6/13/16
to cascading-user
Thanks for sharing this. I can imagine how much "fun" it was to debug this one..

- André
> This message and any files or text attached to it are intended only for the
> recipients named above, and contain information that is confidential or
> privileged. If you are not an intended recipient, you must not read, copy,
> use or disclose this communication. Please also notify the sender by
> replying to this message, and then delete all copies of it from your system.
>
> Este mensaje y cualquier archivo o texto adjunto es dirigido solamente a los
> destinatarios especificados en el encabezado y contiene información
> confidencial y/o privilegiada. Si usted no es el destinatario no deberá
> leer, copiar, usar o divulgar el contenido. Por favor notifique al
> remitente, respondiendo a esté mensaje y elimine todas las copias del mismo
> de su sistema.
>
> --
> You received this message because you are subscribed to the Google Groups
> "cascading-user" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to cascading-use...@googlegroups.com.
> To post to this group, send email to cascadi...@googlegroups.com.
> Visit this group at https://groups.google.com/group/cascading-user.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/cascading-user/74a33199-1b0b-478b-b9c9-a0e7eb7004a4%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



--
André Kelpe
an...@concurrentinc.com
http://concurrentinc.com
Reply all
Reply to author
Forward
0 new messages