trying to run cascading on hadoop

236 views
Skip to first unread message

Marton Trencseni

unread,
Nov 10, 2012, 12:12:24 PM11/10/12
to cascadi...@googlegroups.com
After my previous post where I was unsuccessful getting scalding to run, I downloaded cascading and tried to get it ro run with one of its sample files. I got the same exception.

$ hadoop version
Hadoop 2.0.0-cdh4.1.2

$ hadoop jar MyJob.jar Main hdfs://hostname/user/mtrencseni/xxx.csv hdfs://hostname/user/mtrencseni/xxx2.csv

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/scalding/target/scalding-assembly-0.8.2-SNAPSHOT.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
12/11/10 18:00:54 INFO util.HadoopUtil: resolving application jar from found main method on: Main
12/11/10 18:00:54 INFO planner.HadoopPlanner: using application jar: /home/mtrencseni/cascading-2.0.6/MyJob.jar
12/11/10 18:00:54 INFO property.AppProps: using app.id: DCCA7F3875931F9872FAA96DCB39A3CD
Exception in thread "main" cascading.flow.planner.PlannerException: could not build flow from assembly: [Not implemented by the DistributedFileSystem FileSystem implementation]
    at cascading.flow.planner.FlowPlanner.handleExceptionDuringPlanning(FlowPlanner.java:503)
    at cascading.flow.hadoop.planner.HadoopPlanner.buildFlow(HadoopPlanner.java:230)
    at cascading.flow.FlowConnector.connect(FlowConnector.java:454)
    at Main.main(Main.java:63)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
Caused by: java.lang.UnsupportedOperationException: Not implemented by the DistributedFileSystem FileSystem implementation
    at org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:200)
    at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2186)
    at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2196)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2213)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
    at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2252)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2234)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
    at cascading.tap.hadoop.Hfs.getFileSystem(Hfs.java:321)
    at cascading.tap.hadoop.Hfs.getFullIdentifier(Hfs.java:351)
    at cascading.tap.hadoop.Hfs.getFullIdentifier(Hfs.java:78)
    at cascading.scheme.hadoop.TextDelimited.retrieveSourceFields(TextDelimited.java:733)
    at cascading.tap.Tap.retrieveSourceFields(Tap.java:343)
    at cascading.flow.BaseFlow.retrieveSourceFields(BaseFlow.java:202)
    at cascading.flow.BaseFlow.<init>(BaseFlow.java:171)
    at cascading.flow.hadoop.HadoopFlow.<init>(HadoopFlow.java:87)
    at cascading.flow.hadoop.planner.HadoopPlanner.buildFlow(HadoopPlanner.java:193)
    ... 7 more

Ken Krugler

unread,
Nov 10, 2012, 2:10:58 PM11/10/12
to cascadi...@googlegroups.com
Hi Marton,

On Nov 10, 2012, at 9:12am, Marton Trencseni wrote:

After my previous post where I was unsuccessful getting scalding to run, I downloaded cascading and tried to get it ro run with one of its sample files. I got the same exception.

CHD4 is based on Hadoop 2.0, but Cascading/Scalding uses Hadoop 0.20 - Hadoop 1.0.

See https://issues.apache.org/jira/browse/SQOOP-541 for a similar issue that a user ran into with Sqoop and CHD4.

Based on past list discussions (see https://groups.google.com/forum/#!msg/cascading-user/TO2cWAhkjw8/p8ctkbGYeuoJ) it looks like if you build Cascading 2.0 against CHD4, and you avoid mixing Lfs and Hfs in your flows, then it should work.

-- Ken

--
You received this message because you are subscribed to the Google Groups "cascading-user" group.
To view this discussion on the web visit https://groups.google.com/d/msg/cascading-user/-/EpDFKgRRd4cJ.
To post to this group, send email to cascadi...@googlegroups.com.
To unsubscribe from this group, send email to cascading-use...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/cascading-user?hl=en.

--------------------------------------------




Chris K Wensel

unread,
Nov 10, 2012, 4:18:54 PM11/10/12
to cascadi...@googlegroups.com
Just want to point out we are working with companies where we can to manage incompatibilities


fwiw, there is no Apache Hadoop 2.0. So expect troubles if you are trying to use a "2.0" distribution.

ckw
Reply all
Reply to author
Forward
0 new messages