Question about running a java job on Spark with maven

175 views

Skip to first unread message

WangHua

unread,

Jun 5, 2013, 9:15:53 PM6/5/13

to spark-us...@googlegroups.com

Matei,

I have a problem when I run a job in java with maven.

The command I use is : mvn exec:java -Dexec.mainClass="com.tseg.spark_maven.SimpleJob"

The key parameters I use is shown down:

String logFile = "hdfs://compute-20-01.local:9000/user/hbase/wanghua/simple.txt";

JavaSparkContext sc = new JavaSparkContext("spark://192.168.34.89:7077", "Simple Job", "$SPARK_HOME", new String[]{"target/spark_maven-0.0.1-SNAPSHOT.jar"});

While the job have an exception as show down.

3/06/05 19:16:04 INFO cluster.SparkDeploySchedulerBackend: Granted executor ID job-20130605191604-0024/7 on host 192.168.34.91 with 14 cores, 512.0 MB RAM

13/06/05 19:16:04 INFO client.Client$ClientActor: Executor updated: job-20130605191604-0024/1 is now LOADING

13/06/05 19:16:04 INFO client.Client$ClientActor: Executor updated: job-20130605191604-0024/0 is now LOADING

13/06/05 19:16:04 INFO client.Client$ClientActor: Executor updated: job-20130605191604-0024/2 is now LOADING

13/06/05 19:16:04 INFO client.Client$ClientActor: Executor updated: job-20130605191604-0024/4 is now LOADING

13/06/05 19:16:04 INFO client.Client$ClientActor: Executor updated: job-20130605191604-0024/6 is now LOADING

13/06/05 19:16:04 INFO client.Client$ClientActor: Executor updated: job-20130605191604-0024/5 is now LOADING

13/06/05 19:16:04 INFO client.Client$ClientActor: Executor updated: job-20130605191604-0024/7 is now LOADING

13/06/05 19:16:04 INFO client.Client$ClientActor: Executor updated: job-20130605191604-0024/3 is now LOADING

13/06/05 19:16:04 INFO storage.MemoryStore: ensureFreeSpace(44842) called with curMem=0, maxMem=7422604738

13/06/05 19:16:04 INFO storage.MemoryStore: Block broadcast_0 stored as values to memory (estimated size 43.8 KB, free 6.9 GB)

[WARNING]

java.lang.reflect.InvocationTargetException

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

at java.lang.reflect.Method.invoke(Method.java:597)

at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:297)

at java.lang.Thread.run(Thread.java:619)

Caused by: java.io.IOException: Call to compute-20-01.local/192.168.34.89:9000 failed on local exception: java.io.EOFException

at org.apache.hadoop.ipc.Client.wrapException(Client.java:1103)

at org.apache.hadoop.ipc.Client.call(Client.java:1071)

at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)

at $Proxy18.getProtocolVersion(Unknown Source)

at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)

at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)

at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:118)

at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:222)

at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:187)

at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)

at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1328)

at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:65)

at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1346)

at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:244)

at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)

at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:176)

at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:208)

at spark.rdd.HadoopRDD.<init>(HadoopRDD.scala:57)

at spark.SparkContext.hadoopFile(SparkContext.scala:238)

at spark.SparkContext.textFile(SparkContext.scala:207)

at spark.api.java.JavaSparkContext.textFile(JavaSparkContext.scala:102)

at com.tseg.spark_maven.SimpleJob.main(SimpleJob.java:18)

... 6 more

Caused by: java.io.EOFException

at java.io.DataInputStream.readInt(DataInputStream.java:375)

at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:800)

at org.apache.hadoop.ipc.Client$Connection.run(Client.java:745)

[INFO] ------------------------------------------------------------------------

[INFO] BUILD FAILURE

[INFO] ------------------------------------------------------------------------

[INFO] Total time: 2.536s

[INFO] Finished at: Wed Jun 05 19:16:04 CST 2013

[INFO] Final Memory: 14M/722M

[INFO] ------------------------------------------------------------------------

[ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.2.1:java (default-cli) on project spark_maven: An exception occured while executing the Java class. null: InvocationTargetException: Call to compute-20-01.local/192.168.34.89:9000 failed on local exception: java.io.EOFException -> [Help 1]

[ERROR]

[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.

[ERROR] Re-run Maven using the -X switch to enable full debug logging.

[ERROR]

[ERROR] For more information about the errors and possible solutions, please read the following articles:

[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException

When I use "local" and a local file ,rather than "spark://192.168.34.89:7077" and a hdfs file,the job runs well.

Thanks,

WangHua

Reply all

Reply to author

Forward

0 new messages