Spark cassandra connector doesn't work in Standalone spark cluster

526 views
Skip to first unread message

Zoran Jeremic

unread,
Feb 16, 2017, 1:44:11 AM2/16/17
to DataStax Spark Connector for Apache Cassandra
Hi,

I've created a simple scala application in order to reproduce the problem I have. This application just use SparkContext to read data from Apache Cassandra 3.9 hosted on Amazon EC2 instance.
This works fine if I'm running spark locally, but when I deploy application to Spark standalone one node cluster, it connects to cassandra cluster and disconnects after some time without reading data.

I'm using:

scala 2.11.6
spark 2.1.0 (both for standalone spark and dependency in application)
spark-cassandra-connector 2.0.0-M3
Cassandra Java driver 3.0.0
Apache Cassandra 3.9

My spark configuration is:

object SparkContextLoader {
val sparkConf = new SparkConf()
sparkConf.setMaster("spark://127.0.1.1:7077")

sparkConf.set("spark.cores_max","2")
sparkConf.set("spark.executor.memory","2g")
sparkConf.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
sparkConf.setAppName("Test application")
sparkConf.set("spark.cassandra.connection.host", "xxx.xxx.xxx.xxx")
sparkConf.set("spark.cassandra.connection.port", "9042")
sparkConf.set("spark.ui.port","4041")

val oneJar="/samplesparkmaven/target/samplesparkmaven-jar.jar"
sparkConf.setJars(List(oneJar))
@transient val sc = new SparkContext(sparkConf)

}

Any idea what could be the problem?

Thanks,
Zoran

Russell Spitzer

unread,
Feb 16, 2017, 1:52:25 AM2/16/17
to DataStax Spark Connector for Apache Cassandra

I see two things
Guava conflicts with the driver see the scc FAQ . TLDR don't include the Cassandra Java driver as a dependency.
Issues with not using spark submit, manually adding your app jar.

The code listed also has no Cassandra interaction but I assume this was just skipped.

There should be some error somewhere with most of these problems, check your executor logs for them if nothing pops up in the Spark driver


--
You received this message because you are subscribed to the Google Groups "DataStax Spark Connector for Apache Cassandra" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spark-connector-...@lists.datastax.com.
--

Russell Spitzer
Software Engineer




DS_Sig2.png

Zoran Jeremic

unread,
Feb 16, 2017, 2:09:37 AM2/16/17
to spark-conn...@lists.datastax.com
Hi Russell,

Thank you for quick response.


The code listed also has no Cassandra interaction but I assume this was just skipped.
Yes. I skipped it as it's quite simple:
 
val sc = SparkContextLoader.getSC
def runSparkJob():Unit={
   val table =sc.cassandraTable("prosolo_logs_zj", "logevents")
   println(table.collect().mkString("\n"))
}


I just removed Java driver as dependency as I found in some of the previous posts it's included in spark-connector-driver.
 
I don't have any error neither in spark-cassandra-driver nor in Spark executor logs.
The output from spark-cassandra-diver is this:

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
17/02/14 23:11:25 INFO SparkContext: Running Spark version 2.1.0
17/02/14 23:11:26 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/02/14 23:11:27 WARN Utils: Your hostname, zoran-Latitude-E5420 resolves to a loopback address: 127.0.1.1; using 192.168.2.68 instead (on interface wlp2s0)
17/02/14 23:11:27 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
17/02/14 23:11:27 INFO SecurityManager: Changing view acls to: zoran
17/02/14 23:11:27 INFO SecurityManager: Changing modify acls to: zoran
17/02/14 23:11:27 INFO SecurityManager: Changing view acls groups to: 
17/02/14 23:11:27 INFO SecurityManager: Changing modify acls groups to: 
17/02/14 23:11:27 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(zoran); groups with view permissions: Set(); users  with modify permissions: Set(zoran); groups with modify permissions: Set()
17/02/14 23:11:28 INFO Utils: Successfully started service 'sparkDriver' on port 33995.
17/02/14 23:11:28 INFO SparkEnv: Registering MapOutputTracker
17/02/14 23:11:28 INFO SparkEnv: Registering BlockManagerMaster
17/02/14 23:11:28 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
17/02/14 23:11:28 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
17/02/14 23:11:28 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-7b25a4cc-cb37-4332-a59b-e36fa45511cd
17/02/14 23:11:28 INFO MemoryStore: MemoryStore started with capacity 870.9 MB
17/02/14 23:11:28 INFO SparkEnv: Registering OutputCommitCoordinator
17/02/14 23:11:28 INFO Utils: Successfully started service 'SparkUI' on port 4041.
17/02/14 23:11:28 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://192.168.2.68:4041
17/02/14 23:11:28 INFO SparkContext: Added JAR /samplesparkmaven/target/samplesparkmaven-jar.jar at spark://192.168.2.68:33995/jars/samplesparkmaven-jar.jar with timestamp 1487142688817
17/02/14 23:11:28 INFO StandaloneAppClient$ClientEndpoint: Connecting to master spark://127.0.1.1:7077...
17/02/14 23:11:28 INFO TransportClientFactory: Successfully created connection to /127.0.1.1:7077 after 62 ms (0 ms spent in bootstraps)
17/02/14 23:11:29 INFO StandaloneSchedulerBackend: Connected to Spark cluster with app ID app-20170214231129-0016
17/02/14 23:11:29 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 36901.
17/02/14 23:11:29 INFO NettyBlockTransferService: Server created on 192.168.2.68:36901
17/02/14 23:11:29 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
17/02/14 23:11:29 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.2.68, 36901, None)
17/02/14 23:11:29 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.2.68:36901 with 870.9 MB RAM, BlockManagerId(driver, 192.168.2.68, 36901, None)
17/02/14 23:11:29 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.2.68, 36901, None)
17/02/14 23:11:29 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 192.168.2.68, 36901, None)
17/02/14 23:11:29 INFO StandaloneSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
17/02/14 23:11:29 INFO NettyUtil: Found Netty's native epoll transport in the classpath, using it
17/02/14 23:11:31 INFO Cluster: New Cassandra host /xxx.xxx.xxx.xxx:9042 added
17/02/14 23:11:31 INFO CassandraConnector: Connected to Cassandra cluster: Test Cluster
17/02/14 23:11:32 INFO SparkContext: Starting job: collect at SparkConnector.scala:28
17/02/14 23:11:32 INFO DAGScheduler: Got job 0 (collect at SparkConnector.scala:28) with 6 output partitions
17/02/14 23:11:32 INFO DAGScheduler: Final stage: ResultStage 0 (collect at SparkConnector.scala:28)
17/02/14 23:11:32 INFO DAGScheduler: Parents of final stage: List()
17/02/14 23:11:32 INFO DAGScheduler: Missing parents: List()
17/02/14 23:11:32 INFO DAGScheduler: Submitting ResultStage 0 (CassandraTableScanRDD[0] at RDD at CassandraRDD.scala:18), which has no missing parents
17/02/14 23:11:32 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 8.4 KB, free 870.9 MB)
17/02/14 23:11:32 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 4.4 KB, free 870.9 MB)
17/02/14 23:11:32 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.2.68:36901 (size: 4.4 KB, free: 870.9 MB)
17/02/14 23:11:32 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:996
17/02/14 23:11:32 INFO DAGScheduler: Submitting 6 missing tasks from ResultStage 0 (CassandraTableScanRDD[0] at RDD at CassandraRDD.scala:18)
17/02/14 23:11:32 INFO TaskSchedulerImpl: Adding task set 0.0 with 6 tasks
17/02/14 23:11:39 INFO CassandraConnector: Disconnected from Cassandra cluster: Test Cluster

And in Spark executor this is all I have:

Spark Command: /usr/lib/jvm/java-8-oracle/bin/java -cp /home/zoran/app/spark-2.1.0-bin-hadoop2.7//conf/:/home/zoran/app/spark-2.1.0-bin-hadoop2.7/jars/* -Xmx1g org.apache.spark.deploy.master.Master --host zoran-Latitude-E5420 --port 7077 --webui-port 8080
========================================
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
17/02/15 22:07:48 INFO Master: Started daemon with process name: 3361@zoran-Latitude-E5420
17/02/15 22:07:49 INFO SignalUtils: Registered signal handler for TERM
17/02/15 22:07:49 INFO SignalUtils: Registered signal handler for HUP
17/02/15 22:07:49 INFO SignalUtils: Registered signal handler for INT
17/02/15 22:07:53 WARN Utils: Your hostname, zoran-Latitude-E5420 resolves to a loopback address: 127.0.1.1; using 192.168.2.68 instead (on interface wlp2s0)
17/02/15 22:07:53 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
17/02/15 22:08:17 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/02/15 22:08:28 INFO SecurityManager: Changing view acls to: zoran
17/02/15 22:08:29 INFO SecurityManager: Changing modify acls to: zoran
17/02/15 22:08:29 INFO SecurityManager: Changing view acls groups to:
17/02/15 22:08:29 INFO SecurityManager: Changing modify acls groups to:
17/02/15 22:08:29 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(zoran); groups with view permissions: Set(); users  with modify permissions: Set(zoran); groups with modify permissions: Set()
17/02/15 22:08:47 INFO Utils: Successfully started service 'sparkMaster' on port 7077.
17/02/15 22:08:49 INFO Master: Starting Spark master at spark://zoran-Latitude-E5420:7077
17/02/15 22:08:50 INFO Master: Running Spark version 2.1.0
17/02/15 22:09:02 WARN Utils: Service 'MasterUI' could not bind on port 8080. Attempting port 8081.
17/02/15 22:09:02 INFO Utils: Successfully started service 'MasterUI' on port 8081.
17/02/15 22:09:02 INFO MasterWebUI: Bound MasterWebUI to 0.0.0.0, and started at http://192.168.2.68:8081
17/02/15 22:09:03 INFO Utils: Successfully started service on port 6066.
17/02/15 22:09:03 INFO StandaloneRestServer: Started REST server for submitting applications on port 6066
17/02/15 22:09:08 INFO Master: I have been elected leader! New state: ALIVE
17/02/15 22:14:41 INFO Master: Registering app Test application
17/02/15 22:14:42 INFO Master: Registered app Test application with ID app-20170215221441-0000
17/02/15 22:47:36 INFO Master: Received unregister request from application app-20170215221441-0000
17/02/15 22:47:36 INFO Master: Removing app app-20170215221441-0000
17/02/15 22:47:39 INFO Master: 127.0.0.1:40684 got disassociated, removing it.
17/02/15 22:47:39 INFO Master: 192.168.2.68:44426 got disassociated, removing it.
17/02/15 22:48:31 INFO Master: Registering app Test application
17/02/15 22:48:31 INFO Master: Registered app Test application with ID app-20170215224831-0001
17/02/15 22:49:42 INFO Master: Received unregister request from application app-20170215224831-0001
17/02/15 22:49:42 INFO Master: Removing app app-20170215224831-0001
17/02/15 22:49:42 INFO Master: 127.0.0.1:42034 got disassociated, removing it.
17/02/15 22:49:42 INFO Master: 192.168.2.68:38720 got disassociated, removing it.
17/02/15 22:50:33 INFO Master: Registering app Test application
17/02/15 22:50:33 INFO Master: Registered app Test application with ID app-20170215225033-0002
17/02/15 22:51:23 INFO Master: Received unregister request from application app-20170215225033-0002
17/02/15 22:51:23 INFO Master: Removing app app-20170215225033-0002
17/02/15 22:51:24 INFO Master: 127.0.0.1:42092 got disassociated, removing it.
17/02/15 22:51:24 INFO Master: 192.168.2.68:38721 got disassociated, removing it.

As you can see nothing useful there that would indicate what is the problem. At least not for me.

Zoran

To unsubscribe from this group and stop receiving emails from it, send an email to spark-connector-user+unsub...@lists.datastax.com.
--

Russell Spitzer
Software Engineer




DS_Sig2.png

--
You received this message because you are subscribed to the Google Groups "DataStax Spark Connector for Apache Cassandra" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spark-connector-user+unsub...@lists.datastax.com.

Russell Spitzer

unread,
Feb 16, 2017, 2:19:27 AM2/16/17
to spark-conn...@lists.datastax.com

Your executor log looks very strange. Did you merge the logs of several applications together (/var/lib/spark/worker/appid/#/? )It doesn't seem to even have a Cassandra connection initialization. Also is that stdout and stderr merged together? Does the driver UI show anything when running? How about the worker logs?

I'm going to sleep now but I'll check this out tomorrow if no one else gets back to you.


To unsubscribe from this group and stop receiving emails from it, send an email to spark-connector-...@lists.datastax.com.
--

Russell Spitzer
Software Engineer




DS_Sig2.png

--
You received this message because you are subscribed to the Google Groups "DataStax Spark Connector for Apache Cassandra" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spark-connector-...@lists.datastax.com.

--
You received this message because you are subscribed to the Google Groups "DataStax Spark Connector for Apache Cassandra" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spark-connector-...@lists.datastax.com.

Zoran Jeremic

unread,
Feb 16, 2017, 2:56:23 AM2/16/17
to spark-conn...@lists.datastax.com
Hi Russell,

This is only one stdout file. I don't have error file, and the reason you see several initialization of Test application is that I was initialized my application multiple times. I don't think it's related to the executor because I tried with two different executors and it was the same. I initially used spark in docker container, and now I'm using spark binary distribution and just initialize master node.

 I've increased log level to ALL, and still can't see anything useful. It doesn't show that it's connected to the cassandra, but as you can see from driver logs, connection is established


17/02/14 23:11:31 INFO Cluster: New Cassandra host /xxx.xxx.xxx.xxx:9042 added
17/02/14 23:11:31 INFO CassandraConnector: Connected to Cassandra cluster: Test Cluster

Thanks,
Zoran

To unsubscribe from this group and stop receiving emails from it, send an email to spark-connector-user+unsub...@lists.datastax.com.
--

Russell Spitzer
Software Engineer




DS_Sig2.png

--
You received this message because you are subscribed to the Google Groups "DataStax Spark Connector for Apache Cassandra" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spark-connector-user+unsub...@lists.datastax.com.

--
You received this message because you are subscribed to the Google Groups "DataStax Spark Connector for Apache Cassandra" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spark-connector-user+unsub...@lists.datastax.com.
--

Russell Spitzer
Software Engineer




DS_Sig2.png

--
You received this message because you are subscribed to the Google Groups "DataStax Spark Connector for Apache Cassandra" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spark-connector-user+unsub...@lists.datastax.com.

Russell Spitzer

unread,
Feb 16, 2017, 11:21:37 AM2/16/17
to spark-conn...@lists.datastax.com
That is the wrong log file then, 

Executor logs for StandAlone look like

8:18:53 ➜  ~/repos/spark-cassandra-connector git:(master) ✗ ls /var/lib/spark/worker/app-20170127073119-0001/0
stderr stdout

You should get 2 Files per application. Multiple applications wouldn't log to the same set of files since each application starts it's own executor JVM.

And stdout should look like

INFO  2017-01-27 07:31:20,593 org.apache.spark.executor.DseExecutorBackend: Started daemon with process name: 50...@rspitzer-rmbp15.local
INFO  2017-01-27 07:31:20,599 org.apache.spark.util.SignalUtils: Registered signal handler for TERM
INFO  2017-01-27 07:31:20,600 org.apache.spark.util.SignalUtils: Registered signal handler for HUP
INFO  2017-01-27 07:31:20,600 org.apache.spark.util.SignalUtils: Registered signal handler for INT
INFO  2017-01-27 07:31:21,049 org.apache.spark.SecurityManager: Changing view acls to: russellspitzer
INFO  2017-01-27 07:31:21,049 org.apache.spark.SecurityManager: Changing modify acls to: russellspitzer
INFO  2017-01-27 07:31:21,050 org.apache.spark.SecurityManager: Changing view acls groups to:
INFO  2017-01-27 07:31:21,050 org.apache.spark.SecurityManager: Changing modify acls groups to:
INFO  2017-01-27 07:31:21,051 org.apache.spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(russellspitzer); groups with view permissions: Set(); users  with modify permissions: Set(russellspitzer); groups with modify 
...


And stderr could have anything in it

Zoran Jeremic

unread,
Feb 16, 2017, 3:09:21 PM2/16/17
to spark-conn...@lists.datastax.com
Hi Russell,

Worker directory with logs was not generated until I started one worker with master. I have it now and there is an error indicated that my jar file was not found. Could it be related to the fact that I didn't install hadoop or it's something else?

17/02/16 09:34:08 TRACE NioEventLoop: instrumented a special java.util.Set into: sun.nio.ch.EPollSelectorImpl@5ec247c
17/02/16 09:34:08 TRACE TransportClientFactory: DNS resolution for /192.168.2.68:41520 took 0 ms
17/02/16 09:34:08 DEBUG TransportClientFactory: Creating new connection to /192.168.2.68:41520
17/02/16 09:34:08 DEBUG TransportClientFactory: Connection to /192.168.2.68:41520 successful, running bootstraps...
17/02/16 09:34:08 INFO TransportClientFactory: Successfully created connection to /192.168.2.68:41520 after 5 ms (0 ms spent in bootstraps)
17/02/16 09:34:08 DEBUG TransportClient: Sending stream request for /jars/sample-spark-maven-one-jar.jar to /192.168.2.68:41520
17/02/16 09:34:08 INFO Utils: Fetching spark://192.168.2.68:41520/jars/sample-spark-maven-one-jar.jar to /tmp/spark-9e0576d0-7edc-4be6-a036-cc9a019c0436/executor-00937274-f939-4482-892e-9b8b8334e075/spark-34e8ea36-7b7b-4f3d-a3ec-42641a28463e/fetchFileTemp9117758371361815789.tmp
17/02/16 09:34:08 TRACE TransportClient: Sending request for /jars/sample-spark-maven-one-jar.jar to /192.168.2.68:41520 took 31 ms
17/02/16 09:34:08 TRACE MessageDecoder: Received message StreamFailure: StreamFailure{streamId=/jars/sample-spark-maven-one-jar.jar, error=Stream '/jars/sample-spark-maven-one-jar.jar' was not found.}
17/02/16 09:34:08 DEBUG NettyRpcEnv: Error downloading stream /jars/sample-spark-maven-one-jar.jar.
java.lang.RuntimeException: Stream '/jars/sample-spark-maven-one-jar.jar' was not found.
    at org.apache.spark.network.client.TransportResponseHandler.handle(TransportResponseHandler.java:222)
    at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:121)
    at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:51)
    at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:346)
    at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:346)
    at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:346)
    at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:85)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:346)
    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:652)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:575)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:489)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:451)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:140)
    at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
    at java.lang.Thread.run(Thread.java:745)
17/02/16 09:34:08 INFO Executor: Fetching spark://192.168.2.68:41520/jars/sample-spark-maven-one-jar.jar with timestamp 1487266433215
17/02/16 09:34:08 ERROR Executor: Exception in task 2.0 in stage 0.0 (TID 2)
java.lang.RuntimeException: Stream '/jars/sample-spark-maven-one-jar.jar' was not found.
    at org.apache.spark.network.client.TransportResponseHandler.handle(TransportResponseHandler.java:222)
    at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:121)
    at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:51)
    at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:346)
    at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:346)
    at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:346)
    at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:85)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:346)
    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:652)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:575)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:489)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:451)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:140)
    at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
    at java.lang.Thread.run(Thread.java:745)
17/02/16 09:34:08 TRACE TransportClientFactory: Returning cached connection to /192.168.2.68:41520: TransportClient{remoteAdress=/192.168.2.68:41520, clientId=null, isActive=true}
17/02/16 09:34:08 DEBUG TransportClient: Sending stream request for /jars/sample-spark-maven-one-jar.jar to /192.168.2.68:41520
17/02/16 09:34:08 INFO Utils: Fetching spark://192.168.2.68:41520/jars/sample-spark-maven-one-jar.jar to /tmp/spark-9e0576d0-7edc-4be6-a036-cc9a019c0436/executor-00937274-f939-4482-892e-9b8b8334e075/spark-34e8ea36-7b7b-4f3d-a3ec-42641a28463e/fetchFileTemp6019744207386703435.tmp
17/02/16 09:34:08 TRACE MessageDecoder: Received message OneWayMessage: OneWayMessage{body=NettyManagedBuffer{buf=PooledUnsafeDirectByteBuf(ridx: 13, widx: 11613, cap: 65536)}}
17/02/16 09:34:08 TRACE TransportClient: Sending request for /jars/sample-spark-maven-one-jar.jar to /192.168.2.68:41520 took 14 ms
17/02/16 09:34:08 TRACE MessageDecoder: Received message StreamFailure: StreamFailure{streamId=/jars/sample-spark-maven-one-jar.jar, error=Stream '/jars/sample-spark-maven-one-jar.jar' was not found.}
17/02/16 09:34:08 DEBUG NettyRpcEnv: Error downloading stream /jars/sample-spark-maven-one-jar.jar.
java.lang.RuntimeException: Stream '/jars/sample-spark-maven-one-jar.jar' was not found.
    at org.apache.spark.network.client.TransportResponseHandler.handle(TransportResponseHandler.java:222)
    at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:121)
    at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:51)
    at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:346)
    at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:346)
    at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:346)
    at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:85)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:346)
    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:652)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:575)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:489)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:451)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:140)
    at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
    at java.lang.Thread.run(Thread.java:745)
17/02/16 09:34:08 INFO CoarseGrainedExecutorBackend: Got assigned task 4
17/02/16 09:34:08 INFO Executor: Running task 4.0 in stage 0.0 (TID 4)
17/02/16 09:34:08 ERROR Executor: Exception in task 1.0 in stage 0.0 (TID 1)
java.lang.RuntimeException: Stream '/jars/sample-spark-maven-one-jar.jar' was not found.
    at org.apache.spark.network.client.TransportResponseHandler.handle(TransportResponseHandler.java:222)
    at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:121)
    at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:51)
    at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:346)
    at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:346)
    at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:346)
    at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:85)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:346)
    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:652)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:575)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:489)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:451)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:140)
    at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
    at java.lang.Thread.run(Thread.java:745)



--

Russell Spitzer

unread,
Feb 16, 2017, 3:35:21 PM2/16/17
to spark-conn...@lists.datastax.com
It's because like I said in my first email, you aren't using spark submit. You're attempt to add the jar to the context programmatically and it's failing. You should use spark-submit and add the SCC via --packages like in :

https://github.com/datastax/SparkBuildExamples 

Zoran Jeremic

unread,
Feb 16, 2017, 3:57:46 PM2/16/17
to spark-conn...@lists.datastax.com
Hi Russell,

Isn't that possible to add the jar programmatically? I'm using maven-shade-plugin to package all dependencies in one jar, and to deploy it at once to the server.
Do you want to say that's not possible and I would need to deploy all these dependencies manually each time I change something in my code?

Sorry for such naive questions. This is the first time I'm using Spark standalone.

Thanks,
Zoran

--

Russell Spitzer

unread,
Feb 16, 2017, 4:00:02 PM2/16/17
to spark-conn...@lists.datastax.com
It's possible, but you are getting the path wrong or something like that it is far easier to just let spark-submit do it. Pretty much everyone attempting to avoid SparkSubmit ends up with these kinds of issues.

you would just be changing your launch command to

./bin/spark-submit yourjarname

Zoran Jeremic

unread,
Feb 17, 2017, 1:06:04 AM2/17/17
to spark-conn...@lists.datastax.com
Hi Russell,

Thanks a lot for your help. You're right. It was the problem in wrong path, but unfortunately these errors turn my attention in totally wrong direction and I didn't see obvious thing. My SparkConf.setJars was using relative path, but I missed to put leading "." so it was treated like absolute path. I'm wondering why I didn't get some error indicating that jar file was not found.

I have one more question, but I'll start another thread for that.

Thanks a lot.
Zoran

--
Reply all
Reply to author
Forward
0 new messages