SparkContext cannot be passed as parameter?

2,316 views
Skip to first unread message

CodingCat

unread,
Nov 18, 2012, 2:25:54 AM11/18/12
to spark...@googlegroups.com
Hi, 

I'm stucked with a weird problem in my application

I construct a SparkContext in main function, and pass it as parameters to actors, when I try to call sc.textFile("input_path", splitnum) in an actor, the spark throws an exception as following:

proj.units.InputLayer@307c824e: caught java.lang.NullPointerException
java.lang.NullPointerException
at spark.broadcast.HttpBroadcast.<init>(HttpBroadcast.scala:23)
at spark.broadcast.HttpBroadcastFactory.newBroadcast(HttpBroadcast.scala:53)
at spark.broadcast.HttpBroadcastFactory.newBroadcast(HttpBroadcast.scala:49)
at spark.broadcast.BroadcastManager.newBroadcast(Broadcast.scala:50)
at spark.SparkContext.broadcast(SparkContext.scala:407)
at spark.rdd.HadoopRDD.<init>(HadoopRDD.scala:52)
at spark.SparkContext.hadoopFile(SparkContext.scala:238)
at spark.SparkContext.textFile(SparkContext.scala:207)

if I construct a SparkContext object just at the place where I call sc.textFile(), everything is fine,  however, I have to call sc.textFile for more than one time, so I have to create multiple SparkContext objects which unfortunately seems to be not allowed by Spark (I met an exception like "port occupied")

What should I do under this situation?

Best,

Nan

Reynold Xin

unread,
Nov 18, 2012, 2:28:13 AM11/18/12
to spark...@googlegroups.com
A workaround is to have a global, static variable - not the best programing practice, but really works.

object MyApplicationEnv {
  val sc: SparkContext = _

  def init() {
    if (sc == null) {
      sc = new SparkContext(
          if (System.getenv("MASTER") == null) "local" else System.getenv("MASTER"),
          "my application name",
          null,
          Nil,
          executorEnvVars)
    }
  }
}


--
Reynold Xin

Nan Zhu

unread,
Nov 18, 2012, 2:33:03 AM11/18/12
to spark...@googlegroups.com
Thank you Reynold, I will try it

so you mean it's an already known issue in current Spark release?

Best,

-- 
Nan Zhu
School of Computer Science,
McGill University


Nan Zhu

unread,
Nov 18, 2012, 3:19:18 AM11/18/12
to spark...@googlegroups.com
I still cannot fix this problem, it's really weird, I refer to the shark's implementation, I think the logic of passing sc is nearly the same with my application

can anyone give more suggestions?

Best,

-- 
Nan Zhu
School of Computer Science,
McGill University

Nan Zhu

unread,
Nov 18, 2012, 3:46:40 AM11/18/12
to spark...@googlegroups.com
fix that !

it shows that I cannot construct a SparkContext right after the start of my program, and pass it to other objects for using, 

the only thing I can do is if I need to call something like textFile(), I just construct a SparkContext if global variable sc == null, 

I didn't understand what's the difference between these two ones, 

still thinking about it….

Best,

-- 
Nan Zhu
School of Computer Science,
McGill University

Reynold Xin

unread,
Nov 18, 2012, 3:46:18 AM11/18/12
to spark...@googlegroups.com
It's probably because your spark context hasn't been constructed yet.

The other solution is to make the sc variable a lazy val, and assign a value to it.

e.g.

lazy val sc = new SparkContext(...)

Nan Zhu

unread,
Nov 18, 2012, 4:21:45 AM11/18/12
to spark...@googlegroups.com
I checked that in my code, when I tried to use sc.textFile, sc is always non-NULL, 

also, I continue my development just to find that the sc object cannot be referred in the same file, i.e. if I construct and use that only in A.scala, everything is OK; if I pass it to some class in B.scala, it will throw that exception, but this issue doesn't appear in Shark….

so, I reviewed my code, the most suspected place is that I always use sc in actor…


create and use sc in the same actor -  passed; 
create in main, and use in actor A - crash,
create in actor A, and use in actor B -  crash

so I think the main cause may be the actor model??? anyone familiar scala & spark can explain the reason?

Spark can support passing sc in different actors in future? since, actors is so commonly used in scala-developed systems

Best,

-- 
Nan Zhu
School of Computer Science,
McGill University

Nan Zhu

unread,
Nov 18, 2012, 8:58:24 AM11/18/12
to spark...@googlegroups.com
sorry, in last email, it should be "the sc object can only be referred in the same file"

Best,

-- 
Nan Zhu
School of Computer Science,
McGill University

Reply all
Reply to author
Forward
0 new messages