Issue with Livy - Pyspark

50 views
Skip to first unread message

William Kupersanin

unread,
Mar 8, 2017, 5:48:37 PM3/8/17
to jup...@googlegroups.com
Hello All, 

I am trying to debug an issue in a notebook running the Pyspark kernel where the notebook will execute some cells  but then freeze after a certain point. I think that messaging is getting screwed up between Pyspark and Livy. When the last cell is executed, I will see this on the client side. 


2017-03-08 22:24:48,505 INFO    EventsHandler   InstanceId: 0e1c8fd2-047e-4337-b264-5b64ba74de5a,EventName: notebookStatementExecutionStart,Timestamp: 2017-03-08 22:24:48.504920,SessionGuid: 03d14478-6adc-4b
34-abef-b9b6fd400543,LivyKind: pyspark,SessionId: 8,StatementGuid: f1933b11-b767-4a18-b311-c48901ad8369
2017-03-08 22:24:48,788 DEBUG   Command Status of statement 8 is running.
2017-03-08 22:24:50,920 DEBUG   Command Status of statement 8 is running.

 ...and it never comes back.

On the livy end, I see 

17/03/08 17:26:26 INFO ContextLauncher: 17/03/08 17:26:26 INFO scheduler.DAGScheduler: ResultStage 17 (collect at <stdin>:5) finished in 1.521 s
17/03/08 17:26:26 INFO ContextLauncher: 17/03/08 17:26:26 INFO scheduler.DAGScheduler: Job 8 finished: collect at <stdin>:5, took 3.729078 s
17/03/08 17:26:27 DEBUG RpcDispatcher: [ClientProtocol] Registered outstanding rpc 230 (com.cloudera.livy.rsc.BaseProtocol$GetReplJobResult).
17/03/08 17:26:27 DEBUG KryoMessageCodec: Encoded message of type com.cloudera.livy.rsc.rpc.Rpc$MessageHeader (6 bytes)
17/03/08 17:26:27 DEBUG KryoMessageCodec: Encoded message of type com.cloudera.livy.rsc.BaseProtocol$GetReplJobResult (91 bytes)
17/03/08 17:26:27 DEBUG KryoMessageCodec: Decoded message of type com.cloudera.livy.rsc.rpc.Rpc$MessageHeader (6 bytes)
17/03/08 17:26:27 DEBUG KryoMessageCodec: Decoded message of type com.cloudera.livy.rsc.rpc.Rpc$NullMessage (2 bytes)
17/03/08 17:26:27 DEBUG RpcDispatcher: [ClientProtocol] Received RPC message: type=REPLY id=230 payload=com.cloudera.livy.rsc.rpc.Rpc$NullMessage
17/03/08 17:26:28 DEBUG RpcDispatcher: [ClientProtocol] Registered outstanding rpc 231 (com.cloudera.livy.rsc.BaseProtocol$GetReplJobResult).

ad infinitum

So, with my limited knowledge, it looks to me that Livy thinks it has sent a result to a finished job, but pyspark hasn't received it.
Anyone seen this before? Any thoughts?

Thanks!
--Willie

Alejandro Guerrero

unread,
Mar 8, 2017, 7:08:09 PM3/8/17
to Project Jupyter
Hi Willie,

Can you please continue the discussion on Github? Here's a link: https://github.com/jupyter-incubator/sparkmagic/issues/339

Thanks!
Reply all
Reply to author
Forward
0 new messages