Problems on the cluster

33 views
Skip to first unread message

estev...@gmail.com

unread,
May 8, 2019, 12:21:34 PM5/8/19
to Hops
Hello,
We are facing problems on the Hopsworks cluster since yesterday.
Could you please give us feedback about what is going on?

When trying to run a notebook, the spark application fails with the following message:

-------------------------------------------------------
The code failed because of a fatal error:
Session 2109 unexpectedly reached final status 'dead'. See logs:
stdout:
2019-05-08 18:10:46,047 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2019-05-08 18:10:46,321 WARN UsersGroups: UsersGroups was not initialized.
2019-05-08 18:10:46,470 INFO RMProxy: Connecting to ResourceManager at /10.0.104.193:8032
2019-05-08 18:10:47,063 INFO Client: Requesting a new application from cluster with 21 NodeManagers
2019-05-08 18:10:47,168 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (216000 MB per container)
2019-05-08 18:10:47,168 INFO Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead
2019-05-08 18:10:47,169 INFO Client: Setting up container launch context for our AM
2019-05-08 18:10:47,173 INFO Client: Setting up the launch environment for our AM container
2019-05-08 18:10:47,183 INFO Client: Preparing resources for our AM container
2019-05-08 18:10:47,689 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
2019-05-08 18:10:51,664 INFO Client: Uploading resource file:/tmp/spark-84b6a805-dc3d-43db-a34f-413432104db2/__spark_libs__4816877042688956613.zip -> hdfs:/Projects/TFdistributed/Resources/.sparkStaging/application_1553861944920_1448/__spark_libs__4816877042688956613.zip

stderr:

YARN Diagnostics:
No YARN application is found with tag livy-session-2109-k0vnhmbq in 120 seconds. This may be because 1) spark-submit fail to submit application to YARN; or 2) YARN cluster doesn't have enough resources to start the application in time. Please check Livy log and YARN log to know the details..

Some things to try:
a) Make sure Spark has enough available resources for Jupyter to create a Spark context.
b) Contact your Jupyter administrator to make sure the Spark magics library is configured correctly.
c) Restart the kernel.
-------------------------------------------------------

Antonio

Theofilos Kakantousis

unread,
May 8, 2019, 12:39:27 PM5/8/19
to Hops
Hi Antonio,

hops.site is facing an overload at the moment, causing timeouts Will report back when it's ok to run notebooks again.

Theo  

Theofilos Kakantousis

unread,
May 8, 2019, 2:30:40 PM5/8/19
to Hops
Issue has been resolved. We might need to do some maintenance within the next few days.

estev...@gmail.com

unread,
May 8, 2019, 2:55:57 PM5/8/19
to Hops
Thank you Theofilos.

Antonio
Reply all
Reply to author
Forward
0 new messages