Unable to run Spark job because Kerberos security for Hadoop is ON

2,980 views
Skip to first unread message

Gaurav Dasgupta

unread,
Apr 2, 2013, 3:39:27 AM4/2/13
to spark...@googlegroups.com
Hi,

My CDH4 cluster is enable with Kerberos authentication. I have tested this successfully. But when I am trying to submit a Spark job which uses HDFS for input/output, I am getting the following error:

org.apache.hadoop.security.AccessControlException: Authorization (hadoop.security.authorization) is enabled but authentication (hadoop.security.authentication) is configured as simple. Please configure another method like kerberos or digest.

hadoop.security.authentication is configured as kerberos. The user which is trying to submit the Spark is have Kerberos credentials and I have tested this by running a Hadoop job by the same user for the same input and output directory.

Do, we need to specify something in the Spark conf so that it can use Kerberos authentication? How do I resolve this problem?

Thanks,
Gaurav

Gaurav Dasgupta

unread,
Apr 2, 2013, 5:44:43 AM4/2/13
to spark...@googlegroups.com
Hi,

Setting "/etc/hadoop/conf/" to Sparks classpath resolved that error. But now, I am getting the following error message while running the job:

13/04/02 03:54:59 INFO cluster.TaskSetManager: Starting task 0.0:3 as TID 24 on executor 2: babar8.musigma.com (preferred)
13/04/02 03:54:59 INFO cluster.TaskSetManager: Serialized task 0.0:3 as 1493 bytes in 1 ms
13/04/02 03:54:59 INFO cluster.TaskSetManager: Lost TID 7 (task 0.0:7)
13/04/02 03:54:59 INFO cluster.TaskSetManager: Loss was due to java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "babar8.musigma.com/192.168.200.18"; destination host is: "br10":8020;  [duplicate 1]
13/04/02 03:54:59 INFO cluster.TaskSetManager: Starting task 0.0:7 as TID 25 on executor 2: babar8.musigma.com (preferred)
13/04/02 03:54:59 INFO cluster.TaskSetManager: Serialized task 0.0:7 as 1493 bytes in 1 ms
13/04/02 03:54:59 INFO cluster.TaskSetManager: Lost TID 1 (task 0.0:1)
13/04/02 03:54:59 INFO cluster.TaskSetManager: Loss was due to java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "babar8.musigma.com/192.168.200.18"; destination host is: "br10":8020;  [duplicate 2]
13/04/02 03:54:59 INFO cluster.TaskSetManager: Starting task 0.0:1 as TID 26 on executor 2: babar8.musigma.com (preferred)
13/04/02 03:54:59 INFO cluster.TaskSetManager: Serialized task 0.0:1 as 1493 bytes in 0 ms
13/04/02 03:54:59 INFO cluster.TaskSetManager: Lost TID 20 (task 0.0:4)
13/04/02 03:54:59 INFO cluster.TaskSetManager: Loss was due to java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "babar3.musigma.com/192.168.200.13"; destination host is: "br10":8020;  [duplicate 16]


Any idea how to resolve this? Kerberos tgt for hdfs service is being auto generated and renewed by Cloudera Manager and I can run hadoop distributed jobs without any kerberos error.

Thanks,

Gaurav Dasgupta

unread,
Apr 3, 2013, 4:56:10 AM4/3/13
to spark...@googlegroups.com
Hi,

The issue is solved. I had to locally kinit for all the host principals for kerberos.
No further configuration was required from Spark side.

Thanks,
Gaurav
Reply all
Reply to author
Forward
0 new messages