Hive Authorization Issue using waggle dance

65 views
Skip to first unread message

sreshta...@gmail.com

unread,
Jan 19, 2018, 2:12:17 AM1/19/18
to Waggle Dance User
Hi

We have created a separate user to create ssh tunnel to federated metastore and that user has access to the tables on federated hive metastore. Currently we have a user cluster on which few users have read access to few databases on our local hive metastore and write access to remaining databases on it. We have enabled hive authorization on our user cluster to enable user access checks. This is causing issues while running queries on tables accessed using federated hive metastore. It throws the below error:
Error: Error while compiling statement: FAILED: HiveAccessControlException Permission denied: Principal [name=skamatham, type=USER] does not have following privileges for operation SHOWPARTITIONS [[SELECT] on Object [type=TABLE_OR_VIEW, name=dsp_dm.order_trans_fact]] (state=42000,code=40000)


After doing some research, we found out that the query on federated metastore is trying to check if the user "skamatham" has access to the table or not, where as the access to federated metastore was provided for the ssh tunnel user which is a service account.

Is it possible to bypass hive authorization for queries on federated hive metastore?
Can you please suggest if there is a work around for it.

Please let me know if you need more information on it.

Thanks
Sreshta

Patrick Duin

unread,
Jan 22, 2018, 9:40:18 AM1/22/18
to Waggle Dance User
Hi,

The user initiates the query is the user that is passed along via hive/Waggle Dance. The ssh tunnel user is irrelevant to that. Waggle Dance really just acts as a proxy here and just passed the user along. So for hive authentication to work you'll have to set it up for every user that can do queries (this would be the same with or without Waggle Dance). 

Waggle Dance does provide some crude (database level) read/write protection, which is explained in the README.

For your specific use case you could try to set the HADOOP_USER_NAME (to the tunnel user) in the client (hive/spark/qubole). Each client will have to do this. If this is something that solves your issue, we might be able to accommodate this a bit better and setup waggle dance to issue all requests via a certain user. Will need some testing to see if that would work. Alternatively setup the authorization for the user doing the request "skamatham".

Cheers,
 Patrick
Reply all
Reply to author
Forward
0 new messages