Does Sentry work with Cloudera Standard(free edition) CDH 4.3.0 Cloudera Manager 4.6.2

Showing 1-12 of 12 messages
Does Sentry work with Cloudera Standard(free edition) CDH 4.3.0 Cloudera Manager 4.6.2 Sanjay Subramanian 8/14/13 3:48 PM
Does Sentry work with Cloudera Standard(free edition) CDH 4.3.0   Cloudera Manager 4.6.2
thanks
sanjay
Re: Does Sentry work with Cloudera Standard(free edition) CDH 4.3.0 Cloudera Manager 4.6.2 Harsh J 8/14/13 11:15 PM
Yes it does. Are you having specific issues in trying to set it up
while following the docs? If so, which step and what's the blocker?
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "CDH Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to cdh-user+u...@cloudera.org.
> For more options, visit
> https://groups.google.com/a/cloudera.org/groups/opt_out.



--
Harsh J
Re: Does Sentry work with Cloudera Standard(free edition) CDH 4.3.0 Cloudera Manager 4.6.2 Sanjay Subramanian 8/15/13 1:08 PM
Hi Harsh 

Sorry dont mind, but I actually found the Sentry guide very cryptic :-( and I have many questions. 

I will ask my questions based on the Sentry PDF and it would be great if I could receive guidance regarding those

To give u a background of what we have in production, and what I have attempting to evaluate and upgrade with Sentry for Hive, I have posted the version info


[1] Prerequisite section : Not sure how to set minimum user ID for job submission. We are using YARN and I could not find this property in Cloudera Manager "MapReduce Properties"


[2] Since I am evaluating 4.6.2 Cloudera Manager how do I set  the following (And in production and dev evaluation I am not using parcels)
     - HIVE_AUX_JARS_PATH

[3] Roles and Privileges section : What user interface does Sentry have for setting roles and privileges ? Not clear from this section. Or am I supposed to use Hive CLI to define ROLES and assign users to the ROLES ?

[4] Users and Groups section : Is this "Group" that is being referred to here - Linux Group ?

[6] Is Sentry.xml edited through a safety valve or we need to edit it the old fashioned way ? 

[7] Policy File - What is this ? I am not sure I understand what this is and how it fits into the scheme of things

[8] Defining Roles - which XML or CFG or INI do I configure roles ?

[9] Defining URIs - What is this and why is it required ? 

[10] Can Sentry work without securing hive-metastore ?

[11] Can Sentry work without HDFS kerberos security configured ? The reason is it will make HDFS slow.

[12] I have already configured HiveServer 2 with LDAP for authentication...Is this not enough ? Can I not use Sentry to start defining roles and privileges ? 

thanks

sanjay




What I have in production is as follows
Group 1 (CDH4)
Hosts
thv-nn1.pv.sv.nextag.com, thv01.pv.sv.nextag.com, thv02.pv.sv.nextag.com, thv03.pv.sv.nextag.com, thv04.pv.sv.nextag.com, thv05.pv.sv.nextag.com, thv06.pv.sv.nextag.com, thv07.pv.sv.nextag.com, thv08.pv.sv.nextag.com, thv09.pv.sv.nextag.com, thv10.pv.sv.nextag.com, thv11.pv.sv.nextag.com, thv12.pv.sv.nextag.com, thv13.pv.sv.nextag.com, thv14.pv.sv.nextag.com, thv15.pv.sv.nextag.com, thv16.pv.sv.nextag.com, thv17.pv.sv.nextag.com, thv18.pv.sv.nextag.com, thv19.pv.sv.nextag.com, thv20.pv.sv.nextag.com, thv21.pv.sv.nextag.com, thv22.pv.sv.nextag.com, thv23.pv.sv.nextag.com, thv24.pv.sv.nextag.com, thv25.pv.sv.nextag.com
ComponentVersionCDH Version
ImpalaUnavailableNot installed or path incorrect
HDFS (CDH4 only)2.0.0+552CDH4
Hue Plugins2.1.0+221CDH4
MapReduce 2 (CDH4 only)2.0.0+552CDH4
HBase0.92.1+160CDH4
Oozie3.2.0+126CDH4
Yarn (CDH4 only)2.0.0+552CDH4
Zookeeper3.4.3+28CDH4
Hue2.1.0+221CDH4
MapReduce 1 (CDH4 only)0.20.2+1265CDH4
HttpFS (CDH4 only)2.0.0+552CDH4
Hadoop2.0.0+552CDH4
Hive0.9.0+155CDH4
Flume NG1.2.0+122CDH4
Cloudera Manager Management Daemons4.1.2Not applicable
Cloudera Manager Agent4.1.2Not applicable


What I have done till now in my development sandbox
Group 1 (CDH4)
Hosts
dev-thdp5.corp.nextag.com
ComponentVersionCDH Version
Impala1.1Not applicable
Yarn (CDH4 only)2.0.0+1357CDH4
HDFS (CDH4 only)2.0.0+1357CDH4
Hue Plugins2.3.0+136CDH4
Sqoop2 (CDH4 only)1.99.1+115CDH4
MapReduce 2 (CDH4 only)2.0.0+1357CDH4
HBase0.94.6+96CDH4
HCatalog (CDH4 only)0.5.0+9CDH4
Oozie3.3.2+49CDH4
Zookeeper3.4.5+19CDH4
Hue2.3.0+136CDH4
MapReduce 1 (CDH4 only)2.0.0+1357CDH4
Pig0.11.0+28CDH4
HttpFS (CDH4 only)2.0.0+1357CDH4
Hadoop2.0.0+1357CDH4
Hive0.10.0+121CDH4
Solr (CDH4 only)4.3.0+61CDH4
Flume NG1.3.0+159CDH4
Javajava version "1.6.0_31" Java(TM) SE Runtime Environment (build 1.6.0_31-b04) Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode)Not applicable
Cloudera Manager Agent4.6.2Not applicable
Re: Does Sentry work with Cloudera Standard(free edition) CDH 4.3.0 Cloudera Manager 4.6.2 Shreepadma Venugopalan 8/15/13 10:28 PM
Hi Sanjay,

Thank you for your interest in Sentry. My responses to your questions are inline. 

--Shreepadma


On Thu, Aug 15, 2013 at 1:08 PM, Sanjay Subramanian <sanjaysu...@gmail.com> wrote:
Hi Harsh 

Sorry dont mind, but I actually found the Sentry guide very cryptic :-( and I have many questions. 

I will ask my questions based on the Sentry PDF and it would be great if I could receive guidance regarding those

To give u a background of what we have in production, and what I have attempting to evaluate and upgrade with Sentry for Hive, I have posted the version info


[1] Prerequisite section : Not sure how to set minimum user ID for job submission. We are using YARN and I could not find this property in Cloudera Manager "MapReduce Properties"


[2] Since I am evaluating 4.6.2 Cloudera Manager how do I set  the following (And in production and dev evaluation I am not using parcels)
     - HIVE_AUX_JARS_PATH

I believe with CM 4.6.2, its not necessary to set the HIVE_AUX_JARS_PATH
 

[3] Roles and Privileges section : What user interface does Sentry have for setting roles and privileges ? Not clear from this section. Or am I supposed to use Hive CLI to define ROLES and assign users to the ROLES ?

Sentry roles and privileges are defined in a policy file whose location is contained in sentry-site.xml. Please look at  page 9 in the documentation for an example policy file, page 10 for sentry-site.xml. Note that you need to make changes to hive-site.xml as well. The changes are in page 11.


[4] Users and Groups section : Is this "Group" that is being referred to here - Linux Group ?

Group section maps groups to roles. Sentry support reusing existing OS groups as well as defining new groups under "users" section in the policy file. Note that  a particular Sentry deployment can't use both locally defined groups in the policy file as well as OS groups. To use OS groups, you need to set hive.sentry.provider to org.apache.sentry.provider.file.HadoopGroupResourceAuthorizationProvider.
 

[6] Is Sentry.xml edited through a safety valve or we need to edit it the old fashioned way ? 

In CM 4.6.*, sentry-site.xml has to be manually edited. However, note that sentry-site.xml has only a few configurations. We're working on adding support for editing sentry-site.xml in CM a future release.
 

[7] Policy File - What is this ? I am not sure I understand what this is and how it fits into the scheme of things

Policy file contains the mapping between the privileges and roles, roles and groups. Additionally, it also supports specifying user to group mapping in case the user wishes not to use OS groups.
 

[8] Defining Roles - which XML or CFG or INI do I configure roles ?

Roles are defined in the policy file. Please refer to a sample in page 9 in the documentation.
 

[9] Defining URIs - What is this and why is it required ? 

Sentry provides authorization on URIs in addition to databases,tables etc. This is needed to safeguard against a malicious user trying to access data (s)he doesn't have access to via external tables, load etc. 

[10] Can Sentry work without securing hive-metastore ?

Sentry will work without securing the metastore, but its definitely not recommended. Without securing the metastore through IPTables, Sentry can't provide secure access.
 

[11] Can Sentry work without HDFS kerberos security configured ? The reason is it will make HDFS slow.

Sentry will work without Kerberos. However, if the underlying cluster is not secure, Sentry alone can't guarantee secure access to data on such cluster.

[12] I have already configured HiveServer 2 with LDAP for authentication...Is this not enough ? Can I not use Sentry to start defining roles and privileges ? 

Yes, Sentry accepts both LDAP and Kerberos credentials. While this can be used to start defining roles and privileges, as noted earlier, the underlying cluster has to be secure to guarantee secure access to data.
...
Re: Does Sentry work with Cloudera Standard(free edition) CDH 4.3.0 Cloudera Manager 4.6.2 Sanjay Subramanian 8/16/13 12:04 PM
Awesome thanks so much for your patience and time to answer my questions. Let me go back to the doc and understand all this based on your mail.

Also can u please answer the following 

[1] Prerequisite section : Not sure how to set minimum user ID for job submission. We are using YARN and I could not find this property in Cloudera Manager "MapReduce Properties"

thanks

sanjay
...
Re: Does Sentry work with Cloudera Standard(free edition) CDH 4.3.0 Cloudera Manager 4.6.2 Santhosh Srinivasan 8/16/13 1:13 PM
...
Re: Does Sentry work with Cloudera Standard(free edition) CDH 4.3.0 Cloudera Manager 4.6.2 Sanjay Subramanian 8/19/13 3:51 PM
Another question. 

Just to clarify I have not configured Sentry yet. This question came to my mind. Does Sentry only work for users that have SSHed into a server that runs hive CLI ?

I will tell u why I have this doubt.

My question....
===========
After Sentry is successfully configured can ROLES and permissions be enforced if user logins thru a beeline hive client ?

As of today I find that hiveserver2 supports LDAP authentication. I got that working so users accessing hive from client machines have to be LDAP authenticated

But even after defining ROLES, USERS and pernissions , I could not prevent a user connecting from a beeline client  and doing more than allowed by ROLES and permissions 

sanjay




On Thursday, August 15, 2013 10:28:38 PM UTC-7, Shreepadma Venugopalan wrote:
...
Re: Does Sentry work with Cloudera Standard(free edition) CDH 4.3.0 Cloudera Manager 4.6.2 Shreepadma Venugopalan 8/19/13 8:53 PM
Hi Sanjay,

How are you doing the role grant? Are you by any chance using the grant statement on the Hive CLI/Beeline? Sentry is supported only with HS2 + Beeline. We don't recommend you use Sentry with the Hive CLI.

--Shreepadma
...
Re: Does Sentry work with Cloudera Standard(free edition) CDH 4.3.0 Cloudera Manager 4.6.2 Sanjay Subramanian 8/20/13 11:21 AM
Yeah I was doing the ROLE GRANTS with Hive CLI (there is no documentation that said u could not), but I realised that configuring the roles thru hive CLI does not work with HiveServer 2 as well.

So yes today I am working on getting Sentry up and running on my sandbox. 

As of now I am not using Kerberos because I could not convince the System guys that its a good thing and we need it. But based on an earlier mail from u, I can still run Sentry without Kerberos....so I am going ahead this route.

Our main business usecase is our analytics folks have to connect to HiveServer2 thru SAS connectors and run queries....and I dont want them doing any CRUD operations

I HAVE to make this Sentry work !!! Else I might be forced to look for a job elsewhere :-) 

regards

sanjay
...
Re: Does Sentry work with Cloudera Standard(free edition) CDH 4.3.0 Cloudera Manager 4.6.2 Shreepadma Venugopalan 8/20/13 2:06 PM
Hi Sanjay,

The roles and privileges are from hive's baked in authorization. Note that its advisory only and we don't recommend using it for true security. Please look at the Sentry documentation and let me know if you have questions.

Thanks.
Shreepadma
...
Re: Does Sentry work with Cloudera Standard(free edition) CDH 4.3.0 Cloudera Manager 4.6.2 Sanjay Subramanian 8/20/13 4:16 PM
Thanks Shreepadma

Still getting errors. Can u please provide guidance. My files are as follows

I am NOT running hive-server service 
Only following services are running 
- hive-metastore
- hive-server2

/etc/sentry/default.ini  - local file (NOT HDFS)
[roles]
sas_role = server=dev-thdp5.corp.nextag.com->db=default,table=keyword_impressions_log->action=select
admin_role = server=dev-thdp5.corp.nextag.com

/etc/sentry/sentry-provider.ini  - local file (NOT HDFS)
[databases]
default = file:////etc/sentry/default.ini

[groups]
sas_group = sas_role

[users]
hiveuser1 = sas_group

[roles]
sas_role = server=dev-thdp5.corp.nextag.com->db=default,table=keyword_impressions_log->action=select
admin_role = server=dev-thdp5.corp.nextag.com

/etc/sentry/sentry-site.xml  - local file (NOT HDFS)
<configuration>
   <property>
      <name>hive.sentry.provider</name>
      <value>org.apache.sentry.provider.file.HadoopGroupResourceAuthorizationProvider</value>
   </property>
   <property>
     <name>hive.sentry.provider.resource</name>
     <value>file:////etc/sentry/sentry-provider.ini</value>
   </property>
   <property>
     <name>hive.sentry.server</name>
     <value>dev-thdp5.corp.nextag.com</value>
   </property>
</configuration>

/etc/hive/conf/hive-site.xml (local NOT HDFS)
<?xml version="1.0" encoding="UTF-8"?>

<!--Autogenerated by Cloudera CM on 2013-08-20T02:46:09.113Z-->
<configuration>
  <property>
    <name>hive.metastore.local</name>
    <value>false</value>
  </property>
  <property>
    <name>hive.metastore.uris</name>
    <value>thrift://dev-thdp5.corp.nextag.com:9083</value>
  </property>
  <!--'hive.metastore.warehouse.dir', originally set to '/user/hive/warehouse' (non-final), is overridden below by a safety valve-->
  <property>
    <name>hive.warehouse.subdir.inherit.perms</name>
    <value>true</value>
  </property>
  <property>
    <name>mapred.reduce.tasks</name>
    <value>-1</value>
  </property>
  <property>
    <name>hive.exec.reducers.bytes.per.reducer</name>
    <value>1073741824</value>
  </property>
  <property>
    <name>hive.exec.reducers.max</name>
    <value>999</value>
  </property>
  <property>
    <name>hive.metastore.execute.setugi</name>
    <value>true</value>
  </property>
  <property>
    <name>hive.support.concurrency</name>
    <value>true</value>
  </property>
  <property>
    <name>hive.zookeeper.quorum</name>
    <value>dev-thdp5.corp.nextag.com</value>
  </property>
  <property>
    <name>hive.zookeeper.client.port</name>
    <value>2181</value>
  </property>
  <property>
    <name>sentry.allow.hive.impersonation</name>
    <value>true</value>
  </property>
  <property>
    <name>hive.server2.session.hook</name>
    <value>org.apache.sentry.binding.hive.HiveAuthzBindingSessionHook </value>
  </property>
  <property>
    <name>hive.sentry.conf.url</name>
    <value>file:////etc/sentry/sentry-site.xml</value>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://dev-thdp5.corp.nextag.com/hive</value>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>************</value>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>**********</value>
  </property>
  <property>
    <name>datanucleus.autoCreateSchema</name>
    <value>false</value>
  </property>
  <property>
    <name>datanucleus.fixedDatastore</name>
    <value>true</value>
  </property>
  <property>
    <name>hive.metastore.warehouse.dir</name>
    <value>/user/hive/warehouse</value>
  </property>
  <property>
    <name>hive.server2.authentication</name>
    <value>LDAP</value>
  </property>
  <property>
    <name>hive.server2.thrift.port</name>
    <value>10786</value>
  </property>
  <property>
    <name>hive.server.thrift.port</name>
    <value>10000</value>
  </property>
  <property>
    <name>hive.server2.authentication.ldap.url</name>
    <value>ldap://corp.nextag.com</value>
  </property>
  <property>
    <name>hive.security.authorization.enabled</name>
    <value>true</value>
  </property>
  <property>
    <name>hive.security.authorization.createtable.owner.grants</name>
    <value>ALL</value>
  </property>
  <property>
    <name>hive.security.authorization.createtable.role.grants</name>
    <value>sas_role:select</value>
  </property>
  <property>
    <name>hive.security.authorization.createtable.group.grants</name>
    <value>hiveuser1:select</value>
  </property>
  <property>
    <name>hive.security.authorization.createtable.user.grants</name>
    <value>hiveuser1:select</value>
  </property>
</configuration>

Thanks
sanjay
...
Re: Does Sentry work with Cloudera Standard(free edition) CDH 4.3.0 Cloudera Manager 4.6.2 Sanjay Subramanian 8/20/13 7:24 PM
attaching hive-server2 logs
...