We are trying to enable Kerberos authentication for our Presto cluster (version 0.143) on Amazon EMR (version emr-4.6.0) based on this guide:
(The guide is from Teradata, but the corresponding code seems to have been merged back into core Presto already. There is also a recent PR [1] to pull this documentation into the presto repo.)
So far, we successfully managed to:
(1) set a static hostname for the Presto coordinator node (MASTER node in EMR cluster)
(2) update the DNS configuration to enable lookup of our internal KDC server (which is accessible from the Presto master node)
(3) bind the Presto coordinator to a Windows domain, by running the "domainjoin-cli" command (provided by PBIS Open) on the coordinator node.
However, at this point we are stuck. If our presto coordinator node is joined to our Windows domain (see step (3) above), how do we configure Kerberos auth for Presto? In particular, we are struggling to understand the following questions:
(A) As we understand it, the following steps in the guide are to create 1) a computer principal "presto/presto-coordina...@EXAMPLE.COM" for the machine on which the coordinator is executing, and 2) a user principal "pre...@EXAMPLE.COM" for the user account under which the Presto server is executing on that machine. Are both accounts strictly required, or would it be sufficient to create the computer principal only? What are the roles and responsibilities of both principals?
kadmin
> addprinc -randkey pre...@EXAMPLE.COM
> addprinc -randkey presto/presto-coordina...@EXAMPLE.COM
(B) Do we need to run the Presto server (i.e., the com.facebook.presto.server.PrestoServer Java process) under a Domain Account user? In Amazon EMR, by default, Presto runs as local user "presto" - can we continue to run the process under the local "presto" user, or does it somehow have to run under a Domain Account user (i.e, the user principal created in (A))?
(C) Where should the Service Principal Name (SPN) [2] be registered, on the machine account or on the user account?
If anyone has managed to successfully run Presto with Kerberos auth, we would highly appreciate if you could point us in the right direction to help in getting this set up.
Many thanks,
Waldemar
[1] https://github.com/prestodb/presto/pull/5217
[2] https://msdn.microsoft.com/en-us/library/ms677949(v=vs.85).aspx
(B) Do we need to run the Presto server (i.e., the com.facebook.presto.server.PrestoServer Java process) under a Domain Account user? In Amazon EMR, by default, Presto runs as local user "presto" - can we continue to run the process under the local "presto" user, or does it somehow have to run under a Domain Account user (i.e, the user principal created in (A))?
Hi Christina,
I am following the same documentation from Teradata to set kerberos authentication for a EMR- Presto cluster. Right now just testing with a single node presto server.
Create a service principal for the Presto server -- I believe in the AD world, this translates to setting up the service principal name. For Presto, it can be called presto/presto-coordinator.example.com (though for the JDBC/ODBC drivers, it must be called HTTP/presto-coordinator.example.com). The presto@REALM principal is not necessary.
I created the following
the service principal name is presto/<fqn or private ip address of the master node>@REALM. Since the presto-coordinator.example.com is just for example
Added these in config.properties
http.server.authentication.enabled=true
http.server.authentication.krb5.service-name=presto/<fqn or private ip address of the master node>
http.server.authentication.krb5.keytab=/etc/presto/conf/presto.keytab
http.authentication.krb5.config=/etc/krb5.conf
http-server.https.enabled=true
http-server.https.port=8888
http-server.https.keystore.path=/etc/presto/conf/keystore.jks
http-server.https.keystore.key=password
Setup SSL and keystone as per the documentation
Executing presto cli
export JAVA8_HOME=/usr/lib/jvm/java-1.8.0
PRESTO_HOME=/usr/lib/presto
export PATH=$JAVA8_HOME/bin:$PATH
$PRESTO_HOME/bin/presto-cli-0.143-executable \
--server https://ip-<ip address>.ec2.internal:8888 \
--enable-authentication \
--krb5-config-path /etc/krb5.conf \
--krb5-principal user@REALM \
--krb5-keytab-path ./presto.keytab \
--krb5-remote-service-name presto/<fqn or private ip address of the master node>\
--keystore-path ./keystore.jks \
--keystore-password password \
--catalog hive
But i get a
"Error running command: javax.net.ssl.SSLHandshakeException: General SSLEngine problem"
Can't proceed further. Any suggestion
Thanks
Jothikumar
Your error is with how you have SSL set up (that's why you have "Error running command: javax.net.ssl.SSLHandshakeException: General SSLEngine problem"). Make sure that
1. You have ip-<ip address>.ec2.internal added to keystore.jks
2. keystore.jks is accessible by the presto user (or whatever user Presto is running as)
However, you are going to soon have a Kerberos issue, once you get SSL working:
http.server.authentication.krb5.service-name should be JUST presto, it shouldn't have the host. Same with the --krb5-remote-service-name argument for the CLI.
While you're double checking things, make sure that the Kerberos keytab for the server (presto.keytab) is accessible by the Presto user.
And finally, in the CLI command, the keytab given must match the krb5 principal given. Unless presto.keytab has both the credentials for presto and user, the following is probably wrong:
--krb5-principal user@REALM \
--krb5-keytab-path ./presto.keytab \
I recommend adding extra debug output when running the CLI: http://teradata.github.io/presto/docs/current/security/server.html#additional-kerberos-debugging-info.
Christina
________________________________________
From: Jothikumar [kbmk...@gmail.com]
Sent: Friday, May 20, 2016 1:21 PM
To: Presto
Cc: Wallin, Christina A
Subject: Re: Kerberos based authentication for Presto
Thanks Christina,
I think i might have made a mistake while creating the keystore
keytool -genkeypair -alias presto -keyalg RSA -keystore keystore.jks
Instead of saying "presto" , i actually need to give the principal name here
my principal name which is presto/<fqn or private ip address of the master node>@REALM. This is the principal for the presto service i created in the KDC.
Also when giving the first name in the keystore i should give the unqualified hostname (ip address) of the presto co ordinator. I am using the same keystore for both the coordinator and cli and it is accessible by the co-ordinator and the cli. The coordinator (presto server) is run by "presto" user and the cli is executed by hadoop user. Both have access to the keystore.
I have a slight confusion from the documentation, it says
http.server.authentication.krb5.service-name = The Kerberos server name for the Presto coordinator. Must match the Kerberos principal
Then why it is just presto , my principal is presto/<fqn or private ip address of the master node>@REALM
Please let me know my understanding is correct, I will test it and update
Thanks
Jothikumar
The keystore has nothing to do with Kerberos; it's to set up HTTPS on the Presto server so that the connection is secure. The alias you use doesn't matter; it just serves as an identifier for which certificate it is. The thing that matters is that for the field "What is your first and last name", you give the FQDN of the Presto coordinator. It's significantly easier to use a FQDN and not an IP address, but if you're going to use an IP address, you can do something like this:
keytool -genkeypair -alias presto -keyalg RSA -keystore {keystore_path} -keypass {keystore_password} -storepass {keystore_password} -dname "CN={master_fqdn}, OU=, O=, L=, S=, C=" -ext san=ip:{ip}
Note that the above command doesn't prompt you for input. You can substitute either the coordinator FQDN or the ip instead of {master_fqdn}, namely the CN (aka Common Name). See here for more information: http://stackoverflow.com/questions/8443081/how-are-ssl-certificate-server-names-resolved-can-i-add-alternative-names-using/8444863#8444863.
The docs might have been a little unclear on the point about http.server.authentication.krb5.service-name, but it definitely should not be qualified by the hostname.
Happy Kerberizing!
Christina
________________________________________
From: Jothikumar [kbmk...@gmail.com]
Sent: Friday, May 20, 2016 2:05 PM
--krb5-remote-service-name presto
If you qualify the servicename there, you'll get errors authenticating presto/server.example.com/server.ex...@EXAMPLE.COM.
Again, we'll be happy to take pull requests on the docs. Or, if you'd rather just close the loop here, we'll update the docs ourselves.
Eric
Thanks for the detailed explanation, It is making a lot of sense now. I am able to setup kerberos successfully between the Presto - coordinator (Single node EMR) and the presto - cli.
I made the mistake of giving the wrong name in the CN part of keystore, where Christina explained in detail and i was able to fix it and make progress.
Now comes the part where i want to connect the kerberized coordinator with a JDBC complaint sql client and ODBC as well. From the previous discuss there was a mention that the principal name is HTTP/FQDN@REALM.
What about the java keystore in case of JDBC or ODBC. Appreciate it, if you have any steps for JDBC connection to the presto coordinator.
Thanks for all your help
Jothikumar
I see the steps like this
in the /etc/presto/config.properties file, set the following properties:
http.server.authentication.krb5.service-name=HTTP
http.server.authentication.krb5.keytab=HTTP.keytab
On your client machine, in the java.policy file for your Java environment, include the following line:
permission java.util.PropertyPermission "javax.security.auth.useSubjectCredsOnly", "write";
Connect to the Presto server using a connection URL written in the following format:
jdbc:presto://[Host]:[Port];enableKerberos=true;SSLCertificate=[SSLCert]
The variables are defined as follows:
[Host] is the DNS or IP address of the server.
[Port] is the number of the TCP port to connect to.
[SSLCert] is the full path to a .jks file that contains the SSL certificate for the server.
Reference doc : http://www.simba.com/products/Presto/doc/v1/JDBC_InstallGuide/content/jdbc/pr/authenticating/kerberos.htm
I am going to try this and will update, If you see any problems with this approach let me know.
Thanks
Jothikumar
Connection connection = DriverManager.getConnection("jdbc:presto://ip-<ipaddress>.ec2.internal:8888/hive/schema_name;enableKerberos=true;SSLCertificate=/home/hadoop/presto_keystore.jks","test",null);
If i use the teradata driver i get the below error
xception in thread "main" java.sql.SQLException: [Teradata][Presto](100073) Error fetching JSON content: No content to map due to end-of-input
at [Source: ; line: 1, column: 1].
at com.teradata.presto.client.PRClientUtil.toJsonNode(Unknown Source)
at com.teradata.presto.client.PrestoAPI.execute(Unknown Source)
at com.teradata.presto.dataengine.PRResultSet.execute(Unknown Source)
at com.teradata.presto.dataengine.PRQueryExecutor.execute(Unknown Source)
at com.teradata.presto.dataengine.metadata.PRQueryMetadataSource.executeQuery(Unknown Source)
at com.teradata.presto.dataengine.metadata.PRTablesMetadataSource.<init>(Unknown Source)
at com.teradata.presto.dataengine.PRDataEngine.makeNewMetadataSource(Unknown Source)
at com.teradata.dsi.dataengine.impl.DSIDataEngine.makeNewMetadataResult(Unknown Source)
at com.teradata.dsi.dataengine.impl.DSIDataEngine.makeNewMetadataResult(Unknown Source)
at com.teradata.presto.core.PRConnection.doConnectionTest(Unknown Source)
at com.teradata.presto.core.PRConnection.connect(Unknown Source)
at com.teradata.jdbc.common.BaseConnectionFactory.doConnect(Unknown Source)
at com.teradata.jdbc.common.AbstractDriver.connect(Unknown Source)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:247)
Caused by: com.teradata.support.exceptions.GeneralException: [Teradata][Presto](100073) Error fetching JSON content: No content to map due to end-of-input
at [Source: ; line: 1, column: 1].
... 15 more
If i use the Open source presto jdbc driver, it doesn't even recognize the additional kerberos parameters.
Any help appreciated
Thanks
Jothikumar
Thanks
Jothikumar
--
You received this message because you are subscribed to the Google Groups "Presto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to presto-users...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Thanks for the details.
I got the presto cli working (kerberized) by following the below documentation
http://teradata.github.io/presto/docs/current/security/server.html
For JDBC, as per the Teradata JDBCPrestoInstallGuide.pdf it says to add the below properties
/etc/presto/config.properties file, set the following properties:
http.server.authentication.krb5.service-name=HTTP
http.server.authentication.krb5.keytab=HTTP.keytab
This is different from the cli settings,
In CLI we add the below property
http.server.authentication.krb5.service-name=presto, We create a principal presto/{FQDN}@REALM and export it into a keytab
So for JDBC we need to create a principal with HTTP/{FQDN}@REALM and export it in a key tab and use that? So if i want both cli and JDBC working together in kerberos how the below property should be set
http.server.authentication.krb5.service-name=HTTP / Presto ?
Thanks
Jothikumar
Thanks
Jothikumar
Thanks for all your support, JDBC access works fine with the Teradata driver.
Thanks
Jothikumar
I got confused while creating the CN name for the java keystore and TLS. The CN name should match with the name we use to access the presto coordinator. If it is IP, then we need IP in the certificate
presto_host_name should match the CN name in the certificate
--server https://presto_host_name:<port> \
keytool -genkeypair -alias presto -keyalg RSA -keystore presto_keystore.jks -keypass password123 -storepass password123 -dname "CN=<presto_host_name>, OU=, O=, L=, S=, C=" -ext san=ip:<presto host ip>
The above command helped me.
Thanks
Jothikumar
Thanks
Jothikumar
http.server.authentication.enabled=true
http.server.authentication.krb5.service-name=HTTP
http.server.authentication.krb5.keytab=/etc/presto/presto.keytab
http.authentication.krb5.config=/etc/krb5.conf
http-server.https.enabled=true
http-server.https.port=7778
http-server.https.keystore.path=/etc/presto/presto.jks
http-server.https.keystore.key=<password>
java -Dsun.security.krb5.debug=true -jar /Users/hye/Downloads/presto-cli-0.147-executable.jar --debug --server https://socrates-auth-dev.data.internal.atlassian.com:7778 --enable-authentication --krb5-config-path /Users/hye/Downloads/krb5.conf --krb5-principal hye@<REALM> --krb5-keytab-path /Users/hye/Downloads/hye.keytab --krb5-remote-service-name HTTP --catalog hive --schema default --keystore-path=/Users/hye/Downloads/presto.jks --keystore-password=<password>
jdbc:presto://socrates-auth-dev.data.internal.atlassian.com:7778/hive/default;User=hye@<REALM>;enableKerberos=true;TimeZoneID=UTC;SSLCertificate=/Users/hye/Downloads/presto.jks;LogLevel=6
[Teradata][Presto](100073) Error fetching JSON content: No content to map due to end-of-input
at [Source: ; line: 1, column: 1].
2016-07-13 16:35:00.392 FINE 251 [pool-4-thread-39 - Q.flush] Jul 13 06:35:00.392 TRACE 39045 com.teradata.presto.client.PrestoAPI.execute("SELECT * FROM "system"."information_schema"."tables" WHERE table_schema LIKE '' AND table_name LIKE 'default'"): +++++ enter +++++
2016-07-13 16:35:00.392 FINE 251 [pool-4-thread-39 - Q.flush] Jul 13 06:35:00.392 TRACE 39045 com.teradata.presto.client.PRClientUtil.POST("SELECT * FROM "system"."information_schema"."tables" WHERE table_schema LIKE '' AND table_name LIKE 'default'"): +++++ enter +++++
2016-07-13 16:35:01.749 FINE 251 [pool-4-thread-39 - Q.flush] Jul 13 06:35:01.749 TRACE 39045 com.teradata.presto.client.PRClientUtil.POST("Error Detected during POST operation"): +++++ enter +++++
2016-07-14T01:25:26.699Z INFO http-worker-153 stdout >>> KrbApReq: authenticate succeed.
2016-07-14T01:25:26.699Z INFO http-worker-153 stdout Krb5Context setting peerSeqNumber to: 204003340
2016-07-14T01:25:26.699Z INFO http-worker-153 stdout >>> EType: sun.security.krb5.internal.crypto.ArcFourHmacEType
2016-07-14T01:25:26.700Z INFO http-worker-153 stdout Krb5Context setting mySeqNumber to: 315418662
2016-07-14T01:25:26.700Z INFO http-worker-153 stdout >>> Constrained deleg from GSSCaller{UNKNOWN}
2016-07-14T01:25:26.724Z INFO query-execution-3 com.facebook.presto.event.query.QueryMonitor TIMELINE: Query 20160714_012525_00002_8pg7g :: Transaction:[63614139-2f6a-49e1-9fc4-4d7c12cfc972] :: elapsed 849.00ms :: planning 9.24ms :: scheduling 23.00ms :: running 48.00ms :: finishing 769.00ms :: begin 2016-07-14T01:25:25.861Z :: end 2016-07-14T01:25:26.710Z
Jul 14, 2016 4:56:16 PM org.apache.http.impl.auth.HttpAuthenticator generateAuthResponse
WARNING: NEGOTIATE authentication error: Invalid name provided (Mechanism level: KrbException: Cannot locate default realm)
Exception in thread "main" java.sql.SQLException: [Teradata][Presto](100073) Error fetching JSON content: No content to map due to end-of-input
at [Source: ; line: 1, column: 1].