Configuration of FileHiveMetastore

183 views
Skip to first unread message

George Papageorgiou

unread,
Nov 14, 2016, 2:57:57 PM11/14/16
to Presto
Hi all,

I'm interested in using the file-based metastore in the Hive connector after seeing that some progress has been made in pull request #6525 (Add FileHiveMetastore https://github.com/prestodb/presto/pull/6525/).

I've cloned Dain's repository and created a new HiveFilePlugin in the file-metastore branch by extending the HivePlugin and assigning it the "hive-file" name. I added the new plugin in plugin/hive-file and created a properties file under etc/catalog. The new properties file contains the "connector.name", "hive.metastore.uri" and "hive.metastore.catalog.dir" configuration properties.

If the "hive.metastore.uri" configuration property is not set, PrestoServer crashes with a

Error: Invalid configuration property hive.metastore.uri: may not be null (for class com.facebook.presto.hive.StaticMetastoreConfig.metastoreUris)
      at com.facebook.presto.hive.HiveClientModule.configure(HiveClientModule.java:84)

error message.

If I assign "hive.metastore.uri" a valid URI, and restart the server, the server crashes with a:

Configuration property 'hive.metastore.catalog.dir=/filemetastore' was not used
  at io.airlift.bootstrap.Bootstrap.lambda$initialize$2(Bootstrap.java:235)

error message. However, that configuration parameter is one of those in presto-hive/src/main/java/com/facebook/presto/hive/metastore/file/FileHiveMetastoreConfig.java

Does anyone have any suggestions/ideas about what I'm doing wrong?

Thanks,

George


ps The last few lines of var/log/server.log contain the following:

2016-11-14T14:53:52.013-0500    INFO    main    Bootstrap
2016-11-14T14:53:52.013-0500    WARN    main    Bootstrap   UNUSED PROPERTIES
2016-11-14T14:53:52.013-0500    WARN    main    Bootstrap   hive.metastore.catalog.dir=/filemetastore
2016-11-14T14:53:52.014-0500    WARN    main    Bootstrap
2016-11-14T14:53:52.230-0500    ERROR   main    com.facebook.presto.server.PrestoServer Unable to create injector, see the following errors:

1) Configuration property 'hive.metastore.catalog.dir=/filemetastore' was not used
  at io.airlift.bootstrap.Bootstrap.lambda$initialize$2(Bootstrap.java:235)

1 error
com.google.inject.CreationException: Unable to create injector, see the following errors:

1) Configuration property 'hive.metastore.catalog.dir=/filemetastore' was not used
  at io.airlift.bootstrap.Bootstrap.lambda$initialize$2(Bootstrap.java:235)

1 error
    at com.google.inject.internal.Errors.throwCreationExceptionIfErrorsExist(Errors.java:466)
    at com.google.inject.internal.InternalInjectorCreator.initializeStatically(InternalInjectorCreator.java:155)
    at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:107)
    at com.google.inject.Guice.createInjector(Guice.java:96)
    at io.airlift.bootstrap.Bootstrap.initialize(Bootstrap.java:242)
    at com.facebook.presto.hive.HiveConnectorFactory.create(HiveConnectorFactory.java:106)
    at com.facebook.presto.connector.ConnectorManager.createConnector(ConnectorManager.java:304)
    at com.facebook.presto.connector.ConnectorManager.addCatalogConnector(ConnectorManager.java:193)
    at com.facebook.presto.connector.ConnectorManager.createConnection(ConnectorManager.java:185)
    at com.facebook.presto.connector.ConnectorManager.createConnection(ConnectorManager.java:171)
    at com.facebook.presto.metadata.StaticCatalogStore.loadCatalog(StaticCatalogStore.java:99)
    at com.facebook.presto.metadata.StaticCatalogStore.loadCatalogs(StaticCatalogStore.java:77)
    at com.facebook.presto.server.PrestoServer.run(PrestoServer.java:120)
    at com.facebook.presto.server.PrestoServer.main(PrestoServer.java:67)


2016-11-14T14:53:52.231-0500    INFO    Thread-39   io.airlift.bootstrap.LifeCycleManager   Life cycle stopping...

Dain Sundstrom

unread,
Nov 14, 2016, 4:38:16 PM11/14/16
to presto...@googlegroups.com
There is bug in my branch here:

https://github.com/prestodb/presto/pull/6525/commits/919b2ef6aa3a4035e83f3399162c0039a1708bfe#diff-b743769f5302d024c2bd3ffc5426f124R40

That line should be:

configBinder(binder).bindConfig(FileHiveMetastoreConfig.class);
> --
> You received this message because you are subscribed to the Google Groups "Presto" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to presto-users...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

George Papageorgiou

unread,
Nov 14, 2016, 6:32:23 PM11/14/16
to Presto
Hi Dain,

Thanks for the quick reply.

I tried to follow your suggestion by using:

ConfigBinder.configBinder(binder).bindConfig(FileHiveMetastoreConfig.class);

instead of what you mention in your reply:

configBinder(binder).bindConfig(FileHiveMetastoreConfig.class);

which hopefully is what you meant in your comment above (I'm not familiar with airlift and its usage).

Restarting the server and having a properties file which includes "connector.name", "hive.metastore.uri" and "hive.metastore.catalog.dir" as before, crashes the server with the same error (Unable to create injector) as shown in my previous message.

Any further ideas/suggestions would be helpful. Hopefully, I'm not missing something obvious in my configuration. I assume that using the "hive.metastore.catalog.dir" configuration parameter would be enough to force the connector to use the "file" metastore (and by that I mean HDFS) instead of the hive metastore.

Thanks,

George

Dain Sundstrom

unread,
Nov 14, 2016, 7:19:56 PM11/14/16
to presto...@googlegroups.com
For my branch you need these properties:

hive.metastore=file
hive.metastore.catalog.dir=<some_hadoop_fs_url>

The hive.metastore.uri property only applies when hive.metastore=thrift is set (which is the default).

-dain

George Papageorgiou

unread,
Nov 14, 2016, 10:50:16 PM11/14/16
to Presto
Hi Dain,

Setting 

hive.metastore=file
hive.metastore.catalog.dir=<hdfs_url> 

seems to help, although I still have to set the hive.metastore.uri property, otherwise the server crashes during startup. I'm able to see the catalog corresponding to the connector and the infrormation_schema within that catalog. I also see some tables in information_schema:

presto> show tables from file.information_schema;
          Table
-------------------------
 __internal_partitions__
 columns
 schemata
 tables
 views
(5 rows)

I've created a .prestoSchema (no .prestoPermissions file yet as it is not clear to me what the content of that file should be) Json file with ownerName (set to the owner of the HDFS file system) and ownerType (set to USER) which I uploaded to the directory specified in hive.metastore.catalog.dir hoping that this would be enough to make that directory visible as a database in Presto. This does not seem to work, if you have any suggestion it would be helpful.

I will keep experimenting and if I have any specific questions I will post on this thread. Thanks again for your help and quick responses.

George

Dain Sundstrom

unread,
Nov 14, 2016, 10:56:53 PM11/14/16
to presto...@googlegroups.com

> On Nov 14, 2016, at 7:50 PM, George Papageorgiou <gpa...@gmail.com> wrote:
>
> I've created a .prestoSchema (no .prestoPermissions file yet as it is not clear to me what the content of that file should be) Json file with ownerName (set to the owner of the HDFS file system) and ownerType (set to USER) which I uploaded to the directory specified in hive.metastore.catalog.dir hoping that this would be enough to make that directory visible as a database in Presto. This does not seem to work, if you have any suggestion it would be helpful.
>
> I will keep experimenting and if I have any specific questions I will post on this thread. Thanks again for your help and quick responses.

This system is not really designed for users to create .prestoSchema files manually. Instead you should use the Presto DDL commands to create the schemas and tables you want.

-dain

George Papageorgiou

unread,
Nov 15, 2016, 12:46:11 AM11/15/16
to Presto
Things seem to be working now. I'm able to create schemas and tables and also insert data into these tables (using Presto DDL).

Thanks again for all your help. Do you have an idea of when this might be merged into the main branch and be part of the distribution?

George
Reply all
Reply to author
Forward
0 new messages