Hive on Alluxio (No FileSystem for scheme: alluxio)

851 views
Skip to first unread message

Christian Rehm

unread,
Sep 12, 2017, 9:57:50 AM9/12/17
to Alluxio Users

Hi guys,

I have alluxio running on the master node of my AWS EMR cluster. My current task is to create Hive External Tables from data in Alluxio.

I have followed the Alluxio documentation and have:

  • added export HIVE_AUX_JARS_PATH=/home/hadoop/alluxio-1.5.0-hadoop-2.7/client/default/alluxio-1.5.0-default-client.jar to hive-env.sh in /etc/hive/conf
  • added  <property><name>fs.defaultFS</name><value>alluxio://hostname:19998</value></property> to hive-site.xml in /etc/hive/conf
  • as well as <property><name>fs.hdfs.impl</name><value>org.apache.hadoop.hdfs.DistributedFileSystem</value></property>
  • changed the fs.defaultFS property to  <value>alluxio://hostname:19998</value> and added fs.alluxio.impl, fs.AbstractFileSystem.alluxio.impl and fs.alluxio-ft.impl (as stated somewhere in the docs) in core-sites.xml in /etc/hadoop/conf
  • added export HADOOP_CLASSPATH=/home/hadoop/alluxio-1.5.0-hadoop-2.7/core/client/target/alluxio-1.5.0-hadoop-client.jar:${HADOOP_CLASSPATH} to hadoop-env.sh in /etc/hadoop/conf
  • removed <property> <name>hive.execution.engine</name> <value>tez</value> </property> from hive-site.xml in /etc/hive/conf
  • copied alluxio-1.5.0-default-client.jar to /usr/lib/hive/lib

I can start Hive and it seems to sync properly with Alluxio, but once I try to execute something in Hive, for example create a table:

from docs:
hive> CREATE TABLE u_user (
userid INT,
age INT,
gender CHAR(1),
occupation STRING,
zipcode STRING)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
STORED AS TEXTFILE;
I get following error:

$ hive
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/lib/hive/lib/alluxio-1.5.0-default-client.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadoop/alluxio-1.5.0-hadoop-2.7/client/default/alluxio-1.5.0-default-client.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
17/09/12 13:54:27 INFO conf.HiveConf: Found configuration file file:/etc/hive/conf.dist/hive-site.xml

Logging initialized using configuration in file:/etc/hive/conf.dist/hive-log4j2.properties Async: false
17/09/12 13:54:30 INFO SessionState:
Logging initialized using configuration in file:/etc/hive/conf.dist/hive-log4j2.properties Async: false
17/09/12 13:54:31 INFO hive.metastore: Trying to connect to metastore with URI thrift://ip-172-26-139-20.eu-central-1.compute.internal:9083
17/09/12 13:54:31 INFO hive.metastore: Opened a connection to metastore, current connections: 1
17/09/12 13:54:31 INFO hive.metastore: Connected to metastore.
17/09/12 13:54:31 INFO metrics.MetricsSystem: Starting sinks with config: {}.
17/09/12 13:54:31 INFO hadoop.HadoopConfigurationUtils: Loading Alluxio properties from Hadoop configuration: {}
17/09/12 13:54:31 INFO alluxio.AbstractClient: Alluxio client (version 1.5.0) is trying to connect with FileSystemMasterClient @ /172.26.139.20:19998
17/09/12 13:54:31 INFO alluxio.AbstractClient: Client registered with FileSystemMasterClient @ ip-172-26-139-20.eu-central-1.compute.internal/172.26.139.20:19998
17/09/12 13:54:31 INFO alluxio.AbstractClient: Alluxio client (version 1.5.0) is trying to connect with FileSystemMasterClient @ /172.26.139.20:19998
17/09/12 13:54:31 INFO alluxio.AbstractClient: Client registered with FileSystemMasterClient @ ip-172-26-139-20.eu-central-1.compute.internal/172.26.139.20:19998
17/09/12 13:54:31 INFO session.SessionState: Created HDFS directory: /tmp/hive/hadoop/e017f23a-62db-46bb-95d7-2ba6370a0999
17/09/12 13:54:31 INFO session.SessionState: Created local directory: /mnt/tmp/hadoop/e017f23a-62db-46bb-95d7-2ba6370a0999
17/09/12 13:54:31 INFO session.SessionState: Created HDFS directory: /tmp/hive/hadoop/e017f23a-62db-46bb-95d7-2ba6370a0999/_tmp_space.db
17/09/12 13:54:31 INFO conf.HiveConf: Using the default value passed in for log id: e017f23a-62db-46bb-95d7-2ba6370a0999
17/09/12 13:54:31 INFO session.SessionState: Updating thread name to e017f23a-62db-46bb-95d7-2ba6370a0999 main
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
17/09/12 13:54:31 INFO CliDriver: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive> CREATE TABLE u_user (
   > userid INT,
   > age INT,
   > gender CHAR(1),
   > occupation STRING,
   > zipcode STRING)
   > ROW FORMAT DELIMITED
   > FIELDS TERMINATED BY '|'
   > STORED AS TEXTFILE;
17/09/12 13:54:39 INFO conf.HiveConf: Using the default value passed in for log id: e017f23a-62db-46bb-95d7-2ba6370a0999
17/09/12 13:54:39 INFO ql.Driver: Compiling command(queryId=hadoop_20170912135439_a79fabb5-fefe-4722-8d99-70c2be33d45d): CREATE TABLE u_user (
userid INT,
age INT,
gender CHAR(1),
occupation STRING,
zipcode STRING)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
STORED AS TEXTFILE
17/09/12 13:54:40 INFO parse.CalcitePlanner: Starting Semantic Analysis
17/09/12 13:54:40 INFO parse.CalcitePlanner: Creating table default.u_user position=13
17/09/12 13:54:40 INFO sqlstd.SQLStdHiveAccessController: Created SQLStdHiveAccessController for session context : HiveAuthzSessionContext [sessionString=e017f23a-62db-46bb-95d7-2ba6370a0999, clientType=HIVECLI]
17/09/12 13:54:40 WARN session.SessionState: METASTORE_FILTER_HOOK will be ignored, since hive.security.authorization.manager is set to instance of HiveAuthorizerFactory.
17/09/12 13:54:40 INFO hive.metastore: Mestastore configuration hive.metastore.filter.hook changed from org.apache.hadoop.hive.metastore.DefaultMetaStoreFilterHookImpl to org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook
17/09/12 13:54:40 INFO hive.metastore: Closed a connection to metastore, current connections: 0
17/09/12 13:54:40 INFO hive.metastore: Trying to connect to metastore with URI thrift://ip-172-26-139-20.eu-central-1.compute.internal:9083
17/09/12 13:54:40 INFO hive.metastore: Opened a connection to metastore, current connections: 1
17/09/12 13:54:40 INFO hive.metastore: Connected to metastore.
17/09/12 13:54:40 INFO ql.Driver: Semantic Analysis Completed
17/09/12 13:54:40 INFO ql.Driver: Returning Hive schema: Schema(fieldSchemas:null, properties:null)
17/09/12 13:54:40 INFO ql.Driver: Completed compiling command(queryId=hadoop_20170912135439_a79fabb5-fefe-4722-8d99-70c2be33d45d); Time taken: 0.99 seconds
17/09/12 13:54:40 INFO ql.Driver: Concurrency mode is disabled, not creating a lock manager
17/09/12 13:54:40 INFO ql.Driver: Executing command(queryId=hadoop_20170912135439_a79fabb5-fefe-4722-8d99-70c2be33d45d): CREATE TABLE u_user (
userid INT,
age INT,
gender CHAR(1),
occupation STRING,
zipcode STRING)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
STORED AS TEXTFILE
17/09/12 13:54:40 INFO ql.Driver: Starting task [Stage-0:DDL] in serial mode
17/09/12 13:54:40 INFO exec.DDLTask: creating table default.u_user on alluxio://172.26.139.20:19998/user/hive/warehouse/u_user
17/09/12 13:54:40 ERROR exec.DDLTask: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got exception: java.io.IOException No FileSystem for scheme: alluxio)
       at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:842)
       at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:847)
       at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:3992)
       at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:332)
       at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
       at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
       at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2073)
       at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1744)
       at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1453)
       at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1171)
       at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
       at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
       at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
       at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
       at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
       at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
       at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
       at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
       at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
       at java.lang.reflect.Method.invoke(Method.java:498)
       at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
       at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: MetaException(message:Got exception: java.io.IOException No FileSystem for scheme: alluxio)
       at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result$create_table_with_environment_context_resultStandardScheme.read(ThriftHiveMetastore.java:41498)
       at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result$create_table_with_environment_context_resultStandardScheme.read(ThriftHiveMetastore.java:41466)
       at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result.read(ThriftHiveMetastore.java:41392)
       at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86)
       at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_create_table_with_environment_context(ThriftHiveMetastore.java:1183)
       at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.create_table_with_environment_context(ThriftHiveMetastore.java:1169)
       at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.create_table_with_environment_context(HiveMetaStoreClient.java:2334)
       at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.create_table_with_environment_context(SessionHiveMetaStoreClient.java:93)
       at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:747)
       at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:735)
       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
       at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
       at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
       at java.lang.reflect.Method.invoke(Method.java:498)
       at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:154)
       at com.sun.proxy.$Proxy21.createTable(Unknown Source)
       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
       at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
       at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
       at java.lang.reflect.Method.invoke(Method.java:498)
       at org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2265)
       at com.sun.proxy.$Proxy21.createTable(Unknown Source)
       at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:832)
       ... 22 more

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: java.io.IOException No FileSystem for scheme: alluxio)
17/09/12 13:54:40 ERROR ql.Driver: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: java.io.IOException No FileSystem for scheme: alluxio)
17/09/12 13:54:40 INFO ql.Driver: Completed executing command(queryId=hadoop_20170912135439_a79fabb5-fefe-4722-8d99-70c2be33d45d); Time taken: 0.136 seconds
17/09/12 13:54:40 INFO conf.HiveConf: Using the default value passed in for log id: e017f23a-62db-46bb-95d7-2ba6370a0999
17/09/12 13:54:40 INFO session.SessionState: Resetting thread name to  main

I really hope that someone can help me fix this issue, it's probably something ridiculous as it is most of the time.

Thanks in advance!


Best
Chris

Christian Rehm

unread,
Sep 12, 2017, 10:34:01 AM9/12/17
to Alluxio Users
Fixed it! Problem was that I only restarted the hive-server2 process after each change, but now I restarted hive-hcatalog-server as well and it finally works! :)

Gene Pang

unread,
Sep 12, 2017, 11:05:58 AM9/12/17
to Alluxio Users
Hi Chris,

Glad you got it resolved!

Thanks,
Gene

Weizhan Zeng

unread,
Sep 13, 2017, 9:38:03 PM9/13/17
to Alluxio Users
Add conf and jar on hive metastore 

在 2017年9月12日星期二 UTC+8下午9:57:50,Christian Rehm写道:

Bin Fan

unread,
Sep 15, 2017, 2:21:51 PM9/15/17
to Alluxio Users
one possible way to solve this:
adding --conf spark.sql.hive.metastore.sharedPrefixes=spark.sql.hive.metastore.sharedPrefixes=com.mysql.jdbc,org.postgresql,com.microsoft.sqlserver,oracle.jdbc,alluxio 
when launching spark-shell

more details can be found from my doc fix: https://github.com/Alluxio/alluxio/pull/6155
Reply all
Reply to author
Forward
0 new messages