Cannot insert into Hive Partitioned Table from Presto

1.315 kali dilihat
Langsung ke pesan pertama yang belum dibaca

Martin Ciruzzi

belum dibaca,
6 Okt 2017, 16.26.5606/10/17
kepadaPresto

Hi

Do you know if there's an issue inserting data into Hive partitioned Table? Inserting into not partitioned one does not have any problem, but when trying to insert into a partitioned one, Fail connecting to metasotre exception is arised (?) Maybe nodes are trying to connect to dump their data and only master has a the metastore running?


hive -e "CREATE TABLE melidata.mciruzzi_test_presto_partitions ( usr string, path string, fecha string)   PARTITIONED BY (ds string) row format delimited fields terminated by ',' lines terminated by '\n'  STORED AS TEXTFILE LOCATION 's3n://melidata-results-batch/mciruzzi/tmp/mciruzzi_test_presto_partitions/'"

Logging initialized using configuration in file:/etc/hive/conf.dist/hive-log4j2.properties Async: true
OK
Time taken: 3.231 seconds
[hadoop@ip-10-64-61-29 ~]$ hive -e "CREATE TABLE melidata.mciruzzi_test_presto ( usr string, path string, fecha string, ds string)    row format delimited fields terminated by ',' lines terminated by '\n'  STORED AS TEXTFILE LOCATION 's3n://melidata-results-batch/mciruzzi/tmp/mciruzzi_test_presto/'"

Logging initialized using configuration in file:/etc/hive/conf.dist/hive-log4j2.properties Async: true
OK
Time taken: 4.039 seconds
[hadoop@ip-10-64-61-29 ~]$ presto-cli --server localhost:8889 --catalog hive --schema melidata
presto:melidata>
presto:melidata> insert into melidata.mciruzzi_test_presto select 'usr','path','fecha','ds';
INSERT: 1 row

Query 20171006_201541_00108_242yz, FINISHED, 101 nodes
Splits: 1,734 total, 1,734 done (100.00%)
0:00 [0 rows, 0B] [0 rows/s, 0B/s]

presto:melidata> insert into melidata.mciruzzi_test_presto_partitions select 'usr','path','fecha','ds';

Query 20171006_201628_00112_242yz, FAILED, 102 nodes
Splits: 1,733 total, 1,239 done (71.49%)
0:00 [0 rows, 0B] [0 rows/s, 0B/s]

Query 20171006_201628_00112_242yz failed: Failed connecting to Hive metastore: [localhost:9083]



Exception:

2017-10-06T19:57:36.142Z ERROR remote-task-callback-1179 com.facebook.presto.execution.StageStateMachine Stage 20171006_195735_00069_242yz.1 failed
com.facebook.presto.spi.PrestoException: Failed connecting to Hive metastore: [localhost:9083]
at com.facebook.presto.hive.StaticHiveCluster.createMetastoreClient(StaticHiveCluster.java:81)
at com.facebook.presto.hive.metastore.ThriftHiveMetastore.lambda$getPartition$18(ThriftHiveMetastore.java:552)
at com.facebook.presto.hive.metastore.HiveMetastoreApiStats.lambda$wrap$0(HiveMetastoreApiStats.java:42)
at com.facebook.presto.hive.RetryDriver.run(RetryDriver.java:138)
at com.facebook.presto.hive.metastore.ThriftHiveMetastore.getPartition(ThriftHiveMetastore.java:551)
at com.facebook.presto.hive.metastore.BridgingHiveMetastore.getPartition(BridgingHiveMetastore.java:188)
at com.facebook.presto.hive.metastore.CachingHiveMetastore.loadPartitionByName(CachingHiveMetastore.java:486)
at com.facebook.presto.hive.metastore.CachingHiveMetastore.access$700(CachingHiveMetastore.java:58)
at com.facebook.presto.hive.metastore.CachingHiveMetastore$8.load(CachingHiveMetastore.java:189)
at com.facebook.presto.hive.metastore.CachingHiveMetastore$8.load(CachingHiveMetastore.java:184)
at com.google.common.cache.CacheLoader$1.load(CacheLoader.java:189)
at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3527)
at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2319)
at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2282)
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2197)
at com.google.common.cache.LocalCache.get(LocalCache.java:3937)

Martin Ciruzzi

belum dibaca,
9 Okt 2017, 15.15.0509/10/17
kepadaPresto

When inserting into partitioned table it seems every node writes a part of the results. So it's importante to get configured hive.metastore.uri property in every node poiting to the metastore node ( In my case EMR installs it on our master node, but this was the only one accesing it when configured as localhost:9083 ) :P

David Phillips

belum dibaca,
9 Okt 2017, 22.20.1609/10/17
kepadapresto...@googlegroups.com
Yes, every machine needs to check if the target partition exists so that it can write files in the correct format. The partitions aren’t know in advance since they are based on the data being written.
Balas ke semua
Balas ke penulis
Teruskan
0 pesan baru