Carol Chapman
unread,Oct 11, 2022, 8:06:54 AM10/11/22Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Sign in to report message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to MR3
Hi.
At present, we need to use EsStorageHandler to write the data in HIVE to Elasticsearch. I used EsStorageHandler in HIVE ON MR3, and I found a very strange phenomenon:
Here is my execution SQL statement:
add jar hdfs:///user/hive/resource/jars/elasticsearch-hadoop-6.3.2.jar;
add jar hdfs:///user/hive/resource/jars/commons-httpclient-3.1.jar;
CREATE EXTERNAL TABLE test.test01
(
c1 string,
c2 string,
c3 string,
c4 string,
c5 string,
c6 string,
c7 string
)
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES(
'es.nodes' = '192.168.xx,192.168.xx',
'es.port' = '9200',
'es.index.read.missing.as.empty' = 'true',
'es.resource' = 'index_crowd_uniid_mapping/_doc',
'es.nodes.wan.only' = 'true',
'es.index.auto.create' = 'false',
'es.read.metadata' = 'true',
'es.mapping.names' = 'c1:c1,c2:c2, c3:c3, c4:c4, c5:c5,c6:c6,c7:c7',
'es.scroll.size' = '5000'
);
insert overwrite table test.test01
cd.c1,'static1' as c2 ,cd.c3 as c3,mp.c4,mp.c5,mp.c6,from_unixtime(unix_timestamp(),'yyyy-MM-dd HH:mm:ss') as c7
from test.test02 as cd
inner join test.test03 as mp on cd.c1=mp.c2
where cd.c1='008185' and mp.c3='qiushi6' and length(mp.s_uni_id)>0 and length(mp.c4)>0;
When I execute the Insert statement for the first time, data can be written
But when I execute the statement for the second time, I get the following exception information:
Caused by: java.lang.NoClassDefFoundError: org/elasticsearch/hadoop/hive/HiveValueWriter
at org.elasticsearch.hadoop.hive.EsHiveOutputFormat.getHiveRecordWriter(EsHiveOutputFormat.java:88)
at org.elasticsearch.hadoop.hive.EsHiveOutputFormat.getHiveRecordWriter(EsHiveOutputFormat.java:42)
at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:282)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketForFileIdx(FileSinkOperator.java:772)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:723)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:889)
at org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.process(VectorFileSinkOperator.java:111)
at org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:968)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:941)
at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
at org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:968)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:941)
at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.flushOutput(VectorGroupByOperator.java:1176)
at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.closeOp(VectorGroupByOperator.java:1184)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:735)
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:389)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:353)
... 10 more
When I use Apache HIVE(HDP 3.1.4) to execute statements, it is always normal.