Issue while writing data to hive.

38 views
Skip to first unread message

chaitanya ekre

unread,
Feb 9, 2018, 12:22:20 AM2/9/18
to cascading-user
Hello,
I am facing issue when my cascading program is trying to write data from csv file to HivePartitionTap. The program runs fine but shows following exception.
One more query how to incorporate kerberos authentication of hive metastore with cascading.

My code -
                    

HiveTableDescriptor partitionedDescriptor  = new HiveTableDescriptor(databaseName, tableName, columnNames,columnTypes,new String[]{"partitionKey"},"\t" ,HiveTableDescriptor.HIVE_DEFAULT_SERIALIZATION_LIB_NAME,new Path("/hive_data/"+databaseName+".db/"+tableName+"/"));

HiveTap hiveTap =  new HiveTap(partitionedDescriptor , partitionedDescriptor.toScheme(),SinkMode.REPLACE,false);

Tap partitionTap = new HivePartitionTap( hiveTap,SinkMode.UPDATE )
;


FlowDef flowDef = FlowDef.flowDef().addSource(processPipe,inputtap).addTailSink(processPipe,partitionTap ) ;
        Properties properties = AppProps.appProps().setName("DataProcessing").buildProperties();
        properties.setProperty( "mapred.reduce.tasks", "4" );
        properties.setProperty( "mapred.map.tasks", "4" );
        properties.setProperty( "hive.metastore.uris", "thrift://slave1:9083" );
       
        Flow flow = new Hadoop2MR1FlowConnector(properties).connect(flowDef);
        flow.complete();

Exception -



2018-02-09 00:07:39,713 INFO [main] hive.metastore: Trying to connect to metastore with URI thrift://slave1:9083
2018-02-09 00:07:39,744 INFO [main] hive.metastore: Connected to metastore.
2018-02-09 00:07:40,028 INFO [main] hive.metastore: Trying to connect to metastore with URI thrift://slave1:9083
2018-02-09 00:07:40,040 INFO [main] hive.metastore: Connected to metastore.
2018-02-09 00:07:40,196 INFO [main] cascading.tap.hive.HiveTap: creating table 'r_45903_1711' at '/hive_data/dev_io.db/R_45903_1711' 
2018-02-09 00:07:40,318 INFO [main] cascading.tap.hadoop.io.TapOutputCollector: closing tap collector for: /hive_data/dev_io.db/R_45903_1711/groupkey=A9/part-00008-00000
2018-02-09 00:07:40,334 INFO [main] cascading.tap.hadoop.util.Hadoop18TapUtil: committing task: 'attempt_1517987050954_0052_r_000008_0' - hdfs://slave2:8020/hive_data/dev_io.db/R_45903_1711/_temporary/_attempt_1517987050954_0052_r_000008_0
2018-02-09 00:07:40,381 INFO [main] cascading.tap.hadoop.util.Hadoop18TapUtil: saved output of task 'attempt_1517987050954_0052_r_000008_0' to hdfs://slave2:8020/hive_data/dev_io.db/R_45903_1711
2018-02-09 00:07:40,382 INFO [main] cascading.flow.hadoop.FlowReducer: flow node id: E7B178794ACA4AE9965E91FE37ED90CD, mem on close (mb), free: 2243, total: 4333, max: 8085
2018-02-09 00:07:40,382 INFO [main] cascading.tap.hadoop.io.TapOutputCollector: closing tap collector for: hdfs://slave2:8020/dev_io/e_45903_1711/part-00008
2018-02-09 00:07:40,386 INFO [main] cascading.tap.hadoop.io.TapOutputCollector: closing tap collector for: hdfs://slave2:8020/dev_io/e_logs_45903_1711/part-00008
2018-02-09 00:07:40,389 INFO [main] cascading.flow.hadoop.FlowReducer: flow node id: E7B178794ACA4AE9965E91FE37ED90CD, mem on close (mb), free: 2243, total: 4333, max: 8085
2018-02-09 00:07:40,390 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : cascading.CascadingException: java.io.IOException: AlreadyExistsException(message:Table r_45903_1711 already exists)
    at cascading.tap.hive.HivePartitionTap$HivePartitionCollector.closeCollector(HivePartitionTap.java:156)
    at cascading.tap.partition.BasePartitionTap$PartitionCollector.close(BasePartitionTap.java:189)
    at cascading.flow.stream.element.SinkStage.cleanup(SinkStage.java:129)
    at cascading.flow.stream.graph.StreamGraph.cleanup(StreamGraph.java:187)
    at cascading.flow.hadoop.FlowReducer.close(FlowReducer.java:172)
    at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:453)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164)
Caused by: java.io.IOException: AlreadyExistsException(message:Table r_45903_1711 already exists)
    at cascading.tap.hive.HiveTap.createHiveTable(HiveTap.java:187)
    at cascading.tap.hive.HiveTap.registerPartition(HiveTap.java:347)
    at cascading.tap.hive.HivePartitionTap$HivePartitionCollector.closeCollector(HivePartitionTap.java:152)
    ... 11 more
Caused by: AlreadyExistsException(message:Table r_45903_1711 already exists)
    at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result$create_table_with_environment_context_resultStandardScheme.read(ThriftHiveMetastore.java:26984)
    at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result$create_table_with_environment_context_resultStandardScheme.read(ThriftHiveMetastore.java:26970)
    at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result.read(ThriftHiveMetastore.java:26896)
    at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
    at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_create_table_with_environment_context(ThriftHiveMetastore.java:991)
    at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.create_table_with_environment_context(ThriftHiveMetastore.java:977)
    at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.create_table_with_environment_context(HiveMetaStoreClient.java:1968)
    at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:658)
    at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:646)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:91)
    at com.sun.proxy.$Proxy16.createTable(Unknown Source)
    at cascading.tap.hive.HiveTap.createHiveTable(HiveTap.java:177)
    ... 13 more
2018-02-09 00:07:40,392 INFO [main] org.apache.hadoop.mapred.Task: Runnning cleanup for the task


Reply all
Reply to author
Forward
0 new messages