Hi guys,
I'm learning CDAP with trying the elasticsearch plugin as I have been working with elasticsearch for quite a time. Installed a standalone version in Linux environment.
I set up a simple pipeline using Hydrator, reading a sample text data file and putting it to two sinks: snapshotText and Elasticsearch. snapshotTex works but as I add ES it fails.
Attached is the log file. The error part reads like below:
2017-03-16 20:06:48,060 - ERROR [Executor task launch worker-0:o.a.s.e.Executor@95] - Exception in task 0.0 in stage 1.0 (TID 1)
java.lang.StringIndexOutOfBoundsException: String index out of range: -1
at java.lang.String.substring(String.java:1967) ~[na:1.8.0_111]
at org.elasticsearch.hadoop.rest.RestClient.discoverNodes(RestClient.java:110) ~[na:na]
at org.elasticsearch.hadoop.rest.InitializationUtils.discoverNodesIfNeeded(InitializationUtils.java:58) ~[na:na]
at org.elasticsearch.hadoop.rest.RestService.createWriter(RestService.java:372) ~[na:na]
at org.elasticsearch.hadoop.mr.EsOutputFormat$EsRecordWriter.init(EsOutputFormat.java:173) ~[elasticsearch-hadoop-mr-2.1.0.jar:2.1.0]
at org.elasticsearch.hadoop.mr.EsOutputFormat$EsRecordWriter.write(EsOutputFormat.java:149) ~[elasticsearch-hadoop-mr-2.1.0.jar:2.1.0]
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12$$anonfun$apply$4.apply$mcV$sp(PairRDDFunctions.scala:1113) ~[co.cask.cdap.spark-assembly-1.6.1.jar:na]
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12$$anonfun$apply$4.apply(PairRDDFunctions.scala:1111) ~[co.cask.cdap.spark-assembly-1.6.1.jar:na]
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12$$anonfun$apply$4.apply(PairRDDFunctions.scala:1111) ~[co.cask.cdap.spark-assembly-1.6.1.jar:na]
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1250) ~[co.cask.cdap.spark-assembly-1.6.1.jar:na]
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12.apply(PairRDDFunctions.scala:1119) ~[co.cask.cdap.spark-assembly-1.6.1.jar:na]
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12.apply(PairRDDFunctions.scala:1091) ~[co.cask.cdap.spark-assembly-1.6.1.jar:na]
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) ~[co.cask.cdap.spark-assembly-1.6.1.jar:na]
at org.apache.spark.scheduler.Task.run(Task.scala:89) ~[co.cask.cdap.spark-assembly-1.6.1.jar:na]
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) ~[co.cask.cdap.spark-assembly-1.6.1.jar:na]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_111]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_111]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111]
Any help will be appreciated. Thanks!