Using elasticsearch plugin

43 views
Skip to first unread message

Ray C

unread,
Mar 17, 2017, 11:13:39 AM3/17/17
to CDAP User
Hi guys,

I'm learning CDAP with trying the elasticsearch plugin as I have been working with elasticsearch for quite a time. Installed a standalone version in Linux environment. 
I set up a simple pipeline using Hydrator, reading a sample text data file and putting it to two sinks: snapshotText and Elasticsearch. snapshotTex works but as I add ES it fails. 
Attached is the log file. The error part reads like below: 

2017-03-16 20:06:48,060 - ERROR [Executor task launch worker-0:o.a.s.e.Executor@95] - Exception in task 0.0 in stage 1.0 (TID 1)
java.lang.StringIndexOutOfBoundsException: String index out of range: -1
	at java.lang.String.substring(String.java:1967) ~[na:1.8.0_111]
	at org.elasticsearch.hadoop.rest.RestClient.discoverNodes(RestClient.java:110) ~[na:na]
	at org.elasticsearch.hadoop.rest.InitializationUtils.discoverNodesIfNeeded(InitializationUtils.java:58) ~[na:na]
	at org.elasticsearch.hadoop.rest.RestService.createWriter(RestService.java:372) ~[na:na]
	at org.elasticsearch.hadoop.mr.EsOutputFormat$EsRecordWriter.init(EsOutputFormat.java:173) ~[elasticsearch-hadoop-mr-2.1.0.jar:2.1.0]
	at org.elasticsearch.hadoop.mr.EsOutputFormat$EsRecordWriter.write(EsOutputFormat.java:149) ~[elasticsearch-hadoop-mr-2.1.0.jar:2.1.0]
	at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12$$anonfun$apply$4.apply$mcV$sp(PairRDDFunctions.scala:1113) ~[co.cask.cdap.spark-assembly-1.6.1.jar:na]
	at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12$$anonfun$apply$4.apply(PairRDDFunctions.scala:1111) ~[co.cask.cdap.spark-assembly-1.6.1.jar:na]
	at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12$$anonfun$apply$4.apply(PairRDDFunctions.scala:1111) ~[co.cask.cdap.spark-assembly-1.6.1.jar:na]
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1250) ~[co.cask.cdap.spark-assembly-1.6.1.jar:na]
	at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12.apply(PairRDDFunctions.scala:1119) ~[co.cask.cdap.spark-assembly-1.6.1.jar:na]
	at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12.apply(PairRDDFunctions.scala:1091) ~[co.cask.cdap.spark-assembly-1.6.1.jar:na]
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) ~[co.cask.cdap.spark-assembly-1.6.1.jar:na]
	at org.apache.spark.scheduler.Task.run(Task.scala:89) ~[co.cask.cdap.spark-assembly-1.6.1.jar:na]
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) ~[co.cask.cdap.spark-assembly-1.6.1.jar:na]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_111]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_111]
	at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111]

Any help will be appreciated. Thanks!
pipeline_es.log

Ray C

unread,
Mar 17, 2017, 11:22:38 AM3/17/17
to CDAP User

FYI, elasticsearch is running/working on the same server and below is the configuration of the elasticsearch plugin. 


Vinisha Vyasa

unread,
Mar 17, 2017, 3:10:02 PM3/17/17
to CDAP User
Hey Ray,

From pipeline logs it looks like you are using Elasticsearch 2.3.2.We support Elasticsearch 1.6.0 and 5.2.2. Could you please try either of these 2 versions. 

Thanks,
Vinisha

--
You received this message because you are subscribed to the Google Groups "CDAP User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cdap-user+unsubscribe@googlegroups.com.
To post to this group, send email to cdap...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cdap-user/54700903-8bc4-4dc9-9ba6-2233ebebc064%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Vinisha Vyasa

unread,
Mar 17, 2017, 4:23:33 PM3/17/17
to CDAP User
Hey Ray,

To add to my previous reply, Elasticsearch-1.6.0 plugin is compatible with Elasticsearch-1.6.0. Instead, in your case, we can use Elasticsearch-1.6.0-5.2.2 plugin (which can be installed through Cask Market), to connect to Elasticsearch-2.3.2. I have verified it by running elasticsearch 2.3.2 on localhost with Elasticsearch-1.6.0-5.2.2 plugin in the pipeline. Could you please try it and let us know if you see any issues.


Inline image 3


Inline image 1
Thanks,
Vinisha

Ray C

unread,
Mar 17, 2017, 5:29:54 PM3/17/17
to CDAP User
Hi Vinisha,

Yes it works like a charm with that version. Thanks a lot! 
To unsubscribe from this group and stop receiving emails from it, send an email to cdap-user+...@googlegroups.com.

To post to this group, send email to cdap...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages