Stormcrawler ElasticSearch 0.8 Example Problems

144 views
Skip to first unread message

ad...@adamseo.co.uk

unread,
Nov 26, 2015, 3:17:19 AM11/26/15
to DigitalPebble
Hi guys, I'm having a few problems getting Storm Crawler working with ES. I've followed the readme here, seeded the status index and copied the default crawler-conf.yaml file. Have also updated es-conf.yaml with my ES details and executing with 

storm jar target/storm-crawler-elasticsearch-0.8-SNAPSHOT.jar com.digitalpebble.storm.crawler.elasticsearch.ESCrawlTopology -local -conf es-conf.yaml -conf crawler-conf.yaml

starts fine but dies after http.robots.agents:


27126 [Thread-34-fetch] INFO  c.d.s.c.p.RobotRulesParser - No agents listed in 'http.robots.agents' property! Using http.agent.name [anonymous coward]
27220 [Thread-34-fetch] INFO  c.d.s.c.p.RobotRulesParser - No agents listed in 'http.robots.agents' property! Using http.agent.name [anonymous coward]
27226 [Thread-34-fetch] ERROR b.s.util - Async loop died!
java
.lang.NoClassDefFoundError: org/apache/storm/guava/collect/Iterables

 at com
.digitalpebble.storm.crawler.bolt.FetcherBolt$FetchItemQueues.<init>(FetcherBolt.java:270) ~[storm-crawler-elasticsearch-0.8-SNAPSHOT.jar:?]

 at com
.digitalpebble.storm.crawler.bolt.FetcherBolt.prepare(FetcherBolt.java:702) ~[storm-crawler-elasticsearch-0.8-SNAPSHOT.jar:?]

 at backtype
.storm.daemon.executor$fn__5694$fn__5707.invoke(executor.clj:757) ~[storm-core-0.10.0.jar:0.10.0]

 at backtype
.storm.util$async_loop$fn__545.invoke(util.clj:477) [storm-core-0.10.0.jar:0.10.0]

 at clojure
.lang.AFn.run(AFn.java:22) [clojure-1.6.0.jar:?]

 at java
.lang.Thread.run(Thread.java:745) [?:1.7.0_85]

Caused by: java.lang.ClassNotFoundException: org.apache.storm.guava.collect.Iterables

 at java
.net.URLClassLoader$1.run(URLClassLoader.java:366) ~[?:1.7.0_85]

 at java
.net.URLClassLoader$1.run(URLClassLoader.java:355) ~[?:1.7.0_85]

 at java
.security.AccessController.doPrivileged(Native Method) ~[?:1.7.0_85]

 at java
.net.URLClassLoader.findClass(URLClassLoader.java:354) ~[?:1.7.0_85]

 at java
.lang.ClassLoader.loadClass(ClassLoader.java:425) ~[?:1.7.0_85]

 at sun
.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) ~[?:1.7.0_85]

 at java
.lang.ClassLoader.loadClass(ClassLoader.java:358) ~[?:1.7.0_85]

 
... 6 more

27235 [Thread-34-fetch] ERROR b.s.d.executor -

java
.lang.NoClassDefFoundError: org/apache/storm/guava/collect/Iterables

 at com
.digitalpebble.storm.crawler.bolt.FetcherBolt$FetchItemQueues.<init>(FetcherBolt.java:270) ~[storm-crawler-elasticsearch-0.8-SNAPSHOT.jar:?]

 at com
.digitalpebble.storm.crawler.bolt.FetcherBolt.prepare(FetcherBolt.java:702) ~[storm-crawler-elasticsearch-0.8-SNAPSHOT.jar:?]

 at backtype
.storm.daemon.executor$fn__5694$fn__5707.invoke(executor.clj:757) ~[storm-core-0.10.0.jar:0.10.0]

 at backtype
.storm.util$async_loop$fn__545.invoke(util.clj:477) [storm-core-0.10.0.jar:0.10.0]

 at clojure
.lang.AFn.run(AFn.java:22) [clojure-1.6.0.jar:?]

 at java
.lang.Thread.run(Thread.java:745) [?:1.7.0_85]

Caused by: java.lang.ClassNotFoundException: org.apache.storm.guava.collect.Iterables

 at java
.net.URLClassLoader$1.run(URLClassLoader.java:366) ~[?:1.7.0_85]

 at java
.net.URLClassLoader$1.run(URLClassLoader.java:355) ~[?:1.7.0_85]

 at java
.security.AccessController.doPrivileged(Native Method) ~[?:1.7.0_85]

 at java
.net.URLClassLoader.findClass(URLClassLoader.java:354) ~[?:1.7.0_85]

 at java
.lang.ClassLoader.loadClass(ClassLoader.java:425) ~[?:1.7.0_85]

 at sun
.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) ~[?:1.7.0_85]

 at java
.lang.ClassLoader.loadClass(ClassLoader.java:358) ~[?:1.7.0_85]

 
... 6 more

27331 [Thread-34-fetch] ERROR b.s.util - Halting process: ("Worker died")

java
.lang.RuntimeException: ("Worker died")

 at backtype
.storm.util$exit_process_BANG_.doInvoke(util.clj:336) [storm-core-0.10.0.jar:0.10.0]

 at clojure
.lang.RestFn.invoke(RestFn.java:423) [clojure-1.6.0.jar:?]

 at backtype
.storm.daemon.worker$fn__7184$fn__7185.invoke(worker.clj:532) [storm-core-0.10.0.jar:0.10.0]

 at backtype
.storm.daemon.executor$mk_executor_data$fn__5523$fn__5524.invoke(executor.clj:261) [storm-core-0.10.0.jar:0.10.0]

 at backtype
.storm.util$async_loop$fn__545.invoke(util.clj:489) [storm-core-0.10.0.jar:0.10.0]

 at clojure
.lang.AFn.run(AFn.java:22) [clojure-1.6.0.jar:?]

 at java
.lang.Thread.run(Thread.java:745) [?:1.7.0_85]

27345 [Thread-21-indexer] INFO  o.e.plugins - [Asp] loaded [], sites []

27349 [Thread-32-status] INFO  o.e.plugins - [Ardina] loaded [], sites []

27347 [Thread-12-__metricscom.digitalpebble.storm.crawler.elasticsearch.metrics.MetricsConsumer] INFO  o.e.plugins - [Chan Luichow] loaded [], sites []

 

27346 [Thread-16-spout] INFO  o.e.plugins - [Sigmar] loaded [], sites []


Any ideas what could be causing this?


Thanks


DigitalPebble

unread,
Nov 26, 2015, 4:13:13 AM11/26/15
to digita...@googlegroups.com
Hi Adam

Did you declare the dependency on storm 0.10 in the pom? The master branch is still on 0.9.5 - we will move to 0.10 soon I think.

Hope this helps

Julien


--
You received this message because you are subscribed to the Google Groups "DigitalPebble" group.
To unsubscribe from this group and stop receiving emails from it, send an email to digitalpebbl...@googlegroups.com.
To post to this group, send email to digita...@googlegroups.com.
Visit this group at http://groups.google.com/group/digitalpebble.
For more options, visit https://groups.google.com/d/optout.



--

ad...@adamseo.co.uk

unread,
Nov 26, 2015, 4:23:23 PM11/26/15
to DigitalPebble, jul...@digitalpebble.com
Thanks Julien. Started again with the 0.10 branch and no problems here.

Regards
-Adam
...
Reply all
Reply to author
Forward
0 new messages