YARN ingestion plugins, deployed but can't get them to work

166 views
Skip to first unread message
Assigned to me by joe.m....@gmail.com

Eric Therrien

unread,
Jan 19, 2015, 2:47:42 PM1/19/15
to lum...@googlegroups.com
I followed the instruction in YARN Plugin Build Instructions and deployed the following plugins:

csv
drewnoakes
-image-metadata-extractor
email
-extractor
java
-code
known
-entity-extractor
mime
-type-ontology-mapper
opennlp
-dictionary-extractor
opennlp
-me-extractor
phone
-number-extractor
reindex
subrip
-parser
subrip
-transcript
tika
-mime-type
tika
-text-extractor
youtube
-transcript
zipcode
-extractor
zipcode
-resolver

The plugins are deployed inside docker->hadoop->folder /lumify/libcache. When I import a sample text file or an image in a workspace I was expecting a to have at least the mime-type understood by the web interface. Such is not the case and the file shows up in the workspace with a question mark (?) icon. I tried to look under the Hadoop YARN Resource Manager (http://lumify-dev:8088) but I don't see any of the plugins from that point of view. How can I verify that the YARN plugins are well in place?

Thank you in advance.

Joe Ferner

unread,
Jan 20, 2015, 11:10:04 AM1/20/15
to lum...@googlegroups.com
You can run the Graph Property Workers two different ways. In a YARN process or in the web container.

To run in the web container make sure you have the following config set: webAppEmbedded.graphPropertyWorkerRunner.enabled=true

To run in a YARN process you'll need to build the graph-property-worker/graph-property-worker-yarn jar and deploy it as the yarn user using "yarn jar graph-property-worker-yarn.jar -jar graph-property-worker-yarn.jar"

In both cases you should see in your log files 

"Loading io.lumify.gpw.audio.AudioMp4EncodingWorker from jar:file:/data0/yarn/local/usercache/yarn/appcache/application_1421512055477_0001/filecache/10/graph-property-worker.jar!/META-INF/services/io.lumify.core.ingest.graphProperty.GraphPropertyWorker"

with each of the Graph Property Workers.

Then you'll see "Begin work on element" for each property.

vincenzo algieri

unread,
Jul 12, 2016, 1:25:28 PM7/12/16
to Lumify
Hi Joe,

I'm tring to run the Graph Property Worker it does'nt work. Firstly, I have not the folder docker->hadoop->folder /lumify/libcache. The only libcache forlder in the path
/docker/dev/lumify-dev-persistent/tmp/lumify-hdfslibcache  or  /docker/dev/lumify-dev-persistent/var/lib/hadoop-hdfss. I deployed the plugins inside both of them. And the nice thing is that the list of plugins is available in the admin menù on the web application. But when I try to drag & drop a document in the workspace it is labeled by the question mark (?). I have also the webAppEmbedded.graphPropertyWorkerRunner.enabled=true configuration in the config file.

So, why it doesn't work?? Have I missed any configuration?? Have I add any configuration for each plugin in the /config/lumify.properties file??
Can you explain me step by step how to do???

Thanks!!!

Joe Ferner

unread,
Jul 13, 2016, 8:32:58 AM7/13/16
to Lumify
I no longer work on Lumify. I work at V5 Analytics developing a fork of Lumify.
Reply all
Reply to author
Forward
0 new messages