What happened to http://lumify.io/wikipedia ?

Szymon Smakolski

unread,

Apr 7, 2015, 4:38:41 PM4/7/15

to lum...@googlegroups.com

My previous post seems to have disappeared, so I apologize if this is a duplicate.

I'm trying to follow the readme for:

https://github.com/lumifyio/lumify/tree/master/datasets/wikipedia

and when I go to submit the MR job, I get:

Exception in thread "main" java.lang.RuntimeException: http://lumify.io/wikipedia#wikipediaPage concept not found

at io.lumify.wikipedia.mapreduce.ImportMR.verifyWikipediaPageConcept(ImportMR.java:57)

at io.lumify.wikipedia.mapreduce.ImportMR.setupJob(ImportMR.java:31)

at io.lumify.core.mapreduce.LumifyMRBase.run(LumifyMRBase.java:58)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)

at io.lumify.wikipedia.mapreduce.ImportMR.main(ImportMR.java:73)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

Indeed, I can verify that http://lumify.io/wikipedia gives me a 404. Has this resource simply been moved or is it now removed entirely?

Gordon Shankman

unread,

Apr 10, 2015, 10:29:07 AM4/10/15

to lum...@googlegroups.com

http://lumify.io/wikipedia is just the namespace for the ontology. You're getting that error because both the dev and wikipedia ontologies need to be imported before the MapReduce job is run. We'll be updating the instructions page to add that step to the MapReduce instructions; it is listed in the IDE instructions below.

This page provides details on how to import the ontologies. https://github.com/lumifyio/lumify/blob/master/docs/ontology.md

vincenzo algieri

unread,

Sep 20, 2016, 12:09:50 PM9/20/16

to Lumify, gsha...@gmail.com

Hi Gordon,

I have some issues with the MapReduce in Lumify. In particolar when I try to submit a file with the MR command, as is explained in the official documentation, the execution seem to go in loop. It does not stops and there are no error on the console. It is possibile that the MR command needs of more memory??

The steps tht I followed are these:

1)Copy the MR input file to HDFS:
hadoop fs -mkdir -p /lumify
hadoop fs -put <dataset> /lumify

4)Pre-split destination Accumulo tables:

bin/configure_splits.sh

5)Submit the MR job:

Thanks in advance.

Reply all

Reply to author

Forward