CulumusRDF benchmark

88 views
Skip to first unread message

diego delavega

unread,
May 20, 2014, 1:25:48 PM5/20/14
to cumulus...@googlegroups.com
I'm changing to the users group (maybe is more appropriate here).


-(a little explanation of what i want to do)
I want to run SP2Bench  & LUBM benchmarks with Culumusrdf.
The way i want to do this is by calculating the time required to load (not the initiation of parameters) and the time it meed to return the results (again not the initiation of parameters) and i want to neglect the typing of the results as it is time consuming and would bias the results the way i am measuring them.

I saw the suggestions  in my previous comment (thx. a lot by the way) but i don't want to "heavenly" intervene in the code, in the sense that i don't want to add procedures that would increase the execution time. I would like to keep it simple and with that in mind i wold preferably use the command line.

-(here it comes)
I'm guessing that i should use the .jar created after the compilation and not download the .jar as suggested in the site.
Also i am a little unclear on the usage of the command line as it always return some class not found exception,
like
java -cp target/cumulusrdf-1.0.1.jar edu.kit.aifb.cumulus.cli.Main Load -i ~/Desktop/sp2b/bin/sp2b.n3 

java.lang.NoClassDefFoundError: org/apache/commons/cli/ParseException
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:190)
at edu.kit.aifb.cumulus.cli.Main.main(Main.java:29)
Caused by: java.lang.ClassNotFoundException: org.apache.commons.cli.ParseException
at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
... 3 more
USAGE: edu.kit.aifb.cumulus.cli.Main <utility> [options...]
java.lang.NoClassDefFoundError: org/apache/commons/cli/ParseException


So ... how do i correctly use the command line would be my question.

Also if you have any other suggestions to help me or just orient me i would appreciate it.


Andrea Gazzarini

unread,
May 22, 2014, 3:14:29 PM5/22/14
to cumulus...@googlegroups.com

Hi Diego,
Sorry for delay in response, summer and sun here in Italy are coming out and bugs are like mushrooms in these days :)

About your email, very interesting stuff; we also are interested in those numbers so i would ask you a favour, which I think will also simplify your life: could you try that with 1.1.0? We made a lot of improvements and there's a better command line interface. The 1.1.0 is going to be released very soon but if you have rush for those numbers you can build directly from source, it is stable and supports both Cassandra 1.2.x and 2.x

However, about the stacktrace you got: you mustn't use that jar because it is the pure maven artifact that therefore doesn't contain required dependencies (the reason of the exception you got)

Let me know if i can help you in some way

Best,
Andrea

--
You received this message because you are subscribed to the Google Groups "cumulusrdf" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cumulusrdf-li...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

diego delavega

unread,
May 24, 2014, 5:43:35 AM5/24/14
to cumulus...@googlegroups.com
Hi Andrea.
The v-1.1.0 sounds great, but the fact is that i have a deadline. So when do you estimate it is going to be ready and come out?

Another question about the usage of the command line.
i downloaded Cassandra 1.2.16 and the CLI (.jar) from your site.
 
 I run cassandra and create a keyspace (specifically "mykey").
while cassandra is running i execute
java -cp cumulusrdf-1.0.1.jar edu.kit.aifb.cumulus.cli.Main Load -i ./sp2b_/bin/sp2b.n3 -k mykey -s triple

and i get the following error.

storage layout: triple
log4j:WARN No appenders could be found for logger (me.prettyprint.cassandra.connection.CassandraHostRetryService).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at edu.kit.aifb.cumulus.cli.Main.main(Main.java:38)
Caused by: org.openrdf.rio.UnsupportedRDFormatException: No parser factory available for RDF format N3 (mimeTypes=text/n3, text/rdf+n3; ext=n3)
    at org.openrdf.rio.Rio.createParser(Rio.java:198)
    at org.openrdf.rio.Rio.createParser(Rio.java:213)
    at org.openrdf.repository.util.RDFLoader.loadInputStreamOrReader(RDFLoader.java:318)
    at org.openrdf.repository.util.RDFLoader.load(RDFLoader.java:222)
    at org.openrdf.repository.util.RDFLoader.load(RDFLoader.java:104)
    at org.openrdf.repository.base.RepositoryConnectionBase.add(RepositoryConnectionBase.java:217)
    at edu.kit.aifb.cumulus.cli.Load.main(Load.java:161)
    ... 5 more
USAGE: edu.kit.aifb.cumulus.cli.Main <utility> [options...]
java.lang.reflect.InvocationTargetException


Am i missing something or am i doing it wrong ?
Thank you!!

Andrea Gazzarini

unread,
May 24, 2014, 7:52:25 AM5/24/14
to cumulus...@googlegroups.com
Hi Diego,
I don't remember the release date (@Andreas please help) in the last call, we said someghing about the end of the month.

I proposed you that 1.1.0 way because your first email (on the devlist) had a subject "change src and recompile" so I assume you're quite smart in building the 1.1.0. That's also the reason why I told you that if you need help I'm here (BTW skype: gazzax72). So about your deadline, unless is today (sorry I can't) if you want, we can do that tomorrow, or a day in the next week. Just give me a shout.

Note that in this case you don't need to change sources, but only to checkout the branch/1.1.0 and run a maven build with the following command (from the top level directory)

> mvn -DskipTests clean install -Pcassandra12x-hector-full-tp-index

At the end you will find all CumulusRDF artifacts for Cassandra 1.2.x and specifically for cli, you will find a completely module with a more usable structure like:

- bin
- etc (configuration files)
- lib (jars)

from bin directory you will find a script (cirrus) and you will have to do

> cirrus 

in order to get a help screen

or

> cirrus load -i <file>

to have you file loaded.

--------------------------------------------------------------------------------

About your error, there are two things:

1) I see Hector (retry service) is trying to tell you something: make sure Cassandra is running on localhost:9160, probably it is running on 9161. Sorry, this is something you cannot change for 1.0.1

No appenders could be found for logger (me.prettyprint.cassandra.connection.CassandraHostRetryService).


2) N3 format 
Not sure if is a CumulusRDF or not: It seems the Sesame version we used for 1.0.1 has some issue with n3 files. But let me check better; in the meantime, if you can, could you please try to load a .nt file?

Best,
Andrea


--

Andreas Wagner

unread,
May 24, 2014, 9:40:05 AM5/24/14
to cumulus...@googlegroups.com
Hi guys,

thanks @ Andrea for the quick answers ;)

First, the cumulusrdf-1.0.1.jar does contain all dependencies. I compiled it this way ;) As I see below the ClassNotFoundException seems to have disappeared.

The exception below is, unfortunately, a bug. The quick-fix is as follows:

* open the jar and go to/META-INF/services/
* there is a file org.openrdf.rio.RDFParserFactory
* add the line:

org.openrdf.rio.n3.N3ParserFactory

This should do the trick. I will create an issue for that ...

HTH
Andreas

Andreas Wagner

unread,
May 24, 2014, 9:59:14 AM5/24/14
to cumulus...@googlegroups.com
Hi guys,


On 05/24/2014 01:52 PM, Andrea Gazzarini wrote:
Hi Diego,
I don't remember the release date (@Andreas please help) in the last call, we said someghing about the end of the month.
we are trying to release the v1.1 until the end of next week ;)


I proposed you that 1.1.0 way because your first email (on the devlist) had a subject "change src and recompile" so I assume you're quite smart in building the 1.1.0. That's also the reason why I told you that if you need help I'm here (BTW skype: gazzax72). So about your deadline, unless is today (sorry I can't) if you want, we can do that tomorrow, or a day in the next week. Just give me a shout.
+1 ... thanks @ Andrea :)

Note that in this case you don't need to change sources, but only to checkout the branch/1.1.0 and run a maven build with the following command (from the top level directory)

> mvn -DskipTests clean install -Pcassandra12x-hector-full-tp-index
+1 ... Note, you can also use Cassandra 2.x:

mvn -DskipTests clean install -Pcassandra2x-cql-full-tp-index


At the end you will find all CumulusRDF artifacts for Cassandra 1.2.x and specifically for cli, you will find a completely module with a more usable structure like:

- bin
- etc (configuration files)
- lib (jars)

from bin directory you will find a script (cirrus) and you will have to do

> cirrus 

in order to get a help screen

or

> cirrus load -i <file>
+1 ... very cool :) I have to try that ... :D


to have you file loaded.

--------------------------------------------------------------------------------

About your error, there are two things:

1) I see Hector (retry service) is trying to tell you something: make sure Cassandra is running on localhost:9160, probably it is running on 9161. Sorry, this is something you cannot change for 1.0.1

No appenders could be found for logger (me.prettyprint.cassandra.connection.CassandraHostRetryService).


2) N3 format 
Not sure if is a CumulusRDF or not: It seems the Sesame version we used for 1.0.1 has some issue with n3 files. But let me check better; in the meantime, if you can, could you please try to load a .nt file?

+1 ... yes, NTriples should work without changing the jar ....
Best,
Andrea

HTH
Andreas

diego delavega

unread,
May 25, 2014, 2:26:26 PM5/25/14
to cumulus...@googlegroups.com, andreas.jo...@googlemail.com
Hi to everyone.

The trick with the "org.openrdf.rio.n3.N3ParserFactory" did the job and loaded the data. In regard to that i have a question about the query. 
I tried -q with the string but i couldn't parse the prefixes. Does cumulus has any specific requirements about that.

----------------------------------------

As about the 1.1.0 i got somehow confused from getting it to building and running it.
If i'n not asking to much can i have some bullet-points (the steps for each action)  starting from getting the code.
thx a lot.


Andrea Gazzarini

unread,
May 25, 2014, 3:41:17 PM5/25/14
to cumulus...@googlegroups.com
Hi Diego,
about the first part of your email, Andreas is definitely more expert than me.

About 1.1.0, here's the list

> svn checkout http://cumulusrdf.googlecode.com/svn/branches/1.1.0 cumulusrdf-1.1.0
> cd cumulusrdf-1.1.0

Now, if you are using Cassandra 1.2.x as backend  

> mvn -DskipTests clean install -Pcassandra12x-hector-full-tp-index

Or if you want to use a Cassandra 2.x


> mvn -DskipTests clean install -Pcassandra2x-cql-full-tp-index

Once completed

> cd cumulusrdf-standalone/target

You will find a zip file

cumulusrdf-standalone-1.1.0-SNAPSHOT-with-cassandra-12x.zip (Note that the Cassandra versione could be different depending on the profile you activated in previous step)

That zip is all what you need, a complete CumulusRDF standalone. Just extract that somewhere and you will find the structure (etc / bin / lib) I mentioned in my previous email.
Under the bin folder you will find the cirrus script.

Let me know if you need more details.

Best,
Andrea




diego delavega

unread,
May 29, 2014, 2:16:37 AM5/29/14
to cumulus...@googlegroups.com
Sorry for taking so long but it works great.

Thank you very much!!

Andrea Gazzarini

unread,
May 29, 2014, 3:01:46 AM5/29/14
to cumulus...@googlegroups.com
Great!

you're welcome

Best,
Andrea


2014-05-29 8:16 GMT+02:00 diego delavega <diarde...@gmail.com>:
Sorry for taking so long but it works great.

Thank you very much!!

Nicola Ghirardi

unread,
Jul 17, 2014, 8:42:01 AM7/17/14
to cumulus...@googlegroups.com, andreas.jo...@googlemail.com
Do you have any results?

Andrea Gazzarini

unread,
Jul 21, 2014, 1:51:42 AM7/21/14
to cumulus...@googlegroups.com
Hi Nicola,
sorry for delay in response, as far as I remember Diego sent some results on another thread; on top of that Andreas (Wagner) is doing benchmarks for 1.1.0 release so maybe he collected some other useful data.

@Andreas: I know, we already did a lot of talks about this thing....but at this point you probably know what kind of "head" I have so could you please refresh our (my) memory?

Best,
Andrea
Reply all
Reply to author
Forward
0 new messages