VOS vs Blazegraph (was: Experience using bio2rdf datasets on blazegraph)

Egon Willighagen

unread,

Sep 29, 2016, 9:26:26 AM9/29/16

to bio2rdf, Patrick van Kleef, Hugh Williams, Tim Haynes

On Thu, Sep 29, 2016 at 3:18 PM, Kingsley Idehen <kid...@openlinksw.com> wrote:
> Which is actually for me too a reason to like Blazegraph, even being a
> long term VOS user!
>
> What is the problem though?
>
> 1. Build instructions unclear?
> 2. Build process unpredictable?

Both are somewhat OK. But it's a pain to have to go through this for
every machine you want to fire up an SPARQL end point on.

In contrast, 1. download jar, 2. start jar. And that works on
basically anyone I want to teach SPARQL and demo some queries, data
sets, etc.

Egon

--
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: 0000-0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen

Jim McCusker

unread,

Sep 29, 2016, 9:46:39 AM9/29/16

to bio2rdf, Patrick van Kleef, Hugh Williams, Tim Haynes

I've been using Blazegraph enough to come up with my own puppet script I've been using on development VMs and in production. Feel free to use and tweak as needed (no warranties, express or implied).

https://gist.github.com/jimmccusker/c9b5e4d90674f363ce58da8ea3275abc

Jim

--
You received this message because you are subscribed to the Google Groups "bio2rdf" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bio2rdf+u...@googlegroups.com.
To post to this group, send email to bio...@googlegroups.com.
Visit this group at https://groups.google.com/group/bio2rdf.
For more options, visit https://groups.google.com/d/optout.

--

James P. McCusker III, Ph.D.

http://tw.rpi.edu/web/JamesMcCusker

Kingsley Idehen

unread,

Sep 29, 2016, 10:33:43 AM9/29/16

to bio...@googlegroups.com

On 9/29/16 9:26 AM, Egon Willighagen wrote:
> On Thu, Sep 29, 2016 at 3:18 PM, Kingsley Idehen <kid...@openlinksw.com> wrote:
>> Which is actually for me too a reason to like Blazegraph, even being a
>> long term VOS user!
>>
>> What is the problem though?
>>
>> 1. Build instructions unclear?
>> 2. Build process unpredictable?
> Both are somewhat OK. But it's a pain to have to go through this for
> every machine you want to fire up an SPARQL end point on.
>
> In contrast, 1. download jar, 2. start jar. And that works on
> basically anyone I want to teach SPARQL and demo some queries, data
> sets, etc.
>
> Egon
>

Egon,

We have a commercial edition that gives you the download and go
experience, as would be expected.

We have an open source edition that requires building binary prior to use.

I sense you are now requesting an installer for the open source edition
since Virtuoso is not a Java application that can be deployed using a
JAR file ?

--
Regards,

Kingsley Idehen
Founder & CEO
OpenLink Software (Home Page: http://www.openlinksw.com)

Weblogs (Blogs):
Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
Blogspot Blog: http://kidehen.blogspot.com
Medium Blog: https://medium.com/@kidehen

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this
: http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this

Jim McCusker

unread,

Sep 29, 2016, 12:55:50 PM9/29/16

to bio...@googlegroups.com

I think he's just saying that the deployment for blazegraph is much simpler at the moment than for Virtuoso. I usually script both, and have my own reasons for preferring blazegraph at the moment. I've crashed virtuoso in an attempt to load large TRiG files in the past, and have been able to crash virtuoso on certain queries. I wasn't able to put time into paring those down into bug report-sized tasks to test with, but having blazegraph in Java means that it's very hard to crash due to unexpected input, and I don't need to configure hard limits on result set size or query execution time.

I don't fault you for the problems, since I haven't filed the bug, but it was holding me back and blazegraph worked out of the box.

Jim

--
You received this message because you are subscribed to the Google Groups "bio2rdf" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bio2rdf+u...@googlegroups.com.
To post to this group, send email to bio...@googlegroups.com.
Visit this group at https://groups.google.com/group/bio2rdf.
For more options, visit https://groups.google.com/d/optout.

Kingsley Idehen

unread,

Sep 29, 2016, 2:12:30 PM9/29/16

to bio...@googlegroups.com

On 9/29/16 12:55 PM, Jim McCusker wrote:

I think he's just saying that the deployment for blazegraph is much simpler at the moment than for Virtuoso.

Yes, and I would like to address that perception, naturally. Thus, I am simply asking for some clarity about what makes Virtuoso on-boarding harder. Of course, if this is a Java-centric realm then you can't be easier to use than a JAR that's already built.

I usually script both, and have my own reasons for preferring blazegraph at the moment. I've crashed virtuoso in an attempt to load large TRiG files in the past, and have been able to crash virtuoso on certain queries.

Yes, that can sometimes happen due to configuration and memory management which is tougher when dealing with 'C'. That said, there are a number of alternative memory management libraries that can be swapped out in situations where memory fragmentation is the issue at hand.

I wasn't able to put time into paring those down into bug report-sized tasks to test with, but having blazegraph in Java means that it's very hard to crash due to unexpected input, and I don't need to configure hard limits on result set size or query execution time.

query result set size and execution timeouts are items that let you address a variety of challenges e.g., have live instances that support ad-hoc queries, by anyone, from anywhere (as per DBpedia, Bio2RDF etc..).

I don't fault you for the problems, since I haven't filed the bug, but it was holding me back and blazegraph worked out of the box.

I understand. My quest is to acquire feedback with regards to on-boarding re. Virtuoso (commercial or open source editions).
I believe you did post something about TriG, I just don't recall if it was on the mailing list of via github?

Thanks.

Kingsley

Egon Willighagen

unread,

Sep 29, 2016, 2:16:26 PM9/29/16

to bio2rdf

On Thu, Sep 29, 2016 at 8:12 PM, Kingsley Idehen <kid...@openlinksw.com> wrote:
> On 9/29/16 12:55 PM, Jim McCusker wrote:
> I think he's just saying that the deployment for blazegraph is much simpler
> at the moment than for Virtuoso.
>
> Yes, and I would like to address that perception, naturally. Thus, I am
> simply asking for some clarity about what makes Virtuoso on-boarding harder.
> Of course, if this is a Java-centric realm then you can't be easier to use
> than a JAR that's already built.

A student on Windows I would just like to point to a .exe file (and
similarly on OS/X), which when started boots the VOS environment.
That's the only way to get a group of students started in 10 minutes
and then actually focus on data, SPARQL, and answering biological
questions. Rather than teaching them to be sysadmins.

Jim McCusker

unread,

Sep 29, 2016, 2:20:38 PM9/29/16

to bio...@googlegroups.com

On Thu, Sep 29, 2016 at 2:12 PM Kingsley Idehen <kid...@openlinksw.com> wrote:

I understand. My quest is to acquire feedback with regards to on-boarding re. Virtuoso (commercial or open source editions).
I believe you did post something about TriG, I just don't recall if it was on the mailing list of via github?

It was SPARQL UPDATE related, but it was about the query size (I was loading an ontology via RDFlib graphs). I'm glad you're not neglecting the deployment issues. Some simple scripts (like the puppet script I showed before) can often make a big difference for people who are unfamiliar with setting up software in Linux. Updating the Debian/Ubuntu packages to 7.x would probably be popular as well. The easiest thing could be to set up a CI server that builds and deploys releases to apt and yum repositories as things are released.

Jim

Kingsley Idehen

unread,

Sep 29, 2016, 3:50:29 PM9/29/16

to bio...@googlegroups.com

On 9/29/16 2:16 PM, Egon Willighagen wrote:
> On Thu, Sep 29, 2016 at 8:12 PM, Kingsley Idehen <kid...@openlinksw.com> wrote:
>> On 9/29/16 12:55 PM, Jim McCusker wrote:
>> I think he's just saying that the deployment for blazegraph is much simpler
>> at the moment than for Virtuoso.
>>
>> Yes, and I would like to address that perception, naturally. Thus, I am
>> simply asking for some clarity about what makes Virtuoso on-boarding harder.
>> Of course, if this is a Java-centric realm then you can't be easier to use
>> than a JAR that's already built.
> A student on Windows I would just like to point to a .exe file (and
> similarly on OS/X), which when started boots the VOS environment.
> That's the only way to get a group of students started in 10 minutes
> and then actually focus on data, SPARQL, and answering biological
> questions. Rather than teaching them to be sysadmins.
>
> Egon
>

Egon,

Okay, that simply amounts to:

Please could we have a binary for Linux, Windows, and Mac OS X as part
of the VOS experience :)

Those binaries will be easier to use than a JAR since they aren't
Java-specific.

Kingsley Idehen

unread,

Sep 29, 2016, 3:52:33 PM9/29/16

to bio...@googlegroups.com

On 9/29/16 2:20 PM, Jim McCusker wrote:

On Thu, Sep 29, 2016 at 2:12 PM Kingsley Idehen <kid...@openlinksw.com> wrote:

I understand. My quest is to acquire feedback with regards to on-boarding re. Virtuoso (commercial or open source editions).

I believe you did post something about TriG, I just don't recall if it was on the mailing list of via github?

It was SPARQL UPDATE related, but it was about the query size (I was loading an ontology via RDFlib graphs).

Do you have a github or mailing list post URI. I know I saw something about this wonder by, recently.

I'm glad you're not neglecting the deployment issues. Some simple scripts (like the puppet script I showed before) can often make a big difference for people who are unfamiliar with setting up software in Linux. Updating the Debian/Ubuntu packages to 7.x would probably be popular as well. The easiest thing could be to set up a CI server that builds and deploys releases to apt and yum repositories as things are released.

Sure we can do that.

Jim McCusker

unread,

Sep 29, 2016, 5:10:37 PM9/29/16

to bio...@googlegroups.com

On Thu, Sep 29, 2016 at 3:52 PM Kingsley Idehen <kid...@openlinksw.com> wrote:

Do you have a github or mailing list post URI. I know I saw something about this wonder by, recently.

https://github.com/openlink/virtuoso-opensource/issues/581

Reply all

Reply to author

Forward