SKOSMOS with Virtuoso

214 views
Skip to first unread message

Katerina Gkirtzou

unread,
Nov 28, 2016, 10:44:50 AM11/28/16
to Skosmos Users
Dear SKOSMOS developers,
I am interesting in using SKOSMOS along with Virtuoso RDF store. I was able to correctly set SKOSMOS with Jena Fuseki locally, and everything works well. But when I move to a SPARQL endpoint provided by a Virtuoso RDF store (as it supports SPARQL 1.1) , SKOSMOS seems that is not able to retrieve the concepts. Do I have to set anything in Virtuoso for example? Does SKOSMOS work with Virtuoso's SPARQL Endpoint in general?

Thank you in advance for your help.
Katerina Gkirtzou

Osma Suominen

unread,
Nov 29, 2016, 3:07:33 AM11/29/16
to skosmo...@googlegroups.com
28.11.2016, 17:44, Katerina Gkirtzou wrote:
> Dear SKOSMOS developers,
> I am interesting in using SKOSMOS along with Virtuoso RDF store. I was
> able to correctly set SKOSMOS with Jena Fuseki locally, and everything
> works well. But when I move to a SPARQL endpoint provided by a Virtuoso
> RDF store (as it supports SPARQL 1.1) , SKOSMOS seems that is not able
> to retrieve the concepts. Do I have to set anything in Virtuoso for
> example? Does SKOSMOS work with Virtuoso's SPARQL Endpoint in general?

Dear Katerina,

Thank you for trying out Skosmos!

While Skosmos should work with any SPARQL 1.1 compliant triple store, in
practice we only use it with Fuseki. So I'm unfortunately not able to
say whether it works with Virtuoso or not.

One thing we definitely don't have is support for the Virtuoso text
index, so you must set the SPARQL dialect to "Generic". Have you done
that? You can either do it per vocabulary in vocabularies.ttl, or set
the default dialect in config.inc.

If that doesn't help, can you describe a bit more what the problem is?
Are you getting any vocabulary data from the SPARQL endpoint? For
example, do you get information on the vocabulary front page (metadata,
concept and label statistics)? Does the alphabetical index show any
concepts?

We originally chose Fuseki with jena-text as the main backend for
Skosmos. We also considered Blazegraph (then called Bigdata). I also
looked at Virtuoso, but at least at the time (around 2012-2013), its
text index didn't seem suitable for Skosmos since it required four
leading characters to be specified in queries. The alphabetical index in
Skosmos uses one-character prefix queries (e.g. every concept whose
label starts with "A"). This can apparently be changed only by
recompiling Virtuoso, and I'm unsure about the side effects. Looking at
the documentation no, I see no indication that this has changed. So it
doesn't seem very likely that Virtuoso could be used as an efficient
backend for large vocabularies where Skosmos in practice needs a text
index to work with reasonable performance. It could be used for small
SKOS vocabularies (up to a few thousand concepts perhaps) using the
Generic SPARQL 1.1 dialect, but there may be compatibility problems
since we haven't really tested it together with Virtuoso.

-Osma




--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.s...@helsinki.fi
http://www.nationallibrary.fi

Katerina Gkirtzou

unread,
Nov 29, 2016, 5:07:41 AM11/29/16
to Skosmos Users
Dear Osma,
thank you very much for your quick response. For more details, see inline.


On Tuesday, 29 November 2016 10:07:33 UTC+2, Osma Suominen wrote:
28.11.2016, 17:44, Katerina Gkirtzou wrote:
> Dear SKOSMOS developers,
> I am interesting in using SKOSMOS along with Virtuoso RDF store. I was
> able to correctly set SKOSMOS with Jena Fuseki locally, and everything
> works well. But when I move to a SPARQL endpoint provided by a Virtuoso
> RDF store (as it supports SPARQL 1.1) , SKOSMOS seems that is not able
> to retrieve the concepts. Do I have to set anything in Virtuoso for
> example? Does SKOSMOS work with Virtuoso's SPARQL Endpoint in general?

Dear Katerina,

Thank you for trying out Skosmos!

While Skosmos should work with any SPARQL 1.1 compliant triple store, in
practice we only use it with Fuseki. So I'm unfortunately not able to
say whether it works with Virtuoso or not.

One thing we definitely don't have is support for the Virtuoso text
index, so you must set the SPARQL dialect to "Generic". Have you done
that? You can either do it per vocabulary in vocabularies.ttl, or set
the default dialect in config.inc.


Yes, I have set the SPARQL dialect to "Generic" in config.inc, but it doesn't seem to work.
 
If that doesn't help, can you describe a bit more what the problem is?
Are you getting any vocabulary data from the SPARQL endpoint? For
example, do you get information on the vocabulary front page (metadata,
concept and label statistics)? Does the alphabetical index show any
concepts?


When I select a vocabulary which is available via a SPARQL endpoint of a virtuoso
store, SKOSMOS provides me the initial information, given by the vocabulary.ttl, but it fails
to show any statistics (eg Term counts by language, Resource counts by type)  and to
create the alphabetical and hierarchical indices. Instead, it shows an indication of
"Loading more items" by it fails to do so.
 
We originally chose Fuseki with jena-text as the main backend for
Skosmos. We also considered Blazegraph (then called Bigdata). I also
looked at Virtuoso, but at least at the time (around 2012-2013), its
text index didn't seem suitable for Skosmos since it required four
leading characters to be specified in queries. The alphabetical index in
Skosmos uses one-character prefix queries (e.g. every concept whose
label starts with "A"). This can apparently be changed only by
recompiling Virtuoso, and I'm unsure about the side effects. Looking at
the documentation no, I see no indication that this has changed. So it
doesn't seem very likely that Virtuoso could be used as an efficient
backend for large vocabularies where Skosmos in practice needs a text
index to work with reasonable performance. It could be used for small
SKOS vocabularies (up to a few thousand concepts perhaps) using the
Generic SPARQL 1.1 dialect, but there may be compatibility problems
since we haven't really tested it together with Virtuoso.

-Osma

You mentioned that SKOSMOS requires a good text index in order to work
efficiently, especially for big vocabularies. Do you believe that Lucene
can help in that direction? Have you tried that?

Thanks again for your help.

Katerina

Osma Suominen

unread,
Nov 29, 2016, 11:38:30 AM11/29/16
to skosmo...@googlegroups.com
29.11.2016, 12:07, Katerina Gkirtzou kirjoitti:

> Yes, I have set the SPARQL dialect to "Generic" in config.inc, but it
> doesn't seem to work.

Okay. So the problem is more general than that.

> When I select a vocabulary which is available via a SPARQL endpoint of a
> virtuoso
> store, SKOSMOS provides me the initial information, given by the
> vocabulary.ttl, but it fails
> to show any statistics (eg Term counts by language, Resource counts by
> type) and to
> create the alphabetical and hierarchical indices. Instead, it shows an
> indication of
> "Loading more items" by it fails to do so.

It sounds like no SPARQL queries are succeeding then. Can you tell from
the Virtuoso or Apache logs whether there are any errors? Does Virtuoso
see any incoming SPARQL queries?

> You mentioned that SKOSMOS requires a good text index in order to work
> efficiently, especially for big vocabularies. Do you believe that Lucene
> can help in that direction? Have you tried that?

Jena-text is based on Lucene, built into Fuseki. So we are already using
it that way.

From the Skosmos perspective, the text index should be part of the
SPARQL endpoint, not a separate system. Many RDF triple stores have text
index functionality (not just Fuseki and Virtuoso, but also Blazegraph,
AllegroGraph, 4store and probably others too), but unfortunately there
is no standardization so all of them have different SPARQL query syntax
and thus require specialized support.

Thus far we have implemented only support for jena-text. There used to
be support also for the Blazegraph/Bigdata text index, but it bitrotted
and was dropped from the Skosmos codebase already some time ago.

-Osma

Katerina Gkirtzou

unread,
Dec 1, 2016, 6:34:32 AM12/1/16
to Skosmos Users


On Tuesday, 29 November 2016 18:38:30 UTC+2, Osma Suominen wrote:
29.11.2016, 12:07, Katerina Gkirtzou kirjoitti:

> Yes, I have set the SPARQL dialect to "Generic" in config.inc, but it
> doesn't seem to work.

Okay. So the problem is more general than that.

> When I select a vocabulary which is available via a SPARQL endpoint of a
> virtuoso
> store, SKOSMOS provides me the initial information, given by the
> vocabulary.ttl, but it fails
> to show any statistics (eg Term counts by language, Resource counts by
> type)  and to
> create the alphabetical and hierarchical indices. Instead, it shows an
> indication of
> "Loading more items" by it fails to do so.

It sounds like no SPARQL queries are succeeding then. Can you tell from
the Virtuoso or Apache logs whether there are any errors? Does Virtuoso
see any incoming SPARQL queries? 



Apache get the following error when trying to connect with virtuoso ::

   [Thu Dec 01 13:10:24.744945 2016] [:error] [pid 17271] [client 127.0.0.1:33622]
   PHP Fatal error:  Uncaught EasyRdf_Exception: HTTP request for SPARQL query failed:
   Virtuoso 37000 Error SP030: SPARQL compiler, line 29: syntax error at 'VALUES' before
   '('\n\nSPARQL query:\ndefine sql:big-data-const 0 define output:format "HTTP+XML
   application/sparql-results+xml" define output:dict-format "HTTP+TTL text/turtle"
   PREFIX owl: <http://www.w3.org/2002/07/owl#>\n
   PREFIX skos: <http://www.w3.org/2004/02/skos/core#>\n
   SELECT DISTINCT ?s ?label ?alabel\nWHERE {\n 
   GRAPH <http://zbw.eu/stw/> {\n  
   {\n      ?s skos:prefLabel ?label .\n    
   FILTER (\n        strstarts(lcase(str(?label)), 'a')\n       
   && langMatches(lang(?label), 'en')\n      )\n    }\n  
   UNION\n    {\n      {\n        ?s skos:altLabel ?alabel .\n       
   FILTER (\n          strstarts(lcase(str(?alabel)), 'a')\n         
   && langMatches(lang(?alabel), 'en')\n        )\n      }\n   
   {\n        ?s skos:prefLabel ?label .\n      
   FILTER (langMatches(lang(?label), 'en'))\n      }\n    }\n   ?s a ?type .\n
   FILTER NOT EXISTS { ?s owl:deprecated true }\n  }
   VALUES (?type) { (<http://w in /var/www/html/skosmos/vendor/easyrdf/easyrdf/lib/EasyRdf/Sparql/Client.php on line 290,
   referer: http://localhost/skosmos/stw/en/

I have installed easyrdf  0.9.*  and in virtuoso, I don't see anything in the logs.
 

> You mentioned that SKOSMOS requires a good text index in order to work
> efficiently, especially for big vocabularies. Do you believe that Lucene
> can help in that direction? Have you tried that?

Jena-text is based on Lucene, built into Fuseki. So we are already using
it that way.

 From the Skosmos perspective, the text index should be part of the
SPARQL endpoint, not a separate system. Many RDF triple stores have text
index functionality (not just Fuseki and Virtuoso, but also Blazegraph,
AllegroGraph, 4store and probably others too), but unfortunately there
is no standardization so all of them have different SPARQL query syntax
and thus require specialized support.

Thus far we have implemented only support for jena-text. There used to
be support also for the Blazegraph/Bigdata text index, but it bitrotted
and was dropped from the Skosmos codebase already some time ago.

I mainly asked if skosmos support lucence, as I has thinking if it is possible to set
lucene or solr above virtuoso, in order to have another type of text index. As an idea
to solve the problem. Though, I am not sure if that's easy to do.

Katerina
 

Osma Suominen

unread,
Dec 2, 2016, 6:51:08 AM12/2/16
to skosmo...@googlegroups.com
I see. This appears to be the SPARQL query for the alphabetical index,
and Virtuoso for some reason gives a syntax error possibly involving the
VALUES clause.

Which version of Virtuoso are you using?

Debugging this would require having an easily repeatable test case. For
example, the problematic SPARQL query that Skosmos uses could be
extracted and run from within the Virtuoso UI. If it doesn't work there,
there is either a problem in the query or in Virtuoso.

May I ask why you want to use Skosmos with Virtuoso? Is it important for
your use case that the triple store has to be Virtuoso?

> I mainly asked if skosmos support lucence, as I has thinking if it is
> possible to set
> lucene or solr above virtuoso, in order to have another type of text
> index. As an idea
> to solve the problem. Though, I am not sure if that's easy to do.

That would require a process for keeping the text index synchronized
with the RDF data, which in principle can change any time. Additionally,
with a separate text index, queries from the application (eg Skosmos)
would have to be targeted separately to the text index and the triple
store. This could also cause performance problems.

It makes more sense to do text indexing from within the triple store,
which is why many triple stores support it. Then changes in the RDF
triples can be immediately reflected to the text index (at least
jena-text does this, but other text index implementations may not work
that way) and the text index functions are usable from within SPARQL,
which makes it easy for the application (ie Skosmos) to perform queries
that combine text index functionality with regular SPARQL.

Katerina Gkirtzou

unread,
Dec 6, 2016, 10:29:08 AM12/6/16
to Skosmos Users

I have tried running the above SPARQL query with a little modification in the web sparql endpoint of virtuoso. By modification I mean removing the value clause as I didn't know what are the values it requires or from where can I retrieve them in order to check with them as well and of course closing the brackets to make it the SPARQL query syntactically correct. When I ran that modified query in the web sparql endpoint of virtuoso, I had no problem. I don't know if the values or how it concatenates them with the sparql query generates  syntactical errors.
 

Which version of Virtuoso are you using?

We have version 06.01.3127 of Virtuoso
 

Debugging this would require having an easily repeatable test case. For
example, the problematic SPARQL query that Skosmos uses could be
extracted and run from within the Virtuoso UI. If it doesn't work there,
there is either a problem in the query or in Virtuoso.

May I ask why you want to use Skosmos with Virtuoso? Is it important for
your use case that the triple store has to be Virtuoso?

The main reason is that we already use Virtuoso as a triple store for our RDF data, it is more suitable for the size of our data. So ideally we would like to also upload our controlled vocabularies as well and to provide an nice documentation page, such as SKOSMOS.

> I mainly asked if skosmos support lucence, as I has thinking if it is
> possible to set
> lucene or solr above virtuoso, in order to have another type of text
> index. As an idea
> to solve the problem. Though, I am not sure if that's easy to do.

That would require a process for keeping the text index synchronized
with the RDF data, which in principle can change any time. Additionally,
with a separate text index, queries from the application (eg Skosmos)
would have to be targeted separately to the text index and the triple
store. This could also cause performance problems.

It makes more sense to do text indexing from within the triple store,
which is why many triple stores support it. Then changes in the RDF
triples can be immediately reflected to the text index (at least
jena-text does this, but other text index implementations may not work
that way) and the text index functions are usable from within SPARQL,
which makes it easy for the application (ie Skosmos) to perform queries
that combine text index functionality with regular SPARQL.


Yes, it is true that it will be an overhead over synchronization but I thought to ask.

Katerina

Osma Suominen

unread,
Dec 7, 2016, 5:48:57 AM12/7/16
to skosmo...@googlegroups.com
06.12.2016, 17:29, Katerina Gkirtzou kirjoitti:

> I have tried running the above SPARQL query with a little modification
> in the web sparql endpoint of virtuoso. By modification I mean removing
> the value clause as I didn't know what are the values it requires or
> from where can I retrieve them in order to check with them as well and
> of course closing the brackets to make it the SPARQL query syntactically
> correct. When I ran that modified query in the web sparql endpoint of
> virtuoso, I had no problem. I don't know if the values or how it
> concatenates them with the sparql query generates syntactical errors.

The VALUES block normally contains the URI
<http://www.w3.org/2004/02/skos/core#Concept> as the only value. You can
see the full query template here:
https://github.com/NatLibFi/Skosmos/blob/master/model/sparql/GenericSparql.php#L1144

It is currently needlessly difficult to find out a particular SPARQL
query that Skosmos is performing. I just made a new GitHub issue (#566)
for adding a debug log facility, which would make this easier when it
gets implemented in a future version of Skosmos.

If you want to see the actual query, the easiest way currently is to add
a line such as

echo "<!-- $query -->\n";

to the GenericSparql->generateAlphabeticalListQuery method just before
"return $query;" which is on or around (depending on exact Skosmos
version) line 1172 in the file model/sparql/GenericSparql.php.
Then just reload the page and check the HTML source code to see the
SPARQL query that Skosmos tried to execute.

> We have version 06.01.3127 of Virtuoso

OK. It would be good if you could open an issue on GitHub about this
problem, so that it gets tracked somewhere. Generally we would like to
support all SPARQL 1.1 compliant triple stores, but we don't have much
resources to investigate other triple stores than Fuseki ourselves.

> The main reason is that we already use Virtuoso as a triple store for
> our RDF data, it is more suitable for the size of our data. So ideally
> we would like to also upload our controlled vocabularies as well and to
> provide an nice documentation page, such as SKOSMOS.

I see. I think it's likely that other people/organizations are in the
same situation. As I said Virtuoso was not an option for us because the
text index is not very suitable for this kind of use.

Katerina Gkirtzou

unread,
Dec 7, 2016, 6:10:35 AM12/7/16
to Skosmos Users

Perfect! I will definitely try it and see if I can get a better grasp of what's going on :)
Thanks! I will let you know if I found out something.

Katerina

Reply all
Reply to author
Forward
0 new messages