Base URI for vocabulary // graphdb for vocbench & showvoc

55 views
Skip to first unread message

Steffen Franke

unread,
Jun 7, 2024, 3:38:15 AMJun 7
to vocbench-user

Hi there, I didn’t read all the manuals and didn’t watch all the videos, but at least a couple of them. However, I have still some stupid questions, I couldn’t answer with the docs, but everybody seems to know, except me.

Background:

I want to manage some controlled vocabulary for my institutional research data repository with vocbench3. I managed to install showvoc, vocbench3 and graphdb in docker containers behind a proxy. I created my first project and started my first vocabulary.

Questions:

1.) I didn’t get the point how to publish a controlled vocabulary? How to set a meaningful URI? Of course I could enter a fictive URI/URL when creating a project, but finally I want to have something like this, don’t I?

https://agrovoc.fao.org/browse/agrovoc/en/page/c_3224

I expected showvoc is doing the job (assuming it is an equivalent to skosmos used by agrovoc). But even showvoc does not suggest any URI/URL where it might present the vocabulary to the internet/intranet. If it is written in the manuals just point me to the corresponding page. I didn’t get it.

2.) Why should I set up one graphdb container for showvoc and one graphdb for vocbench as suggested here:

https://bitbucket.org/art-uniroma2/showvoc-docker/src/master/

Why not to use the same graphdb container for both apps? Did I miss something? Is there some automation tool to export the vocabulary from vocbench3 and to import it into showvoc from time to time? Is it really meant to be done manually? Why not to use the same database in both apps?

If I could understand those two points, the rest should be mainly learning SKOSXL. Thank you for your support.

Steffen.

Roland Wingerter

unread,
Jun 8, 2024, 10:17:30 AMJun 8
to vocbench-user
Hi Steffen,

welcome to the VocBench group!

1. How to set up a meaningful URI
When you create a new project in VB3, you need to provide a baseuri. You can choose any URI (like http://example.com or "http://meintest.de/"). But a "meaningful" URI is one that you can use to publish your data on the web, so you will like to choose a URI that is under your control. For example if you own the domain steffenfranke.de you could use "http://steffenfranke.de/project1" as baseuri of your project.
Let's assume you create a SKOS project and go with the standard configuration. When you start entering concepts, VB3 will create a unique URI for each concept, by appending a randomly generated local name to the baseuri, so you will get something like "http://steffenfranke.de/project1/c_47581ea9", where  "47581ea9" is a sequence of randomly generated characters.

2. Your second question is more difficult to answer for me, but I hope this message might help: https://groups.google.com/g/vocbench-user/c/FeZb6lCzoWc/m/nVSchtU3AQAJ

Kind regards
Roland

Armando Stellato

unread,
Jun 9, 2024, 8:28:52 AMJun 9
to Steffen Franke, vocbench-user

Dear Steffen,

 

thanks for reaching out. I hope to give a satisfactory reply to both your questions, even though I’m not 100% sure I got the first one right.

 

So, question 1.

 

I got confused because you mention “meaningful URI” so I guess you were referring to the URI of the datasets you publish. However, you then mention the URI of the SKOSMOS installation of Agrovoc, yet I see the full URI for reaching a concept, which is however not the URI of the concept ;-)

This is the true URI of that concept:

 

http://aims.fao.org/aos/agrovoc/c_3224

 

 

So, instead of trying to be sure what was exactly meant there, I’ll try to reply more extensively (and apologies if I say something, or more than something, trivial to you).

 

By first, even ShowVoc provides URIs for accessing directly concepts. See, for instance, this one:

 

https://stats.fao.org/caliper/browse/showvoc/#/datasets/CPC2.1/data?resId=https:%2F%2Funstats.un.org%2Fclassifications%2FCPC%2Fv2.1%2F2431

 

Now, you could say that the URI you presented is nicer that the one above. I’ll explain why.

 

This one you reported: https://agrovoc.fao.org/browse/agrovoc/en/page/c_3224, configured in SKOSMOS, assumes that by solely indicating c_3224 and knowing the baseuri of the hosted resource (Agrovoc, with URI: http://aims.fao.org/aos/agrovoc/ ) you can resolve the resource c_3224 with URI: http://aims.fao.org/aos/agrovoc/c_3224

 

In ShowVoc, you can access any single URI of any hosted resource, even dependencies, so the full URI has to be specified in order not to be ambiguous. Now, the full URI contains characters that need to be escaped, hence the long name with special escaped characters that you see in the one I reported here above.

 

Is this a problem? No, it is not, because in any case neither one nor the other are the true URIs of the resource (which, again, is: http://aims.fao.org/aos/agrovoc/c_3224 and in this specific case you will see it resolved through another tool, called Loddy).

 

Now, in order to publish a resource, you need to setup an architecture for LOD publication. Here is a good reading:

 

http://linkeddatabook.com/book

 

Besides the reading above, I can give you some more info in short. In order to publish your dataset, you need to be the owner of the domain, so choose your domain wisely. Then, directly you, or your IT dept, whoever, need to be able to configure the services there, so that requests are appropriately rerouted. There is no way out here: before ShowVoc, Skosmos, or any other system, you need to properly route the requests. I’m sorry but hosting data can’t be directly just configuring a tool, you need to get the requests first out there, the tool can’t just reply to them by means of its configuration.

 

Now, there is another point, http resolution (i.e. replying to the request coming to your dataset’s resources URIs) needs, in linked open data publication, also to setup content negotiation.

See this: http://linkeddatabook.com/editions/1.0/#htoc11

 

In short, content negotiation is that thing for which if you ask about the generic resource http://aims.fao.org/aos/agrovoc/c_3224

You will get different responses depending on what you ask for. E.g., if you ask for a web page (your web browser will configure the request in this way, you don’t have to do anything), you will be redirected to this page:

 

https://aims.fao.org/aos/agrovoc/c_3224.html

 

while if you asked for RDF triples, possibly serialized in a format of your choice, the system you are using must configure the request appropriately. E.g. VocBench or ShowVoc can access the same URI resource above, get the RDF triples instead of the web page, and then show their content in their user interface.

This is an example, serialized in turtle, just to give an idea:

 

https://aims.fao.org/aos/agrovoc/c_3224.ttl

 

This content negotiation needs also to be configured in the servers.

 

Now, luckily 2 years ago we introduced a feature in ShowVoc that facilitates this work by allowing system administrators to configure the content negotiation:

 

https://showvoc.uniroma2.it/doc/user/http_resolution.jsf

 

In such case, ShowVoc can also provide both resolvers, that is the web page will be its own page for the concept (with a nicer URI in this case ;-) ) and the RDF content can also be provided by it.

 

Please notice that this cannot replace the first necessary step (you or your IT dept must still configure the servers responding to requests on your domain so that the requests are properly sent to ShowVoc) but it will spare configuring the content negotiation.

 

A final note: this is not an end-user/domain expert task; as you can imagine it requires some IT skills for understanding the various steps, even though the support from ShowVoc helps a lot specifically for the content negotiation. This is the reason for which, besides the page on the http resolution in the ShowVoc manual (which, as you may notice, is on the “administrator” section and not on the “user” one), there is no other material. Dealing with publication of linked open data is not something directly related to the tools, but concerning your whole data publication architecture and setup.

 

Question 2

 

That’s very easy to reply to this. Just using your same example: FAO gets (depending on the service) from tenths to hundreds of millions of requests per year for accessing Agrovoc’s content. Comparatively, even though there is a world-wide spread community editing Agrovoc, the requests are much less for editing, as you can imagine, but they might be more cumbersome. I think this gives you an answer on why it might be better, on a production environment, to have separate triple stores in a production environment ;-)

That said, you can use the same triple store for both. Technically, the only annoyance can be name clashes (store’s repository names are create after the project’s name in the application), but 1) you will be informed by the application if you stumble into an existing one and you can override the choice of name for the repository. Or, you could even specifiy that you want to reuse the same repo, thus de facto reusing the same content for two. With some hacking, you can actually use the same Semantic Turkey server with two user interfaces, but this is discouraged, it’s an hack and not documented in purpose, we cannot support this.

With respect to your question about deploying from VB to SV, here’s the doc page:

 

https://vocbench.uniroma2.it/doc/user/global_data_management.jsf#export_data

 

which mentions the existence of deployers.

 

Just for reference, this is the page that will be created for deployers (still work in progress)

https://vocbench.uniroma2.it/doc/user/ioext/deployers/

 

but both this one and the previous one link to the Semantic Turkey manual (which contains all the technical info and API), which mentions the existence of a ShowVoc deployer, which should be pretty easy to configure (in case of doubts, this page describes the meaning of each entry of the configuration: https://semanticturkey.uniroma2.it/doc/sys/deployer.jsf#showvoc_deployer )

 

Kind Regards,

 

Armando

 

 

 

 

 

 

 

 

--
You received this message because you are subscribed to the Google Groups "vocbench-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vocbench-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/vocbench-user/1cdc74fb-65be-4544-a618-4b1bf7e3e29dn%40googlegroups.com.

Message has been deleted
Message has been deleted

Steffen Franke

unread,
Jun 12, 2024, 3:40:49 AMJun 12
to vocbench-user
Thank you, Roland and Armando, for your answers.

1. I will define a URI for my vocabulary that fulfills the following criteria:
   - The URI looks like a URL, is intuitive, and "nice".
   - The domain is owned by my organization.
   - We are potentially able to set up a service that redirects all requests to the URI/URL to a ShowVoc instance.
   - Up to this point in the future, the URI is just a name.

2. For small installations, using one GraphDB for both VocBench and ShowVoc is not a problem (I have already done it).
   For productive environments, I should rethink the concept.

Kind regards,
Steffen
Reply all
Reply to author
Forward
Message has been deleted
0 new messages