VocBench questions -- compatibility with other triple stores & cloud install

Leia Dickerson

unread,

Apr 4, 2025, 10:47:04 AM4/4/25

to vocbench-user

Hello all--

I understand that VocBench has RDF4J embedded, and that Graph DB is the recommended external triple store to use.

I am interested to know if anyone has used another external triple store with VocBench successfully, and what your experiences were with implementation.

Also, if anyone has implemented VocBench in an Azure cloud instance, please comment on any specific challenges you had or items to keep in mind.

Many thanks.

Leia Dickerson

US Government Accountability Office

stel...@uniroma2.it

unread,

Apr 4, 2025, 11:17:30 AM4/4/25

to Leia Dickerson, vocbench-user

Dear Leia,

Thanks for your email. There’s a general story in the FAQ encompassing several aspects of the triple store interoperability challenges.

Here’s the link:

https://vocbench.uniroma2.it/doc/faq/#general

it’s the first Q&A under that section.

That said, nothing is impossible, but that story gives an idea of the challenges and tradeoffs that integrating other triple stores would imply. As a first attempt, I could say that, given a triple store X, if:

it is not a problem to lose (at least on a first implementation) features associated with native plugins deployed on the store (e.g. history and validation)
X has an RDF4J client that can be deployed within VB
it is ok to invest in testing all the API and possibly revise those services that do not perform well on X
X implements the notion of repository and supports named graphs (still, in compliance with RDF4J API)

Examining an extension for X is a possibility,

Kind Regards,

Armando

--
You received this message because you are subscribed to the Google Groups "vocbench-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vocbench-use...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/vocbench-user/041a3ecb-86fa-454b-ba1d-640347bf36c3n%40googlegroups.com.

Dickerson, Leia J

unread,

Apr 15, 2025, 10:33:43 AM4/15/25

to stel...@uniroma2.it, vocbench-user

Please forgive the delayed reply. Many thanks, Armando.

Leia

Leia Dickerson | Taxonomy + Research Lead | ARM CLS | US GAO

From: stel...@uniroma2.it <stel...@uniroma2.it>
Sent: Friday, April 4, 2025 11:17 AM
To: Dickerson, Leia J <Dicke...@gao.gov>; 'vocbench-user' <vocben...@googlegroups.com>
Subject: RE: [vocbench-user] VocBench questions -- compatibility with other triple stores & cloud install

CAUTION EXTERNAL EMAIL: Do not click on any links or open any attachments unless you trust the sender and/or know the content is safe. If you are suspicious of the e-mail, click on the Report Suspicious Emails button.

Emidio Stani

unread,

Apr 16, 2025, 2:16:50 AM4/16/25

to Dickerson, Leia J, Armando Stellato, vocbench-user

Hello all,

I believe it is an interesting discussion.

I have setup for my client Vocbench with GraphDB, wondering if in the future they could change repository.

I would separate the functionalities of history and validation from the storage. For example , for validation one could use an external service like the shacl ITB validator (https://github.com/ISAITB/shacl-validator) which anyhow I use to validate rdf data in a data flow process.

For the history I see similarities with the skos history of STW (https://github.com/jneubert/skos-history/wiki/Versions-and-Deltas-as-Named-Graphs) of course its reimplementation might require some time.

I see the advantages on use rdf4j (and indirectly GraphDB) for its transactions, indeed one could use any Sparql repository thanks to its API (https://rdf4j.org/documentation/programming/repository/#access-over-http) like Metaphacts does https://help.metaphacts.com/resource/Help:RepositoryManager ; however one need to know the limitations that come with a triplestore, for example Neptune doesn't support shacl validation.

Cheers,

Emidio

To view this discussion visit https://groups.google.com/d/msgid/vocbench-user/PH0PR09MB83948DC019E6C347A9471C54A5B22%40PH0PR09MB8394.namprd09.prod.outlook.com.

Emidio Stani

stel...@uniroma2.it

unread,

Apr 16, 2025, 6:04:43 AM4/16/25

to Emidio Stani, Dickerson, Leia J, vocbench-user

Dear Emidio,

Thanks for contributing to the discussion. So, without being criticist about your points (which are positive suggestions, and I appreciate that), I just want to highlight here a number of aspects, wrt what you said, that support the current choices:

Separation of history&validation feature. The way they are done (not just a “narrative” of what happened, but a very precise list of changes) includes knowing exactly which triples have been changed by an operation. You can’t know that (I can go more in details if needed but, shortly, if I ask to write triple <a,b,c> and this is already existing, all I get as a client of the triple store is a positive response to the operation, but I need to know that the triple has not been effectively written because it’s already there) unless you are sitting on top of the triple store, so you can’t decouple them.
What we have done is that these are separated enough (e.g. we are not using the analogous services provided specifically by GraphDB) in that these are based on the sail architecture, which is part of our required architecture for the triple store (if all advanced functionalities are to be used). From that on, obviously other possibilities can be explored, but knowing what you are losing as well

VB exploits RDF4J support for SHACL and can be used both in inline validation (i.e. rejecting incompatible commits) and batch validation (i.e. you edit everything and then check your data every once in a while). However, SHACL implements syntax validation. Content is another matter. Validation is mainly thought to address that aspect, allowing domain experts to validate actions.

I don’t know Metaphacts in details, so maybe it is using any generic http repo for the main data repository, but what I see is a different thing: it has specific repository connectors for various technologies (Stardog, GraphDB, RDFox, AllegroGraph, etc..), so suggesting the opposite, and then it has a generic SPARQL Repository for connecting to any triple store. Note this:
“By default, the platform works with one specific repository, but it is able to access data from as many repositories as needed”
which is also what VB does. We have different connectors and in the MDR (Metadata Registry) of VocBench or ShowVoc you can see that specific technologies are acknowledged so that access to them can be optimized. However, SPARQLRepository is used for accessing generic SPARQL endpoints. VB and SV are not only used to edit/show their data, they are fully fledged semantic web data browsers. By default, if a dataset is not known, they exploit linked data publishing best practices for getting access to metadata, so they can know if a SPARQL endpoint is available for any URI they stumble upon. If they managed to find it, they will use it, possibly optimizing the query on the base of the metadata they found. If not, they HTTP-resolve the URI. In all cases, the content is shown in the resource view seamlessly, as it if were local data.

Indeed, in reference to point 3 but also more in general, sure one good thing could be to have dedicated connectors for some other technologies, so that these are optimized for them. These would be meant to be used for accessing external data but, who knows, in some cases (and with due limitations, see also my previous email on this) these could be considered as connectors for main data storage.