Write NTriples with blank nodes to remote repository

36 views
Skip to first unread message

F Amer

unread,
Dec 18, 2020, 4:20:58 AM12/18/20
to RDF4J Users
I have a JSON-LD that results in triples with blank nodes.
If there were no blank nodes I could simply use a StatementCollector or Model api to load the statements into a named context.

What is the best way to load these statements into a named graph using statement collectorn or mdd?

Sample set of statements

_:genidxxxxxxxx <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> <http://example.com#3333> .
_:genidxxxxxxxx <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> <http://www.w3.org/1999/02/22-rdf-syntax-ns#nil> .
_:genidxxxxxxxx <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/1999/02/22-rdf-syntax-ns#List> .
_:genidxxxxxxxx <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2000/01/rdf-schema#Resource> .
<http://www.w3.org/1999/02/22-rdf-syntax-ns#nil> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/1999/02/22-rdf-syntax-ns#List> .
<http://www.w3.org/1999/02/22-rdf-syntax-ns#nil> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2000/01/rdf-schema#Resource> .
<http://example.com#1111> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2000/01/rdf-schema#Resource> .
<http://example.com#2222> <http://purl.org/dc/terms/hasPart> _:genidxxxxxxxx .
<http://example.com#2222> <http://purl.org/dc/terms/hasVersion> <http://example.com#1111> .
<http://example.com#2222> <http://purl.org/dc/terms/relation> _:genidxxxxxxxx .
<http://example.com#2222> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2000/01/rdf-schema#Resource> .
<http://example.com#2222> <urn:acme:com#avId> <http://example.com#1111> .
<http://example.com#2222> <urn:acme:com#part2> _:genidxxxxxxxx .
<http://example.com#2222> <urn:acme:com#type> <urn:acme:com#Main> .
<http://example.com#3333> <http://purl.org/dc/terms/hasVersion> <http://example.com#4444> .
<http://example.com#3333> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2000/01/rdf-schema#Resource> .
<http://example.com#3333> <urn:acme:com#type> <urn:acme:com#part1> .
<http://example.com#3333> <urn:acme:com#version> <http://example.com#4444> .
<http://example.com#4444> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2000/01/rdf-schema#Resource> .
<urn:acme:com#Main> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2000/01/rdf-schema#Resource> .
<urn:acme:com#part1> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2000/01/rdf-schema#Resource> .

Bart Hanssens (BOSA)

unread,
Dec 18, 2020, 4:31:20 AM12/18/20
to RDF4J Users
Hi,

well, one option would be to turn the blank nodes into Skolem IRIs first.

See also

So that would be something like:

rdfParser.getParserConfig().set(BasicParserSettings.SKOLEMIZE_ORIGIN, "http://example.com");


Best regards

Bart

From: rdf4j...@googlegroups.com <rdf4j...@googlegroups.com> on behalf of F Amer <fame...@gmail.com>
Sent: Friday, December 18, 2020 10:20
To: RDF4J Users <rdf4j...@googlegroups.com>
Subject: [rdf4j-users] Write NTriples with blank nodes to remote repository
 
--
You received this message because you are subscribed to the Google Groups "RDF4J Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rdf4j-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rdf4j-users/c09d21ad-6fe7-4476-9800-fb57b196f649n%40googlegroups.com.

F Amer

unread,
Dec 20, 2020, 3:05:04 AM12/20/20
to RDF4J Users
Thank you that does solve part of the problem.

However I do have a follow up question.
The numeric part of identifier for the blank node keeps changing even if the input json-ld does not change. This is the part that i have highlighted in red below. This makes the same b0 node appear to be different between 2 runs through the parser even though it is still b0.

First trial
<http://example.com/.well-known/genid/15606f030e93444c91af74944f190d51-b0> <urn:acme:com#id> <urn:acme:com:2222>  .

Second Trial
<http://example.com/.well-known/genid/0d3457fda5f34f219b20feec65c3902c-b0> <urn:acme:com#id> <urn:acme:com:2222>  .

Is there a way this can be avoided ?

Jeen Broekstra

unread,
Dec 20, 2020, 4:55:44 PM12/20/20
to RDF4J Users
On Sun, Dec 20, 2020, at 19:05, F Amer wrote:
Thank you that does solve part of the problem.

However I do have a follow up question.
The numeric part of identifier for the blank node keeps changing even if the input json-ld does not change. This is the part that i have highlighted in red below. This makes the same b0 node appear to be different between 2 runs through the parser even though it is still b0.

Yes. The parser generates a unique id for parsed blank nodes deliberately, to avoid clashes. There is a configuration option in the parser to preserve blank node identifiers, but it will not work in combination with SKOLEMIZE_ORIGIN.

We could take that on board as a feature request (it wouldn't be hard to fix I think), but I'm a little confused about what you're trying to achieve to be honest.

Your first post said something about wanting to get rid of blank nodes in parsed JSON-LD because you want to put things in a named graph using a Model - what makes you think you need to get rid of blank nodes to do that?

Jeen

F Amer

unread,
Dec 22, 2020, 9:12:14 AM12/22/20
to RDF4J Users
I was trying to use @list container type instead of @set in my JSON-LD context since I need to preserve order information. This results in blank nodes in the rdf. Without the blank nodes it was  simple. Rio parser would create triples or quads that could be directly loaded to a remote repository.

        RDFParser rdfParser = Rio.createParser(RDFFormat.JSONLD);
        StatementCollector statementCollector = new StatementCollector();
        rdfParser.setRDFHandler(statementCollector);
        rdfParser.parse(is, path);
        
        // get the statement from the collector
        Collection<Statement> statements = statementCollector.getStatements();

        // add/remove statements from specified named graph using a RepositoryConnection
        connection.add(statements, namedGraph);

With blank nodes this results in error unless I use Skolem IRIs. 

However, in my case I also need to track statements that may have changed for a named graph in subsequent requests. The Skolem IRIs for all blank nodes are flagged as changes even if none of the container items have changed. 

So to me it appears that either I should not use @list and find some other way to preserve order information or the comparison logic should change.

Jeen Broekstra

unread,
Dec 22, 2020, 4:35:48 PM12/22/20
to RDF4J Users

On Wed, Dec 23, 2020, at 01:12, F Amer wrote:
I was trying to use @list container type instead of @set in my JSON-LD context since I need to preserve order information. This results in blank nodes in the rdf.

Without the blank nodes it was  simple. Rio parser would create triples or quads that could be directly loaded to a remote repository.

        RDFParser rdfParser = Rio.createParser(RDFFormat.JSONLD);
        StatementCollector statementCollector = new StatementCollector();
        rdfParser.setRDFHandler(statementCollector);
        rdfParser.parse(is, path);
        
        // get the statement from the collector
        Collection<Statement> statements = statementCollector.getStatements();

        // add/remove statements from specified named graph using a RepositoryConnection
        connection.add(statements, namedGraph);

With blank nodes this results in error unless I use Skolem IRIs. 

Ok, that isn't right. An RDF repository should be able to load the model with no problems, even if it has blank nodes in it.

What type of repository are you using, and what is the exact error you are getting?

However, in my case I also need to track statements that may have changed for a named graph in subsequent requests. The Skolem IRIs for all blank nodes are flagged as changes even if none of the container items have changed. 

So to me it appears that either I should not use @list and find some other way to preserve order information or the comparison logic should change.

How are you comparing models?

Jeen


F Amer

unread,
Dec 24, 2020, 1:33:14 PM12/24/20
to rdf4j...@googlegroups.com
Regarding the error, I realized that there was a bug in my code that was appending an additional "_:" to the blank node id and hence resulting in a malformed query exception "Malformed query: Default namespace used but not defined". But this is fixed now. Thank you for your input.

Regarding comparison, I was going to compare using the values of subject, predicate and object but this does not seem feasible with blank nodes. Most likely I need to use a different approach.


--
You received this message because you are subscribed to the Google Groups "RDF4J Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rdf4j-users...@googlegroups.com.

Bart Hanssens (BOSA)

unread,
Dec 26, 2020, 4:18:50 AM12/26/20
to rdf4j...@googlegroups.com

F Amer

unread,
Dec 29, 2020, 11:28:02 AM12/29/20
to rdf4j...@googlegroups.com
Reply all
Reply to author
Forward
0 new messages