On marshalling objects with aggregation relationships

25 views
Skip to first unread message

Ray Rashif

unread,
Mar 30, 2017, 6:05:25 PM3/30/17
to Empire
Hi

This is in reference to point 4 of https://groups.google.com/forum/#!topic/empire-rdf/wsQiSwtnqYE and thought it would be better to create a separate thread:

4. For creating something new, use Pinto, and additionally place the new triples in a named graph

So, the problem is, I want to create a new House with an existing inhabitant:

House
--| hasMember Person1
----| id "person1"

This Person1 already exists in the store, so I need to identify it someway, and I use a literal ID property. If I simply go ahead and create that House object with Pinto, I get new Person1 objects with the same ID but different URIs. How does one go about marshalling this House object, then?

There is no point at which I set the ID of the Person inside House -- I simply want to serialize House into relevant triples. If there is an existing person matching the ID, no new statements should be created. Perhaps I am misunderstanding something somewhere entirely.

I do not see an aggregation example in the Pinto test cases (only composition examples), or did I miss it? Otherwise, is this not something Pinto can do, but Empire will?

Workaround:

If I hack the identifiable nature of the class with:

@Override
public Resource id() {
// this allows us to skip setting the rdf id explicitly, but not sure if correct approach
return ValueFactoryImpl.getInstance().createURI(NS + "person_" + getId());
//return mIdentifiable.id();
}

        // so then this perhaps becomes useless
@Override
public void id(Resource theResource) {
mIdentifiable.id(ValueFactoryImpl.getInstance().createURI(NS + "person_" + getId()));
}

I am able to work around the issue of "new objects with different URIs". However, I get duplicate `:person_ID rdf:type :Person` and `person_ID :id "ID"` triples (because the id is a separate field that is set).

And since I create a named graph for every new object, the triple store does not treat them as duplicates (so I cannot just be done with it by deleting duplicate triples). I then need to ensure I show only distinct results anywhere I am using the information.

Would really appreciate some help with this, as presently this is the only issue left. If there's no other choice, I can try to integrate Empire, if that will solve these problems.

Ray Rashif

unread,
Mar 30, 2017, 6:06:09 PM3/30/17
to Empire
Sorry, this should have been tagged with  [Pinto].

Michael Grove

unread,
Mar 31, 2017, 1:43:42 PM3/31/17
to empir...@googlegroups.com
On Thu, Mar 30, 2017 at 6:05 PM, Ray Rashif <schivm...@gmail.com> wrote:
Hi

This is in reference to point 4 of https://groups.google.com/forum/#!topic/empire-rdf/wsQiSwtnqYE and thought it would be better to create a separate thread:

4. For creating something new, use Pinto, and additionally place the new triples in a named graph

So, the problem is, I want to create a new House with an existing inhabitant:

House
--| hasMember Person1
----| id "person1"

This Person1 already exists in the store, so I need to identify it someway, and I use a literal ID property. If I simply go ahead and create that House object with Pinto, I get new Person1 objects with the same ID but different URIs. How does one go about marshalling this House object, then?

Maybe a lightweight Person object which has the established IRI of the known person that Pinto can use purely for serialization purposes.
 

There is no point at which I set the ID of the Person inside House -- I simply want to serialize House into relevant triples. If there is an existing person matching the ID, no new statements should be created. Perhaps I am misunderstanding something somewhere entirely.

That's up to the backend. Pinto will serialize the data exactly as-is, and if triples it generates already exist, then they should be ignored on insert.
 

I do not see an aggregation example in the Pinto test cases (only composition examples), or did I miss it? Otherwise, is this not something Pinto can do, but Empire will?

No, probably not a test case exactly like this.
 

Workaround:

If I hack the identifiable nature of the class with:

@Override
public Resource id() {
// this allows us to skip setting the rdf id explicitly, but not sure if correct approach
return ValueFactoryImpl.getInstance().createURI(NS + "person_" + getId());
//return mIdentifiable.id();
}

        // so then this perhaps becomes useless
@Override
public void id(Resource theResource) {
mIdentifiable.id(ValueFactoryImpl.getInstance().createURI(NS + "person_" + getId()));
}

I am able to work around the issue of "new objects with different URIs". However, I get duplicate `:person_ID rdf:type :Person` and `person_ID :id "ID"` triples (because the id is a separate field that is set).

Maybe don't include the ID field? I'd wanted to add the ability to mark something as transient so you could avoid serializing it in cases where you don't care about the field(s).
 

And since I create a named graph for every new object, the triple store does not treat them as duplicates (so I cannot just be done with it by deleting duplicate triples). I then need to ensure I show only distinct results anywhere I am using the information.

Would really appreciate some help with this, as presently this is the only issue left. If there's no other choice, I can try to integrate Empire, if that will solve these problems.

I guess where Empire could help is that you could look up the know `Person` object by it's IRI, or w/ a query, and assert that to be a member of the House, and then persist the House.

Cheers,

Mike
 

--
You received this message because you are subscribed to the Google Groups "Empire" group.
To unsubscribe from this group and stop receiving emails from it, send an email to empire-rdf+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ray Rashif

unread,
Apr 1, 2017, 1:52:01 PM4/1/17
to Empire
Hi Michael

Thanks a lot for responding, you and the tools you have written are a great help.


On Friday, 31 March 2017 23:43:42 UTC+6, mhgrove wrote:


On Thu, Mar 30, 2017 at 6:05 PM, Ray Rashif <schivm...@gmail.com> wrote:
Hi

This is in reference to point 4 of https://groups.google.com/forum/#!topic/empire-rdf/wsQiSwtnqYE and thought it would be better to create a separate thread:

4. For creating something new, use Pinto, and additionally place the new triples in a named graph

So, the problem is, I want to create a new House with an existing inhabitant:

House
--| hasMember Person1
----| id "person1"

This Person1 already exists in the store, so I need to identify it someway, and I use a literal ID property. If I simply go ahead and create that House object with Pinto, I get new Person1 objects with the same ID but different URIs. How does one go about marshalling this House object, then?

Maybe a lightweight Person object which has the established IRI of the known person that Pinto can use purely for serialization purposes.

Sure, I could have a `personUri` string, but that would not matter to Pinto, right? I would have to act on it myself in the back-end.

And if I typed House's `hasMember` field as an IRI/URI and had the URI as the value for that field, Pinto would not know to create a triple of the form `:house1 :hasMember :person1`, or would it? I think this would be the best approach. I tried this once but failed to marshal IRI/URI types.
 
 

There is no point at which I set the ID of the Person inside House -- I simply want to serialize House into relevant triples. If there is an existing person matching the ID, no new statements should be created. Perhaps I am misunderstanding something somewhere entirely.

That's up to the backend. Pinto will serialize the data exactly as-is, and if triples it generates already exist, then they should be ignored on insert.

Ahh no you are absolutely right, Pinto does its job, and what I'm thinking of is not exactly Pinto's job, perhaps.
 
 

I do not see an aggregation example in the Pinto test cases (only composition examples), or did I miss it? Otherwise, is this not something Pinto can do, but Empire will?

No, probably not a test case exactly like this.
 

Workaround:

If I hack the identifiable nature of the class with:

@Override
public Resource id() {
// this allows us to skip setting the rdf id explicitly, but not sure if correct approach
return ValueFactoryImpl.getInstance().createURI(NS + "person_" + getId());
//return mIdentifiable.id();
}

        // so then this perhaps becomes useless
@Override
public void id(Resource theResource) {
mIdentifiable.id(ValueFactoryImpl.getInstance().createURI(NS + "person_" + getId()));
}

I am able to work around the issue of "new objects with different URIs". However, I get duplicate `:person_ID rdf:type :Person` and `person_ID :id "ID"` triples (because the id is a separate field that is set).

Maybe don't include the ID field? I'd wanted to add the ability to mark something as transient so you could avoid serializing it in cases where you don't care about the field(s).
 

As I realized this is fine and correct behaviour -- Pinto is serializing this because the POJO contains that field (it doesn't make a difference on why the field was included). I was expecting something else entirely (see below).
 

And since I create a named graph for every new object, the triple store does not treat them as duplicates (so I cannot just be done with it by deleting duplicate triples). I then need to ensure I show only distinct results anywhere I am using the information.

Would really appreciate some help with this, as presently this is the only issue left. If there's no other choice, I can try to integrate Empire, if that will solve these problems.

I guess where Empire could help is that you could look up the know `Person` object by it's IRI, or w/ a query, and assert that to be a member of the House, and then persist the House.

I will have to try Empire to see what it can help with, but I think I was not able to frame this particular problem well. I now see that it could be more of a design pattern issue than anything else.

I am trying to abstract the RDF away with an HTTP JSON/XML web service. Therefore, the POJOs are created from the submitted content. I am not sure what the practice is for nested relationships, but I think I would need to accept a field for the ID or even the URI and then produce the relationships in the back-end based on the POJO content.

In JSON at least there is no concept of aggregation or composition as it simply is an association or relationship. HTTP recommendations such as HATEOS do suggest including relationship links and allowing for "expansion" of those links, so perhaps what I need to find out is whether Empire could help me do that expansion.

Thanks once again, but let me know if you have any more suggestions.
 

Cheers,

Mike
 

--
You received this message because you are subscribed to the Google Groups "Empire" group.
To unsubscribe from this group and stop receiving emails from it, send an email to empire-rdf+...@googlegroups.com.

Michael Grove

unread,
Apr 4, 2017, 8:10:39 AM4/4/17
to empir...@googlegroups.com
On Sat, Apr 1, 2017 at 1:52 PM, Ray Rashif <schivm...@gmail.com> wrote:
Hi Michael

Thanks a lot for responding, you and the tools you have written are a great help.

On Friday, 31 March 2017 23:43:42 UTC+6, mhgrove wrote:


On Thu, Mar 30, 2017 at 6:05 PM, Ray Rashif <schivm...@gmail.com> wrote:
Hi

This is in reference to point 4 of https://groups.google.com/forum/#!topic/empire-rdf/wsQiSwtnqYE and thought it would be better to create a separate thread:

4. For creating something new, use Pinto, and additionally place the new triples in a named graph

So, the problem is, I want to create a new House with an existing inhabitant:

House
--| hasMember Person1
----| id "person1"

This Person1 already exists in the store, so I need to identify it someway, and I use a literal ID property. If I simply go ahead and create that House object with Pinto, I get new Person1 objects with the same ID but different URIs. How does one go about marshalling this House object, then?

Maybe a lightweight Person object which has the established IRI of the known person that Pinto can use purely for serialization purposes.

Sure, I could have a `personUri` string, but that would not matter to Pinto, right? I would have to act on it myself in the back-end.

And if I typed House's `hasMember` field as an IRI/URI and had the URI as the value for that field, Pinto would not know to create a triple of the form `:house1 :hasMember :person1`, or would it? I think this would be the best approach. I tried this once but failed to marshal IRI/URI types.

I didnt make Pinto aware of "native" RDF values, though that would be fairly easy to add.

Something like:
 
interface Person extends Identifiable {
}

class SimplePerson implements Person {
  private final IRI mId;
  // constructor omitted
  public Resource id() {
    return mId;
  }
}

Ought to work for you to get the serialization you're expecting. House would have collection of members, which you can populate w/ SimplePerson using known IRIs.
 
 
 

There is no point at which I set the ID of the Person inside House -- I simply want to serialize House into relevant triples. If there is an existing person matching the ID, no new statements should be created. Perhaps I am misunderstanding something somewhere entirely.

That's up to the backend. Pinto will serialize the data exactly as-is, and if triples it generates already exist, then they should be ignored on insert.

Ahh no you are absolutely right, Pinto does its job, and what I'm thinking of is not exactly Pinto's job, perhaps.
 
 

I do not see an aggregation example in the Pinto test cases (only composition examples), or did I miss it? Otherwise, is this not something Pinto can do, but Empire will?

No, probably not a test case exactly like this.
 

Workaround:

If I hack the identifiable nature of the class with:

@Override
public Resource id() {
// this allows us to skip setting the rdf id explicitly, but not sure if correct approach
return ValueFactoryImpl.getInstance().createURI(NS + "person_" + getId());
//return mIdentifiable.id();
}

        // so then this perhaps becomes useless
@Override
public void id(Resource theResource) {
mIdentifiable.id(ValueFactoryImpl.getInstance().createURI(NS + "person_" + getId()));
}

I am able to work around the issue of "new objects with different URIs". However, I get duplicate `:person_ID rdf:type :Person` and `person_ID :id "ID"` triples (because the id is a separate field that is set).

Maybe don't include the ID field? I'd wanted to add the ability to mark something as transient so you could avoid serializing it in cases where you don't care about the field(s).
 

As I realized this is fine and correct behaviour -- Pinto is serializing this because the POJO contains that field (it doesn't make a difference on why the field was included). I was expecting something else entirely (see below).
 

And since I create a named graph for every new object, the triple store does not treat them as duplicates (so I cannot just be done with it by deleting duplicate triples). I then need to ensure I show only distinct results anywhere I am using the information.

Would really appreciate some help with this, as presently this is the only issue left. If there's no other choice, I can try to integrate Empire, if that will solve these problems.

I guess where Empire could help is that you could look up the know `Person` object by it's IRI, or w/ a query, and assert that to be a member of the House, and then persist the House.

I will have to try Empire to see what it can help with, but I think I was not able to frame this particular problem well. I now see that it could be more of a design pattern issue than anything else.

I am trying to abstract the RDF away with an HTTP JSON/XML web service. Therefore, the POJOs are created from the submitted content.

This was my original use case as well.
 
I am not sure what the practice is for nested relationships, but I think I would need to accept a field for the ID or even the URI and then produce the relationships in the back-end based on the POJO content.

Most relevant tests are RDFMapperTests#testReadMixed, RDFMapperTests#testWriteMixed. The small app i originally created Pinto used two levels of nested objects, but on the backend i had to do lookups into the database to populate objects w/ an appropriate IRI if they already existed in the db.
 

In JSON at least there is no concept of aggregation or composition as it simply is an association or relationship. HTTP recommendations such as HATEOS do suggest including relationship links and allowing for "expansion" of those links, so perhaps what I need to find out is whether Empire could help me do that expansion.

Pinto is just the stand-alone core of Empire, albeit cleaned up a bit. Empire adds in lightweight JPA support, so while it can do a lot more, most of that functionality comes at the cost of having to work through EntityManager.

It doesn't seem like you're doing anything wholly different than what I am currently using Pinto for, so I'm not sure where it's falling down for you. Perhaps I'm not understanding the issue very well. A stand alone test case that demonstrates what Pinto is doing wrong would be helpful.

Cheers,

Mike
 

Thanks once again, but let me know if you have any more suggestions.
 

Cheers,

Mike
 

--
You received this message because you are subscribed to the Google Groups "Empire" group.
To unsubscribe from this group and stop receiving emails from it, send an email to empire-rdf+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Empire" group.
To unsubscribe from this group and stop receiving emails from it, send an email to empire-rdf+unsubscribe@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages