Serialization of any object with parent-child relationships without outputing the ID value

1,385 views
Skip to first unread message

David Hayek

unread,
Feb 16, 2017, 5:26:09 PM2/16/17
to jackson-user
If I have to serialize an object that I can not annotate and that contains parent-child relationship, such as a DefaultMutableTreeNode, I can avoid the infinite recursion/stack overflow problem by supplying an ObjectIdGenerator from the findObjectIdInfo() method in a subclass of the JacksonAnnotationIntrospectorThe ID value is in the resulting JSON object. 


(In addition, with the DefaultMutableTreeNode class I have to supply a custom BeanPropertyWriter to handle exceptions thrown when serializing objects like this. See: http://stackoverflow.com/questions/35359430/how-to-make-jackson-ignore-properties-if-the-getters-throw-exceptions/35416682)

I have 3 constraints I'm trying to satisfy:

1. I can not annotate objects in various dependent libraries.
2. I can not enumerate ahead-of-time all of the possible objects that have parent-child relationships
3. I do not want the virtual ID value from an ObjectIdGenerator in the serialized JSON object, although I still need a way to keep track of them.

To satisfy 1 and 2 I was thinking of using an ObjectIdGenerator for all objects, regardless of whether they contain parent-child relationships or whether I had access to the source files. This would simplify development, but are there performance penalties for doing this?

I'm not sure how to proceed with #3. What objects do I need to sub-class in order to get this to work? It looks like the actual output is done by the WritableObjectId object, which is returned from the DefaultSerializerProvider.findObjectId() method. But the WritableObjectId is declared final so I can't override its writeAsField() method. Is there another way to do this?

Thanks for your help.



This message and any attachment are confidential and may be privileged or otherwise protected from disclosure. If you are not the intended recipient, you must not copy this message or attachment or disclose the contents to any other person. If you have received this transmission in error, please notify the sender immediately and delete the message and any attachment from your system. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not accept liability for any omissions or errors in this message which may arise as a result of E-Mail-transmission or for damages resulting from any unauthorized changes of the content of this message and any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not guarantee that this message is free of viruses and does not accept liability for any damages caused by any virus transmitted therewith.

Click http://www.emdgroup.com/disclaimer to access the German, French, Spanish and Portuguese versions of this disclaimer.

Tatu Saloranta

unread,
Feb 16, 2017, 5:40:19 PM2/16/17
to jackson-user
On Thu, Feb 16, 2017 at 1:43 PM, David Hayek <david...@sial.com> wrote:
> If I have to serialize an object that I can not annotate and that contains
> parent-child relationship, such as a DefaultMutableTreeNode, I can avoid the
> infinite recursion/stack overflow problem by supplying an ObjectIdGenerator
> from the findObjectIdInfo() method in a subclass of the
> JacksonAnnotationIntrospector. The ID value is in the resulting JSON object.
>
> There's an example here:
> http://stackoverflow.com/questions/27316485/using-jsonidentityinfo-without-annotations
>
> (In addition, with the DefaultMutableTreeNode class I have to supply a
> custom BeanPropertyWriter to handle exceptions thrown when serializing
> objects like this. See:
> http://stackoverflow.com/questions/35359430/how-to-make-jackson-ignore-properties-if-the-getters-throw-exceptions/35416682)
>
> I have 3 constraints I'm trying to satisfy:
>
> 1. I can not annotate objects in various dependent libraries.
> 2. I can not enumerate ahead-of-time all of the possible objects that have
> parent-child relationships
> 3. I do not want the virtual ID value from an ObjectIdGenerator in the
> serialized JSON object, although I still need a way to keep track of them.

How would the id be known on deserialization then, if it is not
written as part of data?
Is it calculated from other fields?

> To satisfy 1 and 2 I was thinking of using an ObjectIdGenerator for all
> objects, regardless of whether they contain parent-child relationships or
> whether I had access to the source files. This would simplify development,
> but are there performance penalties for doing this?
>
> I'm not sure how to proceed with #3. What objects do I need to sub-class in
> order to get this to work? It looks like the actual output is done by the
> WritableObjectId object, which is returned from the
> DefaultSerializerProvider.findObjectId() method. But the WritableObjectId is
> declared final so I can't override its writeAsField() method. Is there
> another way to do this?

Current system is designed for Object Id to be included in data, so
even if sub-classing was allowed, changing
behavior in sub-class would be a fragile solution.
I am not necessarily against removing `final` keyword, but just
suggesting that it might be better to add explicit support which could
(and should) be tested via unit tests, to avoid accidental breakages.

So I guess I would be open to improvement ideas that would achieve
this with less work than many overrides; but also if you want, please
file an RFE for just making `WritableObjectId` non-final. Ideally this
should go in 2.9, but I think this is one of cases where change in
patch would be acceptable (reverse of making method final would not be
-- but this change can not break anything wrt code that worked with
earlier patches).

-+ Tatu +-


>
> Thanks for your help.
>
>
>
> This message and any attachment are confidential and may be privileged or
> otherwise protected from disclosure. If you are not the intended recipient,
> you must not copy this message or attachment or disclose the contents to any
> other person. If you have received this transmission in error, please notify
> the sender immediately and delete the message and any attachment from your
> system. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not
> accept liability for any omissions or errors in this message which may arise
> as a result of E-Mail-transmission or for damages resulting from any
> unauthorized changes of the content of this message and any attachment
> thereto. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not
> guarantee that this message is free of viruses and does not accept liability
> for any damages caused by any virus transmitted therewith.
>
> Click http://www.emdgroup.com/disclaimer to access the German, French,
> Spanish and Portuguese versions of this disclaimer.
>
> --
> You received this message because you are subscribed to the Google Groups
> "jackson-user" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to jackson-user...@googlegroups.com.
> To post to this group, send email to jackso...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

David Hayek

unread,
Feb 17, 2017, 10:57:17 AM2/17/17
to jackson-user
Tatu

Thanks for your quick reply. For some background on what I'm trying to do, I want to log the objects that are passed as method arguments, either during trace logging or when an exception occurs. Right now we have very little insight from trace logging, other than method flow, or what happens during exceptions or why, beyond the usual stack traces. So I only need to worry about serialization, not deserialization, of the objects (and omitting the ID, as you pointed out, means that such objects effectively couldn't be deserialized). 

If is preferable that the output be in JSON format, since this same format used by most of the application's RESTful services, so it would be relatively easy to replay the data that was captured. Using an object's toString() method was just too unreliable.

I don't know that my use case (as a one-directional application of the Jackson libraries for the purposes of logging) is very common. I'm open to suggestions if there's a better way of doing this. 

Unfortunately, we're using Wildfly 8.2 and its built-in version of the Jackson libraries, which are at 2.4.1. I'm not sure how much wiggle room we have to upgrade.

Thanks

David

Tatu Saloranta

unread,
Feb 17, 2017, 2:49:47 PM2/17/17
to jackson-user
On Fri, Feb 17, 2017 at 7:57 AM, David Hayek <david...@sial.com> wrote:
> Tatu
>
> Thanks for your quick reply. For some background on what I'm trying to do, I
> want to log the objects that are passed as method arguments, either during
> trace logging or when an exception occurs. Right now we have very little
> insight from trace logging, other than method flow, or what happens during
> exceptions or why, beyond the usual stack traces. So I only need to worry
> about serialization, not deserialization, of the objects (and omitting the
> ID, as you pointed out, means that such objects effectively couldn't be
> deserialized).
>
> If is preferable that the output be in JSON format, since this same format
> used by most of the application's RESTful services, so it would be
> relatively easy to replay the data that was captured. Using an object's
> toString() method was just too unreliable.
>
> I don't know that my use case (as a one-directional application of the
> Jackson libraries for the purposes of logging) is very common. I'm open to
> suggestions if there's a better way of doing this.
>
> Unfortunately, we're using Wildfly 8.2 and its built-in version of the
> Jackson libraries, which are at 2.4.1. I'm not sure how much wiggle room we
> have to upgrade.

Ok. Well, no changes will be released for branches older than 2.7
(well never say never; critical bug fixes could be backported), so
much of discussion could be theoretic regarding changes/fixes.
(btw: I would strongly suggest upgrade to 2.4.6 to Wildfly maintainers
-- there's absolute no benefit using an obsolete patch version, and
very low risk of upgrading to latest patch -- but I realize that's not
something you can influence)

Still: just to make sure I understand the use case -- and your
explanation helps a lot, getting close! -- are you using Object Ids
just to avoid duplication? Or what is the benefit? If all you want to
do is just to serialize some additional property, that can be handled
using other mechanisms; Object Id is used for de-duplication and may
be bit heavy mechanism for simple (?) content augmentation.

Specifically "virtual" bean properties via `@JsonAppend` (added in ...
2.5) -- and more importantly, underlying machinery that may be used
without annotations -- would probably work better, if I understand use
case.

-+ Tatu +-

David Hayek

unread,
Feb 17, 2017, 3:59:25 PM2/17/17
to jackson-user
Tatu

I am using Object Ids to avoid duplication for objects that we can't modify and can't identify in advance that would have circular references. Without the Object Id, for example, printing an object such as the DefaultMutableTreeNode would always result in the following exception:

com.fasterxml.jackson.databind.JsonMappingException: Infinite recursion (StackOverflowError) (through reference chain: javax.swing.tree.DefaultMutableTreeNode["children"])

There are a number of such objects that we encounter, whether from Hibernate entity objects with one-to-many or one-to-one mappings or from out-of-the-box JAX-RS classes such as UriInfo.

I agree that it does seem a bit heavy-weight to use virtual Ids, but our use case for this is isolated strictly to logging when the TRACE level is enabled or when exceptions occur, so not very often. We also use asynchronous logging to reduce the performance hit.

Thanks

David

Tatu Saloranta

unread,
Feb 17, 2017, 4:57:02 PM2/17/17
to jackson-user
On Fri, Feb 17, 2017 at 12:59 PM, David Hayek <david...@sial.com> wrote:
> Tatu
>
> I am using Object Ids to avoid duplication for objects that we can't modify
> and can't identify in advance that would have circular references. Without
> the Object Id, for example, printing an object such as the
> DefaultMutableTreeNode would always result in the following exception:
>
> com.fasterxml.jackson.databind.JsonMappingException: Infinite recursion
> (StackOverflowError) (through reference chain:
> javax.swing.tree.DefaultMutableTreeNode["children"])
>
> There are a number of such objects that we encounter, whether from Hibernate
> entity objects with one-to-many or one-to-one mappings or from
> out-of-the-box JAX-RS classes such as UriInfo.

Ok that makes sense.

> I agree that it does seem a bit heavy-weight to use virtual Ids, but our use
> case for this is isolated strictly to logging when the TRACE level is
> enabled or when exceptions occur, so not very often. We also use
> asynchronous logging to reduce the performance hit.

Part that I do not yet understand is this: when encountering already
serialized Object,
how is that handled? Would you serialize it using Object Id generated
(using whatever mechanism),
but that just wasn't included in originally serialized full object?
Or replaced with something like `null`?
Something has to be serialized in general since at the time of
serialization it may not be possible to just
omit value (f.ex. in case property is to be serialized, name is
already written); except if property itself
is to be excluded (using `@JsonInclude` mechanism for example).

David Hayek

unread,
Feb 17, 2017, 6:29:25 PM2/17/17
to jackson-user
Part that I do not yet understand is this: when encountering already 
serialized Object, 
how is that handled? Would you serialize it using Object Id generated 
(using whatever mechanism), 
but that just wasn't included in originally serialized full object? 
Or replaced with something like `null`? 
Something has to be serialized in general since at the time of 
serialization it may not be possible to just 
omit value (f.ex. in case property is to be serialized, name is 
already written); except if property itself is to be excluded (using `@JsonInclude` mechanism for example). 

If I understand your question correctly, I would never have an already serialized object. The method args I'm logging would always be non-serialized POJOs. 

Actually, I may have been too focused on this idea of using a virtual Object Id. The main thing is for the serializer to keep track of which objects its already seen in order to avoid the infinite recursion problem. The only current way to do that is by associating each object with a virtual Id if it doesn't have a natural Id. 

I noticed this in the implementation of the DefaultSerializerProvider method: public WritableObjectId findObjectId()

Rather than use an Id generator, you could just set a boolean value in the _seenObjectIds map (or similar) and return a subclass of WritableObjectId with a one-line implementation of the writeAsField() method: idWritten = true; This would omit both the property name and the value in the serialized output.

Then maybe add a builder method for the ObjectMapper class that would allow clients to specify this sort of protection - something like: withInfiniteRecusionCheckEnabled()

I'm not sure if my answer was helpful or just more confusing. In any case, thanks for your help. The Jackson libraries have been of tremendous use to us.

David
Reply all
Reply to author
Forward
0 new messages