Modify ObjectId AVRO serialization

59 views
Skip to first unread message

Amit Singh

unread,
Oct 22, 2019, 10:44:18 AM10/22/19
to jackson-user
I was trying to modify how ObjectId is serialized by the Jackson Avro library. Default serializer creates an AVRO having following structure :

{
   
"name" : "id",
   
"type" : [ "null", {
     
"type" : "record",
     
"name" : "ObjectId",
     
"namespace" : "org.bson.types",
     
"fields" : [ {
       
"name" : "counter",
       
"type" : {
         
"type" : "int",
         
"java-class" : "java.lang.Integer"
       
}
     
}, {
       
"name" : "date",
       
"type" : [ "null", {
         
"type" : "long",
         
"java-class" : "java.util.Date"
       
} ]
     
}, {
       
"name" : "machineIdentifier",
       
"type" : {
         
"type" : "int",
         
"java-class" : "java.lang.Integer"
       
}
     
}, {
       
"name" : "processIdentifier",
       
"type" : {
         
"type" : "int",
         
"java-class" : "java.lang.Short"
       
}
     
}, {
       
"name" : "time",
       
"type" : {
         
"type" : "long",
         
"java-class" : "java.lang.Long"
       
}
     
}, {
       
"name" : "timeSecond",
       
"type" : {
         
"type" : "int",
         
"java-class" : "java.lang.Integer"
       
}
     
}, {
       
"name" : "timestamp",
       
"type" : {
         
"type" : "int",
         
"java-class" : "java.lang.Integer"
       
}
     
} ]
   
} ]
 
}

However, MongoDB has following format for ObjectId as per official source :

{"$oid": <ObjectId bytes as 24-character, big-endian hex
string
>}

So ideally, the AVRO, that I want to generate for this, should have the following structure :

{
       
"name" : "oid",
       
"type" : {
         
"type" : "string"
       
}
     
}]
}

What I have been able to do so far is to extend the SerializerModifier and use changeProperties to remove all the fields generated by the ObjectId by first comparing the class. However, using this approach, I am not able to create my own property to add to the List<BeanPropertyWriter> which later gets serialized.

public class ObjectIdSerializerModifier extends BeanSerializerModifier {

@Override
public List<BeanPropertyWriter> changeProperties(SerializationConfig config, BeanDescription beanDesc, List<BeanPropertyWriter> beanProperties) {
if ( beanDesc.getBeanClass().equals(ObjectId.class) ) {
beanProperties.clear(); // I need to add the 'oid' string element here.
}
return beanProperties;
}
}

The above code piece fails with the following error :

com.fasterxml.jackson.databind.exc.InvalidDefinitionException: "Any" type (usually for `java.lang.Object`) not supported: `expectAnyFormat` called with type [simple type, class org.bson.types.ObjectId]
    at com.fasterxml.jackson.databind.exc.InvalidDefinitionException.from(InvalidDefinitionException.java:77)
    at com.fasterxml.jackson.dataformat.avro.schema.VisitorFormatWrapperImpl.expectAnyFormat(VisitorFormatWrapperImpl.java:174)
    at com.fasterxml.jackson.databind.ser.impl.UnknownSerializer.acceptJsonFormatVisitor(UnknownSerializer.java:66)
    at com.fasterxml.jackson.dataformat.avro.schema.RecordVisitor.schemaFieldForWriter(RecordVisitor.java:178)
    at com.fasterxml.jackson.dataformat.avro.schema.RecordVisitor.optionalProperty(RecordVisitor.java:121)
    at com.fasterxml.jackson.databind.ser.BeanPropertyWriter.depositSchemaProperty(BeanPropertyWriter.java:839)
    at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.acceptJsonFormatVisitor(BeanSerializerBase.java:863)
    at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.acceptJsonFormatVisitor(DefaultSerializerProvider.java:566)
    at com.fasterxml.jackson.databind.ObjectMapper.acceptJsonFormatVisitor(ObjectMapper.java:4046)
    at com.fasterxml.jackson.databind.ObjectMapper.acceptJsonFormatVisitor(ObjectMapper.java:4025)
    at com.amitds1997.avroserializer.Main.main(Main.java:55)

This is probably because I have emptied the List and it no longer contains any elements. So how can I do this? Also, is this the right approach to take or should I modify some other parameters and functions?

Amit Singh

unread,
Oct 23, 2019, 3:26:26 AM10/23/19
to jackson-user
I have been able to track down the solution to somehow overriding expectAnyFormat in VisitorFormatWrapperImpl.java but have no clue how to do that. Is there someway I can extend that?

Tatu Saloranta

unread,
Oct 23, 2019, 7:49:37 PM10/23/19
to jackson-user
Ok, so: if you do want `ObjectId` type handled, you will just need to provide a custom serializer (JsonSerializer), registered for that type: otherwise it is assumed to be a Bean, and properties discovered. This is probably the easiest way.

If you just want to ignore all properties with value of ObjectId, there are couple of ways to do that: `@JsonIgnoreType` annotation on class does it -- in this case you would want to attach it via mix-in.
But Jackson 2.9 adds "config override" support too, something like:

   mapper.configOverride(ObjectId.class).setIsIgnoredType(true);

I hope this helps,

-+ Tatu +-


--
You received this message because you are subscribed to the Google Groups "jackson-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jackson-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jackson-user/d9e5d38a-ea0b-472d-99f3-88e0e2280eb9%40googlegroups.com.

Amit Singh

unread,
Oct 25, 2019, 11:33:11 PM10/25/19
to jackson-user
Correct me if I am wrong, but the approach you described works well when I want to serialize the POJO into JSON (by using a custom serializer and overriding serialize method). However, what I actually want to achieve is change how POJO is used to obtain the AVRO schema from the POJO. I want to generate an AVRO schema that is able to parse Mongo generated JSON. Now as I said, Mongo stores ObjectId as :
'id' : {
 
'oid' : '24-character-hex-string';
}
So, is there any way to override some methods and get that? It's something similar to the JSON serialization you said about, but for AVRO schema generation.
To unsubscribe from this group and stop receiving emails from it, send an email to jackso...@googlegroups.com.

Tatu Saloranta

unread,
Oct 25, 2019, 11:47:22 PM10/25/19
to jackson-user
On Fri, Oct 25, 2019 at 8:33 PM Amit Singh <amitd...@gmail.com> wrote:
>
> Correct me if I am wrong, but the approach you described works well when I want to serialize the POJO into JSON (by using a custom serializer and overriding serialize method). However, what I actually want to achieve is change how POJO is used to obtain the AVRO schema from the POJO. I want to generate an AVRO schema that is able to parse Mongo generated JSON. Now as I said, Mongo stores ObjectId as :
> 'id' : {
> 'oid' : '24-character-hex-string';
> }
> So, is there any way to override some methods and get that? It's something similar to the JSON serialization you said about, but for AVRO schema generation.

Yes, schema generation (used for JSON Schema as well as Avro schema)
relies in `JsonSerializer` that type uses, in particular method
"acceptJsonFormatVisitor()". You would need to register a custom
serializer for `ObjectId`. You can have a look at how existing
standard serializers implement this method -- it's not a very
interface but it's the way to do it.

Hope this helps,

-+ Tatu +-
Reply all
Reply to author
Forward
0 new messages