optimizing serializer

27 views
Skip to first unread message

Jimit Shah

unread,
Apr 2, 2015, 1:19:08 PM4/2/15
to netfli...@googlegroups.com

I am writing an application to use Zeno. I have a system with around 25 different objects. All the objects have a flat structure i.e. no object has a refererence to another object. I have around half a million such objects. This is an example serializer that I have written.


public class ContactSerializer extends NFTypeSerializer<Contact> {

    public ContactSerializer() {

        super("Contact");

    }


    @Override

    protected void doSerialize(Contact value, NFSerializationRecord rec) {

        serializePrimitive(rec, "ID", value.getID());

        serializePrimitive(rec, "ManageEntityId", value.getManageEntittyId());

        serializePrimitive(rec, "Email", value.getEmail());

        serializeObject(rec, "Privilege", value.getPrivilege());

        serializePrimitive(rec, "Version", value.getVersion());

    }


    @SuppressWarnings("unchecked")

    @Override

    protected Contact doDeserialize(NFDeserializationRecord rec) {

        return Contact.newBuilder()

            .setID(deserializeLong(rec, "ID"))

            .setManageEntityId(deserializeLong(rec, "ManageEntityId"))

            .setEmail(deserializePrimitiveString(rec, "Email"))

            .setPrivilege((ContactPrivilege)deserializeObject(rec, "Privilege"))

            .setVersion(deserializeInteger(rec, "Version"))

        .build();

    }


    @Override

    protected FastBlobSchema createSchema() {

        return schema(

            field("ID", FieldType.LONG),

            field("ManageEntityId", FieldType.LONG),

            field("Email", FieldType.STRING),

            field("Privilege", "ContactPrivilege"),

            field("Version", FieldType.INT)

        );

    }


    @Override

    public Collection<NFTypeSerializer<?>> requiredSubSerializers() {

        return serializers(

            new EnumSerializer<ContactPrivilege>(ContactPrivilege.class)

        );

    }



When I try to load such 0.5. million objects on the client side from a file generated on the data server it takes almost 20 mins. My client side server has around 16 gigs of ram and 4 cores. Do you think that I need to optimize my serializer or is this a reasonable amount of load time?


Thanks,

Jimit

Jonathan Stockdill

unread,
Apr 2, 2015, 2:14:05 PM4/2/15
to Jimit Shah, netfli...@googlegroups.com
  • What are the current heap settings?  Does increasing it help?
  • How are you reading the data in?
—jon




--
You received this message because you are subscribed to the Google Groups "Netflix Zeno Discussion Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to netflix-zeno...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Drew Koszewnik

unread,
Apr 2, 2015, 3:29:58 PM4/2/15
to netfli...@googlegroups.com, jimi...@gmail.com
Hi Jimit,

No, 20 minutes is far above the expected range for what you're describing.  

I have the same questions as Jon.

If increasing the heap size doesn't help (and you aren't I/O bound), I would suggest as a next step hooking up a profiler (e.g. Java Flight Recorder or YourKit) during the 20 minute load to determine where the time is being spent.

Thanks,
Drew.
To unsubscribe from this group and stop receiving emails from it, send an email to netflix-zeno+unsubscribe@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages