Generating subclasses of a protocol buffer serialized class

1,692 views
Skip to first unread message

Chris

unread,
Jan 17, 2009, 2:14:06 AM1/17/09
to Protocol Buffers
Hi, sorry if this is a dumb question. I have class A which I want to
serialize but equally want to add logic to it (which I cant today
because its generated). I was wondering if there was:

- An ability that if I created class of type B that extends A, where
no member variables of B would be serialized.
- Then provide the correct handling during serialization.

Basically B contains my business logic, A contains all the serialized
fields.

This technique is not unknown to a similar problem of object
relational mapping.

Currently it seems a little messy, we are either copying fields to and
from the serialized form into a richer object or using lots of
delegation, neither of which is that clean.

Best

ChRiS

Shane Green

unread,
Jan 17, 2009, 3:02:20 AM1/17/09
to Chris, Protocol Buffers
I believe that extending one of the generated types would be problematic
from a maintainability standpoint, and may introduce unnecessary
limitations with regards to using and extending the business classes.
Have you considered associating behaviours with the data instances using
a wrapper-style approach?

The business classes could wrap instances of the generated data
structure classes, accessing and manipulating their properties as
needed. Serialization of these objects could amount to serializing
their data, perhaps wrapped in a message which includes the business
object specifier so the correct type could be determined dynamically
when parsing.

If the application is such that extending the generated message types
could have made sense, then it may make sense to view the wrapper based
solution in terms of object state.

# Marshalling
state = business_object.get_state()
bytes = state.SerializeToString()

# Unmarshalling
state.ParseFromString(bytes)
business_object.set_state(state)

Alek Storm

unread,
Jan 17, 2009, 3:20:42 AM1/17/09
to Protocol Buffers
I agree with Shane. Wrapping the data is a great way to separate the
business logic from the data. Actually, if you're using Python,
you're not allowed to subclass generated message types (though there's
no 'sealed' modifier) - it screws with the metaclass machinery.

Cheers,
Alek

Chris

unread,
Jan 17, 2009, 3:30:45 AM1/17/09
to Protocol Buffers
Thanks Steve and Alek. As I mentioned wrappers delegating access to a
containing class and simple copy to from a business class is what we
do today and its a mess, its error prone and its totally manual, am I
missing something? Every new attribute needs to either have a few
lines more added to the business class to get and set its values. In
a prior job I did use a system where persisted classes were generated
as *Base classes where you were expected to implement the * extending
*Base. This worked very well and reduced maintainability. I can of
course imagine that for languages that are not OO it would prove to be
a challenge (yes I dont use Python).

Thanks

C

Henner Zeller

unread,
Jan 17, 2009, 2:53:08 PM1/17/09
to Chris, Protocol Buffers
Hi,
On Sat, Jan 17, 2009 at 9:30 AM, Chris <chrisj...@gmail.com> wrote:
>
> Thanks Steve and Alek. As I mentioned wrappers delegating access to a
> containing class and simple copy to from a business class is what we
> do today and its a mess, its error prone and its totally manual, am I
> missing something? Every new attribute needs to either have a few
> lines more added to the business class to get and set its values.

Typically, a business class should not expose all the internal state
(i.e. the attributes of the underlying data) in the first place. You
should be very specific which attributes you expose.

Anyway, having said that, you might consider just returning a
reference or pointer to the protocol buffer kept in the business class
in case you _really_ have to access the internal data
business_object->data().get_foobar().
This way you don't have to modify anything in case you add fields to
the data. However, this way you always expose all properties of the
internal data which is a bad idea - but again, this would have
happened with the inheritance approach as well.
On advantage as well is that the access code looks sufficiently ugly
that users will avoid using internal data fields directly :)

> In
> a prior job I did use a system where persisted classes were generated
> as *Base classes where you were expected to implement the * extending
> *Base. This worked very well and reduced maintainability.

You said the right thing here, but I guess you meant to say 'reduced
maintenance' .. ;)

-h

Chris

unread,
Jan 18, 2009, 1:52:04 AM1/18/09
to Protocol Buffers
Yes after I pressed the button I realized the error of my ways (re
reduce maintainability). I partially agree with you about the
exposing of the internal nastyness,however that is a choice to be made
by the developer per use basis. To not have that there is like saying
we dont offer you object oriented programming cause you will make
mistakes....well kinda (not meant to be a deeply religious argument
open for internet flaming).

C

On Jan 17, 11:53 am, "Henner Zeller" <h.zel...@acm.org> wrote:
> Hi,
>

Marc Gravell

unread,
Jan 18, 2009, 5:17:02 AM1/18/09
to Protocol Buffers
What language are you using? In C#, "partial classes" are a viable way
of adding extra logic into generated classes - the protobuf-net
generator allows this fairly well. In the more general sense, consider
encapsulation over inheritance, or simply keep the two separate (for
example, passing the generated object into static methods defined in
the business class etc),

Marc Gravell

Chris

unread,
Jan 18, 2009, 8:03:26 PM1/18/09
to Protocol Buffers
Thanks Mark, Its java. So far people keep recommending me what I am
already doing (delegation) which is itself not maintainable. Sounds
like there is a need for a code generator to generate the delegation
of those methods you want to expose :-}

Kenton Varda

unread,
Jan 20, 2009, 2:08:04 PM1/20/09
to Chris, Protocol Buffers
Sorry, but we don't allow subclassing of protocol buffer objects (in any language).  Allowing this leads to too many "fragile base class" problems.

If you want to expose all of the fields of your protocol message (which is what you'd get from subclassing), you can always add an accessor to your wrapper which returns the wrapped message object.  Then you do not need to add a new accessor for every new field added.  If you only want to expose some fields but not others, you could separate your message into two messages like:

message MyMessage {
  message Public {
    // public fields here
  }
  required Public public_fields = 1;
  // private fields here
}

Then your wrapper class can provide a method to access the MyMessage.Public part but not the rest.

fedor.l...@gmail.com

unread,
Feb 4, 2009, 6:22:05 PM2/4/09
to Protocol Buffers
> Sorry, but we don't allow subclassing of protocol buffer objects (in any
> language).  Allowing this leads to too many "fragile base class" problems.

Sorry to bring up an old topic again, and I know it's discouraged, but
I was wondering if there would be any incompatibilities if the
subclass would remain very simple, simply adding a couple of custom
member variables and otherwise only calling public functions in its
own functions. Are there any binary structure dependencies that I'm
not thinking of that would be broken(like the offsets table), or would
this be a conservative enough subclassing?

Kenton Varda

unread,
Feb 4, 2009, 7:51:26 PM2/4/09
to fedor.l...@gmail.com, Protocol Buffers
The problem is that it's difficult to define what is allowed and what is not.  "Fragile base class" problems are quite diverse.  In the end it's up to you to use your own engineering judgment, but the recommendation from us is that you look for designs that don't require subclassing protobuf objects.

With respect to offset tables, I think the only way you could break those is if you used multiple inheritance.  But it could depend on your compiler.  A simple subclass that just adds new members *probably* won't break anything, but C++ is so horribly complicated that it's hard to say for certain.

If you just want to add some custom members, why not define a struct/class containing an instance of your message and those custom fields along side?
Reply all
Reply to author
Forward
0 new messages