I think that map support would probably be useful. I've basically
created my own maps in protocol buffers a couple times, either by
using two repeated fields, or a repeated field of a custom "pair"
type. In these cases, it would have been nice to be able to use the
Protocol Buffer as a map directly, rather than needing to transfer the
data to some other object that actually implements the map. I would be
interested to hear the opinion of the Google maintainers. I'm assuming
that there are probably many applications inside Google that exchange
map-like messages.
This would be a big change, although it wouldn't be an impossible one,
I don't think. I think it could be implemented as "syntactic sugar"
over a repeated Pair message. I think the biggest challenge is that
maps are a "higher level" abstraction than repeated fields, which
leads to many design challenges:
* Are the maps ordered or unordered?
* If ordered, how are keys compared? This needs to be consistent
across programming languages.
* If unordered, how are hash values computed? This could result in a
message being parsed and re-serialized differently, if different
languages compute the hashes differently.
* For both, how are "'unknown" fields handled?
* Do the maps support repeated keys?
* If not, what happens when parsing a message with repeated keys?
Other message protocols contain map-like structures: JSON, Thrift, and
Avro. Avro only supports string keys. JSON only supports primitive
keys. Thrift has a similar note about maps:
http://wiki.apache.org/thrift/ThriftTypes
> For maximal compatibility, the key type for map should be a basic
> type rather than a struct or container type. There are some
> languages which do not support more complex key types in their
> native map types. In addition the JSON protocol only supports key
> types that are base types.
Evan
--
Evan Jones
http://evanjones.ca/
--
You received this message because you are subscribed to the Google Groups "Protocol Buffers" group.
To post to this group, send email to prot...@googlegroups.com.
To unsubscribe from this group, send email to protobuf+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
On Oct 6, 2010, at 9:23 , Igor Gatis wrote:I think that map support would probably be useful. I've basically created my own maps in protocol buffers a couple times, either by using two repeated fields, or a repeated field of a custom "pair" type. In these cases, it would have been nice to be able to use the Protocol Buffer as a map directly, rather than needing to transfer the data to some other object that actually implements the map. I would be interested to hear the opinion of the Google maintainers. I'm assuming that there are probably many applications inside Google that exchange map-like messages.
It would be nice to have mapped fields, e.g. key-value pairs.
This would be a big change, although it wouldn't be an impossible one, I don't think. I think it could be implemented as "syntactic sugar" over a repeated Pair message.
I think the biggest challenge is that maps are a "higher level" abstraction than repeated fields, which leads to many design challenges:
* Are the maps ordered or unordered?
* If ordered, how are keys compared? This needs to be consistent across programming languages.
* If unordered, how are hash values computed? This could result in a message being parsed and re-serialized differently, if different languages compute the hashes differently.
* For both, how are "'unknown" fields handled?
* Do the maps support repeated keys?
* If not, what happens when parsing a message with repeated keys?
Indeed, maps have been brought up repeatedly. I forget the current state of the discussion, but I think it's generally agreed that it would be a good thing to add; it's just a matter of how to implement it (and finding the time to do it).A couple of the major issues:- backward compatibility with existing repeated fields
- ensuring that the client can't change the map key in the mapped value without also updating the map structure.
On Thu, Oct 7, 2010 at 6:19 PM, Jason Hsueh <jas...@google.com> wrote:Indeed, maps have been brought up repeatedly. I forget the current state of the discussion, but I think it's generally agreed that it would be a good thing to add; it's just a matter of how to implement it (and finding the time to do it).A couple of the major issues:- backward compatibility with existing repeated fieldsI believe this has been addressed by the proposal I have made. Maps a repeated fields. So they can be exposed as list of pairs (string, message).
- ensuring that the client can't change the map key in the mapped value without also updating the map structure.Could you provide an example?
On Thu, Oct 7, 2010 at 7:02 PM, Igor Gatis <igor...@gmail.com> wrote:On Thu, Oct 7, 2010 at 6:19 PM, Jason Hsueh <jas...@google.com> wrote:Indeed, maps have been brought up repeatedly. I forget the current state of the discussion, but I think it's generally agreed that it would be a good thing to add; it's just a matter of how to implement it (and finding the time to do it).A couple of the major issues:- backward compatibility with existing repeated fieldsI believe this has been addressed by the proposal I have made. Maps a repeated fields. So they can be exposed as list of pairs (string, message).For backwards compatibility, the idea was to allow current repeated message fields to be converted to maps.
You might expose these as pairs (string, message), or just as regular message accessors, except with a key parameter instead of an index.
On the wire though, the map field would be serialized as the messages only. Keys are not serialized separately; they're just a special field within the message, as in the example in the .proto file. That leads to ...
- ensuring that the client can't change the map key in the mapped value without also updating the map structure.Could you provide an example?In the example from the .proto, suppose we generate key-based accessors for the items map field, and internally the field is represented as a hash_map<string, Item>.Item* mutable_items(const string& name);const Item& items(const string& name);A client could do something like:const string key = "foo";foo.mutable_items(key)->set_name("bar");i.e. its logical key (and its key when serialized and reparsed) is "bar" but the hash_map points to it using the old key "foo".