Issue 336 in protobuf: Support generic string type

61 views
Skip to first unread message

prot...@googlecode.com

unread,
Oct 6, 2011, 1:36:54 PM10/6/11
to prot...@googlegroups.com
Status: New
Owner: ----
Labels: Type-Defect Priority-Medium

New issue 336 by olafvdspek: Support generic string type
http://code.google.com/p/protobuf/issues/detail?id=336

ATM 3 overloads are generated:
const ::std::string& value
const char* value
const char* value, size_t size

Could you add a 4th (or replace the std::string one) like this?

template<class T>
void set_field(const T& v)
{
set_field(v.data(), v.size());
}

This'd nicely support string types other than std::string.

prot...@googlecode.com

unread,
Nov 7, 2011, 4:28:11 AM11/7/11
to prot...@googlegroups.com
Updates:
Status: WontFix

Comment #1 on issue 336 by liuj...@google.com: Support generic string type
http://code.google.com/p/protobuf/issues/detail?id=336

We probably won't do this. It will make the API less clear and will
potentially be broken if the template type has data() and size() but
doesn't work like string. You can still use data() and size() to set the
field.

prot...@googlecode.com

unread,
Nov 7, 2011, 5:12:39 AM11/7/11
to prot...@googlegroups.com

Comment #2 on issue 336 by olafv...@gmail.com: Support generic string type
http://code.google.com/p/protobuf/issues/detail?id=336

That's only possible if the value is already in a variable. If it's the
return value of an expression, you can't 'just' call data() and size().

Not doing this makes protobuf significantly harder to use with custom
string types.

Eyal Farago

unread,
Nov 7, 2011, 6:37:24 AM11/7/11
to prot...@googlegroups.com, codesite...@google.com
you can always write your own plug-in that adds this kind of methods to your generated classes.
this plugin can be quite generic as it has access to the FileDecriptor (hence all message's descriptors), which allows it to identify string fields, and use the (well known and documented) insertion points, to add you own methods, which in turn van be built on top of the public API (you don't necessarily have to access the member variable itself).

eyal.

prot...@googlecode.com

unread,
Nov 9, 2011, 6:40:17 AM11/9/11
to prot...@googlegroups.com

Comment #3 on issue 336 by olafv...@gmail.com: Support generic string type
http://code.google.com/p/protobuf/issues/detail?id=336

Compare 1, 2 and 3. Which one looks cleaner?

array<char, 20> a();
string s();
vector<char> v();

1. { auto z = a(); b.set_field(z.data(), z.size()); }
2. b.set_field(s());
3. { auto z = v(); b.set_field(z.data(), z.size()); }

> and will potentially be broken if the template type has data() and size()
> but doesn't work like string.

Maybe you should remove the (const char*) overload too, as it might not be
a null terminated string... ;)
Seriously, what's the chance a developer uses such a class with set_field()
and doesn't get compile errors?


prot...@googlecode.com

unread,
Nov 10, 2011, 12:51:57 AM11/10/11
to prot...@googlegroups.com

Comment #4 on issue 336 by Eyal.Far...@gmail.com: Support generic string
type
http://code.google.com/p/protobuf/issues/detail?id=336

Hi,
As I replied in the group, there's always the possibility of achieving this
with a plugin.
GPB probably won't support custom strings classes as the actual members,
but it's quite easy to add 'setters' that takes some other user defined
classes and assign/convert them into a string, or even templated setters
like the one suggested here (not that I thinks this is a good idea).

I'd suggest writing a plugin that adds setters to the custom string classes
your code base is using, this is safer than adding the templated setter,
and quite easy to do; relying on the plugin API provided by GPB and the
well known (and halfway documented) insertion points of the c++ code
generator, you can add specific methods for each string member in your
messages. This way you get what you wanted and the GPB code-base remains
clean and solid.

Eyal.

prot...@googlecode.com

unread,
Nov 10, 2011, 4:03:50 AM11/10/11
to prot...@googlegroups.com

Comment #5 on issue 336 by olafv...@gmail.com: Support generic string type
http://code.google.com/p/protobuf/issues/detail?id=336

> As I replied in the group,

I didn't see your reply. Not everybody reads the group. ;)

I'll have a look.

prot...@googlecode.com

unread,
Nov 10, 2011, 6:15:31 AM11/10/11
to prot...@googlegroups.com

Comment #6 on issue 336 by olafv...@gmail.com: Support generic string type
http://code.google.com/p/protobuf/issues/detail?id=336

But it would be nice if you answered my questions.

prot...@googlecode.com

unread,
Nov 10, 2011, 7:09:48 AM11/10/11
to prot...@googlegroups.com

Comment #7 on issue 336 by Eyal.Far...@gmail.com: Support generic string
type
http://code.google.com/p/protobuf/issues/detail?id=336

assuming you refer to the '1 2 3' question in comment 3:
std::wstring also supports this kind of contract, it has data() and size()
methods,
however passing a wstring to the proposed methods will merely cause a
compile time warning, but may cause severe run-time issues, as the
wchar_t's in the wstring will be truncated to chars, hence causing data
loss, or even worth corruption (the resulting string may not be utf8
compatible as proto requires).

so yes, I do see some possible damage in adding this kind of overloads.
however if you do have classes that can be safely transformed into a string
(requiring more than one-liner as you stated before) you have several
options:
1. write a utility method to do this for you.
2. generate the setters for your specific classes using a plug-in.

either way you choose, I wouldn't recommend using function templates for
this, as it may introduce some dangers like the example I gave above, and
in any case I don't think the maintainers of GPB can support you on this
one as they are not aware of your classes and as I said this kind of
function template introduce some danger (and as the maintainer answered
you, some complexity).

eyal.

prot...@googlecode.com

unread,
Nov 10, 2011, 7:14:50 AM11/10/11
to prot...@googlegroups.com

Comment #8 on issue 336 by olafv...@gmail.com: Support generic string type
http://code.google.com/p/protobuf/issues/detail?id=336

> however passing a wstring to the proposed methods will merely cause a
> compile time warning

Really?

const char* a;
const wchar_t* b = a;

This does *not* compile. Why do you think this would merely generate an
error?

> 1. write a utility method to do this for you.

Can that be done without an unnecessary std::string construction/copy?

prot...@googlecode.com

unread,
Nov 10, 2011, 7:18:52 AM11/10/11
to prot...@googlegroups.com

Comment #9 on issue 336 by olafv...@gmail.com: Support generic string type
http://code.google.com/p/protobuf/issues/detail?id=336

> Why do you think this would merely generate an error?

Oops. I meant a warning, of course.

prot...@googlecode.com

unread,
Nov 10, 2011, 7:31:56 AM11/10/11
to prot...@googlegroups.com

Comment #10 on issue 336 by Eyal.Far...@gmail.com: Support generic string
type
http://code.google.com/p/protobuf/issues/detail?id=336

ok,
my bad :-), I was thinking about the iterators based c'tor of std::string,
this is not the one you are proposing to use.
...though your idea can be extended in such a way that does demonstrate the
warning:

template< typename It >
set_xxx( It b, It e ){
xxx_->assign( b, e );
}

ps. I steel think this is a bad idea.

prot...@googlecode.com

unread,
Nov 10, 2011, 7:35:58 AM11/10/11
to prot...@googlegroups.com

Comment #11 on issue 336 by olafv...@gmail.com: Support generic string
type
http://code.google.com/p/protobuf/issues/detail?id=336

Why would that compile with a warning?
Did you actually try it? If so, could you post the code and the warnings?

> ps. I steel think this is a bad idea.

Why?

prot...@googlecode.com

unread,
Nov 10, 2011, 8:04:17 AM11/10/11
to prot...@googlegroups.com

Comment #12 on issue 336 by Eyal.Far...@gmail.com: Support generic string
type
http://code.google.com/p/protobuf/issues/detail?id=336

the warning I'm referring to is regarding the truncation of wchar_t to
char, this will cause a loss of data in case your wchar_t contains a value
higher than 256, and will probably do the wrong thing for values in the
range [128..256).

Now, having that said, I couldn't reproduce this warning with
std::basic_string, but I do have a sample of the warning I was talking
about:

#include <string>


int main(int argc, char* argv[])
{
wchar_t w = 300;
std::wstring ww( 10, w );
//surprisingly enough, no warning here
std::string aa( ww.c_str(), ww.c_str() + ww.size() );
aa.assign( ww.c_str(), ww.c_str() + ww.size() );

//this does generate a warning
aa[ 2 ] = w;
//1>c:\users\eyal\documents\visual studio
2005\projects\sample_cpp\sample_cpp\sample_cpp.cpp(18) : warning
C4244: '=' : conversion from 'wchar_t' to 'char', possible loss of data

return 0;
}

prot...@googlecode.com

unread,
Nov 10, 2011, 12:02:43 PM11/10/11
to prot...@googlegroups.com

Comment #13 on issue 336 by olafv...@gmail.com: Support generic string
type
http://code.google.com/p/protobuf/issues/detail?id=336

Ah. But that's not what I requested.
So, why do you steel think it's a bad idea?

prot...@googlecode.com

unread,
Nov 10, 2011, 1:52:02 PM11/10/11
to prot...@googlegroups.com

Comment #14 on issue 336 by Eyal.Far...@gmail.com: Support generic string
type
http://code.google.com/p/protobuf/issues/detail?id=336

I'm glad i managed to explain the wstring issue...
As the maintainer answered your request, adding more and more kinds of
setters clutters the interface,this is basically what's bothering me about
it.
Regarding your question about imlementing the utility assignment function
without an additional string copy, this is trivial to imlement since gpb
provides the mutable_xxx() accessor, so the utility function can use any
trick in the book to assign your value into the message's field
(incluedingswap), this can be easily done using the plugin approach.

prot...@googlecode.com

unread,
Nov 10, 2011, 2:20:32 PM11/10/11
to prot...@googlegroups.com

Comment #15 on issue 336 by olafv...@gmail.com: Support generic string
type
http://code.google.com/p/protobuf/issues/detail?id=336

You could remove the (const char*) overload, because it might not be a null
terminated string...

prot...@googlecode.com

unread,
Nov 10, 2011, 3:39:35 PM11/10/11
to prot...@googlegroups.com

Comment #16 on issue 336 by Eyal.Far...@gmail.com: Support generic string
type
http://code.google.com/p/protobuf/issues/detail?id=336

Well, you could request overloads for all of string`s mutating method but
it doesn`t make too much sense, does it? Gpb provides the coammons and
alows you to use the rest throgh the mutable_xxx() overloads, i think
custom string types are an isoteric use case from the library point of view.
I think this discussion is exhausted, i can help you with the plugin if you
like(let`s take it to the group).

Eyal.

prot...@googlecode.com

unread,
Nov 12, 2011, 11:02:13 AM11/12/11
to prot...@googlegroups.com

Comment #17 on issue 336 by olafv...@gmail.com: Support generic string
type
http://code.google.com/p/protobuf/issues/detail?id=336

What's wrong with staying here?

prot...@googlecode.com

unread,
Nov 15, 2011, 8:29:26 AM11/15/11
to prot...@googlegroups.com

Comment #18 on issue 336 by olafv...@gmail.com: Support generic string
type
http://code.google.com/p/protobuf/issues/detail?id=336

liujisi, could you please reply to my comments (#3) and tell me what the
recommended way to do this is?

Reply all
Reply to author
Forward
0 new messages