yes. The're the same on the wire.
> I have a few fields that probably will only contain ASCII i.e. legal UTF8,
> but I'm not 100% sure.
> I am tempted to just turn them all to "bytes".
> But this begs the question - what is the "string" type useful for, and why
> shouldn't I just always use "bytes" to be sure, all the time, and not both
> with "string" at all?
> Does "string" add anything besides validation that only valid UTF8 is
> passing over the wire? Is there really a big benefit to this behavior? Or
> is there some other advantage that I'll miss out on by changing all my
> "string"s to "bytes"?
If you use the C++ api there is not much difference since both types
are represented as std::string in the API. It makes a big difference
for the Java API (and Python?), that have a native type for an UTF-8
string. In Java, if you deal with a protocol buffer 'string' type, the
generated API will return a java.lang.String while otherwise it will
return a ByteString. ByteString can hold any character while the
native Java String works only for UTF-8. So while 'ByteString' is more
flexible, 'String' is more convenient to deal with within Java code
because all string manipulation libraries can handle it.
So the benefit is a more convenient Api in the generated Java code.
And as well documentation: if you use 'string' you emphasize that a
field only contains readable text while 'bytes' might contain any
binary blob.
-h