Basically the current behavior is that if you have a struct with a []byte field that you pass into the json marshaler, it will represent that in the output json string as the base64 encoded version of what you put in, and if you're unmarshaling into a []byte it will try to base64 decode the json string first.
On Sunday, December 15, 2013 5:22:29 PM UTC-8, Brian Picciano wrote:My application that I have in mind (although this has been a problem in the past) is essentially re-implementing redis-cluster, but with some added capability.
Where does JSON fit in with redis-cluster? I would have expected requests and responses to be encoded using the "unified request protocol".
The data in args (and the key, for that matter) needs to be unescaped, because it won't necessarily be retrieved through a JSON interface later.
Because Redis command arguments are binary data, base64 or some other encoding is required to represent arguments as a JSON strings.
On Sunday, December 15, 2013 6:51:48 PM UTC-8, Brian Picciano wrote:I assume it's already doing that for encoding/decoding strings, since those can just as well have binary data in them.
JSON strings are UTF-8; they cannot contain arbitrary binary data.The json package coerces strings to valid UTF-8 by replacing invalid bytes with the Unicode replacement rune.
I'm going to compress my three responses to one.> What I'm less clear on is exactly what your use case is for encoding the it as a string if you're dealing with it exclusively as bytes. I mean, if you're using it as both, surely you're making the copy anyway at some point.That's my point, I don't want to use string EVER in my application (I have no reason to for this particular one). But with encoding/json I have to, because I can't directly get []byte out of it for a JSON string value. So I have to convert.> JSON defines strings to be UTF-8 encoded, and as such is not suitable for storing binary data. Encoding an unknown []byte with base64 eliminates problemsI think that's a decision the coder should make. If I am worried that my binary data can't be encoded into a UTF-8 string then I can encode it into hex or base64 or whatever I like. But if I KNOW that my data is coming in as a proper JSON string and going out the other end without being changed in between there's no reason I should be forced to pay the penalty of four extra copies ([]byte -> string (inside encoding/json) -> []byte -> app -> string -> []byte (inside encoding/json)).
> If it's textual data, then string is the correct type.
That's true if I am actually interacting with the data. If I'm just carrying the data along and spitting it back out somewhere else than I don't really care what it is, and what I really need to optimize for is speed and memory. Four copies aren't helping.> Alternatively you can make your own type that implements MarshalJSON and UnmarshalJSON.The problem with doing this (and the RawMessage) is that you skip the unicode (un)escaping step which encoding/json does for strings (internally, it actually does it while they're still []byte, so it's pretty trivial to have it do it for []byte fields too). I could just pass along the []byte untouched, with the backslashes an all still in there, and send it out the other end as a JSON string and no-one would be any wiser. But what if that other end isn't JSON? What if it''s some custom binary interface? They're going to be receiving different data than was passed in.
On Sun, Dec 15, 2013 at 7:32 AM, egon <egon...@gmail.com> wrote:
You can use:type MyStruct struct {A, B json.RawMessage}json.RawMessage is defined as type RawMessage byte[]. Alternatively you can make your own type that implements MarshalJSON and UnmarshalJSON.The reason it does b64, is that it is "the correct way" to represent byte array in a json. In other words by default it is safe, but you can override the behavior by using RawMessage or a custom Marshaler.+egon
On Saturday, December 14, 2013 10:25:18 PM UTC+2, Brian Picciano wrote:I'm sorry if this has been brought up already, I haven't been able to find anything on it in my searching. I also know this would be a fairly significant change and would break backwards compatibility, but it is a fairly annoying "feature" that I think is more of a hindrance than a help.Basically the current behavior is that if you have a struct with a []byte field that you pass into the json marshaler, it will represent that in the output json string as the base64 encoded version of what you put in, and if you're unmarshaling into a []byte it will try to base64 decode the json string first. I can understand why this might be thought to be "the right way", since it forces you to use string as a string and raw binary data as []byte. But it's a bit presumptuous to assume that there is no legitimate reason anyone should pass a string through to a []byte and work with them that way.Currently, if I want my destination struct to be something like:type MyStruct struct {A, B []byte}And have that be filled by the json string: `{"A":"foo","B":"bar"}`, then I would first have to make a temporary struct like:type MyStructStr struct {A, B string}And copy/convert each field over individually. Same goes if I want to convert from MyStruct back into json. This adds a lot of extra code and data copies. In encoding/json the data is initially passed in and (un)quoted as []byte, where it is then converted to string. So now I'm converting back to []byte. This is unnecessary and unoptimizes for the common case.I've hacked a version of encoding/json where I took out the b64 stuff. It works just fine and is actually less code than it used to be. So there's no technical reason it has to stay (to my knowledge). Again, I know this probably won't make it in for anything in versions 1.*, but for 2 I think it should be considered. Also, if this is the wrong place to post this please let me know, I'll happily move it.
--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.