I think it would be nice to specify UTF-8 as the only encoding to use in JSON-RPC 2.0.
--
You received this message because you are subscribed to the Google Groups "JSON-RPC" group.
To post to this group, send email to json...@googlegroups.com.
To unsubscribe from this group, send email to json-rpc+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/json-rpc?hl=en.
Here is some context from other json technologies:
"This function only works with UTF-8 encoded data."
- http://php.net/manual/en/function.json-encode.php
"The character encoding of JSON text is always Unicode. UTF-8 is the
only encoding that makes sense on the wire, but UTF-16 and UTF-32 are
also permitted."
- http://www.json.org/fatfree.html
The character encoding for JSONRequest is UTF-8.
- http://www.json.org/JSONRequest.html
"The application/json Media Type for JavaScript Object Notation
(JSON)"
- http://www.ietf.org/rfc/rfc4627.txt says
"3. Encoding
JSON text SHALL be encoded in Unicode. The default encoding is
UTF-8.
Since the first two characters of a JSON text will always be ASCII
characters [RFC0020], it is possible to determine whether an octet
stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking
at the pattern of nulls in the first four octets.
00 00 00 xx UTF-32BE
00 xx 00 xx UTF-16BE
xx 00 00 00 UTF-32LE
xx 00 xx 00 UTF-16LE
xx xx xx xx UTF-8
"
--
Matt (MPCM)
Yes.
Further suggestion: add some predefined encoding error class to the set
of JSON-RPC 2.0 errors and allow the server to fill in the information
about supported encodings in the error object.
Best,
r
My interpretation would be that the actual encoding used is really
part of the *transport* layer, not the json-rpc protocol layer, and so
falls out side the scope of the spec.
As long as the transport knows how to convert a given set of bytes
into valid Unicode characters, that is all the json-rpc protocol layer
requires.
Actual implementations of course require the transport to be defined,
and if we want seamless interoperability between implementations then
we need to have documents that spec out the transport layer too. As
far as I know, the only transport for which such a document exists is
HTTP (http://groups.google.com/group/json-rpc/web/json-rpc-over-http),
although it doesn't look like it has been updated to the current spec.
A similar document for json-rpc over tcp would be useful, and in my
view is the 'right' layer to specify the unicode encoding(s) to use.
I would be happy to take on writing up a draft 'json-rpc over tpc'
spec, although progress would likely be slow since my partner is due
to have a child within the next few weeks.
I would suggest that a byte that cannot be part of a UTF8 JSON string be sent immediately after the JSON string, to act as an end-of-object marker. It seems that UTF8 encoded strings can contain any byte except 0xfe and 0xff, so I would suggest requiring an 0xff byte immediately after a complete JSON string. Such an encoding scheme would allow one TCP stream to stream multiple JSON objects in a full-duplex fashion, mixing procedure calls and replies if required. Does anyone else have thoughts on this?
Why not do the obvious and just use a line feed, 0x0A to mark the end of each message? According to the JSON spec this should never occur in a valid JSON message as it should be escaped into '\n'.
For every problem there is always a solution that is simple obvious and wrong.
On Wed, 2010-05-19 at 10:02 +1000, Rasjid Wilcox wrote: > In many contexts a line feed would be counted as whitespace, and so I > would not want to assume that there are no line feeds within the JSON > message itself. What you say is contradictory - you choose to follow the spec on "Whitespace can be inserted between any pair of tokens" but you ignore that it also states 'String: Any UNICODE character except " or \ or control character', and that it also specifies that these has to be escaped.