JSON escapes solidus (/) characters?

6,488 views
Skip to first unread message

John Panzer

unread,
Jun 25, 2008, 7:12:44 PM6/25/08
to opensocial-an...@googlegroups.com
Apparently, JSON requires escaping of solidus (/) characters, meaning some of the examples in the spec are invalid; for example "http:\/\/example.org" is the right way to encode a URL in a JSON string (I am very surprised by this, if a JSON guru can confirm/correct this it'd be much appreciated).  Assuming this is true, the spec text should probably be updated to be valid JSON.

John Panzer (http://abstractioneer.org)

Arne Roomann-Kurrik

unread,
Jun 25, 2008, 7:28:24 PM6/25/08
to opensocial-an...@googlegroups.com
I don't really resemble a JSON guru, but I don't think I've seen this
in practice. From http://www.ietf.org/rfc/rfc4627.txt?number=4627:

string begins and ends with
quotation marks. All Unicode characters may be placed within the
quotation marks except for the characters that must be escaped:
quotation mark, reverse solidus, and the control characters (U+0000
through U+001F).

Which would seem to indicate that solidus should be OK.

~Arne

On Wed, Jun 25, 2008 at 4:12 PM, John Panzer <jpa...@google.com> wrote:
>
> Apparently, JSON requires escaping of solidus (/) characters, meaning some of the examples in the spec are invalid; for example "http:\/\/example.org" is the right way to encode a URL in a JSON string (I am very surprised by this, if a JSON guru can confirm/correct this it'd be much appreciated). Assuming this is true, the spec text should probably be updated to be valid JSON.
>
> John Panzer (http://abstractioneer.org)
> >

--
OpenSocial IRC - irc://irc.freenode.net/opensocial

Mike Samuel

unread,
Jun 25, 2008, 7:48:29 PM6/25/08
to opensocial-an...@googlegroups.com
I'm not a JSON guru either but my take on it is below.

Reverse solidus has to be escaped because it is the escape prefix.

JSON only allows a select few characters to be escaped to avoid ambiguities in the way eval works to unpack JSON -- '\v' is a vertical tab in most interpreters but the letter 'v' in others.

The solidus is among the set of characters that MAY be escaped so that JSON can be embedded.  E.g. ["<\/script>"] is the same as ["</script>"] but can safely be embedded in an HTML script tag, and bash-style line continuations are non-standard but widely supported by eval.

The relevant bit of the RFC is
         char = unescaped /
escape (
%x22 / ; " quotation mark U+0022
%x5C / ; \ reverse solidus U+005C
%x2F / ; / solidus U+002F
%x62 / ; b backspace U+0008
%x66 / ; f form feed U+000C
%x6E / ; n line feed U+000A
%x72 / ; r carriage return U+000D
%x74 / ; t tab U+0009
%x75 4HEXDIG ) ; uXXXX U+XXXX

escape = %x5C ; \

quotation-mark = %x22 ; "

unescaped = %x20-21 / %x23-5B / %x5D-10FFFF


cheers,
mike


2008/6/25 Arne Roomann-Kurrik <kur...@google.com>:

Kevin Brown

unread,
Jun 25, 2008, 7:48:41 PM6/25/08
to opensocial-an...@googlegroups.com
On Wed, Jun 25, 2008 at 4:12 PM, John Panzer <jpa...@google.com> wrote:
Apparently, JSON requires escaping of solidus (/) characters, meaning some of the examples in the spec are invalid; for example "http:\/\/example.org" is the right way to encode a URL in a JSON string (I am very surprised by this, if a JSON guru can confirm/correct this it'd be much appreciated).  Assuming this is true, the spec text should probably be updated to be valid JSON.

This is primarily due to buggy javascript parsers that treat // as a comment when it's in a string (IE5.5 has this issue, for instance). It's a recommendation, not a requirement from what I remember.

json.org indicates that it should be escaped (see http://json.org/string.gif). The RFC has the following in section 2.5 as well:

 string = quotation-mark *char quotation-mark


char = unescaped /
escape (
%x22 / ; " quotation mark U+0022
%x5C / ; \ reverse solidus U+005C
%x2F / ; / solidus U+002F
%x62 / ; b backspace U+0008
%x66 / ; f form feed U+000C
%x6E / ; n line feed U+000A
%x72 / ; r carriage return U+000D
%x74 / ; t tab U+0009
%x75 4HEXDIG ) ; uXXXX U+XXXX

escape = %x5C ; \

quotation-mark = %x22 ; "

unescaped = %x20-21 / %x23-5B / %x5D-10FFFF

 So, it probably should be escaped. I've found that almost all json serializers do escape it as well.



John Panzer (http://abstractioneer.org)


Nick Thompson

unread,
Jun 25, 2008, 9:20:34 PM6/25/08
to opensocial-an...@googlegroups.com
I thought the reasoning here had to do with embedding JSON into
HTML. To safely allow that case it was deemed wise to make
sure that "</script>" would be escaped as "<\/script>".

Bob Ippolito's simplejson library did this and we've used it for
quite some time. In that time we have:
1) never embedded generated JSON in HTML
2) suffered greatly from escaped slashes. "http:\/\/whatever\/"
kills the legibility of urls embedded in JSON. this is a huge
issue for debugging.

So we were relieved when more recent releases of simplejson
abandoned this practice. I'd call it a failed experiment and
leave '/' unescaped.

nick


Kevin Brown wrote:
> On Wed, Jun 25, 2008 at 4:12 PM, John Panzer <jpa...@google.com
> <mailto:jpa...@google.com>> wrote:
>
> Apparently, JSON requires escaping of solidus (/) characters,
> meaning some of the examples in the spec are invalid; for example

> "http:\/\/example.org <http://example.org>" is the right way to


> encode a URL in a JSON string (I am very surprised by this, if a
> JSON guru can confirm/correct this it'd be much appreciated).
> Assuming this is true, the spec text should probably be updated to
> be valid JSON.
>
>
> This is primarily due to buggy javascript parsers that treat // as a
> comment when it's in a string (IE5.5 has this issue, for instance). It's
> a recommendation, not a requirement from what I remember.
>

> json.org <http://json.org> indicates that it should be escaped (see

Nick Thompson

unread,
Jun 25, 2008, 9:50:13 PM6/25/08
to opensocial-an...@googlegroups.com
Sorry, prematurely replied to that last one.

The grammar that Kevin refers to below is a parsing grammar
and is ambiguous as a generative grammar. So the JSON
spec in that respect doesn't really recommend "\/" over
"/" or "\u002f", it just says you have to parse all three.
There may be such a recommendation elsewhere in the JSON
spec though.

The issue with "//" in IE 5.5 is a new one to me. I didn't
find anything about it in a quick search, and i'd be a
little surprised if JScript of any variety couldn't handle
the string "http://example.com/". But there may be more to
it than that. If there are issues with unescaped / other
than the </script> problem, i'd like to know.

nick

Reply all
Reply to author
Forward
0 new messages