Escaping LINE SEPARATOR for JSON payload

367 views
Skip to first unread message

pambrose

unread,
Apr 5, 2007, 11:45:59 AM4/5/07
to Google Web Toolkit
When sending user data to the client, I am getting consistent
Javascript Exceptions:
"JavaScriptException: JavaScript SyntaxError exception: unterminated
string literal"

I looked at the string that was being sent and found the culprit: an
embedded LINE SEPARATOR char ('\u2028').
If I strip the character from the string, the problem goes away.

I would greatly appreciate it if someone on the GWT team could verify
that \u2028 is getting properly
escaped for JSON.

Thanks,
Paul

Reinier Zwitserloot

unread,
Apr 5, 2007, 9:36:56 PM4/5/07
to Google Web Toolkit
\u escapes are direct java source code escapes.

For example, the following two lines in java code:

String helloWorld = "Hello World!";
String helloWorld = \u0025Hello World!\u0025;

are -exactly- the same thing, if for a moment you bear with me and
assume 0025 is the code for the quote character. (I made it up)

And the following code will not compile:

String helloWithQuote = "Jack said, \u0025Hello, World!\u0025";

because it is exactly the same as:

String helloWithQuote = "Jack said, "Hello, World!""; //this is
clearly a java syntax error.

Hence, \u operates at a completely different level compared to \", and
all other backslash escapes.

I'm not sure this is your problem, though. I'm just guessing.

pambrose

unread,
Apr 6, 2007, 12:26:37 AM4/6/07
to Google Web Toolkit
Hi Reinier,

The LINE SEPARATOR character is *not* escaped in the current
GWT marshalling to JSON. That is why the the JSON parsing on
the client is choking on the bad character.

Paul


Reinier Zwitserloot

unread,
Apr 6, 2007, 2:20:54 AM4/6/07
to Google Web Toolkit
Ah, yes, rereading I see I misinterpreted that one.

It should be possible to root through the GWT source to figure that
out, but just as verification exercise, can you check and see what
happends if you actually use \u2028 in your java source? Just to make
sure that the step of parsing the java file itself bombs because e.g.
you saved as UTF-8 but for some reason the gwt compiler is trying to
parse your 2028 as ascii/iso-8859 and thus bollockses it up.

Dan Morrill

unread,
Apr 6, 2007, 9:40:57 AM4/6/07
to Google-We...@googlegroups.com
Hi, Paul!

To make sure I'm thinking of the correct library, can you clarify where the exception is occurring?  Is this in the JSONParser code?

- Dan Morrill

pambrose

unread,
Apr 9, 2007, 1:43:08 PM4/9/07
to Google Web Toolkit
Hey Dan,

I am not sure where in the javascript the problem is happening. All I
know is that I am
getting an JspException when the object containing the offending
character is unmarshalled
in the client.

Another way of seeing the problem is to add a string to some object
that is to be
sent from server to client and assign it a value with an embedded LINE
SEPARATOR char:
test_val = "This is a test " + Character.toString('\u2028') + " of a
bad char";

Send the object to the client and have a look in firebug at the
response sent.
I do not know what the appropriate JSON escape sequence should be for
such a character, but I
can see that no encoding is taking place in the response sent.

Cheers,
Paul


Miguel Méndez

unread,
Apr 17, 2007, 3:03:30 PM4/17/07
to Google-We...@googlegroups.com
Hi Pambrose,

I'm not sure if this will help you or not.  ECMA-232, the specification for JavaScript states that 0x2028 and 0x2029 are considered line endings, regardless of where they occur.  Another interesting thing that I discovered is that Mozilla based browsers, this includes FireFox, actually respect the specification.  IE and Safari do not.

Our JSONParser just calls the JS eval function with the string it is given.  It will see the line endings and terminate the string early which will cause a JS unterminated string literal error.

The original string needs to have these alternate line endings unicode escaped.  I have a fix for GWT-RPC that will address this since our server to client encoding is JSON and we use eval to rehydrate the objects.  You may need additional escaping if you then take the string from RPC and try to eval it again.

HTH,
Reply all
Reply to author
Forward
0 new messages