Native JSON and safe parsing implementation

157 views
Skip to first unread message

Brendan

unread,
Oct 23, 2009, 5:55:46 PM10/23/09
to Google Web Toolkit
I realize this post is coming at the end of the day on friday, but
hopefully it will get a look =)

Going through the different JSON parsing implementations available for
GWT out there, almost all of them rely on a straight eval and note
that they should only be used for trusted code. The old JSONValue code
seems kind of quaint in light of JavaScriptObjects, and looking
through the trunk it looks like there's just a TODO in JsonUtils for
safe parsing.

Anyway, I don't have that much experience writing JSNI code, so I
wanted to run this small class past you all and see if this aligns
with GWT best practices, if I'm doing anything monumentally stupid, or
if I'm just plain missing something.

The following is a pretty basic adaptation of Douglas Crockford's
json2.js. When parse() is called, it checks to see if there is a
native JSON defined (which exists in the latest IE, Firefox, Chrome
and Safari) and simply uses that. If not, it defines a $wnd.JSON.parse
in javascript using the json2 code, which is then indistinguishable
from the native functionality in future calls (this tactic comes
straight from json2.js).

It lazily initializes the JSON object, as defining an entry point
doesn't seem like a big win, and adds a lot of support necessary to
make it happen. It also defines it on the $wnd object...scope still
occasionally confuses me in javascript, but it seemed to make more
sense to define it there than on window.

Any thoughts? Even drive by comments are useful if it gives me enough
to google. Thanks!

The code:
/**
* Evaluates a JSON string safely
*
* @param <T> The type of JavaScriptObject that should be returned
* @param jsonString The source JSON text
* @return The evaluated object
* @throws NullPointerException if <code>jsonString</code> is
<code>null</code>
* @throws IllegalArgumentException if <code>jsonString</code> is
empty
* @throws JavaScriptException if <code>jsonString</code> is non-
parseable
*/
public static final <T extends JavaScriptObject> T parse(String
jsonString) {
if (jsonString == null) {
throw new NullPointerException();
}

if (jsonString.length() == 0) {
throw new IllegalArgumentException("empty argument");
}

// check if JSON variable is defined, if not, call init code
if (isJsonUndefined()) {
defineJSON();
}

return nativeParse(jsonString);
}

private static native <T extends JavaScriptObject> T nativeParse
(String jsonString) /*-{
return $wnd.JSON.parse(jsonString);
}-*/;

/**
* simple check to see if JSON object is defined
* will be defined in browsers with native implementation
* otherwise, will need to init manually
*
* @return true if JSON object is undefined
*/
protected static final native boolean isJsonUndefined() /*-{
return !$wnd.JSON;
}-*/;

/**
* naive way to define the JSON object *if* not already
* natively defined by browser. straight copy from json2.js.
*/
private static final native void defineJSON() /*-{
//http://www.JSON.org/json2.js
//2009-09-29
//Public Domain.
//NO WARRANTY EXPRESSED OR IMPLIED. USE AT YOUR OWN RISK.
//See http://www.JSON.org/js.html

// Create a JSON object only if one does not already exist.
if (!$wnd.JSON) {
$wnd.JSON = {};
}

(function () {
var cx = /[\u0000\u00ad\u0600-\u0604\u070f\u17b4\u17b5\u200c-\u200f
\u2028-\u202f\u2060-\u206f\ufeff\ufff0-\uffff]/g;

// create parse method if one does not exist
// The parse method takes a text and returns
// a JavaScript value if the text is a valid JSON text.
if (typeof $wnd.JSON.parse !== 'function') {
$wnd.JSON.parse = function (text) {

var j;

// first stage, escape sequences
cx.lastIndex = 0;
if (cx.test(text)) {
text = text.replace(cx, function (a) {
return '\\u' +
('0000' + a.charCodeAt(0).toString(16)).slice(-4);
});
}

// second stage, remove non-JSON patterns
if (/^[\],:{}\s]*$/.
test(text.replace(/\\(?:["\\\/bfnrt]|u[0-9a-fA-F]{4})/g, '@').
replace(/"[^"\\\n\r]*"|true|false|null|-?\d+(?:\.\d*)?(?:[eE][+\-]?
\d+)?/g, ']').
replace(/(?:^|:|,)(?:\s*\[)+/g, ''))) {

// third stage, eval
j = eval('(' + text + ')');

return j;

}

// If the text is not JSON parseable, then a SyntaxError is thrown.
throw new SyntaxError('JSON.parse');
};
}
}());

}-*/;

Sripathi Krishnan

unread,
Oct 24, 2009, 11:28:08 PM10/24/09
to google-we...@googlegroups.com
You should take a look at http://tools.ietf.org/html/rfc4627

GWT recommends using the regular expression defined in that RFC. Also, one of their classes (ExternalTextResourcePrototype.java) defines the Regular expression and runs eval.. so you could use the same strategy.

See method evalObject in  http://code.google.com/p/google-web-toolkit/source/browse/changes/bobv/clientbundle/user/src/com/google/gwt/resources/client/impl/ExternalTextResourcePrototype.java?spec=svn4992&r=4992


Source Code -
  /**
   * Evaluate the JSON payload. The regular expression to validate the safety of
   * the payload is taken from RFC 4627 (D. Crockford).
   *
   * @param data the raw JSON-encapsulated string bundle
   * @return the evaluated JSON object, or <code>null</code> if there is an
   *         error.
   */
  private static native JavaScriptObject evalObject(String data) /*-{
    var safe = !(/[^,:{}\[\]0-9.\-+Eaeflnr-u \n\r\t]/.test(
      data.replace(/"(\\.|[^"\\])*"/g, '')));

    if (!safe) {
      return null;
    }

    return eval('(' + data + ')') || null;
  }-*/;



--Sri


2009/10/23 Brendan <bck...@gmail.com>

Brendan

unread,
Oct 25, 2009, 6:22:27 AM10/25/09
to Google Web Toolkit
Thanks for the reply and the source reference. I hadn't seen that
class...it would be nice to pull that method into (maybe) JsonUtils so
that it would be in the core and more visible.

The RFC and the json2.js code (here: http://www.json.org/json2.js )
were both written by Douglas Crockford. The main difference seems to
be that the json2 code has been updated since the RFC was written to
deal with how javacript handles certain unicode characters (which will
cause erroneous parsing without escaping them) and to speed up the
regex testing in older versions of IE and Safari. These are the first
and second "stages," respectively, as reference in the code above and
in the json2 code, with more explanation there.

However, the more important difference is that if a browser can
natively parse the JSON string, the json2 code (and thus the class in
my OP) will default to the native implementation. Besides the obvious
speed advantage to doing the parsing in native code, there are some
other problems with pure-js parsing. Most of the points on this page
are still true: http://ejohn.org/blog/native-json-support-is-required/

Anyway, obviously all that doesn't invalidate the current
implementation. Mostly this seems like an interesting case where
deferred binding would be the usual approach, but the delineation of
browsers with native JSON support doesn't match the current way the
browser targets are defined, necessitating a different approach.

Is there anywhere else in the codebase where native features are
supported similarly? I can think of something like getElementByID, but
that was added so long ago its probably a pretty clear case when
native support can be used

Thanks again.

On Oct 24, 10:28 pm, Sripathi Krishnan <sripathi.krish...@gmail.com>
wrote:
> You should take a look athttp://tools.ietf.org/html/rfc4627
>
> GWT recommends using the regular expression defined in that RFC. Also, one
> of their classes (ExternalTextResourcePrototype.java) defines the Regular
> expression and runs eval.. so you could use the same strategy.
>
> See method evalObject inhttp://code.google.com/p/google-web-toolkit/source/browse/changes/bob...
>
> Source Code -
>   /**
>    * Evaluate the JSON payload. The regular expression to validate the
> safety of
>    * the payload is taken from RFC 4627 (D. Crockford).
>    *
>    * @param data the raw JSON-encapsulated string bundle
>    * @return the evaluated JSON object, or <code>null</code> if there is an
>    *         error.
>    */
>   private static native JavaScriptObject evalObject(String data) /*-{
>     var safe = !(/[^,:{}\[\]0-9.\-+Eaeflnr-u \n\r\t]/.test(
>       data.replace(/"(\\.|[^"\\])*"/g, '')));
>
>     if (!safe) {
>       return null;
>     }
>
>     return eval('(' + data + ')') || null;
>   }-*/;
>
> --Sri
>
> 2009/10/23 Brendan <bcke...@gmail.com>
> >        //Seehttp://www.JSON.org/js.html

Thomas Broyer

unread,
Oct 25, 2009, 7:13:55 AM10/25/09
to Google Web Toolkit

On 23 oct, 22:55, Brendan <bcke...@gmail.com> wrote:
> I realize this post is coming at the end of the day on friday, but
> hopefully it will get a look =)
>
> Going through the different JSON parsing implementations available for
> GWT out there, almost all of them rely on a straight eval and note
> that they should only be used for trusted code. The old JSONValue code
> seems kind of quaint in light of JavaScriptObjects, and looking
> through the trunk it looks like there's just a TODO in JsonUtils for
> safe parsing.

Funny! I'm finishing a JsonUtils patch (was about to submit it this
night but finally chose to review it one more time, ensuring tests
pass, checkstyle, etc.) adding:
- JsonUtils.safeParse
- JsonUtils.stringify
- JsonUtils.isArray (equivalent of ECMAScript 5's Array.isArray)
- deferred binding implementations using either native support
(user.agent=ie8) or emulation (eval() for the parsing; user.agent=
{ie6,gecko,opera}), and a fallback to emulation when native support
isn't available (e.g. recent versions of Firefox, Chrome and Safari
have native support, but older versions do not, so we detect if native
JSON is supported –recent versions– and fallback to emulation
otherwise –older versions–; user.agent={gecko1_8,safari})

I'm using a similar code at work with no problem so far.

I'll ping when my patch is sent for review.

Brendan

unread,
Oct 25, 2009, 8:28:38 AM10/25/09
to Google Web Toolkit
Great!

stringify seems like it would make things even messier, but I guess
(though I'd have to look) if JSON is natively supported then the
browser vendor also has e.g. Date.prototype.toJSON() defined as well.

I look forward to seeing your code! especially that last case.

Thomas Broyer

unread,
Oct 25, 2009, 9:47:54 AM10/25/09
to Google Web Toolkit


On 25 oct, 13:28, Brendan <bcke...@gmail.com> wrote:
> Great!
>
> stringify seems like it would make things even messier, but I guess
> (though I'd have to look) if JSON is natively supported then the
> browser vendor also has e.g. Date.prototype.toJSON() defined as well.
>
> I look forward to seeing your code! especially that last case.

Patch sent for review at http://gwt-code-reviews.appspot.com/86803

stringify() is only implemented for JSOs (because application/json can
only be an object or array; contrary to what JSON.stringify() accepts/
produces, per ECMAScript 5)

eneveu

unread,
Oct 26, 2009, 2:34:18 PM10/26/09
to Google Web Toolkit
Hi everyone :)

I'm on the go (movie theater time!) and didn't have the time to read
all the replies in detail, but it seems like nobody talked about it:

There are some nice things inside gwt-rpc-plus : http://code.google.com/p/gwt-rpc-plus

http://code.google.com/p/gwt-rpc-plus/source/browse/#svn/trunk/gwt-rpc-plus/gwt/com/dotspots/rpcplus/client/codec

You can create loose/strict JSON encoders/decoders using
http://code.google.com/p/gwt-rpc-plus/source/browse/trunk/gwt-rpc-plus/gwt/com/dotspots/rpcplus/client/codec/impl/JSONFactory.java



Regards,

-Etienne
Reply all
Reply to author
Forward
0 new messages