A lighter, more easily deployable version of Caja

David Bruant

unread,

Apr 29, 2013, 9:19:14 AM4/29/13

to google-ca...@googlegroups.com

Hi,

Caja enables secure confinement of standard-based code. The confined code sits on the page while believing being in its own page. By achieving this, Caja proves that it is possible at all, which is in itself kind of a big deal.

Now, to enable this, Caja comes with some cost, including a server-side live rewritter, some runtime checks and certainly other constraints that seem to reduce its attractivity to web developers [1] as it's hard to deploy and performance takes some hit. Caja enables to safely run malicious code. However this isn't everyone's use case (well... it depends on the definition you have of security [2]).

So I was wondering if there was a good balance to be found that make deployment easier while getting rid of some security guarantees that are necessary only in the face of intentionally malicious code.
For instance, what is the best that can be done without the live server-side rewritter (rewriting the widget once with a command-line tool and loading the rewritten version can be considered acceptable for instance)?
What I looking for is a solution that would be easier to deploy with smaller runtime footprint but that would mostly confine well-behaving standard code, giving it the impression it leaves on its own page while only sitting on a div. And based on the different possible solutions, how hard would it be to develop it (I assume stripping out parts of Caja that wouldn't be necessary).

Thanks,

David

[1] http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2013-April/039448.html
[2] https://mail.mozilla.org/pipermail/es-discuss/2013-February/028804.html

๏̯͡๏ Jasvir Nagra

unread,

Apr 29, 2013, 9:27:05 AM4/29/13

to Google Caja Discuss

Hi David,

We have a mode which takes advantage of ES5-strict and of a html and css sanitizer written in pure javascript to provide a lightweight Caja sandbox. It is able to do so without any serverside rewriting provided it is running on a browser which supports strict mode correctly. While we have not officially documented in, you can explore using the following example:

<!DOCTYPE html>

<html>

function onsuccess(state) { console.log('Started in es5Mode = ' + state.es5Mode); }

function onfail(err) { console.log('failed' + err); }

caja.initialize({

es5Mode: true,

debug: true,

log: function(x) { console.log(x); },

maxAcceptableSeverity: 'NO_KNOWN_EXPLOIT_SPEC_VIOLATION'

}, onsuccess, onfail);

caja.load(

document.getElementById('guest'),

caja.policy.net.ALL,

function(frame) {

frame.code('http://www.thinkfu.com/trivial.html', 'text/html')

.api({ alert: caja.tame(caja.markFunction(function(m) { alert(m); }))})

.run(function() {});

});

</script>

</body>

</html>

jas

--

---
You received this message because you are subscribed to the Google Groups "Google Caja Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-caja-dis...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Kevin Reid

unread,

Apr 29, 2013, 10:11:56 AM4/29/13

to google-ca...@googlegroups.com

On Apr 29, 2013, at 6:19, David Bruant <brua...@gmail.com> wrote:

> So I was wondering if there was a good balance to be found that make deployment easier while getting rid of some security guarantees that are necessary only in the face of intentionally malicious code.
> For instance, what is the best that can be done without the live server-side rewritter (rewriting the widget once with a command-line tool and loading the rewritten version can be considered acceptable for instance)?

As Jasvir noted, Caja already has an almost pure client-side mode, ES5 mode (almost in that it still uses a server purely as a proxy to permit cross-origin/referrerless loading of resources). Also, there is in fact a command-line tool (bin/cajole) which will work in nearly all of the cases where the server will.

> What I looking for is a solution that would be easier to deploy with smaller runtime footprint but that would mostly confine well-behaving standard code, giving it the impression it leaves on its own page while only sitting on a div. And based on the different possible solutions, how hard would it be to develop it (I assume stripping out parts of Caja that wouldn't be necessary).

I believe it is nearly trivial to do this. All you need to do is load the code which is responsible for constructing virtual pages:

- domado.js (DOM virtualization wrapper),
- html-emitter.js (constructing HTML from parser events),
- and their dependencies.

The one wrinkle is that you would have to stub out the calls into the Caja runtime intended to make secured objects; again, not that complex.

You still need, for example, the HTML and CSS schema/whitelist, because it contains information about parsing and the semantic categories of language constructs which is important both for interpreting input content and to get correct virtualization (as opposed to having parts of one guest leak into another).

Overall, the result of this would be essentially identical to ES5 mode, except that nothing would be frozen and there would be none of SES's workarounds for browser bugs; both of these would likely be significant performance improvements (the former because JS implementations tend to be optimized for the common case of not-frozen objects, and the latter because reimplementing built-in algorithms tends to be slow).

There is another possibility for lightening which you did not mention: removing the requirement for 'standard code'. A lot of the heavy weight of the DOM virtualization arises from implementing the exact specified DOM interface behavior, or behavior that code in the wild expects; in this hypothetical case we would instead be free to implement our own subsetted or modified API.

Off the top of my head, these are some possibilities for making things faster, or at least simpler:

- Get rid of NodeList objects which are 'live' (update in response to DOM mutations), which are a pain to implement and require several layers of special gimmicks.

- Remove the possibility of global (to a virtual document) variables; all executed scripts have clean lexical environments except for some explicit sort of export/import. (This would eliminate the need for a client-side (!) rewrite of JS and also simplify environment setup for eval.)

- Remove all of the HTML parser special cases and require the input to be well-formed XML; this would mainly be less code but should remove some indirections.

- Remove the node list puns, where window.frames === window and form elements are node lists of form elements.

- Remove upward navigation, i.e. .parentNode .offsetParent .nextSibling and so on. This would allow more capability-styled access control and remove some of the necessary logic for creating a well-defined document boundary.

David Bruant

unread,

Apr 29, 2013, 12:16:43 PM4/29/13

to google-ca...@googlegroups.com, Kevin Reid

Le 29/04/2013 16:11, Kevin Reid a �crit :

> On Apr 29, 2013, at 6:19, David Bruant <brua...@gmail.com> wrote:
>
>> So I was wondering if there was a good balance to be found that make deployment easier while getting rid of some security guarantees that are necessary only in the face of intentionally malicious code.
>> For instance, what is the best that can be done without the live server-side rewritter (rewriting the widget once with a command-line tool and loading the rewritten version can be considered acceptable for instance)?
> As Jasvir noted, Caja already has an almost pure client-side mode, ES5 mode (almost in that it still uses a server purely as a proxy to permit cross-origin/referrerless loading of resources).

I don't see how this server is provided in Jasvir example. Or is it the
cajaServer parameter of caja.initialize?

Interestingly, referrerless will be possible without a proxy in the
future [1]. In lots of circumstances, leaking the referrer isn't a huge
deal anyway.
Cross-origin requests are more annoying, but probably workable either
through CORS or having a copy of the widget on the server.

> Also, there is in fact a command-line tool (bin/cajole) which will work in nearly all of the cases where the server will.
>
>> What I looking for is a solution that would be easier to deploy with smaller runtime footprint but that would mostly confine well-behaving standard code, giving it the impression it leaves on its own page while only sitting on a div. And based on the different possible solutions, how hard would it be to develop it (I assume stripping out parts of Caja that wouldn't be necessary).
> I believe it is nearly trivial to do this. All you need to do is load the code which is responsible for constructing virtual pages:
>
> - domado.js (DOM virtualization wrapper),
> - html-emitter.js (constructing HTML from parser events),
> - and their dependencies.

Aren't they already loaded in caja.js to make Jasvir code snippet work?
Or maybe you meant the lighter Caja I'm asking for would just be the
parts that you just listed?

Do you have an idea of the order of magnitude all this code takes
minified? 100k? 300k?

> The one wrinkle is that you would have to stub out the calls into the Caja runtime intended to make secured objects; again, not that complex.
>
> You still need, for example, the HTML and CSS schema/whitelist, because it contains information about parsing and the semantic categories of language constructs which is important both for interpreting input content and to get correct virtualization (as opposed to having parts of one guest leak into another).
>
> Overall, the result of this would be essentially identical to ES5 mode, except that nothing would be frozen and there would be none of SES's workarounds for browser bugs; both of these would likely be significant performance improvements (the former because JS implementations tend to be optimized for the common case of not-frozen objects, and the latter because reimplementing built-in algorithms tends to be slow).
>
>
> There is another possibility for lightening which you did not mention: removing the requirement for 'standard code'.

In general, I'd like to try to keep things according to standard, but
what you list below as resulting in a perf boost often turns out to also
be good practices for other reasons or rarely used features.

> A lot of the heavy weight of the DOM virtualization arises from implementing the exact specified DOM interface behavior, or behavior that code in the wild expects; in this hypothetical case we would instead be free to implement our own subsetted or modified API.
>
> Off the top of my head, these are some possibilities for making things faster, or at least simpler:
>
> - Get rid of NodeList objects which are 'live' (update in response to DOM mutations), which are a pain to implement and require several layers of special gimmicks.

Very few people know that NodeLists are live. Even less do use this
property, so that's easy to give up for the vast majority of cases.

> - Remove the possibility of global (to a virtual document) variables; all executed scripts have clean lexical environments except for some explicit sort of export/import. (This would eliminate the need for a client-side (!) rewrite of JS and also simplify environment setup for eval.)

That's also a JavaScript good practice in general.

> - Remove all of the HTML parser special cases and require the input to be well-formed XML; this would mainly be less code but should remove some indirections.

Well-formed XML is not expected to be a good practice. Quoteless
attributes, some elements that don't need to be closed help authoring
and people use them extensively. However, well-formed HTML is a
requirement that could work.

> - Remove the node list puns, where window.frames === window and form elements are node lists of form elements.

Not that used in practice anyway.

> - Remove upward navigation, i.e. .parentNode .offsetParent .nextSibling and so on. This would allow more capability-styled access control and remove some of the necessary logic for creating a well-defined document boundary.

Forbidding that would probably break a lot of things, so that's fine to
keep.

Thanks Kevin and Jasvir for your insightful responses!

David

[1] http://wiki.whatwg.org/wiki/Meta_referrer
[2] https://bugzilla.mozilla.org/show_bug.cgi?id=704320
[3] https://bugs.webkit.org/show_bug.cgi?id=72674

Kevin Reid

unread,

Apr 29, 2013, 1:31:59 PM4/29/13

to Google Caja Discuss

On Mon, Apr 29, 2013 at 9:16 AM, David Bruant <brua...@gmail.com> wrote:

Le 29/04/2013 16:11, Kevin Reid a écrit :

As Jasvir noted, Caja already has an almost pure client-side mode, ES5 mode (almost in that it still uses a server purely as a proxy to permit cross-origin/referrerless loading of resources).

I don't see how this server is provided in Jasvir example. Or is it the cajaServer parameter of caja.initialize?

Yes.

I believe it is nearly trivial to do this. All you need to do is load the code which is responsible for constructing virtual pages:

- domado.js (DOM virtualization wrapper),
- html-emitter.js (constructing HTML from parser events),
- and their dependencies.

Aren't they already loaded in caja.js to make Jasvir code snippet work? Or maybe you meant the lighter Caja I'm asking for would just be the parts that you just listed?

Yes, the lighter non-secure system would be just those parts. In fact, when I was originally developing domado.js (as a translation from a pre-ES5 design), I ran it as a standalone JS file until it was at the point where it was capable of interacting with the rest of Caja.

Do you have an idea of the order of magnitude all this code takes minified? 100k? 300k?

The minified code loaded into the secured frame in ES5 mode is 387kB. A rough cut of the part of it that would be needed for the above I find to be 244kB. But code size is not really the performance bottleneck; execution time is.

There is another possibility for lightening which you did not mention: removing the requirement for 'standard code'.

In general, I'd like to try to keep things according to standard, but what you list below as resulting in a perf boost often turns out to also be good practices for other reasons or rarely used features.

There's some history here: in the past, Caja tried to be more of its own platform and less of a perfect emulation of the browser, but we have found that that's not what users want. We've become paranoid about deviating from the de-facto standards.

A lot of the heavy weight of the DOM virtualization arises from implementing the exact specified DOM interface behavior, or behavior that code in the wild expects; in this hypothetical case we would instead be free to implement our own subsetted or modified API.

Off the top of my head, these are some possibilities for making things faster, or at least simpler:

- Get rid of NodeList objects which are 'live' (update in response to DOM mutations), which are a pain to implement and require several layers of special gimmicks.

Very few people know that NodeLists are live. Even less do use this property, so that's easy to give up for the vast majority of cases.

Indeed, but for the one where it isn't it makes people's code break peculiarly;

- Remove the possibility of global (to a virtual document) variables; all executed scripts have clean lexical environments except for some explicit sort of export/import. (This would eliminate the need for a client-side (!) rewrite of JS and also simplify environment setup for eval.)

That's also a JavaScript good practice in general.

Perhaps so, but again, we have to deal with what people write, not what people should write.

- Remove all of the HTML parser special cases and require the input to be well-formed XML; this would mainly be less code but should remove some indirections.

Well-formed XML is not expected to be a good practice. Quoteless attributes, some elements that don't need to be closed help authoring and people use them extensively. However, well-formed HTML is a requirement that could work.

I don't mean avoiding erroneous cases: I mean such things as omitting no start nor end tags, so that the tree-builder doesn't have to know which ones to insert implicitly. In general, more regular syntax. I recognize this is impractical: I'm just writing a wishlist here.

- Remove the node list puns, where window.frames === window and form elements are node lists of form elements.

Not that used in practice anyway.

The former we don't actually implement, but the latter is, more often than you might think: I believe we've seen code that expects document.forms[i] to be a form element and code that expects document.forms[i][j] to be an input element. You can't satisfy both of those without the pun.

David Bruant

unread,

Apr 30, 2013, 12:59:27 PM4/30/13

to google-ca...@googlegroups.com, Kevin Reid

Hi Kevin,

Thanks for your answers! More questions below :-)

Le 29/04/2013 19:31, Kevin Reid a écrit :

On Mon, Apr 29, 2013 at 9:16 AM, David Bruant <brua...@gmail.com> wrote:

Le 29/04/2013 16:11, Kevin Reid a écrit :

As Jasvir noted, Caja already has an almost pure client-side mode, ES5 mode (almost in that it still uses a server purely as a proxy to permit cross-origin/referrerless loading of resources).

I don't see how this server is provided in Jasvir example. Or is it the cajaServer parameter of caja.initialize?

Yes.

Is the server-side of Caja documented somewhere? Specifically the end-points, parameters, the part that's being a proxy for cross-origin requests, etc. I haven't found too much about that yet.

I believe it is nearly trivial to do this. All you need to do is load the code which is responsible for constructing virtual pages:

- domado.js (DOM virtualization wrapper),
- html-emitter.js (constructing HTML from parser events),
- and their dependencies.

Aren't they already loaded in caja.js to make Jasvir code snippet work? Or maybe you meant the lighter Caja I'm asking for would just be the parts that you just listed?

Yes, the lighter non-secure system would be just those parts. In fact, when I was originally developing domado.js (as a translation from a pre-ES5 design), I ran it as a standalone JS file until it was at the point where it was capable of interacting with the rest of Caja.

Are domado and html-emitter and the APIs they expose documented somewhere (at a higher level than the comments that can be found in code)?

David

Kevin Reid

unread,

Apr 30, 2013, 1:15:32 PM4/30/13

to David Bruant, Google Caja Discuss

On Tue, Apr 30, 2013 at 9:59 AM, David Bruant <brua...@gmail.com> wrote:

Is the server-side of Caja documented somewhere? Specifically the end-points, parameters, the part that's being a proxy for cross-origin requests, etc. I haven't found too much about that yet.

I don't know offhand whether there's documentation, but as far as I know we largely treat it as an implementation detail, especially now that we are switching away from using the server whenever possible. The interface for integrators is “it's a Java servlet which you need to provide to the client”.

Are domado and html-emitter and the APIs they expose documented somewhere (at a higher level than the comments that can be found in code)?

No, they are currently considered internal to the implementation, and are frequently revised to support the inter-component communication necessary for new features. If we were to decide to create a security-less Caja (which is not exactly high on our list of priorities) we would of course provide a well-defined API.

Reply all

Reply to author

Forward