XSS protection for mustache

Mike Samuel

unread,

Oct 18, 2011, 4:01:52 PM10/18/11

to mustache.java

I work on improving tools and languages to make it easier to write
secure and robust programs. Mustache seems to be an elegant well-
thought out template system and I was wondering whether people would
be interested in improving on its default XSS system.

https://github.com/mikesamuel/html-contextual-autoescaper-java
provides a writer-like object that provides two methods:

writeSafe(String)
write(Object o)

so that the sequence of calls

w.writeSafe("<b>");
w.write("I <3 Ponies!");
w.writeSafe("</b>\n<button onclick=foo(");
w.writeObject(ImmutableMap.<String, Object>of("foo", "bar", "\"baz
\"", 42));
w.writeSafe(")>");

results in the output

<b>I <3 Ponies!</b>
<button onclick="foo({"foo":"\x22bar\x22":42})">

The safe parts are treated as literal chunks of HTML/CSS/JS, and the
unsafe parts are escaped to preserve security and least-surprise.
Note that when a map value is interpolated in a JavaScript value
context, it is converted to a JSON-like form.

I would like to try and modify mustache.java so that, when used to
generate HTML, writes of content in the template can be routed to
writeSafe(), and value interpolations can be routed to writeObject.

Does this sound like a worthwhile option for mustache users? If so,
how should I go about putting together a patch?

Sam Pullara

unread,

Oct 18, 2011, 7:31:05 PM10/18/11

to mustac...@googlegroups.com

Using the current built-in protection provided by using {{ }} (unsafe content) instead of {{{ }}} (safe content) you get this:

<b>{{message}}</b>
<button onclick="foo({{object}})"></button>

With a backing object of:

new Scope(new Object() {
String message = "I <3 Ponies!";
String object = json.toString();
}));

Results in:

Is that insufficient for your needs?

Sam

Mike Samuel

unread,

Oct 25, 2011, 5:18:09 PM10/25/11

to mustache.java

On Oct 18, 7:31 pm, Sam Pullara <spull...@gmail.com> wrote:
> Is that insufficient for your needs?
>
> Sam

Thanks for the quick response. No. I'm not trying to use
mustache.java for a project. I'm a security researcher and I'm trying
to make it easier to write applications robust against XSS by tweaking
template languages.

I'd like to try my "tweaks" on Mustache before taking on widely used
messy languages like JSP because it has a clean simple interface, is
used by people who care about performance, and already does naive auto-
escaping.

Perhaps my example was poorly chosen. I set up http://java-html-escaper.appspot.com/
as a testbed for other security researchers to criticize, but if you
take a look it may help you get an idea of what I'm trying to do.
The example input shows a template combined with a simple input. The
input is escaped differently depending on context, so the input
"<Cincinatti>" becomes
<Cincinatti> when used in an HTML text content
\x3cCincinati\x3e when used inside a JavaScript string value
%3cCincinatti%3e when used inside a URL query string

The autoescaper also normalizes the HTML structure.
http://java-html-escaper.appspot.com/?src=I+%3C3+%3C%21--+comment+--%3E%3Cspan+style%3Dcolor%3A{{.}}+onclick%3Dalert%28%2F*secret*%2F42%29%3E{{.}}%3C%2Fspan%3E+Ponyz%21%21%21&inp=%22red%22
demonstrates that

I <3 <span style=color:{{.}} onclick=alert(/*secret*/
42)>{{.}}</span> Ponyz!!!

post render yields an HTML string with text normalized, quotes added,
and comments elided:

I <3 <span style="color:red" onclick="alert( 42)">red</span>
Ponyz!!!

cheers,
mike

Sam Pullara

unread,

Oct 25, 2011, 5:22:40 PM10/25/11

to mustac...@googlegroups.com

Right now mustache.java is context free and it is expected that the user of the library uses it properly such that it won't introduce XSS problems using the current encoding mechanism. I admit that it is possible to use it incorrectly and get XSS bugs though. It may be possible to add extensions to the mustache language to specify the context and get different encodings but that wouldn't work without developer intervention. How are you proposing to determine things like:

> "<Cincinatti>" becomes
> <Cincinatti> when used in an HTML text content
> \x3cCincinati\x3e when used inside a JavaScript string value
> %3cCincinatti%3e when used inside a URL query string

During compile time? If your library could sit in the mustache builder and determine those cases that would be really cool.

Sam

Mike Samuel

unread,

Oct 27, 2011, 12:13:40 PM10/27/11

to mustache.java

On Oct 25, 5:22 pm, Sam Pullara <spull...@gmail.com> wrote:
> Right now mustache.java is context free and it is expected that the user of the library uses it properly such that it won't introduce XSS problems using the current encoding mechanism. I admit that it is possible to use it incorrectly and get XSS bugs though. It may be possible to add extensions to the mustache language to specify the context and get different encodings but that wouldn't work without developer intervention. How are you proposing to determine things like:
>
> > "<Cincinatti>" becomes
> > <Cincinatti> when used in an HTML text content
> > \x3cCincinati\x3e when used inside a JavaScript string value
> > %3cCincinatti%3e when used inside a URL query string
>
> During compile time? If your library could sit in the mustache builder and determine those cases that would be really cool.

I did a static analysis version for Closure Templates (since writing
efficient parsers in JavaScript is hard) and for Go.

http://code.google.com/closure/templates/docs/security.html

Possibly
http://code.google.com/p/closure-templates/source/browse/trunk/java/src/com/google/template/soy/parsepasses/contextautoesc/InferenceEngine.java
could be adapted to work statically within mustache.java. It does
rely on the ability to statically identify template call endpoints
though.

Sam

unread,

Oct 28, 2011, 12:12:34 PM10/28/11

to mustache.java

I would love to integrate this into the system, at the very least as
an extension module. I can add a plugin API to the parser such that
you will get a callback for each character and then when I need to
encode I could call your system with the currently active encoding
context. Have you made an attempt to integrate it into the parser? You
can just fork the project on Github and I can help you do the work.
Critically for performance reasons, the integration must be done at
template compile time inside the MustacheBuilder's parser.

Though I believe that you can safely use the default encoding, I would
love to be even resistant to misuse of the template engine.

Sam

On Oct 27, 9:13 am, Mike Samuel <mikesam...@gmail.com> wrote:
> On Oct 25, 5:22 pm, Sam Pullara <spull...@gmail.com> wrote:
>
> > Right now mustache.java is context free and it is expected that the user of the library uses it properly such that it won't introduce XSS problems using the current encoding mechanism. I admit that it is possible to use it incorrectly and get XSS bugs though. It may be possible to add extensions to the mustache language to specify the context and get different encodings but that wouldn't work without developer intervention. How are you proposing to determine things like:
>
> > > "<Cincinatti>" becomes
> > > <Cincinatti> when used in an HTML text content
> > > \x3cCincinati\x3e when used inside a JavaScript string value
> > > %3cCincinatti%3e when used inside a URL query string
>
> > During compile time? If your library could sit in the mustache builder and determine those cases that would be really cool.
>
> I did a static analysis version for Closure Templates (since writing
> efficient parsers in JavaScript is hard) and for Go.
>
> http://code.google.com/closure/templates/docs/security.html
>

> Possiblyhttp://code.google.com/p/closure-templates/source/browse/trunk/java/s...

Mike Samuel

unread,

Oct 28, 2011, 2:11:36 PM10/28/11

to mustache.java

On Oct 28, 12:12 pm, Sam <spull...@gmail.com> wrote:
> I would love to integrate this into the system, at the very least as
> an extension module. I can add a plugin API to the parser such that
> you will get a callback for each character and then when I need to
> encode I could call your system with the currently active encoding
> context. Have you made an attempt to integrate it into the parser? You
> can just fork the project on Github and I can help you do the work.
> Critically for performance reasons, the integration must be done at
> template compile time inside the MustacheBuilder's parser.

If you want a static approach, I would need to be able to get a
callback for each string of static text and a begin/end call for
sections.

For example, given the following template (and ignore whitespace)

Hello
{{#worlds}}
, {{world}}
{{/worlds}}!

I would like to see the following series of callbacks

safeText "Hello"
startSection
safeText ", "
interpolation
endSection
safeText "!"

When interpolation happens, I need to be able to do one of:

1. Specify a function that transforms the interpolated value to chars
to output
2. Specify a function that takes an interpolated value and a Writer/
Appendable to do the writing
3. Specify a function that takes a Writer/Appendable and wraps it so
that the resulting wrapper is used to write the interpolated value.

Option 2 would be ideal since I get access to the raw value, and it
leaves space for optimizing out buffer copies.

Ideally, when I see safeText, I would be able to substitute safe text.
For example, I would like to be able to normalize

safeText "<a href=http://example.com/"

to

safeText "<a href=\"http://example.com/"

A surprising number of attack vectors are closed when I can quote
unquoted attributes whose values include sections or interpolation
boundaries, and similarly if I can normalize '<' immediately before an
interpolation or section boundary.

> Though I believe that you can safely use the default encoding, I would
> love to be even resistant to misuse of the template engine.

The default encoding is plain text -> HTML text with entities?

I would also ideally get an event when a {{{expr}}} section is seen
that distinguishes it from a normal {{...}} section.

> Sam

Reply all

Reply to author

Forward