Aura View and Auto-Escaping

51 views
Skip to first unread message

Paul M Jones

unread,
Feb 18, 2012, 3:52:46 PM2/18/12
to aur...@googlegroups.com
Hi everybody,

I read this from Jakob Vrana a few days ago ...

<http://php.vrana.cz/defense-against-xss-in-zend-framework.php>

... and it occurred to me that we could apply and automatic escaping system in Aura View, one that applies escaping when methods and properties are *accessed* (not when they are assigned). With that in mind, I have committed an "escaper" branch to Aura.View:

<https://github.com/auraphp/Aura.View/tree/escaper>

To try it out, go to your Aura.View clone and issue `git checkout escaper`.

With that branch, you don't need to issue $this->escape() on absolutely everything all the time. Instead, any time you access a variable assigned to a template, Aura View will escape it for you before handing it back. This applies to strings, array keys and values, object properties, method returns ... *and* it applies to all values returned by method returns. (It's done by using an escaper object decorator.)

I am absolutely certain there is a performance penalty for this approach; there is at least one additional method call per data access call. I think that having the automatic escaping is a nice tradeoff, given that we don't have to remember to call `$this->escape()` everywhere, *and* the lack of the `$this->escape()` calls should make the templates look cleaner.

The solution seems very thorough, but I want everyone to take a look and try to break it. Let me know what you think.

--

Paul M. Jones
http://paul-m-jones.com/

Hari K T

unread,
Feb 18, 2012, 7:28:48 PM2/18/12
to aur...@googlegroups.com
Hey,

Good work Paul .

I was thinking to notify you about this . 

I will try it out , with xss cheatsheet ;) .

Hari K T
M: +91-9388758821 | W: http://harikt.com/blog

Hari K T

unread,
Feb 18, 2012, 7:39:17 PM2/18/12
to aur...@googlegroups.com
A quick question is why the naming __raw ?

Paul M Jones

unread,
Feb 18, 2012, 7:43:23 PM2/18/12
to aur...@googlegroups.com

On Feb 18, 2012, at 18:39 , Hari K T wrote:

> A quick question is why the naming __raw ?

On the word "raw" itself: it's the difference between "escaped" data and "raw, unfiltered, possibly dangerous" data.

In the escaper object, the double-underscore prefix is to reduce the likelihood of conflicting with an underlying object method.

In the template object, we reuse __raw() as the name so you don't have to remember a different method name.

Does that make sense?

Hari K T

unread,
Feb 18, 2012, 8:36:43 PM2/18/12
to aur...@googlegroups.com
Yes that make sense .

Another question for you .

I have a layout , and in layout.php 

I have <?php echo $this->inner_view; ?></div>

so the corresponding view I set from action is 

$this->view->setInnerView('list');

And list.php contains some html , now it escapes automatically even if I am not having any data set for view , is it a good way ?


Qn2 : 

I am setting data from action as 

$this->view->setInnerData([])

Is there another way to do it ? I am getting error in view as 

Notice: Undefined index: datas in  Path .

In our framework we have a TwoStep view object , so we will not be able to set the data as described in Aura.View . So my concern is people who uses system needs another documentation ? And if so I don't feel its good .

Jakub Vrana

unread,
Mar 7, 2012, 3:35:53 AM3/7/12
to The Aura Project for PHP, Jakub Vrana
Interesting approach. The current implementation suffers both from
double escaping some data and not escaping some other data.

This is double escaped:

$escaper['b'] = 'x';
$escaper['b'] .= ' > ';
$escaper['b'] .= '2';
echo $escaper['b'];

This is not escaped at all:

$escaper->c = 'x';
$escaper->c .= ' > ';
$escaper->c .= '2';
echo $escaper->c;

The reason is that __set() is not defined.

I think that a reasonable solution is to make the object immutable -
e.g. throw an exception in __set() and offsetSet().

Jakub Vrana

Paul M. Jones

unread,
Apr 17, 2012, 5:12:00 PM4/17/12
to aur...@googlegroups.com, Jakub Vrana
Hi all,

Following up on this after, what, six weeks?

I think making it immutable is the right thing, too. Do we want an exception proper, or would E_NOTICE be sufficient?


-- pmj

Hari K T

unread,
Apr 17, 2012, 11:53:10 PM4/17/12
to aur...@googlegroups.com
Thank you Jakub Vrana for taking your timing looking into this. I did see the mail , but I was not sure what to say.

Thank you @Paul for taking time on this. What do you think ? I feel Exception is good or may be notice looking the environment ? ( prod , devl , testing ? )

Paul M. Jones

unread,
Apr 18, 2012, 1:32:22 PM4/18/12
to aur...@googlegroups.com, Jakub Vrana

On Mar 7, 2012, at 2:35 AM, Jakub Vrana wrote:

I've worked through this, and making it immutable makes it difficult/impossible to do things like:

$template->var = 'this & that';

You could still access that functionality through setData() and addData(), but then you have the same problem as described above.

Instead of making it immutable, perhaps the escaper should do two things:

1. Implement __set() as Jakub notes above, but instead of throwing ImmutableException, it works the same as offsetSet(), so that we can add properties to the object and have them escaped properly.

2. The __escape() method should set the $double_encode param on htmlspecialchars() to false. That way we can do operations on the properties (like the concatenation above) and not worry about double-escaping.

Thoughts?


-- pmj

Robert Gonzalez

unread,
Apr 18, 2012, 2:02:43 PM4/18/12
to aur...@googlegroups.com
I know I'm late to the game on this discussion, but when you look at what is being proposed here, does it really make sense to do this given the context of the app? I mean, why should the view automatically escape data? Yes, I believe there is a facet of the view that should offer that functionality, but to make it an automatic process seems... not right. 

Honestly, what is so hard, as a programmer, about saying:
<?php echo $this->escape($this->var) ?> 

as opposed to:
<?php echo $this->var ?>

The programmer is programming the app, right? The programmer is setting the view variables, right? The programmer is (potentially) crafting the templates, right? Why can't the folks that are programming, writing methods, setting variables and manipulating data ALL DAY LONG just add one more call? I'm asking this from the perspective of someone that has set view variables into the view object at one point, then accessed them for another reason at another point prior to rendering (or even after rendering). What I set, I should get as-is unless I specify a manipulation routine, which is easy enough to do with a process very similar to $var = $view->someManipulationMethod($view->var). That method *could* be an escape, could be a mapped method, could be an overloaded method... could be anything. At the same time, I should be able to do something like $var = $view->var and know with certainty that if I set '$<45&*>' into that variable that I am going to get that same thing out.

Or am I just being a confrontational ass on this matter? :)
--

Robert Gonzalez
   

Paul M. Jones

unread,
Apr 18, 2012, 4:00:36 PM4/18/12
to aur...@googlegroups.com

On Apr 18, 2012, at 1:02 PM, Robert Gonzalez wrote:

> I know I'm late to the game on this discussion, but when you look at what is being proposed here, does it really make sense to do this given the context of the app? I mean, why should the view automatically escape data? Yes, I believe there is a facet of the view that should offer that functionality, but to make it an automatic process seems... not right.

This is a fair assessment. Honestly I only recently came to the idea that the data for output should be escaped automatically, after holding exactly your opinion for almost a decade.

(For reference, the codebase for the auto-escaping is at <https://github.com/auraphp/Aura.View/tree/escaper>.)

The short version of my change of heart is this: I had a good friend and skilled developer come to me and say that auditing Solar/Savant/Zend/PHP-based templates was very difficult, because he had to see if escape() was called on each individual output call. He had to check every case where values weren't escaped to see if it was accidental or intentional, over hundreds of template scripts, and track back through the codebase to see what the data was supposed to be. Most of the time the lack of escaping was intentional, but *occasionally* it was accidental, and it was tedious to decipher that to find potential security holes.

With that in mind, I realized that it's a real hassle to do that kind of thing. High-end enterprisey folks really do need to do audits of that sort.

Now, *most* of the time, output should be escaped. I'd say it's at least 80%, and probably closer to 98%. That is, only 2 out of 100 times you send output (or examine a data value) will you need it to be the raw unescaped data (such as user-entered HTML that has been otherwise sanitized). As such, escaping automatically removes the manual tedium for a major part of development work. You still have access to the raw values, but it becomes easy to grep the templates for calls to `__raw`. That makes it the default case that outputs are escaped, and makes it obvious that using unescaped values is intentional. Voila: audits are easier, and you have to *ask* for unescaped values. By default you are more-safe from accidental XSS vulnerabilities.

This is not to say that it's perfect; things like `count()` and `instanceof` are not going to work right in templates, because of the way the automatic escaping works. For things like that you need to ask for the raw original data, not the escaped object, but those things are very few and (I believe) infrequent or easily worked around. The automatic escaping is also guaranteed to be slower to execute than calling $this->escape() on things manually; however, that in turn is slower than calling htmlspecialchars() on everything too, so it's a not a question of absolute speed, but of whether you think the performance tradeoff is worth it.


> Honestly, what is so hard, as a programmer, about saying:
> <?php echo $this->escape($this->var) ?>
>
> as opposed to:
> <?php echo $this->var ?>

You're right; there's nothing hard about it, aside from being a bit tedious (but necessary). But the proper comparison under PHP 5.4 and the automatic escaping system is this:

<?php echo $this->escape($this->var); ?>

vs:

<?= $this->var; ?>

(The echo-tag is enabled in 5.4 along with long-form tags.) That second form looks like a win to me.

Now, we really see a difference when we compare this:

<?php echo $this->html; ?>

vs:

<?= $this->__raw()->html; ?>

Is the first one *intended* to be escaped, or was it accidentally left unescaped? Hard to tell. But in the second one, it's immediately clear what's going on: the developer wants the raw, unsafe, dangerous, be-very-careful-with-this unescaped value.


> Or am I just being a confrontational ass on this matter? :)

Not at all. These are valid and reasonable questions. Like I said, I only came to this approach after years, maybe a decade, of being a "you should manually escape in your PHP templates" guy.

Please let me know if that helps explain my thinking, even if you still don't agree. :-)


-- pmj

Jakub Vrana

unread,
Apr 18, 2012, 4:57:00 PM4/18/12
to Paul M. Jones, aur...@googlegroups.com
Disabling $double_encode is evil. I could want to say:

"You can write & in HTML as &amp;."

Applying htmlspecialchars() with disabled $double_encode produces this:

"You can write &amp; in HTML as &amp;."

Which will render in browser as:

"You can write & in HTML as &."

--
Jakub Vrana


-----Original Message-----
From: Paul M. Jones [mailto:pmj...@paul-m-jones.com]
Sent: Wednesday, April 18, 2012 10:30 AM
To: aur...@googlegroups.com
Cc: Jakub Vrana
Subject: Re: [auraphp] Aura View and Auto-Escaping


On Mar 7, 2012, at 2:35 AM, Jakub Vrana wrote:

I've worked through this, and making it immutable makes it

Paul M. Jones

unread,
Apr 18, 2012, 5:48:53 PM4/18/12
to Jakub Vrana, aur...@googlegroups.com

On Apr 18, 2012, at 3:57 PM, Jakub Vrana wrote:

> Disabling $double_encode is evil. I could want to say:
>
> "You can write & in HTML as &amp;."
>
> Applying htmlspecialchars() with disabled $double_encode produces this:
>
> "You can write &amp; in HTML as &amp;."
>
> Which will render in browser as:
>
> "You can write & in HTML as &."

Good call Jakub.

While I'm concerned about *not* escaping things, I'm less concerned about property manipulations within the template that end up with double-escaping. So, per a modified version of the example you provided earlier, where no escaping is applied at all ...

$this->c = 'x';
$this->c .= ' > ';
$this->c .= '2';
echo $this->c; // "x > 2"

... is clearly one that has to be addressed immediately, at the very least by defining __set(). (The $this represents the idea that this is happening inside a template script.) That's a no-brainer; I've done it locally by making __set() mimic offsetSet().

However, I feel some resistance to preventing double-escaping by making the object immutable, especially after having tried it locally. It's too easy to get around, and if you prevent all the get-arounds, it becomes very difficult to use in practice.

A modified form of the example you gave earlier, where double-escaping occurs ...

$this->b = 'x';
$this->b .= ' > ';
$this->b .= '2';
echo $this->b; // "x &amp;gt; 2"

... seems less troublesome when you realize that it's actually operating as designed. The revised version of that would be:

$this->b = 'x';
$this->__raw()->b .= ' > ';
$this->__raw()->b .= '2';
echo $this->b; // "x &gt; 2"

...or even:

$this->b = 'x';
echo $this->b . $this->escape(' > 2');

I don't know how often other people do these kinds of manipulations within template scripts. For myself, they are quite rare, but I'm only a sample size of one.

It seems like the kind of thing we can address in the documentation quite easily. It's annoying, but it's not a security violation; I think the tradeoffs here are are reasonable.

Other thoughts?


-- pmj

Jakub Vrana

unread,
Apr 18, 2012, 6:48:25 PM4/18/12
to Paul M. Jones, aur...@googlegroups.com
Your class should be able to handle plaintext. So requiring "You can write
&amp; in HTML as &amp;amp;." is obviously wrong because it's not plaintext.
Inputs to your class can come from all kinds of different sources.

--
Jakub Vrana


-----Original Message-----
From: Paul M. Jones [mailto:pmj...@paul-m-jones.com]
Sent: Wednesday, April 18, 2012 2:10 PM
To: Jakub Vrana
Cc: aur...@googlegroups.com
Subject: Re: [auraphp] Aura View and Auto-Escaping

On Apr 18, 2012, at 3:57 PM, Jakub Vrana wrote:

> Disabling $double_encode is evil. I could want to say:
>
> "You can write & in HTML as &amp;."

I had intuited as much, but I am at a loss to come up with actual usage
where it would be unsafe.

Your example shows an unexpected result, but I think it's from writing it
incorrectly in the first place. It seems more-correct to write it as "You
can write & in HTML as &amp;amp;." That would get translated to "You can
write &amp; in HTML as &amp;amp;." via htmlspecialchars, and would render in
the browser as "You can write & in HTML as &amp;."

Are there other examples where it is either unsafe or produces something
dramatically unexpected?


-- pmj

Paul M. Jones

unread,
Apr 18, 2012, 7:02:16 PM4/18/12
to Jakub Vrana, aur...@googlegroups.com

On Apr 18, 2012, at 5:48 PM, Jakub Vrana wrote:

> Your class should be able to handle plaintext. So requiring "You can write
> &amp; in HTML as &amp;amp;." is obviously wrong because it's not plaintext.
> Inputs to your class can come from all kinds of different sources.

I agree; I actually deleted this message and re-sent a different one. ;-)


-- pmj

Hari K T

unread,
Apr 19, 2012, 12:13:49 AM4/19/12
to aur...@googlegroups.com
@Paul ,

I asked you a doubt regarding whether its a good idea to move Escaper as a separate package . So if form component is coming that will help .

But at the time you mentioned Form view will be moved to Framework package .

But do you still think Escaper should be with Aura View or separate package ?

Paul M. Jones

unread,
Apr 19, 2012, 12:30:49 AM4/19/12
to aur...@googlegroups.com

On Apr 18, 2012, at 11:13 PM, Hari K T wrote:

> But do you still think Escaper should be with Aura View or separate package ?

For now I think it's still good to keep it with the View package. I'd prefer to make sure the View package doesn't have a dependency on anything else at this point.


-- pmj

pmjones

unread,
Apr 25, 2012, 7:09:10 PM4/25/12
to aur...@googlegroups.com

Hi all,


I'm at the point where I'd like to merge the `escaper` branch to `master`, thus making auto-escaping the default. There will need to be one additional change in Aura.Framework, where the inner view is set into the outer view via `__raw()` (so that it doesn't get double-escaped) but aside from that it appears to have no ill effects.


Comments? Critique?



-- pmj

jon.e...@gmail.com

unread,
Apr 25, 2012, 7:12:02 PM4/25/12
to aur...@googlegroups.com
Sounds good to me.

Jon

Hari K T

unread,
Apr 25, 2012, 8:14:25 PM4/25/12
to aur...@googlegroups.com
Yes , +1 from my part .

A question / suggestion though :-) .

Sometimes the user may not be able to understand whether its escaped or double escaped . So I feel we should add a wrapper of a <div class="something"> for the contents which is escaped and <div class="something2"> . So there is a different style for these ( red border , blue border , anyway different colors ) .

This makes life easier when we make the flag to debug something like that ? So the user knows which all are escaped and double escaped ?

Hari K T

unread,
Apr 25, 2012, 8:21:38 PM4/25/12
to aur...@googlegroups.com
One more suggestion is , I feel its a good idea to keep the new EscaperFactory, towards the end.

$template = new Template(
    new EscaperFactory,
    new TemplateFinder,
    new HelperLocator
);
something like 

$template = new Template(
    new TemplateFinder,
    new HelperLocator,
  new EscaperFactory
);
Why I felt so is . At any time , if we want to revert back , we can make the new Escaperfactory instance to null or '' in the construct ?

Hari K T

unread,
Apr 26, 2012, 3:23:06 AM4/26/12
to aur...@googlegroups.com
Hi ,

I feel there is another problem .

From the layout we are using

$this->__raw()->inner_view;

This means we are not actually getting the raw data. We are getting the first escaped data.

If we want to get real data from outer view , we should assign the $this->__raw() data as inner data from setOuterView .

In this way we get it escaped properly .
Reply all
Reply to author
Forward
0 new messages