John, maybe we should get together sometime and talk through how to
merge your work with mine. The addition of an Output object has made
many things clearer but I'm sure there's going to be some gotchas in
there. In fact, the more I work inside the core methods of Erector,
the more I feel like rewriting them from scratch...
---
Alex Chaffee - al...@cohuman.com - http://alexch.github.com
Stalk me: http://friendfeed.com/alexch | http://twitter.com/alexch |
http://alexch.tumblr.com
On Tue, Dec 22, 2009 at 1:34 PM, John Firebaugh
<john.fi...@gmail.com> wrote:
> It's not a ruby interpreter warning I'm talking about, it's an
> admonition in the documentation for the method -- warning you, in
> effect, to watch out for the fact that it is not idempotent.
>
> The way idempotency is accomplished is by tagging the result with a
> special String subclass (RawString) and then checking for the presence
> of that tag (albeit a bit indirectly, in order to be compatible with
> rails and other libraries with a similar concept).
>
> It is still quite possible to output 2<4, 2<4 or 2&lt;4. Choose between:
>
> rawtext "2<4" # => "2<4"
> text! "2<4" # => "2<4"
> text "2<4" # => "2<4"
> text "2<4" # => "2&lt;4"
> text CGI.html_escape("2<4") # => "2&lt;4"
>
> Patch:
>
> http://github.com/bigfix/erector/commit/838ed3c38f8959dd24676752c26eb51851739aef
>
>
> On Tue, Dec 22, 2009 at 1:15 PM, Andy Peterson <an...@carbonfive.com> wrote:
>> Since you're asking,
>> I would prefer to work on muting the warning rather than having the h
>> routine not do what it is told.
>>
>> My reasoning is that trying to make an encoding method idempotent doesn't
>> really seem
>> feasible. Without some additional metadata, it doesn't know whether the
>> text is was
>> given was encoded yet... then it starts making heuristic guesses which are
>> harder for
>> programmers to reason about and use.
>>
>> Sometimes I want 2<4, sometimes 2<4, and sometimes even 2&lt;4.
>>
>> Andy
>>
>> On Tue, Dec 22, 2009 at 10:36 AM, John Firebaugh <john.fi...@gmail.com>
>> wrote:
>>>
>>> > It's on the "output" branch but I'd like to get some resolution on the
>>> > Rails issues from the other thread so we can get this stuff onto the
>>> > main branch.
>>>
>>> Working on it.
>>>
>>> Would anyone object to changing Widget#h to be idempotent, i.e.
>>> h(text) returns raw(text.html_escape), and h(h(text)) doesn't
>>> double-escape? It would make integrating with rails output safety
>>> easier, would match the behavior of ERB::Util.h in rails 3.0 and
>>> 2.3+rails_xss plugin, and we could remove this warning:
>>>
>>> # Note that the #text method automatically HTML-escapes
>>> # its parameter, so be careful *not* to do something like
>>> text(h("2<4"))
>>> # since that will double-escape the less-than sign (you'll get
>>> # "2&lt;4" instead of "2<4").
>>>
>>> --
>>>
>>> You received this message because you are subscribed to the Google Groups
>>> "erector" group.
>>> To post to this group, send email to ere...@googlegroups.com.
>>> To unsubscribe from this group, send email to
>>> erector+u...@googlegroups.com.
>>> For more options, visit this group at
>>> http://groups.google.com/group/erector?hl=en.
>>>
>>>
>>
>>
>>
>> --
>> Andy Peterson | Carbon Five | 415.546.0500 x17 |
>> mailto:an...@carbonfive.com
>>
>
I'm going to get slightly philosophical but having "h" return a RawString
strikes me as the most obvious thing (subject to the musings below about
what "h" is for, anyway).
I think of what the "text" method does not as escaping but as rendering a
string into rendered html. The two live in different kinds of objects
(strings versus RawStrings in this context; strings versus DOM nodes or
the like in other APIs; strings versus Widget objects in some erector
contexts like examples/join.rb). If all the code were consistently
written in this style, "raw" would never be needed (it is just a way of
talking with code which doesn't operate in this fashion). For the same
reason, I don't have a use case for the "h" method, which made me ponder
deleting "h" rather than changing it the last time this subject came up.
The alternate mindset of taking a string and turning it into an escaped
string causes a world of hurt, in my experience. In my suggested model,
the concept of "re-escaping" becomes nonsensical - quoting is something
which happens to strings, not to rendered html, so there would be no
reason to feed html into something which expects a string.
I believe there are protocols which call for generating "2&lt;3" (some
of the variants of RSS spring to mind), but I regard this as fairly
obscure (and generally frowned on in the XML world, as I understand
things). So sure, there should be some way to do it, but I wouldn't design
too much around this situation (unless it really is the "normal" use case
for "h").