Why does webob iterate over environ, calling __html__()?

7 views
Skip to first unread message

Chris

unread,
Sep 26, 2009, 4:44:58 PM9/26/09
to pylons-discuss
Hi,
Looking for a little advice on how to solve this.
Here is the problem: I store an object in environ and this object has
a special __getattr__ which always returns an object. However, this
object is *not* callable.

On a redirect, WebOb iterates over environ checking for __html__ via
hasattr (which calls __getattr__). My object in the environ returns
an object for __html__, but again it is not callable, so an Exception
is thrown.

The object store in environ is a thirdparty MongoDB database instance.

So, is this a bug with WebOb, that is, should webob check that
__html__ is callable?
# in webob html_escape
html = getattr(s, '__html__', None)
if html and callable(html):
return html()

Or should I fix this in the PyMongo Database class? Either way I have
to change or subclass a 3rd party lib.

I'm leaning towards fixing this in webob, because I have other objects
that have special __getattr__ methods. Any advice is appreciated.

Thanks,
Chris

Mike Orr

unread,
Sep 27, 2009, 1:44:44 AM9/27/09
to pylons-...@googlegroups.com

.__html__() is a convention used for smart escaping. String-like
objects should return self to indicate that they're preformatted and
should not be escaped further. Other objects can define .__html__() to
indicate their preferred HTML format. This convention is used by
literal() in webhelpers.html, and by the render functions that ship
with Pylons (render_mako, etc). I didn't know WebOb itself also did
it.

Previous implementations of smart escaping (Quixote, Genshi) required
preformatted objects to be a certain class. This made it impossible
for third-party libraries to mark their objects preformatted, because
they'd have to depend on the package with the special class, which
they'd refuse to do or wouldn't know about. Worse, they would be tied
to one specific template library rather than supporting all of them.
The .__html__ strategy allows third-party packages to define their own
string subclass with an .__html__ method rather than having to depend
on a special class in a foreign package.

So in this case, is the object yours or somebody else's? If it's
yours, define an .__html__ method that returns unicode. If it's
somebody else's, well, what does it think .__html__ is supposed to be?
Or is it just returning something for all .__getattr__ calls
regardless of value?

callable() has been deprecated by Guido and is not in Python 3. He
says rather than using it, just call the d**n thing and see if you get
an exception or not.

--
Mike Orr <slugg...@gmail.com>

Chris

unread,
Sep 27, 2009, 11:18:11 AM9/27/09
to pylons-discuss
I see. Thanks for the info about __html__.

> I didn't know WebOb itself also did it.
It only seems to do it on an HTTP Redirection. In webob.exc,
_make_body(self, environ, escape) it loops over environ, calling
escape on any values. (i'm still not exactly sure why).

> Or is it just returning something for all .__getattr__ calls
> regardless of value?

pymongo.database instance always returns a collection object,
regardless of the attr name. The collection object is not callable,
(well actually it defines __call__, but its implementation throws an
exception immediately on purpose). The pymongo database __getattr__
looks like this

# in pymongo database
def __getattr__(self, name):
return Collection(self.db, name)


Ok, good to know about the callable deprecation. It looks like I may
just subclass pymongo.database and override its __getattr__ to check
for __html__. That doesn't feel very clean, but it'll work.


On Sep 27, 12:44 am, Mike Orr <sluggos...@gmail.com> wrote:
> Mike Orr <sluggos...@gmail.com>

Mike Orr

unread,
Sep 27, 2009, 3:03:49 PM9/27/09
to pylons-...@googlegroups.com
On Sun, Sep 27, 2009 at 8:18 AM, Chris <fractal...@gmail.com> wrote:
>
> I see.  Thanks for the info about __html__.
>
>> I didn't know WebOb itself also did it.
> It only seems to do it on an HTTP Redirection.  In webob.exc,
> _make_body(self, environ, escape) it loops over environ, calling
> escape on any values.  (i'm still not exactly sure why).

If it's displaying the environment in an error message, it has to
escape it to avoid security vulnerabilities. Otherwise a cracker can
force an exception and put malicious Javascript in the query string
(which would be displayed as part of the environment).

>> Or is it just returning something for all .__getattr__ calls
>> regardless of value?
>
> pymongo.database instance always returns a collection object,
> regardless of the attr name.  The collection object is not callable,
> (well actually it defines __call__, but its implementation throws an
> exception immediately on purpose).  The pymongo database __getattr__
> looks like this
>
> # in pymongo database
> def __getattr__(self, name):
>  return Collection(self.db, name)
>
>
> Ok, good to know about the callable deprecation.  It looks like I may
> just subclass pymongo.database and override its __getattr__ to check
> for __html__.  That doesn't feel very clean, but it'll work.

It sounds like the best solution. Sometimes you have to make kludges
like this when two unrelated libraries make contradictory assumptions.
Fortunately pymongo.database is overridable.

Why does pymongo.database return a useless value for unknown
attributes? Perhaps this is a bug in PyMongo. I'm not sure what
``Collection(self.db, name)`` means, but if a property is not
specifically defined it should raise AttributeError. Otherwise it
will throw off not only WebOb but all analysis/introspection tools.

--
Mike Orr <slugg...@gmail.com>

Chris

unread,
Sep 27, 2009, 3:42:29 PM9/27/09
to pylons-discuss
> If it's displaying the environment in an error message, it has to
> escape it to avoid security vulnerabilities. Otherwise a cracker can
> force an exception and put malicious Javascript in the query string
> (which would be displayed as part of the environment).

I see. I will dig around and try to understand the flow more. I
still have questions, but I'll see if I can figure it out before
asking.


> Why does pymongo.database return a useless value for unknown
> attributes? Perhaps this is a bug in PyMongo. I'm not sure what
> ``Collection(self.db, name)`` means, but if a property is not
> specifically defined it should raise AttributeError. Otherwise it
> will throw off not only WebOb but all analysis/introspection tools.

I think it is like that as a convenience feature. It strikes me as a
little odd as well.
In MongoDB, a collection is somewhat analogous to a RDBMS table. A
table stores records, a mongo collection stores json-ish documents.
For example, you may have a 'users' collection, 'photos' collection
etc. The __getattr__ creates the collection if it did not already
exist, or returns the existing collection.

example,
db = Database()
db.users.save(...)
db.random_collection.save(...)

# Equivalent to this form - more explicit
users = Collection(db, 'users')
users.save(...)
random_collection = Collection(db, 'random_collection')
random_collection.save(...)

Thanks for your help!

On Sep 27, 2:03 pm, Mike Orr <sluggos...@gmail.com> wrote:
> Mike Orr <sluggos...@gmail.com>

Marius Gedminas

unread,
Sep 28, 2009, 9:32:40 AM9/28/09
to pylons-...@googlegroups.com
On Sun, Sep 27, 2009 at 08:18:11AM -0700, Chris wrote:
>
> I see. Thanks for the info about __html__.
>
> > I didn't know WebOb itself also did it.
> It only seems to do it on an HTTP Redirection. In webob.exc,
> _make_body(self, environ, escape) it loops over environ, calling
> escape on any values. (i'm still not exactly sure why).
>
> > Or is it just returning something for all .__getattr__ calls
> > regardless of value?
>
> pymongo.database instance always returns a collection object,
> regardless of the attr name. The collection object is not callable,
> (well actually it defines __call__, but its implementation throws an
> exception immediately on purpose). The pymongo database __getattr__
> looks like this
>
> # in pymongo database
> def __getattr__(self, name):
> return Collection(self.db, name)

There are all kinds of special names that can trip you up, in various
places. I've found that it's safest to add

if name.startswith('_'):
raise AttributeError(name)

to such magic __getattr__ methods.

If you actually use collection names starting with an underscore, you
could make a stricter check and look for __..., or __...__.

Marius Gedminas
--
There's a special hell 4 people who replace words with numbers.
-- Ben "Yahtzee" Croshaw

signature.asc
Reply all
Reply to author
Forward
0 new messages