Pyramid: URLDispatch vs Traversal question

464 views
Skip to first unread message

oO

unread,
Jan 29, 2011, 9:14:38 PM1/29/11
to pylons-devel
Sorry if this is a newbie question, but I'm finding that I'm a little
bit lost when it comes to the best way to implement an example
application using MongoDB (or any non-ZODB datastore) and traversal.

Conceptually, I like the idea of traversal instead of URL Dispatch but
I find I'm a little bit lost as to what is the best implementation in
practice, specially when not relying on ZOBD to handle the object
relationship

As part of the application, I'm implementing a blog section, which
uses only 2 basic document (in MongoDB terms), a "Blog" and a
"BlogEntry". however, from an application perspective, the URLs I need
to deal with are in the form

<Blog>/archive/{year}/{month}/{day}/<BlogEntry>

for example:

/blog/archive/2011/01/29/my-post-about-pyramid

Of course, I'd eventually want to have my <Blog> resource live
anywhere I want in the application, so using traversal makes a lot of
sense. But I almost feel that the parts between <Blog> and <BlogPost>
in my tree are better expressed as a URL Dispatch style patterns than
by creating resources, specially if resolving the resources means a
roundtrip to the database at every single step:

/blog
DB do we have an object named "blog"? yes, it's a BlogResource
/blog/archive
BlogResource objects have an implicit child called "archive"
/blog/archive/2011
DB Does this particular BlogResource "blog" have posts for the year
2011?
/blog/archive/2011/01
DB Does this particular BlogResource "blog" have posts for the month
2011/01?
/blog/archive/2011/01/29
DB Does this particular BlogResource "blog" have posts for the month
2011/01/29?
/blog/archive/2011/01/29/my-post-about-pyramid
DB Does this particular BlogResource "blog" have a
BlogPostResource 2011/01/29/my-post-about-pyramid?

Of course I could also stop anywhere in between , where I would want
to see a list of BlogPosts matching that particular date range. But It
seems crazy that I should have 5 different calls to the DB to resolve
that path, when I would probably want, once I know I have traversed to
a Blog object, to try to resolve the rest of the path with a single
call to the DB.

but I would love to not have to worry about the view infrastructure,
because one of the things I liked about Zope is being able to declare
views after the fact, completely separate from the resource objects,
so I would imagine I could have "/blog/json" or "/blog/archive/atom"
or "/blog/archive/2011/01/29/my-post-about-pyramid/edit" all being
valid URL as well, which should match to explicit views and not just
resources.

QUESTIONS:

- Am I thinking too hard, and should I just use URL Traversal instead?
- Should I just do what ZODB does and implement the full resource
hierarchy explicitely in my DB?
- Is there a way to create hybrid URLDispatch + Traversal application
when the traversal happens first, and a resource can trigger a route
match on the remaining portion of the URL?


oO




Rob Miller

unread,
Jan 29, 2011, 10:45:37 PM1/29/11
to pylons...@googlegroups.com
On 1/29/11 6:14 PM, oO wrote:
> Sorry if this is a newbie question, but I'm finding that I'm a little
> bit lost when it comes to the best way to implement an example
> application using MongoDB (or any non-ZODB datastore) and traversal.
>
> Conceptually, I like the idea of traversal instead of URL Dispatch but
> I find I'm a little bit lost as to what is the best implementation in
> practice, specially when not relying on ZOBD to handle the object
> relationship
>
> As part of the application, I'm implementing a blog section, which
> uses only 2 basic document (in MongoDB terms), a "Blog" and a
> "BlogEntry". however, from an application perspective, the URLs I need
> to deal with are in the form
>
> <Blog>/archive/{year}/{month}/{day}/<BlogEntry>
>
> for example:
>
> /blog/archive/2011/01/29/my-post-about-pyramid
>
> Of course, I'd eventually want to have my<Blog> resource live
> anywhere I want in the application, so using traversal makes a lot of
> sense.

this confuses me a bit. traversal is, in general, a bit _less_ flexible
than URL dispatch as far as having resources live "anywhere" in an
application. it's much easier to map two different URL patterns to the
same view than it is to construct two different traversal paths to the
same resource.

> But I almost feel that the parts between<Blog> and<BlogPost>
> in my tree are better expressed as a URL Dispatch style patterns than
> by creating resources,

i tend to agree.

> specially if resolving the resources means a
> roundtrip to the database at every single step:
>
> /blog
> DB do we have an object named "blog"? yes, it's a BlogResource
> /blog/archive
> BlogResource objects have an implicit child called "archive"
> /blog/archive/2011
> DB Does this particular BlogResource "blog" have posts for the year
> 2011?
> /blog/archive/2011/01
> DB Does this particular BlogResource "blog" have posts for the month
> 2011/01?
> /blog/archive/2011/01/29
> DB Does this particular BlogResource "blog" have posts for the month
> 2011/01/29?
> /blog/archive/2011/01/29/my-post-about-pyramid
> DB Does this particular BlogResource "blog" have a
> BlogPostResource 2011/01/29/my-post-about-pyramid?

yup, this pretty much describes what would be happening.

> Of course I could also stop anywhere in between , where I would want
> to see a list of BlogPosts matching that particular date range. But It
> seems crazy that I should have 5 different calls to the DB to resolve
> that path, when I would probably want, once I know I have traversed to
> a Blog object, to try to resolve the rest of the path with a single
> call to the DB.
>
> but I would love to not have to worry about the view infrastructure,
> because one of the things I liked about Zope is being able to declare
> views after the fact, completely separate from the resource objects,
> so I would imagine I could have "/blog/json" or "/blog/archive/atom"
> or "/blog/archive/2011/01/29/my-post-about-pyramid/edit" all being
> valid URL as well, which should match to explicit views and not just
> resources.

again, i'm a bit confused. zope is all about traversal, so the examples
you give above are quite different from each other. the first would
likely be a "json" view on the "blog" resource, the second an "atom"
view on the "archive: resource, the final would be an "edit" view on the
blog post resource. views are _always_ in the context of a resource
with zope.

> QUESTIONS:
>
> - Am I thinking too hard, and should I just use URL Traversal instead?

i suspect yes.

> - Should I just do what ZODB does and implement the full resource
> hierarchy explicitely in my DB?

you don't have to, but it'd make things conceptually a bit simpler (than
using traversal and not doing so, that is; url dispatch would be simpler
yet). but it does make the content more "placeful", so you'll still
have to jump through some hoops if you want the same blog post to be
reachable from two different urls.

> - Is there a way to create hybrid URLDispatch + Traversal application
> when the traversal happens first, and a resource can trigger a route
> match on the remaining portion of the URL?

yep: http://docs.pylonsproject.org/projects/pyramid/dev/narr/hybrid.html

-r

Wade Leftwich

unread,
Jan 30, 2011, 9:04:36 AM1/30/11
to pylons-devel
If you want to stick with Traversal, don't forget that you have
request.subpath available to you. For the path "/blog/archive/
2011/01/29/my-post-about-pyramid", you could have an Archive context
whose default (unnamed) view would be called with a request that
included request.subpath = ['2011', '01', '29', 'my-post-about-
pyramid'].

The view would then call a method on the Archive context using
arguments constructed from the request.subpath. The Archive would in
turn make a single call to Mongo. A path like '/blog/archive/2011/01'
would require some logic in the view, to choose a different Archive
method and a different template to show a listing of posts for
January, or perhaps redirect to /blog/month/2011/01 .

So if you are using Traversal in this way, the logic for parsing the
path lives in the view and/or the context. With URL Dispatch it's in
the configuration.

-- Wade

Tres Seaver

unread,
Jan 30, 2011, 11:12:58 AM1/30/11
to pylons...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 01/30/2011 09:04 AM, Wade Leftwich wrote:
> If you want to stick with Traversal, don't forget that you have
> request.subpath available to you. For the path "/blog/archive/
> 2011/01/29/my-post-about-pyramid", you could have an Archive context
> whose default (unnamed) view would be called with a request that
> included request.subpath = ['2011', '01', '29', 'my-post-about-
> pyramid'].

Other traversal-based strategies:

- - an 'archive' view registered for the blog context (assuming you don't
otherwise need the 'archive' context).

- - add non-persistent placeholder objects to your app to take advantage
of traversal's use of '__getitem__'. This solution allows you write
views for the "transient" context classes, even though they aren't in
your database. E.g., something like:

- --------------------------- %< ---------------------
class Blog:
...
def __getitem__(self, key):
if key == 'archive':
return Archive(self)
return super(Blog, self).__getitem__(key)

class Archive:
__name__ = 'archive'
YEARS = range(2010, 2100)
def __init__(self, blog):
self.__parent__ = self.blog = blog
def __getitem__(self, key):
try:
year = int(key)
except TypeError:
raise KeyError(key)
if year not in self.YEARS:
raise KyeError(key)
return ArchiveYear(self, year)

class ArchiveYear:
__name__ = 'archive'
MONTHS = range(1, 12)
def __init__(self, archive, year):
self.__parent__ = archive
self.__name__ = str(year)
self.blog = archive.blog
self.year = year
def __getitem__(self, key):
try:
mnnth = int(key)
except TypeError:
raise KeyError(key)
# or, return self.blog.entries[key]
if month not in self.MONTHS:
raise KeyError(key)
return ArchiveMonth(self, month)

class ArchiveMonth:
__name__ = 'archive'
MONTHS = range(1, 12)
def __init__(self, year, month):
self.__parent__ = year
self.__name__ = str(month)
self.blog = year.blog
self.month = month
def __getitem__(self, key):
return self.blog.entries[key]

- --------------------------- %< ---------------------

> The view would then call a method on the Archive context using
> arguments constructed from the request.subpath. The Archive would in
> turn make a single call to Mongo. A path like '/blog/archive/2011/01'
> would require some logic in the view, to choose a different Archive
> method and a different template to show a listing of posts for
> January, or perhaps redirect to /blog/month/2011/01 .
>
> So if you are using Traversal in this way, the logic for parsing the
> path lives in the view and/or the context. With URL Dispatch it's in
> the configuration.

Agreed. I find the traversal stuff much easier to test on this account.
The fact that I find it easier to think about is likely due to the warp
in my brain from years of heavy Zope usage. ;)


Tres.
- --
===================================================================
Tres Seaver +1 540-429-0999 tse...@palladion.com
Palladion Software "Excellence by Design" http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk1FjgoACgkQ+gerLs4ltQ4fuwCdE04ygTijKNcWg6mLsLK8Ruq3
ewUAoL+DDXtoQk9+idz4PrwS0E2DaQbJ
=g74T
-----END PGP SIGNATURE-----

Wade Leftwich

unread,
Jan 30, 2011, 11:25:23 AM1/30/11
to pylons-devel
Actually a bit more in keeping with the spirit of Traversal, and again
keeping your calls to Mongo to a mininum, would be something like
this:

/blog/month/2011/01/29/my-post-about-pyramid would do this:

archive = Blog['archive']
year = archive['2011'] = archive.__getitem__('2011') = Year(y=2011)
month = year['01'] = year.__getitem__('01') = Month(y=2011, m=1)
day = month['29'] = month.__getitem__('01') = Day(y=2011, m=1, d=29)
post = day['my-post-about-pyramid'] = day.__getitem__('my-post-about-
pyramid') = Post(y=2011, m=1, d=29, slug='my-post-about-pyramid')
... and the default view for Post would be to query Mongo

This would handle shorter paths in a natural way -- e.g. '/blog/month/
2011/01' would map to the default view for Month(y=2011, m=1), which
would call Post.list_posts_for_month() .

Wade Leftwich

unread,
Jan 30, 2011, 12:39:14 PM1/30/11
to pylons-devel
> This would handle shorter paths in a natural way -- e.g. '/blog/month/
> 2011/01' would map to the default view for Month(y=2011, m=1), which
> would call Post.list_posts_for_month() .
>

Er, make that "the default view for Month(y=2011, m=1), which would
call Month.list_posts_for_month() which would actually talk to Mongo."

Mike Orr

unread,
Jan 30, 2011, 1:21:24 PM1/30/11
to pylons...@googlegroups.com
On Sun, Jan 30, 2011 at 8:25 AM, Wade Leftwich <wlef...@gmail.com> wrote:
> Actually a bit more in keeping with the spirit of Traversal, and again
> keeping your calls to Mongo to a mininum, would be something like
> this:
>
> /blog/month/2011/01/29/my-post-about-pyramid would do this:
>
> archive = Blog['archive']
> year = archive['2011'] = archive.__getitem__('2011') = Year(y=2011)
> month = year['01'] = year.__getitem__('01') = Month(y=2011, m=1)
> day = month['29'] = month.__getitem__('01') = Day(y=2011, m=1, d=29)
> post = day['my-post-about-pyramid'] = day.__getitem__('my-post-about-
> pyramid') = Post(y=2011, m=1, d=29, slug='my-post-about-pyramid')
> ... and the default view for Post would be to query Mongo
>
> This would handle shorter paths in a natural way -- e.g. '/blog/month/
> 2011/01' would map to the default view for Month(y=2011, m=1), which
> would call Post.list_posts_for_month() .

Are you traversing each level of the date here, or using a subpath from archive?

It seems like extra overhead to create a temporary Year class and
Month class just to traverse through them. What are you gaining?

I'm thinking of making a CMS site with articles embedding nested
comments. Chris suggested making the article a repoze.folder.Folder
subclass. I said it seems funny to think of an artlcle as a folder. He
said he makes almost everything folders nowadays, as it makes it
easier to both use the object and traverse through it. Could
ArchiveYear and ArchiveMonth be made into Folder subclasses to factor
away some of the code?

--
Mike Orr <slugg...@gmail.com>

Wade Leftwich

unread,
Jan 30, 2011, 3:47:05 PM1/30/11
to pylons-devel
On Jan 30, 1:21 pm, Mike Orr <sluggos...@gmail.com> wrote:

> Are you traversing each level of the date here, or using a subpath from archive?

Traversing each level -- the same pattern Tres described this morning
as "transient" context classes.

>It seems like extra overhead to create a temporary Year class and
>Month class just to traverse through them. What are you gaining?

Simplicity.

If you're going to use the subpath, then your view callable has to
know something about the structure of the site. During traversal, the
successive contexts don't have access to the subpath -- the view
callable gets that as part of the request, after traversal is
finished. So each step of the traversal is ignorant about what's
downstream.

When you instantiate a temporary ArchiveYear object, you don't know
whether its __getitem__ method will be called, or its default view, or
a named view. (The default view, not shown in Tres' ArchiveYear, might
go to the datastore for monthly counts of posts, or a list of all the
posts in the year.)

Similarly, the ArchiveMonth object knows its own year and month
values, but not whether it will be called on to deliver an ArchiveDay
or a list of posts for the month.

>Chris suggested making the article a repoze.folder.Folder
>subclass. I said it seems funny to think of an artlcle as a folder. He
>said he makes almost everything folders nowadays, as it makes it
>easier to both use the object and traverse through it. Could
>ArchiveYear and ArchiveMonth be made into Folder subclasses to factor
>away some of the code?

I haven't used repoze.folder yet, but I'm going to check it out,
thanks for the tip. Per Tres' remark this morning, Chris' reliance on
folders may have something to do with prolonged exposure to Zope ...

oO

unread,
Jan 30, 2011, 3:56:22 PM1/30/11
to pylons-devel
Thanks for the info. I'm following up on the idea of using
request.subpath, but at what point does this property get created?
This is my understanding of the call stack:

1. the RootFactory is called, with the request object as a parameter
2. the RootFactory returns the root resource
3. Pyramid recursively calls the __getitem__ method on the resource
until it runs out of parts or a resource returns a KeyError
4. ...
5. a view gets called with the request and context

How do I access the request object from within the __getitem__ method
of my resources? If I'm to dynamically return resource objects based
on the subpath, then I need to somehow access the request object at
that point

Currently I'm storing the request object I got in step 1 it inside my
RootFactory

root_factory.request = request

and every resource object is constructed with

Resource.__name__
Resource.__parent__
Resource.root_factory = root_factory

Which allows me to access the request object from inside the
__getitem__ method, but I get an attribute error that request.subpath
doesn't exist. Does it only get created later?

I'm guessing that for the archive object to be the valid context and a
view to be called, it should raise a KeyError during traversal, but
if I do this, I just get a 404 error without getting a view called
that would be able to access the request.subpath property.

Sorry if I'm creating more confusion on the subject.

oO

Wade Leftwich

unread,
Jan 30, 2011, 4:53:44 PM1/30/11
to pylons-devel
Hello oO,

As you have discovered, the subpath is not easily available during
traversal, when the context objects are being chosen; it's the view
callable that gets the subpath, after traversal is finished.

I thought about it some more and, if it was me, I would follow the
"transient context classes" pattern described by Tres in this thread
today. Something like this:


class Root(dict):
__name__ = None
__parent__ = None


class Archive(object):
def __init__(self, name=None, parent=None):
self.__name__ = name
self.__parent__ = parent

def __getitem__(self, key):
yearval = int(key)
return Year(yearval=yearval, name=key, parent=self)

class Year(object):
def __init__(self, yearval, name=None, parent=None):
self.yearval = year
self.__name__ = name
self.__parent__ = parent

def __getitem__(self, key):
monthval = int(key)
return Month(monthval=monthval, yearval=self.yearval,
name=key, parent=self)

def listposts(self):
"""Default view might call this"""
return ask_mongo_for_this_years_links(self.year)

class Month(object):
def __init__(self, monthval, yearval, name=None, parent=None):
self.monthval = monthval
self.yearval = yearval
self.__name__ = name
self.__parent__ = parent

def __getitem__(self, key):
dayval = int(key)
return Day(dayval=dayval, monthval=self.monthval,
yearval=self.yearval, name=key, parent=self)

def listposts(self):
"""Default view might call this"""
return ask_mongo_for_this_months_links(self.year, self.month)

class Day(object):
def __init__(self, dayval, monthval, yearval, name=None,
parent=None):
self.dayval = dayval
self.monthval = monthval
self.yearval = yearval
self.__name__ = name
self.__parent__ = parent

def __getitem__(self, key):
slug = key
return Post(slug, dayval=self.dayval, monthval=self.monthval,
yearval=self.yearval, name=key, parent=self)

def listposts(self):
"""Default view might call this"""
return ask_mongo_for_this_days_links(self.year, self.month,
self.day)

class Post(object)
def __init__(self, slug, dayval, monthval, yearval, name=None,
parent=None):
self.dayval
self.monthval = monthval
self.yearval = yearval
self.__name__ = name
self.__parent__ = parent
self.content = ask_mongo_for_blogpost(self.year, self.month,
self.day, slug)

root = Root()
root['archive'] = Archive(name='archive', parent=root)

###

Now you don't have to be concerned with subpaths at all.

-- Wade



Tres Seaver

unread,
Jan 31, 2011, 11:45:12 AM1/31/11
to pylons...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 01/30/2011 01:21 PM, Mike Orr wrote:


> On Sun, Jan 30, 2011 at 8:25 AM, Wade Leftwich wrote:
>> Actually a bit more in keeping with the spirit of Traversal, and again
>> keeping your calls to Mongo to a mininum, would be something like
>> this:
>>
>> /blog/month/2011/01/29/my-post-about-pyramid would do this:
>>
>> archive = Blog['archive']
>> year = archive['2011'] = archive.__getitem__('2011') = Year(y=2011)
>> month = year['01'] = year.__getitem__('01') = Month(y=2011, m=1)
>> day = month['29'] = month.__getitem__('01') = Day(y=2011, m=1, d=29)
>> post = day['my-post-about-pyramid'] = day.__getitem__('my-post-about-
>> pyramid') = Post(y=2011, m=1, d=29, slug='my-post-about-pyramid')
>> ... and the default view for Post would be to query Mongo
>>
>> This would handle shorter paths in a natural way -- e.g. '/blog/month/
>> 2011/01' would map to the default view for Month(y=2011, m=1), which
>> would call Post.list_posts_for_month() .
>
> Are you traversing each level of the date here, or using a subpath from archive?
>
> It seems like extra overhead to create a temporary Year class and
> Month class just to traverse through them. What are you gaining?

One advantage: this strategy allows registering views against those
transient classes, which means that in the "reusable app" scenario the
re-using app can override those views selectively.

Another advantage: the logic is in the code, not in per-deployment
configuration, which makes it unit-testable.

> I'm thinking of making a CMS site with articles embedding nested
> comments. Chris suggested making the article a repoze.folder.Folder
> subclass. I said it seems funny to think of an artlcle as a folder. He
> said he makes almost everything folders nowadays, as it makes it
> easier to both use the object and traverse through it. Could
> ArchiveYear and ArchiveMonth be made into Folder subclasses to factor
> away some of the code?

That would makes sense if you actually want to *store* the entries
inside the archive/year/month tree (a perfectly reasonable strategy).

Making the *entry* a folder would make sense if you wanted to store
other items (images, audio / video clips, tarballs, PDFs) inside it.


Tres.
- --
===================================================================
Tres Seaver +1 540-429-0999 tse...@palladion.com
Palladion Software "Excellence by Design" http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk1G5xgACgkQ+gerLs4ltQ78OACfSx7iwBQjabB/GYwpG4qlD/Q0
LOIAn02erOioBx5CWRb/pwEVxDxHFaru
=1kAq
-----END PGP SIGNATURE-----

oO

unread,
Feb 1, 2011, 7:50:13 PM2/1/11
to pylons-devel
Thanks for all the useful replies. I learned a lot in the last few
days. Some things that I ended up implementing:

1. a generic BlogFolder class for all my resources that return a list
of posts. I figured out that the context declaration for a view can be
an inherited class, which is great. All the resources that inherit
from the BlogFolder class now have the same default view.

2. you can raise a NotFound exception inside a resource. Which means
that I can do a quick check for valid components in the resource phase
(like a year being 4 digits) but still raise an error when the view
calls my resource to get the posts for a year that doesn't have any.

BTW I love the separation between the view and the renderer, which
means I can completely separate the data returned by various views,
from the template work which makes it pretty for human consumption.
using the json renderer as a default and the json view firefox plugin
makes for a great dev environment.

oO
Reply all
Reply to author
Forward
0 new messages