PyCon 2006 wrapup

5 views
Skip to first unread message

Kevin Dangoor

unread,
Mar 3, 2006, 7:52:05 AM3/3/06
to turbo...@googlegroups.com
Hi, everyone!

PyCon was definitely a good time. I highly recommend it to folks. It
was a great chance to meet many people doing a wide variety of things
with Python. The talks covered a fair number of topics (including a
sneak preview of Python 2.5 for those of us not watching python-dev).

We had a good turnout for the sprint. On Sunday, we did some planning
on what we would do and how we'd proceed. There were about 12 of us!

We were split into 3 teams, working on Docudo, Kid and TurboGears WSGI.

Joel Pearson sent me a Docudo update last night saying that they've
got a spec together and Docudo is now like a "wiki20" plus: stores in
subversion, edits pages using TinyMCE and keeps pages in separate
project version namespaces. For those new to Docudo: the idea is to
create a web-based tool that fixes some of the warts of using a wiki
for software project documentation.

You can join in on Docudo! The mailing list is here:
http://groups.google.com/group/docudo
and the Subversion repository is here:
http://www.turbogears.org/svn/docudo

In addition to Joel, Karl Guertin, Mike Orr, Arthur McLean, Kevin Horn
all participated in Docudo.

Mike Pirnat and David Stanek were working on Kid. They created tests
that time a variety of Kid's operations and squeezed out some
performance gains. Then they embarked on a major restructuring of
Kid's internals with an aim to simplify. Kid jumps through some hoops
to stream documents and can become much simpler if those hoops aren't
there. I left before they did yesterday, so I'm not certain where they
left off.

I was working with Ian Bicking, Gary Godfrey, Matt Good and Bill
Zingler on better WSGI support for a future TurboGears version. This
is in the tgpycon branch that was apparently scaring some folks on
Monday. That branch doesn't pass all of the original tests yet, but it
does pass some new and interesting ones. As of now, you can
instantiate a TurboGears application right in your object tree and
have URL and configuration sanity. You can also attach a TurboGears
app to a URL via a config file. And mix a TurboGears app with other
WSGI apps that are possibly written with other frameworks.

Before I left yesterday, Gary Godfrey told me about his next step with
that branch and his change was going to make it even more flexible and
useful by making each controller class a "WSGI app" with the ability
to insert "middleware" at any part of your stack. You'll also have
incredible control over how object traversal wants *if you want to*.

This branch has a goal that existing TurboGears apps should continue
working without modifications. We're well on the way with that goal,
but serious testing will be needed. My plan for this branch is that
after 0.9 enters beta, I'll branch 0.9. The tgpycon branch will then
move to the trunk to get more serious attention. That code won't be
out until after TG 1.0, though.

We got some real traction on some "big ticket items", which I'm very
pleased about. These would have been much harder to work out
individually and I think we've got enough started that we'll be able
to keep the work going and get contributions from others now that the
sprint is over.

Feel free to try out the tgpycon branch and talk about it on the list.

Kevin

Michele Cella

unread,
Mar 3, 2006, 8:15:13 AM3/3/06
to TurboGears
Kevin Dangoor wrote:
> Hi, everyone!
>
> PyCon was definitely a good time. I highly recommend it to folks. It
> was a great chance to meet many people doing a wide variety of things
> with Python. The talks covered a fair number of topics (including a
> sneak preview of Python 2.5 for those of us not watching python-dev).
>

Really cool indeed. :-)

> We had a good turnout for the sprint. On Sunday, we did some planning
> on what we would do and how we'd proceed. There were about 12 of us!
>

That's a lot of mind sharing! :D

> We were split into 3 teams, working on Docudo, Kid and TurboGears WSGI.
>
> Joel Pearson sent me a Docudo update last night saying that they've
> got a spec together and Docudo is now like a "wiki20" plus: stores in
> subversion, edits pages using TinyMCE and keeps pages in separate
> project version namespaces. For those new to Docudo: the idea is to
> create a web-based tool that fixes some of the warts of using a wiki
> for software project documentation.
>
> You can join in on Docudo! The mailing list is here:
> http://groups.google.com/group/docudo
> and the Subversion repository is here:
> http://www.turbogears.org/svn/docudo
>
> In addition to Joel, Karl Guertin, Mike Orr, Arthur McLean, Kevin Horn
> all participated in Docudo.
>

Wow, this is definitely a big push forward for Docudo!

> Mike Pirnat and David Stanek were working on Kid. They created tests
> that time a variety of Kid's operations and squeezed out some
> performance gains. Then they embarked on a major restructuring of
> Kid's internals with an aim to simplify. Kid jumps through some hoops
> to stream documents and can become much simpler if those hoops aren't
> there. I left before they did yesterday, so I'm not certain where they
> left off.
>

Great, I've seen a big activity on Kid's Trac, I'm sure that this will
help to fix the base reloading issues, namespacing kid internals and
providing helpful error displaying along with performance improvements.

> I was working with Ian Bicking, Gary Godfrey, Matt Good and Bill
> Zingler on better WSGI support for a future TurboGears version. This
> is in the tgpycon branch that was apparently scaring some folks on
> Monday. That branch doesn't pass all of the original tests yet, but it
> does pass some new and interesting ones. As of now, you can
> instantiate a TurboGears application right in your object tree and
> have URL and configuration sanity. You can also attach a TurboGears
> app to a URL via a config file. And mix a TurboGears app with other
> WSGI apps that are possibly written with other frameworks.
>
> Before I left yesterday, Gary Godfrey told me about his next step with
> that branch and his change was going to make it even more flexible and
> useful by making each controller class a "WSGI app" with the ability
> to insert "middleware" at any part of your stack. You'll also have
> incredible control over how object traversal wants *if you want to*.
>

Ok, I can't just stop saying Cool! :D
These things can really become a killer feature for a web framework
like TG IMHO.

> This branch has a goal that existing TurboGears apps should continue
> working without modifications. We're well on the way with that goal,
> but serious testing will be needed. My plan for this branch is that
> after 0.9 enters beta, I'll branch 0.9. The tgpycon branch will then
> move to the trunk to get more serious attention. That code won't be
> out until after TG 1.0, though.
>

This sounds like a good idea to me...

> We got some real traction on some "big ticket items", which I'm very
> pleased about. These would have been much harder to work out
> individually and I think we've got enough started that we'll be able
> to keep the work going and get contributions from others now that the
> sprint is over.
>

Definitely!

Thanks for the nice writeup and to all you guys that made this
possible!

Ciao
Michele

Mike Pirnat

unread,
Mar 4, 2006, 11:46:46 AM3/4/06
to turbo...@googlegroups.com
I've posted some of my least awful PyCon photos, including some from
the TG sprint, here:

http://www.flickr.com/photos/mikepirnat/sets/72057594074231320/

--
Mike Pirnat
mpi...@gmail.com
http://www.pirnat.com/

Ben Bangert

unread,
Mar 4, 2006, 3:51:46 PM3/4/06
to TurboGears
Kevin Dangoor wrote:

> Before I left yesterday, Gary Godfrey told me about his next step with
> that branch and his change was going to make it even more flexible and
> useful by making each controller class a "WSGI app" with the ability
> to insert "middleware" at any part of your stack. You'll also have
> incredible control over how object traversal wants *if you want to*.

I'd be interested in hearing how this works out. Similar things were
considered for Pylons controllers, however it seems to greatly
over-complicate things when you make individual controllers true WSGI
apps, vs merely objects that you call with a WSGI interface. The latter
still allows for insertion of some types of "middleware" at any part of
the stack, though since your application itself is typically a
collection of 'controllers', rather than each individual controller
being an App capable of functioning totally independently.... it warps
a bunch of terminology that will likely drive many people insane. :)

Great to hear about the continued drives for deeper WSGI integration. I
should note that while Pylons currently uses Myghty resolving in the
core, its on the road-map to split out the resolver sequence to a more
WSGI styled approach which is where RhubarbTart appears to be going.
This means that future TurboGears versions will vary very little from
future Pylons versions. Good times ahead!

Cheers,
Ben

Gary Godfrey

unread,
Mar 5, 2006, 11:15:05 AM3/5/06
to TurboGears
Ben Bangert wrote:
> Kevin Dangoor wrote:
>
> > Before I left yesterday, Gary Godfrey told me about his next step with
> > that branch and his change was going to make it even more flexible and
> > useful by making each controller class a "WSGI app" with the ability
> > to insert "middleware" at any part of your stack. You'll also have
> > incredible control over how object traversal wants *if you want to*.
>
> I'd be interested in hearing how this works out. Similar things were
> considered for Pylons controllers, however it seems to greatly
> over-complicate things when you make individual controllers true WSGI
> apps, vs merely objects that you call with a WSGI interface. The latter
> still allows for insertion of some types of "middleware" at any part of
> the stack, though since your application itself is typically a
> collection of 'controllers', rather than each individual controller
> being an App capable of functioning totally independently.... it warps
> a bunch of terminology that will likely drive many people insane. :)

OK - I'll outline some of my current thoughts and let you pound on 'em.
I'll be glad to hear what type of things you ran into with pylons.

So, the general idea is that everything is wsgi until it can't be. So
a TurboGears controller is wsgi middleware which (by default) takes one
item off the url and uses it to wsgi call self methods. It may also
have other uses, like capturing a Form Validation Exception and calling
appropriate error methods. This also imples that the @expose decorator
is really a wsgi adapter.

The fun part about this is that a (for instance) Authorization check
just needs to be simple wsgi middleware and it works on whole
controllers as well as with individual methods with a single routine.

What also will begin to happen (or at least should happen) is that
we'll start using the environ dict in a more expanded fashion. I'm
thinking of something akin to Zope3's Interfaces, but not so formal.
So, "Identity" middleware will set a "wsgi.Identity" which will contain
UserName, Groups, Roles, Permissions, etc. Then, a Authorization
check is completely separate from the Identity mechinism. I suspect
we'll need something like PEPs just for this (it will have to move far
faster than PEPs do - especially initially). Vision: a TurboGears
application should run under Zope that has a wsgi interface and may
even be able to handle simple permissions.

> Great to hear about the continued drives for deeper WSGI integration. I
> should note that while Pylons currently uses Myghty resolving in the
> core, its on the road-map to split out the resolver sequence to a more
> WSGI styled approach which is where RhubarbTart appears to be going.
> This means that future TurboGears versions will vary very little from
> future Pylons versions. Good times ahead!

RT was doing some half wsgi / half CherryPy things in the URL
traversal. We started going down that path and realized that it would
be pretty easy to just use wsgi everywhere and get the same effect
(actually better effects in several important ways). So RhubarbTart
seems like it's falling out (being replaced with very little code).
Some things that are gumming me up now are:

1) How to communicate upstream? For instance, if I want to logout, I
need to let the Identity middleware know that. Can I set something
magic in environ? Do I have to throw an exception?

2) What if I need to "fork" mutiple wsgi requests. This could easily
happen on a "three column" web page where the right hand side contains
the summary and the main area contains the real thing. I suspect it
just means intercepting the start_response() call, but I'm not sure.

3) The wsgi standard says that no inspection of wsgi applications is
allowed. This is unforunate because we currently rely on things like
an "expose" property on controller methods which are callable. At the
very least, I'd like to start an informal standard that a 'wsgi'
property be on all wsgi callables. That should be enough to prevent
uncallables in Controllers from getting called by Evil URLs.

Cheers,
Gary Godfrey
Austin, TX

Ben Bangert

unread,
Mar 5, 2006, 12:56:35 PM3/5/06
to TurboGears
Gary Godfrey wrote:
> So, the general idea is that everything is wsgi until it can't be. So
> a TurboGears controller is wsgi middleware which (by default) takes one
> item off the url and uses it to wsgi call self methods. It may also
> have other uses, like capturing a Form Validation Exception and calling
> appropriate error methods. This also imples that the @expose decorator
> is really a wsgi adapter.

This definitely works better in a object-path based dispatch. It won't
work for Routes-based dispatch under most conditions as Routes matches
and will act on the entire URL, not just the front section of it, so
its not quite clean to pop off the front. If the person is careful,
they can use Routes-based dispatch, as Ian has talked about, by making
a Route specifically for the purpose that has a section at the
beginning and a url_info remainder that goes to PATH_INFO.

> The fun part about this is that a (for instance) Authorization check
> just needs to be simple wsgi middleware and it works on whole
> controllers as well as with individual methods with a single routine.

Though I think it'd be a little tricky to setup Authorization that
worked solely with environ, rather than letting you ask for specific
permissions for controller access, as a library method would. The thing
with middleware is that for it to truly be portable (and useful
middleware), it really should be able to function based solely off the
environ and/or content passing up/down the middleware chain.

There's several things that lose features if moved purely to
middleware, like Auth functions, where you might want to do,
Auth.user_can(permission='edit_people'), even though the environ might
not contain anything having to do with users or permissions. So while
WSGI sounds great, and it is, for a lot of things, it definitely has
its place, and library functions work great for a lot of this as well.

> What also will begin to happen (or at least should happen) is that
> we'll start using the environ dict in a more expanded fashion. I'm
> thinking of something akin to Zope3's Interfaces, but not so formal.
> So, "Identity" middleware will set a "wsgi.Identity" which will contain
> UserName, Groups, Roles, Permissions, etc. Then, a Authorization
> check is completely separate from the Identity mechinism. I suspect
> we'll need something like PEPs just for this (it will have to move far
> faster than PEPs do - especially initially). Vision: a TurboGears
> application should run under Zope that has a wsgi interface and may
> even be able to handle simple permissions.

But why? Why cloud up environ needlessly? A thread-local global with a
library that any app can use up and down the chain accomplishes the
same thing, without clouding up environ.

> 1) How to communicate upstream? For instance, if I want to logout, I
> need to let the Identity middleware know that. Can I set something
> magic in environ? Do I have to throw an exception?

Not an issue with a thread-local global and a handy library package.
Alternatively, the Identity middleware could put a logout function in
environ that you could call, which would trigger the appropriate
cleanup. Notice however, that the Identity middleware would need to
know what session system is being used so that it could clean the
session as needed for logging out. This isn't necessarily difficult, as
with session middlware, it'd be present in the environ somewhere and
Identity middleware would just need to know where it is.

environ can be loaded up with objects and functions. So rather than
sending a message upstream, we can just keep some of the upstream
functions around downstream as needed.

> 2) What if I need to "fork" mutiple wsgi requests. This could easily
> happen on a "three column" web page where the right hand side contains
> the summary and the main area contains the real thing. I suspect it
> just means intercepting the start_response() call, but I'm not sure.

It's a bit more complicated I think. This is also what Ian Bicking
proposed using HTML overlays for, as it'd make it a bit more feasible
to assemble different sections of a page with different wsgi apps.

> 3) The wsgi standard says that no inspection of wsgi applications is
> allowed. This is unforunate because we currently rely on things like
> an "expose" property on controller methods which are callable. At the
> very least, I'd like to start an informal standard that a 'wsgi'
> property be on all wsgi callables. That should be enough to prevent
> uncallables in Controllers from getting called by Evil URLs.

Alternatively, you could enforce a policy, ie Python's, and declare
that private methods use _ in front. In Pylons, after looking at more
than a few different Controller styles, I went with the callable style
utilized by Aquarium. It's incredibly 'Pythonic', and super-flexible.
That is, the controller is required to be a callable, and is called
with the method name as the first arg, the rest of the methods normal
args as the remainder.

It's sort of like CP's default method, except that since its called
everytime, it provides a very nice spot to put in controller-wide setup
stuff, possible alter which method is to be called, do authentication
checks for multiple methods in one go, etc.

Consider the current TG solution, where you might have to drop the same
decorator on a dozen different methods requiring the user to be logged
in for all of them. Talk about repeating yourself. If you had a
__call__ style, you could say that being logged in is necessary for all
the methods except 'login', in a mere 2 lines of code by checking the
method name against a list of what requires it.

It also provides a handy point for modifying method calls. Another few
lines of code in a __call__ function, and you could have it setup where
it first checks for method_HTTP_METHOD, then method. That'd make it a
snap to split functions depending on the request method, ie, it checks
for a method_POST first, then method, if the request method is POST.

Anyways, I obviously rather like this style, especially as it provides
for incredible amounts of flexibility and customizability, all at the
developers fingertips. It also changes the point of where the callable
is in the model, so that you call the controller, and let it figure out
how to proceed. It'd also solve the problem you mention by letting the
controller ensure an evil URL doesn't call a method it shouldn't.

Finally, if each controller is a wsgi app, does that mean it has to
setup the thread-local globals again for the current application? I
think my main issue with making each controller a wsgi app, is that it
really isn't. If you consider a WSGI app to be a stand-alone
application (which I think it should be), then it doesn't make sense.
Controllers depend on other controllers being where they are, because
controllers are all part of a single application. WSGI apps don't care
about anything outside of them-self.

Does each controller have to parse environ, and setup a request object
again? There's a lot of setup stuff a framework does when it starts
handling a WSGI call, is that all going to be pushed into each
controller?

It's for those reasons, that using wsgi for controller calls just
doesn't make a lot of sense to me. However, I can see cases where it'd
be very useful for controllers to be called using the wsgi interface,
which looks identical to the controllers being WSGI aps, however
thread-local globals and such are setup by a primary WSGI app (RT, or
whatever). So it looks like they're WSGI apps, because they're called
with the WSGI interface, but each controller is aware its in a greater
whole, the application.

Cheers,
Ben

Gary Godfrey

unread,
Mar 5, 2006, 7:00:01 PM3/5/06
to TurboGears
Fun stuff. BTW, if it didn't come across in my last message, I really
got a feeling of "drinking the cool-aid" during pycon. Finally "got"
paste (and many aspects of TG) that I didn't before. Some of the
zelotry will likely wear off soon :-).

Ben Bangert wrote:
> There's several things that lose features if moved purely to
> middleware, like Auth functions, where you might want to do,
> Auth.user_can(permission='edit_people'), even though the environ might
> not contain anything having to do with users or permissions. So while
> WSGI sounds great, and it is, for a lot of things, it definitely has
> its place, and library functions work great for a lot of this as well.

The main issue that I have with library functions is that it's not
(currently) portable across frameworks. I can't do:

from <current running parent framework> import Identity

So I'm thinking of using environ as a proxy. You can still do calls
within this scope (just for example):

environ['Identity'].user_can(permission='edit_people')

Now, I hope that someday the different frameworks might be able to get
together and have a generic import mechinism, but I think that's a ways
off. Right now, wsgi is the only thing that we have that might have a
slight possibility of maybe bringing the frameworks together. So I'm
hanging a little more here than I would if it were a purely techinical
decision.

> But why? Why cloud up environ needlessly? A thread-local global with a
> library that any app can use up and down the chain accomplishes the
> same thing, without clouding up environ.

I sort of like the idea that environ as the primary thread-safe area to
use. We need to be _very_ careful about namespace and clutter, but the
same is true for any namespace.

> environ can be loaded up with objects and functions. So rather than
> sending a message upstream, we can just keep some of the upstream
> functions around downstream as needed.

Cool - I can see a big thing here is documentation. It's going to be
very easy to get this thing confused! I'm also envisioning that the
docstrings for these routines should have dot embedded in them. Now,
how to automagically create a pretty graph which automatically searches
the source and looks for callbacks and exceptions up the wsgi stack...

> > 2) What if I need to "fork" mutiple wsgi requests. This could easily
> > happen on a "three column" web page where the right hand side contains
> > the summary and the main area contains the real thing. I suspect it
> > just means intercepting the start_response() call, but I'm not sure.
>
> It's a bit more complicated I think. This is also what Ian Bicking
> proposed using HTML overlays for, as it'd make it a bit more feasible
> to assemble different sections of a page with different wsgi apps.

Cool - I hadn't read that before. It sounds like a great way (from an
Application designer standpoint) to implement this. I suspect it could
be implemented fairly easily using paste.recursive, but it could be
that I just don't understand it well enough yet <g>.

> Alternatively, you could enforce a policy, ie Python's, and declare
> that private methods use _ in front. In Pylons, after looking at more
> than a few different Controller styles, I went with the callable style
> utilized by Aquarium. It's incredibly 'Pythonic', and super-flexible.
> That is, the controller is required to be a callable, and is called
> with the method name as the first arg, the rest of the methods normal
> args as the remainder.

That's sort of what we have with WSGI, isn't it? The only difference
is that the method name and additional parameters are pulled from
environ['PATH_INFO'] rather than passed in parameters. I am thinking
of making a function or two which will make managing this while doing
RESTful stuff a bit easier.

> Consider the current TG solution, where you might have to drop the same
> decorator on a dozen different methods requiring the user to be logged
> in for all of them. Talk about repeating yourself. If you had a
> __call__ style, you could say that being logged in is necessary for all
> the methods except 'login', in a mere 2 lines of code by checking the
> method name against a list of what requires it.

Again, still easy to do WSGIish (top of head, no error checking)i:

def __call__(self, environ, start_response):
tgEnviron(environ) # Make sure environment has things in place
# lc path_info has PATH_INFO split on '/' and cleaned up.
if not environ['path_info'][0] in ['Login','Logout']:
identity.require(environ, identity.in_group('admin'))
return self.defaultCall(environ, start_response) # standard
Controller __call__

> Finally, if each controller is a wsgi app, does that mean it has to
> setup the thread-local globals again for the current application?

Nope - only the wsgi Server/Gateway side needs to worry about that.
environ is thread safe - if we hang everything there, then we're safe.

> I think my main issue with making each controller a wsgi app, is that it
> really isn't.

You're right - each controller is really WSGI middleware.

> If you consider a WSGI app to be a stand-alone
> application (which I think it should be), then it doesn't make sense.

I'm not sure I entirely agree there - I can see room for WSGI apps that
are irrelevent stand alone (a menu system, for instance).

> Controllers depend on other controllers being where they are, because
> controllers are all part of a single application. WSGI apps don't care
> about anything outside of them-self.

I think this is a terminology issue. I'm thinking that an
"Application" is a WSGI middleware stack with WSGI applications hanging
off on the leaves. The architecture of the middleware is part of the
Application.

> Does each controller have to parse environ, and setup a request object
> again? There's a lot of setup stuff a framework does when it starts
> handling a WSGI call, is that all going to be pushed into each
> controller?

Again, I think it's completely valid (and necessary) for Applications
to make assumpions about upstream middleware. I'm assuming that a TG
application which is running under a different Server/Gateway would
have a small piece of middleware which marshalls the enviorn
appropriately to resemble the normal TG Server environment.

Ciao,
Gary

Mike Orr

unread,
Mar 5, 2006, 8:13:37 PM3/5/06
to turbo...@googlegroups.com
On 3/5/06, Ben Bangert <gas...@gmail.com> wrote:
> There's several things that lose features if moved purely to
> middleware, like Auth functions, where you might want to do,
> Auth.user_can(permission='edit_people'), even though the environ might
> not contain anything having to do with users or permissions. So while
> WSGI sounds great, and it is, for a lot of things, it definitely has
> its place, and library functions work great for a lot of this as well.

Authentication in middleware is relatively easy. "If user is not
logged in, divert to login page, then take him back where he was."
The middleware can block access to a page or section, or set "hints"
in the environment (essentially an Auth object of some sort). But
ultimately some kinds of authorization will be tied too closely to the
application logic to be feasable with middleware.

> > What also will begin to happen (or at least should happen) is that
> > we'll start using the environ dict in a more expanded fashion. I'm
> > thinking of something akin to Zope3's Interfaces, but not so formal.
> > So, "Identity" middleware will set a "wsgi.Identity" which will contain
> > UserName, Groups, Roles, Permissions, etc. Then, a Authorization
> > check is completely separate from the Identity mechinism. I suspect
> > we'll need something like PEPs just for this (it will have to move far
> > faster than PEPs do - especially initially). Vision: a TurboGears
> > application should run under Zope that has a wsgi interface and may
> > even be able to handle simple permissions.
>
> But why? Why cloud up environ needlessly? A thread-local global with a
> library that any app can use up and down the chain accomplishes the
> same thing, without clouding up environ.

Mainly to standardize the ways in which frameworks and libraries
interact. A robust generic auth library could replace an auth
middleware, but would we get into an "every library does things
differently" situation? Would different auth implementations have a
compatible API so you can switch between them? Would the auth library
be consistent with your other libraries? Why is it so bad to have the
Auth object in the WSGI enviroment; it has to be somewhere.

> > 1) How to communicate upstream? For instance, if I want to logout, I
> > need to let the Identity middleware know that. Can I set something
> > magic in environ? Do I have to throw an exception?

> > 3) The wsgi standard says that no inspection of wsgi applications is
> > allowed.

What does that mean? That the application can't set an environ flag
("logout") to signal the middleware? If so, that sounds like a bug in
the spec, which a use case like this would expose. The application
may want to handle the logout page/redirect, or it may want to let the
middleware to, but in any case the middleware may have to do some
behind-the-scenes cleanup. Having an exception means the exception
handler gets to choose what the logout page looks like, and if the
exception handler comes from still a third package, that becomes
problematic.

> Finally, if each controller is a wsgi app, does that mean it has to
> setup the thread-local globals again for the current application? I
> think my main issue with making each controller a wsgi app, is that it
> really isn't. If you consider a WSGI app to be a stand-alone
> application (which I think it should be), then it doesn't make sense.
> Controllers depend on other controllers being where they are, because
> controllers are all part of a single application. WSGI apps don't care
> about anything outside of them-self.

That is a bit of a tension. Quixote normally considers the publisher
+ root directory as persistent objects that last the lifetime of the
server. But WSGI'ifying it creates separate instances for each
request. It doesn't seem to break things in practice because the
overhead is insignificant, the module globals persist (for shared data
that's not tied to a request or session), and the application has to
be multiprocess-safe anyway, but it does mean you're (mis)using the
framework in a way that wasn't intended.

> Does each controller have to parse environ, and setup a request object
> again? There's a lot of setup stuff a framework does when it starts
> handling a WSGI call, is that all going to be pushed into each
> controller?

In Quixote's WSGIServer (I don't remember the download location
offhand), the WSGI interface creates a Request object and calls the
Publisher's usual entry point, so the controller doesn't have to do
anything unusual. WSGIServer is at the same level as the other
servers (SCGIServer, SimpleServer, etc.)


--
Mike Orr <slugg...@gmail.com>
(m...@oz.net address is semi-reliable)

Ben Bangert

unread,
Mar 5, 2006, 8:31:07 PM3/5/06
to TurboGears
Gary Godfrey wrote:
> Fun stuff. BTW, if it didn't come across in my last message, I really
> got a feeling of "drinking the cool-aid" during pycon. Finally "got"
> paste (and many aspects of TG) that I didn't before. Some of the
> zelotry will likely wear off soon :-).

Yes, once you "get" it, its amazing where the thoughts start roaming.
:)

> The main issue that I have with library functions is that it's not
> (currently) portable across frameworks. I can't do:
> from <current running parent framework> import Identity

Sure you can. Library functions are absolutely portable if they're well
encapsulated. The Python Email library works great regardless of
framework, the WebHelpers package spits out HTML in templates
regardless of template/framework choice, etc. Libraries can absolutely
be portable across frameworks.... if they're designed not to rely on
framework specific environment variables that is. ;)

> So I'm thinking of using environ as a proxy. You can still do calls
> within this scope (just for example):
> environ['Identity'].user_can(permission='edit_people')

Again, is there some reason that won't work as a function? Heck, its
just a function you put in the environ, isn't it? That already means
you can use it as a library. Why does sticking the function in a dict
thats passed around make it any more framework-portable than a library?

When I heard about Identity being created, I asked at the time why not
make it as a combination. That is, middleware, and functions you use in
the framework. The middleware part of it is what you initialize before
your webapp starts, it also sets up a thread-safe global for use by
Identity functions throughout your application. This makes it easier to
use identity functions since you don't need to keep passing environ all
over the place in your application. Functions can just use a
thread-safe (request-local) global to pull config/objects out.

> I sort of like the idea that environ as the primary thread-safe area to
> use. We need to be _very_ careful about namespace and clutter, but the
> same is true for any namespace.

Sure, though environ is thread-safe mainly because its a dict.
Cluttering environ is no better than cluttering the namespace, but
thread-safe module globals are a good way for some things to function.
Consider with the Identity middleware, if it sets a thread-safe global
as its called, before it calls your app, you would be able to do this
anywhere inside your webapp:

from identity import app_identity
print app_identity.user_can(permission='edit_people')

This could work, because the middleware during its call, initializes
the thread-local global for itself. Then any function that uses
Identity could import the config and get to the functions and active
setup that it needs to. No cluttering of environ, no cluttering of the
namespace, and its framework neutral. Middleware ensures that the
request-local module global is present, library functions can then be
used easily anywhere.

> Cool - I can see a big thing here is documentation. It's going to be
> very easy to get this thing confused! I'm also envisioning that the
> docstrings for these routines should have dot embedded in them. Now,
> how to automagically create a pretty graph which automatically searches
> the source and looks for callbacks and exceptions up the wsgi stack...

Yup, definitely a big thing. Also, having a very handy way to setup the
middleware stack for a webapp is a good idea. In Pylons, the middleware
stack is setup for you, and put in the new projects config data, it
looks like this:
http://pylonshq.com/project/pylonshq/browser/Pylons/trunk/pylons/templates/paster_template/%2Bpackage%2B/config/middleware.py_tmpl

Making TurboGears run with full Paste-compatibility will lead to a
config file along these lines somewhere anyways, it'd be great to have
a format thats fairly generic and is in the same place between
frameworks even. Despite what pieces of middleware is being loaded, we
can then start standardizing on project layouts when possible.

> That's sort of what we have with WSGI, isn't it? The only difference
> is that the method name and additional parameters are pulled from
> environ['PATH_INFO'] rather than passed in parameters. I am thinking
> of making a function or two which will make managing this while doing
> RESTful stuff a bit easier.

Yes, that does work to an extent. I should note here, URL generation is
absolutely important at this point. Every single URL, whether referring
to a static or dynamic section, needs to use URL generation to ensure
the app works (as it needs to add SCRIPT_NAME when needed).

Since no one is going to want to hold environ and be passing it all
over the place during the template rendering, etc. its useful to
provide a module request-local access to it.

> Again, still easy to do WSGIish (top of head, no error checking)i:
> def __call__(self, environ, start_response):
> tgEnviron(environ) # Make sure environment has things in place
> # lc path_info has PATH_INFO split on '/' and cleaned up.
> if not environ['path_info'][0] in ['Login','Logout']:
> identity.require(environ, identity.in_group('admin'))
> return self.defaultCall(environ, start_response) # standard
> Controller __call__

There's a lot to be said for nice request objects. While environ is
pretty easy to work with, having a nice request object to deal with is
even better. Other than that, I think your example looks slick. Does
every controller call tgEnviron then?

> Nope - only the wsgi Server/Gateway side needs to worry about that.
> environ is thread safe - if we hang everything there, then we're safe.

Hopefully my spiel about request-local module globals convinced you? :)

> You're right - each controller is really WSGI middleware.

Mmm, its not really middleware either, as middleware will assume its
between things, it still seems wrong to me to have middleware that
won't work by itself in the middle of some other stack. Consider the
following example:

You have a Blog app, the blog app has a few dozen controllers, one of
them is admin.comments for administrating the comments section as an
admin. How would you be able to use admin.comments in a Finance webapp
stack?

I really don't see how, because your admin.comments controller is
exceptionally dependent on the fact that its part of your Blog app and
is surrounded by other controllers you wrote it with. If it was truly
WSGI middleware, then it'd be pluggable into any stack you'd want to
use it in.

This is what I meant by abusing terminology. However, if you say that
your controllers use a WSGI interface when called, then you *could* put
in a fully independent wsgi middleware/app, or one of your apps
controllers, and it'd work the same either way. This is mainly a
terminology issue, as we seem to be in agreement that using the WSGI
interface is a "good idea". :)

> Again, I think it's completely valid (and necessary) for Applications
> to make assumpions about upstream middleware. I'm assuming that a TG
> application which is running under a different Server/Gateway would
> have a small piece of middleware which marshalls the enviorn
> appropriately to resemble the normal TG Server environment.

Yup, makes sense. I think it sounds like a great idea, and is very
pluggable indeed. I'd consider using request-local module globals for
things like Identity and some middleware. Paste does this as well, and
its a pretty useful thing so that functions in various locations can
get to request-local data that was setup by middleware.

I think it'd be great to standardize on a few things so that besides
for being able to re-use components of frameworks like middleware, we
can re-use other chunks. To an extent some of the work done during the
sprint went this direction. Given the WSGI stuff you spoke of, I think
there's a few clear things that standardizing on would be great for:

request object - Paste's goes a good distance already, RT's does as
well, CP provides the cpg.request, etc. This could even be a small bit
of middleware that your application puts up front and then the rest of
your app can get to it like how you can get to cpg.request from
anywhere. Also, out of all the frameworks I've used, the request object
seems to differ the least.

middleware config - Having a nice place to setup the middleware stack
for your application is a good idea. I'd propose a config/ directory in
the project, with middleware.py. It's worked out quite well for Pylons
but am open to better ideas as well. I should note that we tried a
single config.py file as well, but setting up routes, middleware
stacks, etc. all in a single file became a real chore.

Hmm, now that I think about it, TG decorators could be quite portable
as well if they could assume there was a standard request object
available, and it was known that the method content was being returned
via the WSGI interface....

Some thoughts anyways. :)

- Ben

Ben Bangert

unread,
Mar 5, 2006, 8:41:35 PM3/5/06
to TurboGears
Mike Orr wrote:
> > But why? Why cloud up environ needlessly? A thread-local global with a
> > library that any app can use up and down the chain accomplishes the
> > same thing, without clouding up environ.

> Mainly to standardize the ways in which frameworks and libraries
> interact. A robust generic auth library could replace an auth
> middleware, but would we get into an "every library does things
> differently" situation? Would different auth implementations have a
> compatible API so you can switch between them? Would the auth library
> be consistent with your other libraries? Why is it so bad to have the
> Auth object in the WSGI enviroment; it has to be somewhere.

Perhaps that was miscommunicated. When I refer to library, I merely
mean that there's a module of functions that you use within your
webapp. So a Identity framework package, would come with a group of
functions, and a WSGI middleware app.

1) You stick the middleware in front of the application you want to use
Identity with.
2) You configure the Identity middleware to point to your db, etc, and
any other config it needs.
3) Inside your webapp, you import Identity.functions, or whatever its
called, and use the functions

This could work, because the Identity middleware sets up request-local
globals for use with functions that want to use data that the Identity
middleware setup when it was called. The Auth object does have to be
somewhere, its just exceptionally irritating, and massive 'repeat
yourself' to have to pass it all over the place within your app.

That's why I use request-local module globals so that functions can get
to the Auth object, without forcing the developer to hang onto an
object that's needed all over the place by tons of functions.

Hopefully that clarifies what I meant with 'library' vs purely
middleware. Either way you have the object and functions, and the
setup. The way I'm referring to means you save yourself the effort of
passing the environ all over the place.

Cheers,
Ben

Mike Orr

unread,
Mar 5, 2006, 9:45:02 PM3/5/06
to turbo...@googlegroups.com
On 3/5/06, Ben Bangert <gas...@gmail.com> wrote:
>

A middleware bundled with library components (importable functions)
certainly makes sense, since it gives more flexibility to the
controller. I was not complaining about a WSGI-library hybrid, but of
an idiosyncratic library that avoided WSGI for no good reason. The
'email' library in another message, presented as superior to a WSGI
alternative, is not an example. (1) the 'email' package doesn't do
anything web specific, (2) it doesn't carry state across requests or
sessions. Therefore it has no reason to be middleware. But anything
that's related to identity or sessions or redirects or page wrapping
or cookies etc should probably be WSGI'ified unless there's a good
reason not to. "Useful in only one application or framework" would be
a good reason not to. "Not reinventing the wheel (even if we already
have)" would be a good reason to.

And I can see the point for turbogears.request.wsgi [1] or such so
you don't have to pass the environment around everywhere. The point
of middleware is to be framework-neutral, not to force everything
through the environment.

[1] I'm assuming cherrypy.request will go away someday, since it
notionally ties us to a package we're not 100% happy with, and using a
fake 'cherrypy' module for compatibilty screams of kludge.

Ben Bangert

unread,
Mar 5, 2006, 10:37:43 PM3/5/06
to TurboGears
Mike Orr wrote:
> A middleware bundled with library components (importable functions)
> certainly makes sense, since it gives more flexibility to the
> controller. I was not complaining about a WSGI-library hybrid, but of
> an idiosyncratic library that avoided WSGI for no good reason. The
> 'email' library in another message, presented as superior to a WSGI
> alternative, is not an example. (1) the 'email' package doesn't do
> anything web specific, (2) it doesn't carry state across requests or
> sessions. Therefore it has no reason to be middleware. But anything
> that's related to identity or sessions or redirects or page wrapping
> or cookies etc should probably be WSGI'ified unless there's a good
> reason not to. "Useful in only one application or framework" would be
> a good reason not to. "Not reinventing the wheel (even if we already
> have)" would be a good reason to.

Indeed, the email library was a poor choice for an example.

> And I can see the point for turbogears.request.wsgi [1] or such so
> you don't have to pass the environment around everywhere. The point
> of middleware is to be framework-neutral, not to force everything
> through the environment.

Yup, though if TG is increasing Paste-compatibility at the same time,
some Paste middleware perhaps similar to the paste.deploy.config
middleware could be used to 'register' such thread-locals (mainly
because it can be tricky to clean-up at the end of a wsgi app request).

Lots of places for re-use. :)

- Ben

Gary Godfrey

unread,
Mar 6, 2006, 12:24:16 AM3/6/06
to TurboGears
Ben Bangert wrote:
> Again, is there some reason that won't work as a function? Heck, its
> just a function you put in the environ, isn't it? That already means
> you can use it as a library. Why does sticking the function in a dict
> thats passed around make it any more framework-portable than a library?

Maybe I'm not quite understanding something here. Let's suppose I
have a wsgi stack like (apologies if diagram doesn't make it through):

== SERVER ==
= Virtual Host Splitter =

= ldap ID = mysql ID =
= TG Blog = TG Blog =

OK, so we have two different web sites hosted virtually on a single
WSGI Server. The TG blog code needs to call either the LDAP ID
mechanism or the mysql ID mechinism depending on which path was taken.
So, the "import Identity" used inside TG Blog needs to mean different
things depending which one is called. How would that work?

> Sure, though environ is thread-safe mainly because its a dict.
> Cluttering environ is no better than cluttering the namespace, but
> thread-safe module globals are a good way for some things to function.

See, "thread-safe module globals" always scare me. Every time you have
to look up the current thread and map that to the right thread-safe
value is a potential for screwing things up. That should only be done
rarely, ideally not by everyone who implements a feature.

(getting late - will respond more on the morrow).

Cheers,
Gary

Ben Bangert

unread,
Mar 6, 2006, 12:55:24 AM3/6/06
to TurboGears
Gary Godfrey wrote:
> == SERVER ==
> = Virtual Host Splitter =
>
> = ldap ID = mysql ID =
> = TG Blog = TG Blog =
>
> OK, so we have two different web sites hosted virtually on a single
> WSGI Server. The TG blog code needs to call either the LDAP ID
> mechanism or the mysql ID mechinism depending on which path was taken.
> So, the "import Identity" used inside TG Blog needs to mean different
> things depending which one is called. How would that work?

Pretty easy really. When the ID is called, it initializes the
thread-local. That way when you do import Identity from TG Blog, it
looks at the thread-local, which was setup by ID. When you setup each
ID, you of course tell it how its going to do the lookup, which it
saves. Then when its called, it looks at how it was initialized, loads
it up, sets the thread-local, and proceeds.

class ID(object):
def __init__(self, app, auth_lookup='file'):
self.app = app
self.auth_scheme = auth_lookup

def __call__(self, environ, start_response):
# pull our self.auth_scheme, setup the thread-local, etc.
return self.app(environ, start_response)

Then you wrap it...

myapp = ID(myapp, auth_lookup='ldap')
etc.

So when you wrap the app, you configure its lookup. That way before it
calls your TG Blog, it sets up the same lookup scheme you initialized
it with. This is how most current middleware functions and remembers
its configuration during call.

> See, "thread-safe module globals" always scare me. Every time you have
> to look up the current thread and map that to the right thread-safe
> value is a potential for screwing things up. That should only be done
> rarely, ideally not by everyone who implements a feature.

Oh, its not just a thread-safe module global.... :) Keep in mind that
your app might call a differently configured version of the same app
farther down the chain. That app would then blast away your thread-safe
module global, and when its done, the app farther in the chain would be
left using data that wasn't its own.

So not only does it need to be thread-safe, it needs to implement a
stack as well, so that you push your object on, and pop it off when
you're done. RhubarbTart got such a stacked thread-safe during the
PyCon sprint. I've talked with Ian about putting an object like this in
Paste, so that when you need it, you can use that one instead of making
your own version (which may or may not work). Also, a small bit of
middleware that ensures the object is pushed and popped off helps out.

The complex issues are taken care of, and using it becomes fairly easy.
Again, Paste already has something like this, so given the code in
RhubarbTart, and already in Paste, it should be fairly trivial to make
a GlobalRegistry style middleware you could use like so:

app = GlobalRegistry(app)
app = ID(app, lookup_auth='ldap')

Then in our ID, which we require the user to run underneath a
GlobalRegistry middleware:
#identity.py
from paste.globalregistry import StackedObject
myid = StackedObject()

class ID(object):
def __init__(self, app, auth_lookup='file'):
self.app = app
self.auth_scheme = auth_lookup

def __call__(self, environ, start_response):
# pull our self.auth_scheme, setup the thread-local, etc.
id = ConfigureAndSetupForIdentity()
environ['paste.globalregistry'].register(myid, id)
return self.app(environ, start_response)

So the GlobalRegistry takes care of pushing and popping your
thread-locals, and now you can 'from identity import myid' anywhere in
your webapp, and it has the current id object. You don't have to worry
about the WSGI and thread/request local nuances, and middleware takes
care of ensuring its all working right in a totally portable fashion.

I should also mention that putting things in environ doesn't solve this
problem. Just imagine the same scenario, only you're storing your ID
object in environ. If one of your apps down the chain uses ID as well,
it will blow the prior ID object in environ away.... so regardless of
if its a thread-local or in environ, it needs to be a stacked style
object.

Hopefully that helps clear this all up some. I'm looking over Paste
code now and hopefully will have some middleware along these lines
working tomorrow so we can play with it. :)

Cheers,
Ben

Mike Orr

unread,
Mar 6, 2006, 1:00:44 AM3/6/06
to turbo...@googlegroups.com
On 3/5/06, Gary Godfrey <ggodfre...@io.com> wrote:
>
> Ben Bangert wrote:
> > Again, is there some reason that won't work as a function? Heck, its
> > just a function you put in the environ, isn't it? That already means
> > you can use it as a library. Why does sticking the function in a dict
> > thats passed around make it any more framework-portable than a library?
>
> Maybe I'm not quite understanding something here. Let's suppose I
> have a wsgi stack like (apologies if diagram doesn't make it through):
>
> == SERVER ==
> = Virtual Host Splitter =
>
> = ldap ID = mysql ID =
> = TG Blog = TG Blog =
>
> OK, so we have two different web sites hosted virtually on a single
> WSGI Server. The TG blog code needs to call either the LDAP ID
> mechanism or the mysql ID mechinism depending on which path was taken.
> So, the "import Identity" used inside TG Blog needs to mean different
> things depending which one is called. How would that work?

If the WSGI manager can activate a different middleware stack
depending on the URL, there would be no interference. I don't
remember if Paste can do this. If
there's one global middleware stack, and if the config file can handle
host-specific configurations the way it does path-specific
configurations now, you'd have two options:

1) Use a robust middleware that can handle multiple backends and
switch per request. Put the connection info in the config file for
each host. The middleware may want to use a connection pool bla bla
bla.

2) Use two stupid middlewares that can be disabled in the
configuration. Activate one for one host, and the other for the other
host. This is similar to Apache's "LoadModule" and "ModAuth Off".
The middleware is running but will pass the request/response unchanged
if disabled.

Of course, if there's no interaction between the sites, and they're on
different IPs or ports, there's no need to multiplex them in one
process at all.


> > Sure, though environ is thread-safe mainly because its a dict.
> > Cluttering environ is no better than cluttering the namespace, but
> > thread-safe module globals are a good way for some things to function.
>
> See, "thread-safe module globals" always scare me. Every time you have
> to look up the current thread and map that to the right thread-safe
> value is a potential for screwing things up. That should only be done
> rarely, ideally not by everyone who implements a feature.

That's already been done. 'cherrypy.thread_data' is a dumping ground
for any per-thread objects you need. 'cherrypy.request' is also local
to the thread, so 'cherrypy.request.wsgi_environ' would be too.

Gary Godfrey

unread,
Mar 6, 2006, 3:41:24 PM3/6/06
to TurboGears
Ben Bangert wrote:
> Pretty easy really. When the ID is called, it initializes the
> thread-local. That way when you do import Identity from TG Blog, it
> looks at the thread-local, which was setup by ID. When you setup each
> ID, you of course tell it how its going to do the lookup, which it
> saves. Then when its called, it looks at how it was initialized, loads
> it up, sets the thread-local, and proceeds.
>

Sorry - I should have included the Authorization part of the diagram
(was assuming it was there). Remember this (slight change)?

def __call__(self, environ, start_response):
tgEnviron(environ) # Make sure environment has things in place

if not environ['path_info'][0] in ['Login','Logout']:

environ['authorization'].check(role='Admin')
return self.defaultCall(environ, start_response)

It needs to be able to call the *right* version from deep inside the
code. You're right: wrapping is easy for the ID part. I'm making a
distinction between getting the user ID (Identity) and checking the
permissions (Authorization). Authorization can be done deep down in
the code and may require a direct call to the particular Identity
middleware. I guess I see it this way:

Environ verison:

ID: set environ['authorization'] to ID local class (self most
likely).
Auth: call environ['authorization'].method to check.

Library version:
Make sure identity.py is installed by any ID conforming modules.
ID: allocate new thread-safe location and put self there.
Auth: Call identity.py function which will (Given current thread
ID), lookup ID object and return with it. Call the auth method.

Somehow, you're implying that there's an "identity.py" installed in
site-packages that these two different ID packages are sharing (it
could be that LDAP ID and MySQL ID are part of completely different
packages). Who installs it? Do both of them have a "identity.py" in
their install packages? I'm just saying that it's easier to share a
namespace (politically) than it is to share code.

The libarary version just seems to add a whole lot of complexity. It
adds the requirement that there's a common module that is shared among
all ID providers (basically you're using the library namespace rather
than the environ namespace), it forces creation of yet another
thread-safe region (which I think does a lock under the covers), and it
just slows things down. And I'm not sure what it buys you. Either
method can be used stacked or unstacked.

> Oh, its not just a thread-safe module global.... :) Keep in mind that
> your app might call a differently configured version of the same app
> farther down the chain. That app would then blast away your thread-safe
> module global, and when its done, the app farther in the chain would be
> left using data that wasn't its own.

You're right that it could be that a lower-level Identity module may
blow away what was above, but that's what you'd want, right? Whoever
is last on the stack has control.

> [ neat GlobalRegistry stuff deleted]

I like the GlobalRegistry idea - I just don't think you can get all the
Frameworks to agree on it. I'm mostly thinking small and subversive,
here :-). BTW, to implement the globalregistry, I'd just use
environ['globalregistry'].

> I should also mention that putting things in environ doesn't solve this
> problem. Just imagine the same scenario, only you're storing your ID
> object in environ. If one of your apps down the chain uses ID as well,
> it will blow the prior ID object in environ away....

The question is: what's wrong with this? That's the behaviour that
you'd expect; the only time you'd want to go back up is with an
exception, and the local variables will be fine to store the ID state
for a redirect. But I can see cases where you'd want a stack. But
that's tangental to the environ/library thing.

> so regardless of
> if its a thread-local or in environ, it needs to be a stacked style
> object.

Just curious, though: when do you ever pop?

Cheers,
Gary

Ben Bangert

unread,
Mar 6, 2006, 4:13:01 PM3/6/06
to TurboGears
Gary Godfrey wrote:
> It needs to be able to call the *right* version from deep inside the
> code. You're right: wrapping is easy for the ID part. I'm making a
> distinction between getting the user ID (Identity) and checking the
> permissions (Authorization). Authorization can be done deep down in
> the code and may require a direct call to the particular Identity
> middleware. I guess I see it this way:
>
> Environ verison:
> ID: set environ['authorization'] to ID local class (self most
> likely).
> Auth: call environ['authorization'].method to check.
>
> Library version:
> Make sure identity.py is installed by any ID conforming modules.
> ID: allocate new thread-safe location and put self there.
> Auth: Call identity.py function which will (Given current thread
> ID), lookup ID object and return with it. Call the auth method.

Errr, I'm pretty sure at this point we have significant amounts of
confusion and misconceptions around this. As I'm not talking about a
purely library version, its a middleware piece with functions you use
in your app. The whole thing is in a single Python package.

> Somehow, you're implying that there's an "identity.py" installed in
> site-packages that these two different ID packages are sharing (it
> could be that LDAP ID and MySQL ID are part of completely different
> packages). Who installs it? Do both of them have a "identity.py" in
> their install packages? I'm just saying that it's easier to share a
> namespace (politically) than it is to share code.

Err, no, there is a Identity site-package installed. There isn't two
different ID packages sharing it, there's two differently configured ID
middleware instances.

> The libarary version just seems to add a whole lot of complexity. It
> adds the requirement that there's a common module that is shared among
> all ID providers (basically you're using the library namespace rather
> than the environ namespace), it forces creation of yet another
> thread-safe region (which I think does a lock under the covers), and it
> just slows things down. And I'm not sure what it buys you. Either
> method can be used stacked or unstacked.

It buys you insane amounts of convenience, and massive boosts in
usability. This is why Pylons does that, its why CherryPy does it, its
why the TG WSGI branch does it, etc. Everyone's doing it cause its a
"good idea" in both practice and theory. :)

> I like the GlobalRegistry idea - I just don't think you can get all the
> Frameworks to agree on it. I'm mostly thinking small and subversive,
> here :-). BTW, to implement the globalregistry, I'd just use
> environ['globalregistry'].

That defeats the entire purpose of the global registry...

> > I should also mention that putting things in environ doesn't solve this
> > problem. Just imagine the same scenario, only you're storing your ID
> > object in environ. If one of your apps down the chain uses ID as well,
> > it will blow the prior ID object in environ away....
>
> The question is: what's wrong with this? That's the behaviour that
> you'd expect; the only time you'd want to go back up is with an
> exception, and the local variables will be fine to store the ID state
> for a redirect. But I can see cases where you'd want a stack. But
> that's tangental to the environ/library thing.

You're seeing middleware as something that just handles data on the way
down. Middleware can wrap replies and alter data *on the way back up*.
This is why its important not to destroy things farther down the chain.

> Just curious, though: when do you ever pop?

The middleware pops it. PasteDeploy comes with configuration middleware
that does this, if you're curious what an implementation looks like:
http://pythonpaste.org/deploy/paste/deploy/config.py.html?f=131&l=173#131

That ensures that the configuration is popped.

The desire to stick everything into environ is a very common one when
first getting into WSGI. There's limitations to that approach,
significant ones, which is why everyone ends up using thread-locals.
Typical reasons why thread-locals are a good idea:
- You don't need to pass environ all over the place
- No need to keep sticking more and more keys in environ when they're
for use in the current space, not farther down the chain
- Helper functions for templates and controllers will likely need
access to configuration data. Being able to import a module global
thats a request-local means your templates don't need to keep passing
this data in (which saves massive repetition)

Frameworks don't need to agree on the GlobalRegistry, its middleware,
anyone can use it or not. The Identity middleware could use it, and no
WSGI capable framework would need to be the wiser. This is why
middleware is good stuff, if someone makes a more
robust/reliable/fancier GlobalRegistry middleware, its pretty trivial
to swap it in/out.

Hopefully this helps clear some things up, otherwise I'd suggest we
have a brainstorm session on IRC or something. :)

- Ben

Ian Bicking

unread,
Mar 7, 2006, 2:10:55 PM3/7/06
to turbo...@googlegroups.com
Mike Orr wrote:
>>OK, so we have two different web sites hosted virtually on a single
>>WSGI Server. The TG blog code needs to call either the LDAP ID
>>mechanism or the mysql ID mechinism depending on which path was taken.
>> So, the "import Identity" used inside TG Blog needs to mean different
>>things depending which one is called. How would that work?
>
>
> If the WSGI manager can activate a different middleware stack
> depending on the URL, there would be no interference. I don't
> remember if Paste can do this. If
> there's one global middleware stack, and if the config file can handle
> host-specific configurations the way it does path-specific
> configurations now, you'd have two options:
>
> 1) Use a robust middleware that can handle multiple backends and
> switch per request. Put the connection info in the config file for
> each host. The middleware may want to use a connection pool bla bla
> bla.

At least in Paste there is no global middleware stack, instead you set
up a fairly specific stack that can potentially be more complex than
just a linear top-to-bottom. Well, *usually* is more complex.

So...

> 2) Use two stupid middlewares that can be disabled in the
> configuration. Activate one for one host, and the other for the other
> host. This is similar to Apache's "LoadModule" and "ModAuth Off".
> The middleware is running but will pass the request/response unchanged
> if disabled.

... here you could do a couple different things. "Auth" is a bad term,
because I don't know exactly what you are thinking of, authorization or
authentication. But lets say you want to authorize people using some
middleware egg:turbogears#identity (using Paste entry point
terminology), but in a particular part of your application you want to
additionally authenticate using IP based authentication -- not as
trusty, and not a replacement for a "real" authentication system, but
useful.

So, in this setup we put TG's identity in front of everything, but put
an additional middleware in front of /backend that logs anyone from the
local network in as "local_admin":

[filter:identity]
use = egg:TurboGears#identity
provider = some-info-provider

[composite:main]
use = egg:Paste#urlmap
/ = myapp
/backend = backend-app
filter-with = identity

[app:myapp]
use = egg:MyApp
config values...

[app:backend-app]
use = egg:BackendAdmin
config values...
filter-with = ip-auth

[filter:ip-auth]
use = egg:Paste#grantip
192.168.0.0/24 = local_admin

--
Ian Bicking / ia...@colorstudy.com / http://blog.ianbicking.org

Ian Bicking

unread,
Mar 7, 2006, 2:12:58 PM3/7/06
to turbo...@googlegroups.com
Ben Bangert wrote:
>>And I can see the point for turbogears.request.wsgi [1] or such so
>>you don't have to pass the environment around everywhere. The point
>>of middleware is to be framework-neutral, not to force everything
>>through the environment.
>
>
> Yup, though if TG is increasing Paste-compatibility at the same time,
> some Paste middleware perhaps similar to the paste.deploy.config
> middleware could be used to 'register' such thread-locals (mainly
> because it can be tricky to clean-up at the end of a wsgi app request).

I wasn't tracking this discussion as it happened, but threadlocal stuff
came up a lot here. After some discussion with Ben on IRC, I'm thinking
that one thread-local WSGI environment can satisfy nearly every use
case, in a fairly nice way.

So lets say that foo.wsgi_environ is a threadlocal variable that points
to the current WSGI environment for this request. It is set up with
WSGI middleware, and stacked for the case of multiple stacked requests
(like when doing an internal redirect). Basically, we make sure it is
accurate, and if there's no current request it bails out with an
exception. (cherrypy.request actually sticks around after the request,
but rhubarbtart.request doesn't -- RT's CP compatibility copies CP's
behavior, though)

Anyway, given that -- in one, well-known and importable location (and
yes, with all the problems that implies) -- you can build up all sorts
of other objects in a way that feels fairly safe to me.

For instance:

import foo
class Request(object):
def __init__(self, wsgi_environ=foo.wsgi_environ):
self.environ = wsgi_environ
@property
def params(self):
if 'myframework.request.params' not in self.environ:
self.environ['myframework.request.params'] =
parse_params(self.environ)
return self.environ['myframework.request.params']
... and other methods ...

request = Request()


The idea being that *no* state is kept in these objects -- they put all
state into the environment. Each object is a kind of proxy around the
request object. The proxy can be thicker (like in this example), or
pretty thin (e.g., put the actual object in the environment, and then
create an object that proxies all method access).

Though there is one import problem -- where to keep this threadlocal
wsgi_environ -- all the other import problems aren't an issue. If
objects are careful about checking state in the environment, they can
see changes that come about from other frameworks. (In this example I
am *not* being careful, but in paste.request.parse_formvars() I am more
careful)

There's still some open issues in my mind. For example, I think these
kinds of objects should probably have a .copy() method or something that
gets a concrete dictionary instead of the threadlocal, so that you can
get a version of the object that won't disappear under your feet (once
the request has finished). But the basic pattern feels good to me.
Except for the import problem :-/ -- though a clever middleware could
probably set multiple threadlocal's to match the environment, with the
only interaction through a key in the environment itself.

Mike Orr

unread,
Mar 7, 2006, 10:03:55 PM3/7/06
to turbo...@googlegroups.com
On 3/7/06, Ian Bicking <ia...@colorstudy.com> wrote:
> At least in Paste there is no global middleware stack, instead you set
> up a fairly specific stack that can potentially be more complex than
> just a linear top-to-bottom. Well, *usually* is more complex.

You can have a non-linear stack? How is that possible?

> > 2) Use two stupid middlewares that can be disabled in the
> > configuration. Activate one for one host, and the other for the other
> > host. This is similar to Apache's "LoadModule" and "ModAuth Off".
> > The middleware is running but will pass the request/response unchanged
> > if disabled.
>
> ... here you could do a couple different things. "Auth" is a bad term,
> because I don't know exactly what you are thinking of, authorization or
> authentication.

Sorry, I was just making an analogy. In Apache you load a module
globally, then enable/disable it for a particular URL subset. The
module I was thinking of is mod_auth, and the directive
"AuthAuthoritative Off" (not "ModAuth Off"). Although
AuthAuthoritative doesn't fully disable it like I was thinking, just
partially. But you get the idea.

Ian Bicking

unread,
Mar 7, 2006, 10:56:14 PM3/7/06
to turbo...@googlegroups.com
Mike Orr wrote:
> On 3/7/06, Ian Bicking <ia...@colorstudy.com> wrote:
>> At least in Paste there is no global middleware stack, instead you set
>> up a fairly specific stack that can potentially be more complex than
>> just a linear top-to-bottom. Well, *usually* is more complex.
>
> You can have a non-linear stack? How is that possible?

In the example I gave it was something like:

TG Identity
|
v
urlmap
| |
v v
"/" "/backend/"
| |
v |
MyPackage |
|
v
grantip
|
v
Backend App

--
Ian Bicking | ia...@colorstudy.com | http://blog.ianbicking.org

Reply all
Reply to author
Forward
0 new messages