Extending the URL Dispatcher (GSoC 2015 proposal)

415 views
Skip to first unread message

Marten Kenbeek

unread,
Mar 2, 2015, 11:57:24 AM3/2/15
to django-d...@googlegroups.com
Hey all,

I'm working on a proposal to extend the URL dispatcher. Here, I'd like to provide a quick overview of the features I propose. 

I'd like to:
- Allow matching based on request attributes such as the subdomain or protocol, and business logic such as the existence of a database object.
- Make middleware configurable for a subset of views. It should be easy to add, reorder or replace middleware at any level in the (currently recursive) matching algorithm. 
- Provide conventions for common patterns, such as an easy-to-configure URL router for all generic Model views. For generic views, this should be a one-liner. For custom views and other non-default options, this should still be relatively easy to configure compared to writing out all patterns. 

In the process, I'd like to formalize some classes used in the dispatcher. Currently, the RegexURLPattern and RegexURLResolver classes provide most of the functionality of the URL dispatcher. By abstracting these classes, and factoring out the loading mechanism and some other internals, I hope to provide an extensible dispatching framework for third-party apps.

The full, yet incomplete proposal can be found at https://gist.github.com/knbk/325d415baa92094f1e93 if you want more details. It currently contains a slightly more in-depth discussion of the current dispatcher and the proposed features, and a start on the redesign of the dispatcher. 

I'm mostly looking for some feedback on the general direction of these features, though any feedback is welcome. I'm still working on it, so details might change based on new insights.

Thanks,
Marten

Tim Graham

unread,
Mar 2, 2015, 12:14:30 PM3/2/15
to django-d...@googlegroups.com
Hi Marten,

I think it would be helpful to motivate this with some pseudocode of specific use cases you are aiming to solve. Have you looked into whether there are any related third-party projects related to your ideas from which you could draw inspiration?

Tim

Marc Tamlyn

unread,
Mar 2, 2015, 2:20:20 PM3/2/15
to django-d...@googlegroups.com
A collection of thoughts:

I think allowing the url dispatcher to inspect the database for the existence of certain objects is potentially somewhat dangerous. However, good support for a view raising a "continue resolving" exception along the lines of https://github.com/jacobian-archive/django-multiurl might be interesting. Related to this, a check for potentially conflicting url mappings could be interesting.

Middleware currently has complex and unintuitive behaviour in the event of exceptions. You talk about the middleware/decorator split, but not how to make either make sense.

Supporting generic sets of views has some logic, although in my experience it is extremely rare that you can sensibly use a generic view with no alterations at all - it almost always needs extra context or some other tweaks. I'm not really convinced that a one liner to get CRUD for a particular model will actually be that useful in the wild - you're likely to end up changing too many things. I don't find the "one line in a urlconf for each view" convention to be particularly problematic, however writing all the regexes is potentially more prone to problems.

If you are intending on introducing alternative URL resolvers, some example ideas would be needed. The lack of a consistent way to reverse a slug for example is a good idea to address, but we need to establish how.

How are you intending to support different resolvers in the same project? It seems to me that it is rather inefficient in large projects to loop through all resolvers for all urls.

Namespacing urls is currently over complex for the 90% use case, and the docs are hard to understand as a result. Alternative designs in this area could be interesting.

Overall there are lots of interesting starts of ideas here, but I feel one or two dead ends. It's a potentially very varied project and the crux of the proposal needs to focus on ensuring that some specific tasks are well designed and achievable, with others being extensions later on.

Marc

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/34ebf9ed-9961-4b33-9f49-5e6a4f9c6469%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Curtis Maloney

unread,
Mar 2, 2015, 8:56:39 PM3/2/15
to django-d...@googlegroups.com
On 3 March 2015 at 03:57, Marten Kenbeek <marte...@gmail.com> wrote:
Hey all,

I'm working on a proposal to extend the URL dispatcher. Here, I'd like to provide a quick overview of the features I propose. 

I'd like to:
- Allow matching based on request attributes such as the subdomain or protocol, and business logic such as the existence of a database object.

There was a "continue resolving" sort of exception proposed/implemented that would obviate this, allowing the logic to remain in views [or view decorators]... a much simpler solution, IMHO.
 
- Make middleware configurable for a subset of views. It should be easy to add, reorder or replace middleware at any level in the (currently recursive) matching algorithm. 

This has certainly been on the "wanted" list for many years now, however I expect it would require the middleware re-design that so far has proven too invasive to land.

That said, providing the "new" middleware-as-wrapper interface around url patterns lists could be a good stepping stone to eventually removing the existing middleware API.
 
- Provide conventions for common patterns, such as an easy-to-configure URL router for all generic Model views. For generic views, this should be a one-liner. For custom views and other non-default options, this should still be relatively easy to configure compared to writing out all patterns. 

Are you talking about pre-cooked url patterns for the various CBV?  Or plugin routers for groups of CBV?  I'm certainly in favour of some tool that makes it easier to express "common" regex matches [satisfying the "protect from the tedium" rule of frameworks]
 

In the process, I'd like to formalize some classes used in the dispatcher. Currently, the RegexURLPattern and RegexURLResolver classes provide most of the functionality of the URL dispatcher. By abstracting these classes, and factoring out the loading mechanism and some other internals, I hope to provide an extensible dispatching framework for third-party apps.

As mentioned elsewhere, I would very much like to see a resolver system based on the "parse" library [essentially, the inverse of str.format - https://pypi.python.org/pypi/parse], and to do so would indeed require some formal analysis / documentation of the existing resolver architecture.
 
--
Curtis

Marten Kenbeek

unread,
Mar 3, 2015, 8:37:46 PM3/3/15
to django-d...@googlegroups.com
First of all, thanks for the feedback. There are some good points here that I hadn't thought about.

Behaviour similar to a `ContinueResolving` exception is one of the things I was aiming at. However, I can't see how to raise this in a view while maintaining backwards compatibility with view middleware. E.g. the CsrfViewMiddleware might have bailed out before the first view was executed, while the second view might be csrf exempt. Executing the view middleware twice might cause problems as well.

As for alternate url resolvers: they'll provide the same functions (`resolve` and `reverse`) that are required for any new-style resolvers, so they can be freely mixed and replaced, for example:

urlpatterns = patterns('',
   
SubdomainResolver('accounts.', include('some.url.config.urls'), decorators=[login_required]),
    url
(r'^$', 'my.view.function'),
   
ModelRouter(r'^mymodel/', model=MyModel, field='some_field',
        views
= {
           
'detail': MyDetailView.as_view(),
           
'list':   MyListView.as_view(),
           
'edit':   MyUpdateView.as_view(),
           
'new':    MyCreateView.as_view(),
           
'delete': None,
       
}
   
),
)

As for the model router: a real-world example can be as simple as this. You can either specify a field and based on the field, `detail` and `edit` will accept only numbers or strings or whatever, or you can specify your own regex that is repeated for all single-object views. The magic is that the model router will automatically reverse the url based on a model or model instance if that model uses a model router, similar to how `get_absolute_url` provides the same functionality for a single view per model.

However, looking at ruby-on-rails and django-rest-framework, routers are tightly coupled with controllers/viewsets, which provide all the views for a specific model. Django's own `ModelAdmin` combines the two, though it is more of a router-and-controller-and-much-more in one. Controllers and viewsets are both tightly coupled with the routers, so it might be warranted that a possible viewset simply implements the `resolve` and `reverse` methods for its contained views.

The `decorators` parameter above also shows what I had in mind in that regard. After looking for concrete examples, I can't seem to think of a proper use case for the same behaviour for middleware, at least not with Django's built-in middleware. For decorators, though, it's a great addition, if there is a proper way to add, reorder or remove inherited decorators (reorder mostly because of csrf_exempt).

That's it for today.

Marten

Marten Kenbeek

unread,
Mar 5, 2015, 4:36:47 PM3/5/15
to django-d...@googlegroups.com
To respond to the other issues raised:

On 2 March 2015 at 17:14, Tim Graham <timog...@gmail.com> wrote:
I think it would be helpful to motivate this with some pseudocode of specific use cases you are aiming to solve. Have you looked into whether there are any related third-party projects related to your ideas from which you could draw inspiration?

There isn't too much out there for Django, the few apps that are out there usually work around the url dispatcher, e.g. with additional middleware. I've mostly looked at other frameworks like flask and ruby-on-rails for features. My previous post has some (pseudo)code that covers most features I'm planning on implementing. 

On Monday, March 2, 2015 at 8:20:20 PM UTC+1, Marc Tamlyn wrote:
A collection of thoughts:

I think allowing the url dispatcher to inspect the database for the existence of certain objects is potentially somewhat dangerous. However, good support for a view raising a "continue resolving" exception along the lines of https://github.com/jacobian-archive/django-multiurl might be interesting.

How so? I guess the potential to do other things than just fetching the object can be dangerous. If overused it can certainly cause performance issues, so matching against an object should generally be the last step in the url resolver. 

I prefer to think of it as a view preprocessor: if the url matches, you convert the string arguments to the values that are actually used, with an option to continue resolving. Maybe the implementation should clearly separate these two steps. A "continue resolving" exception in the view, if carelessly used, opens up the possibility that a view is partly processed and then control is passed to another view. Not to mention the view middleware, which received the first view. It just feels to me like a "continue resolving" exception should be part of the resolving process, not of the view. 

I also don't want to separate it into too many steps, that's why I wanted to include this step in the url resolver. However, it might warrant its own step in the whole handling process. 

Related to this, a check for potentially conflicting url mappings could be interesting.

I could definitely use some regex patterns to check for potentially conflicting regex patterns (maybe I should use jQuery as well) :P Kidding aside, regexes are hard to compare by nature. We could factor out static prefixes, and maybe check for common patterns. It would be easier with alternate matching schemes. E.g., flask uses a simple `<int:id>` to match an integer and capture it in the `id` parameter. Support to check for conflicts would be a lot simpler with those patterns. It would definitely be a best-effort feature. 

Middleware currently has complex and unintuitive behaviour in the event of exceptions. You talk about the middleware/decorator split, but not how to make either make sense.

I guess I'm not too familiar with real-world usage of middleware. Could you elaborate on this complex and unintuitive behaviour? Initially I simply thought of middleware and decorators as site-wide and per-view wrappers, but further investigation reminded me that there are more distinctions. The current middleware doesn't fully "fit in" with what I had in mind, especially the fact that request middleware is executed before the url is resolved, so this can't even be changed in the resolver. The middleware requires some more thought and discussion. 

Supporting generic sets of views has some logic, although in my experience it is extremely rare that you can sensibly use a generic view with no alterations at all - it almost always needs extra context or some other tweaks. I'm not really convinced that a one liner to get CRUD for a particular model will actually be that useful in the wild - you're likely to end up changing too many things. I don't find the "one line in a urlconf for each view" convention to be particularly problematic, however writing all the regexes is potentially more prone to problems.

Yeah, the generic views should be more of a fallback. The fact that it is a one-liner for generic views is more of an added nicety, I should've put that differently. As I elaborated in my previous post. the real power is that you can reverse urls based on models or model instances, rather than having to specify the parameters in the exact way the url expects. One huge benefit (can't believe I only thought of this just now) is that, if you decide to change the url structure, e.g. to use the slug instead of the pk, you don't have to run though your entire codebase to change every instance where you reverse that url. That might be an even larger benefit than url-agnostic reversing for third-party apps. No more regex duplication? Not bad either. 
 
If you are intending on introducing alternative URL resolvers, some example ideas would be needed. The lack of a consistent way to reverse a slug for example is a good idea to address, but we need to establish how.

If necessary, pure string-based comparison (e.g. for static prefixes) might give that little performance boost over regular expressions that can make it fast instead of average. Alternative, simpler syntax can also be provided (e.g. based on the `parse` library that Curtis mentioned). I don't really understand what you're getting at with slugs, though. Do you mean a lack of a consistent way to reverse the slugify process, or just reversing slug-based urls in general? 
 
How are you intending to support different resolvers in the same project? It seems to me that it is rather inefficient in large projects to loop through all resolvers for all urls.

Even now, routing works with a recursive design. Different resolvers will provide the same API to their direct "parent" (namely `resolve` and `reverse`) so they are a drop-in replacement for any other resolvers. Only if the replacement resolver itself is inefficient, would it be a problem.
 
Namespacing urls is currently over complex for the 90% use case, and the docs are hard to understand as a result. Alternative designs in this area could be interesting.

Definitely. One design that springs to mind is to remove the difference in namespaces and names: a namespace is simply a resolver with a name, that contains other named resolvers - instead of an app_name and possibly a namespace in `include`, you would just supply a `name` to the `url` that wraps `include`. 
 
Overall there are lots of interesting starts of ideas here, but I feel one or two dead ends. It's a potentially very varied project and the crux of the proposal needs to focus on ensuring that some specific tasks are well designed and achievable, with others being extensions later on.

Marc

I think the initial redesign and the public API should definitely be part of the project. The decorator/middleware features, as well as passing the `request` object would require changes to the API if done later on, so those are good candidates as well imo - though specifics, such as the `SubdomainResolver` in my example above are an easy extension later on. The same goes for model routers and other alternative routers. 

On Tuesday, March 3, 2015 at 2:56:39 AM UTC+1, Curtis Maloney wrote:
There was a "continue resolving" sort of exception proposed/implemented that would obviate this, allowing the logic to remain in views [or view decorators]... a much simpler solution, IMHO.

I believe I might have mentioned it before somewhere in this post or the last one, but that would mess with view middleware, and I don't  particularly like that resolving and view handling are mixed that way. Could just be me though. Either way, I think we all agree that a "continue resolving" option is a good addition, one way or another. 
 
This has certainly been on the "wanted" list for many years now, however I expect it would require the middleware re-design that so far has proven too invasive to land.

That said, providing the "new" middleware-as-wrapper interface around url patterns lists could be a good stepping stone to eventually removing the existing middleware API.

I guess I'm lucky that I haven't used middleware that much, except as a fire-and-forget setting. As I mentioned I'm more of a decorator kind of guy. It'd be interesting to hear about all the struggles, and maybe I can help. 
 
Are you talking about pre-cooked url patterns for the various CBV?  Or plugin routers for groups of CBV?  I'm certainly in favour of some tool that makes it easier to express "common" regex matches [satisfying the "protect from the tedium" rule of frameworks]

What I had in mind would go more towards a router than just some pre-cooked url patterns, including specific handling of resolving and reversing urls based on the model that the router connects to. I don't know yet just how far this should eventually go. If added as a later extension, that would certainly allow for plenty of time to think it through and refine it. I like this feature, though, so I'd definitely want to reach the next feature freeze deadline even if it was added after the initial patch. 

As mentioned elsewhere, I would very much like to see a resolver system based on the "parse" library [essentially, the inverse of str.format - https://pypi.python.org/pypi/parse], and to do so would indeed require some formal analysis / documentation of the existing resolver architecture.
 
--
Curtis

Interesting, definitely worth taking a look. A revamped framework would also allow for an easy extension later on that adds resolvers based on this, in case it isn't included initially. 

I'll be at the Django sprint in Amsterdam this Saturday, maybe some interesting ideas or insights will roll out. I'll be sure to revisit my proposal and this thread afterwards. 

Marten

Tom Christie

unread,
Mar 6, 2015, 11:21:01 AM3/6/15
to django-d...@googlegroups.com
> E.g., flask uses a simple `<int:id>` to match an integer and capture it in the `id` parameter. Support to check for conflicts would be a lot simpler with those patterns. It would definitely be a best-effort feature.

From my point of view, this by itself would make for a really nicely-scoped GSoC project.
Being able to demonstrate an API that allowed the user to switch to a URL resolver that used that simpler style would be a really, really nice feature,
and also feels like it might actually be a manageable amount of work.
This wouldn't *necessarily* need to allow decorator style routing, instead of the current URLConf style, but that might also be a nice addition. Personally though I would consider tackling that as an incremental improvement.

Things I'd be wary of:

* Anything around "continue resolving" exceptions or object inspection during routing, both of which sound like an anti-pattern to me.
* Method based routing. Feel less strongly about this, but still not convinced that it's a good style.
* Generic views / model routes / automatic routing. Too broadly defined, and with no clear best answer.

Anyways, interesting stuff Marten, look forward to hearing more.

  Tom

Marten Kenbeek

unread,
Mar 9, 2015, 10:01:02 AM3/9/15
to django-d...@googlegroups.com
After all the feedback I got here and at the sprint, I think the core of my proposal will be the revamping of the current dispatcher and creating a public API. I'll keep in mind the other features and in general the extendibility of the new dispatcher, and if there is time left I'll start implementing some of them.

One interesting note is how content management systems try to work with the url dispatcher. Most systems simply use a catch-all pattern. This often includes custom machinery to resolve and reverse url patterns for pages, blog posts, and other content types or plugins. Django's url dispatcher is completely static in that it doesn't provide any public API to change the url configuration after it has been loaded. This can be problematic with the dynamic nature of CMS's, hence the custom machinery. Bas (bpeschier) had to take it to a new level by routing certain content entries to a custom view. If you want to avoid a "router" view (which is the url dispatcher's job after all), you'd need to dig into the internals of the url dispatcher to have any kind of dynamic updating of the configuration.

I'd like to keep this dynamic nature in mind when designing the new API, and in time implement a public API for this as well (e.g. a simple `register` and `unregister`). This would avoid the need for either a router view or unique url prefixes for each content type as well. It should certainly allow for granular control, I believe reloading the complete url dispatcher can take quite some time (I should probably test that). 

I'm still in doubt on whether I should implement a refactor of url namespaces and middleware. Url namespacing is overly complex and I'm not too sure what the exact goal of the mechanism is. It obviously needs to differentiate multiple instances of the same url configuration, and it is also used to differentiate url configurations as well as to provide a default instance for an url configuration. I'm not too sure what is absolutely needed and what just makes it more complicated than necessary. However, as namespaces are such an integral part of the dispatcher, it is worth looking into and it might be necessary to keep in mind with the new API. 

As for middleware, I'm inclined to only support per-include decorators. Users can always use `decorator_from_middleware` to define middleware for a subset of views. While middleware certainly needs a revamp, I'm not too familiar with its current issues, and I feel this is slightly out of this project's scope. 

Marten Kenbeek

unread,
Mar 11, 2015, 10:57:18 AM3/11/15
to django-d...@googlegroups.com
I came across an app named django-url-namespaces[1]. It provides support for declarative style url patterns, very similar to the declarative style of models and forms. I'm not particularly for or against this style or the current style, but I wanted to put this out here for discussion. It can simplify namespaces as well: the app name is nothing more than the class name (if not overridden), and the namespace is nothing more than the attribute name of the including class (again, if not overridden). This project is going towards a more class-based approach anyway, so it might fit the new design.

Any strong feelings or convincing arguments for one or the other? I'm slightly in favour of the declarative class style, mostly because it's in line with many other parts in Django, though there is practically no difference between classes and their instances, unlike forms and models. You might treat the class as a configuration and an instance as a resolver match, though, which would give clear, distinctive semantics to classes and instances. The instance would work both for resolving and reversing, as it knows all about the view, the url and the arguments. 

Anyway, I have an updated proposal available at https://gist.github.com/knbk/cd0d339e1d3fa127cf7a

I've intentionally left out a `ContinueResolving` exception. This is nothing that cannot easily be implemented with a custom resolver. What is harder to implement with a custom resolver is a `StopResolving` (`AbortResolving`? need to work on the name) kind of exception, basically a `Resolver404` that breaks through its current recursion level. I feel it is warranted that this is included in the proposal. Think e.g. of a `SubdomainResolver`. You usually don't want to include all the urls for the main domain into your subdomain, so this allows you to raise `StopResolving` if none of the subdomain urls match. 

[1] https://github.com/fish2000/django-url-namespaces

Op maandag 9 maart 2015 15:01:02 UTC+1 schreef Marten Kenbeek:

Alexandr Shurigin

unread,
Mar 11, 2015, 11:02:45 AM3/11/15
to Marten Kenbeek, django-d...@googlegroups.com
Hi all.



-- 
Alexandr Shurigin
Sent with Airmail

Включено 11 марта 2015 г. в 16:57:25, Marten Kenbeek (marte...@gmail.com) написал:

--

You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.

Marten Kenbeek

unread,
Mar 26, 2015, 9:46:14 AM3/26/15
to django-d...@googlegroups.com
I've updated my proposal with a few more changes. Specifically, I've added a small part about reversing namespaced urls without specifying the namespace. In some circumstances this can provide looser coupling between third-party apps and arbitrary, reversible objects. It can also ease the transition from non-namespaced urls to namespaced urls by providing an intermediate step where an url name can match both. This step is absolutely necessary if third-party apps want to switch to namespaced urls but also want to provide backwards compatibility with previous versions. 

Additionally, I've included a timeline. Not my strongest point, but I think it's a decent guideline of what I'll be working on during the project. 

The up-to-date proposal can be found at http://www.google-melange.com/gsoc/proposal/public/google/gsoc2015/knbk/5629499534213120. Any additional feedback before (or after) the deadline is greatly appreciated. If something in the proposal is not clear, let me know and I'll try to explain it better in my proposal. 

There are still a few design decisions to make between now and the start of GSoC, but I'll probably create a separate thread for those. 

Thanks,
Marten

Op maandag 2 maart 2015 17:57:24 UTC+1 schreef Marten Kenbeek:
Reply all
Reply to author
Forward
0 new messages