GSoC 2015: Improved URL Pattern Matching (Draft)

177 views
Skip to first unread message

Alexander Patel

unread,
Mar 25, 2015, 7:17:31 PM3/25/15
to django-d...@googlegroups.com
Hello, all,
My name is Alex Patel, and I am an undegraduate at Harvard College in the United States studying mathematics and philosophy. I intend to submit a proposal to work on Django's URL dispatch mechanism for this year's Google Summer of Code, and am posting to solicit any feedback, comments, or concerns before I submit the final proposal on Friday.
I've included the abstract to the draft of my proposal below:
Abstract

Django features a powerful and complete URL dispatching mechanism, defined in django.core.urlresolvers, that uses regular expression pattern matching to map requested URLs to views with speed and reliability. Having not been subject to significant changes since the beginning of the project's development, however, this tool has not been updated to reflect the extensibility, flexibility, and ease of development at the core of Django's philosophy. For example, it is not possible for developers to specify a non-regex based resolver for a URL configuration, nor is there a standardized way to extend the current resolver to support common patterns or alternative pattern syntaxes to simplify the process of easily writing pretty URLs.

The primary objective of this project is to formalize a public-facing API to support the use of alternative URL resolving mechanisms. This will involve substantial refactoring and abstraction of the current URL resolution mechanism, which suffers from a tight coupling of internal components and little documentation for quick extension by developers. Further, it will involve reincorporating Django's regex-based RegexURLResolver as the default resolving mechanism, the performance and backwards compatibility of which must be thoroughly analysed and tested. Finally, the project aims to incentivize developers to create and release alternative URL resolvers compliant with the new API through outreach efforts and a proof-of-concept alternative resolving mechanism.

You can find a link to the full draft as a Github Gist here.
The deadline for GSoC is this Friday at 19:00 UTC. Any feedback before then would be greatly appreciated. Thanks a ton!
Best,
Alex

Russell Keith-Magee

unread,
Mar 25, 2015, 11:11:02 PM3/25/15
to Django Developers
Hi Alex,

Thanks for sending in a proposal. The proposal itself is strong - I've got some questions about API design choices, but they're more in the realm of "design details that need to be sorted out", rather than "fundamental flaws/weaknesses in your proposal". Design details notwithstanding, the timeline seems well thought out and achievable; the milestones you've proposed are good targets against which we can evaluate your progress.

The only thing I'd flag as a potential weakness (and it's not a critical flaw) is the item in week 1-2 to formalise the specification. The wheels of Django move slowly - I wouldn't expect any non-trivial design discussion to come to a firm resolution in 2 weeks. This is an area where you'll need to start early, possibly before the formal GSoC period.

It's also an area where having a working example will make the discussion a lot more compelling. In practice, much of the "real" discussion will probably happen near the end of your proposal when you've got something ready to merge, rather than in the abstract design phase.

Broadly, I think the design you've proposed is workable. However, I've got two suggestions - or rather, design considerations that I don't know if you've given any thought to. Both come in the form of end use cases that ideally would be possible as a result of your refactoring.

Firstly, it should be possible to define a URL dispatcher that takes into account the *domain*, as well as the URL. For example, I want my main example.com website to be a landing and sales page, but with login capability; but I want <username>.example.com to be a user-specific site. This is possible to do right now with some ROOT_URLCONF trickery and some little known Middlware details, but there's no reason it should be part of the main capability of URLConf - if only because it would be nice for reverse to be able to return the subdomain when necessary.

Secondly, there's potential for the URL dispatch mechanism to replace (or at least supplement) the Middleware system. For example - think of a login_required decorator - at present, that can only be applied on a per-view basis. It would be desirable to be able to apply it on a *collection* of views - so, for example, an entire subtree of a URL scheme is automatically login protected, and only that subtree carries the overhead of the Auth Middleware.

To be clear - I don't think you have to provide working solutions for either of these as part of your project. However, I think any design you propose should make it clear how these use cases would be satisfied. As it stands, it isn't immediately obvious to me how this would happen - in particular, the design choice to pass in the resolver as an argument to patterns() (which is, as Simon has noted, a deprecated method anyway) doesn't make it clear to me how these tasks would be tackled.

Yours,
Russ Magee %-)

Alexander Patel

unread,
Mar 26, 2015, 5:37:04 PM3/26/15
to django-d...@googlegroups.com
Russ -

Thanks a ton for the feedback. I really appreciate you taking the time to look over my proposal.

On Wednesday, March 25, 2015 at 11:11:02 PM UTC-4, Russell Keith-Magee wrote:
Hi Alex,

On Thu, Mar 26, 2015 at 7:03 AM, Alexander Patel <alexand...@college.harvard.edu> wrote:
Hello, all,
My name is Alex Patel, and I am an undegraduate at Harvard College in the United States studying mathematics and philosophy. I intend to submit a proposal to work on Django's URL dispatch mechanism for this year's Google Summer of Code, and am posting to solicit any feedback, comments, or concerns before I submit the final proposal on Friday.
I've included the abstract to the draft of my proposal below:
Abstract

Django features a powerful and complete URL dispatching mechanism, defined in django.core.urlresolvers, that uses regular expression pattern matching to map requested URLs to views with speed and reliability. Having not been subject to significant changes since the beginning of the project's development, however, this tool has not been updated to reflect the extensibility, flexibility, and ease of development at the core of Django's philosophy. For example, it is not possible for developers to specify a non-regex based resolver for a URL configuration, nor is there a standardized way to extend the current resolver to support common patterns or alternative pattern syntaxes to simplify the process of easily writing pretty URLs.

The primary objective of this project is to formalize a public-facing API to support the use of alternative URL resolving mechanisms. This will involve substantial refactoring and abstraction of the current URL resolution mechanism, which suffers from a tight coupling of internal components and little documentation for quick extension by developers. Further, it will involve reincorporating Django's regex-based RegexURLResolver as the default resolving mechanism, the performance and backwards compatibility of which must be thoroughly analysed and tested. Finally, the project aims to incentivize developers to create and release alternative URL resolvers compliant with the new API through outreach efforts and a proof-of-concept alternative resolving mechanism.

You can find a link to the full draft as a Github Gist here.
The deadline for GSoC is this Friday at 19:00 UTC. Any feedback before then would be greatly appreciated. Thanks a ton!

Thanks for sending in a proposal. The proposal itself is strong - I've got some questions about API design choices, but they're more in the realm of "design details that need to be sorted out", rather than "fundamental flaws/weaknesses in your proposal". Design details notwithstanding, the timeline seems well thought out and achievable; the milestones you've proposed are good targets against which we can evaluate your progress.

The only thing I'd flag as a potential weakness (and it's not a critical flaw) is the item in week 1-2 to formalise the specification. The wheels of Django move slowly - I wouldn't expect any non-trivial design discussion to come to a firm resolution in 2 weeks. This is an area where you'll need to start early, possibly before the formal GSoC period.

It's also an area where having a working example will make the discussion a lot more compelling. In practice, much of the "real" discussion will probably happen near the end of your proposal when you've got something ready to merge, rather than in the abstract design phase.

Definitely noted. I think that both prepending more time onto the design phase before the formal GSoC period and reworking the timeline to permit more time for feedback are necessary.
 

Yours,
Russ Magee %-)


Broadly, I think the design you've proposed is workable. However, I've got two suggestions - or rather, design considerations that I don't know if you've given any thought to. Both come in the form of end use cases that ideally would be possible as a result of your refactoring.
 

Firstly, it should be possible to define a URL dispatcher that takes into account the *domain*, as well as the URL. For example, I want my main example.com website to be a landing and sales page, but with login capability; but I want <username>.example.com to be a user-specific site. This is possible to do right now with some ROOT_URLCONF trickery and some little known Middlware details, but there's no reason it should be part of the main capability of URLConf - if only because it would be nice for reverse to be able to return the subdomain when necessary.

It seems that the most stable way to support this would be to have the handler pass both the path and the subdomain prefix, if not just request.get_full_path() (in some capacity), to the resolver and then modify the signature of url() to accept an optional subdomain prefix pattern to match.
 

Secondly, there's potential for the URL dispatch mechanism to replace (or at least supplement) the Middleware system. For example - think of a login_required decorator - at present, that can only be applied on a per-view basis. It would be desirable to be able to apply it on a *collection* of views - so, for example, an entire subtree of a URL scheme is automatically login protected, and only that subtree carries the overhead of the Auth Middleware.

To be clear - I don't think you have to provide working solutions for either of these as part of your project. However, I think any design you propose should make it clear how these use cases would be satisfied. As it stands, it isn't immediately obvious to me how this would happen - in particular, the design choice to pass in the resolver as an argument to patterns() (which is, as Simon has noted, a deprecated method anyway) doesn't make it clear to me how these tasks would be tackled.

My justification for modifying patterns() to configure a resolver for a group of URL patterns was that patterns() seemed like the finest level of granularity at which a set of patterns could be grouped. As patterns() is deprecated in 1.8, though, I don't think that there's a reason not to allow the developer to specify a resolver as an optional attribute of the URLconf itself, and then take advantage of include() and the URL pattern tree to load the resolvers recursively. It seems that the right design choice, in general, is to harness the existing nested tree structure. Then, things like specifying middleware groups could be done at the URLconf level, by defining an optional list of middleware to be applied to a given set of patterns in a URLconf, and then recursively apply the middleware as the resolver traverses the tree. I'll definitely put more thought into this, though, for the final proposal; it is clear that supporting middleware for collections of patterns will take a significant refactoring of the way middleware is applied by the handler, and I definitely want to spur some discussion about the design for that change, if not hack on it myself.

Thanks, again!

Best,

Alex

Aymeric Augustin

unread,
Mar 26, 2015, 6:02:05 PM3/26/15
to django-d...@googlegroups.com
Hi Alexander,

I won’t repeat what Russell said, just add a few things.

A few months ago I had a use case for internationalization that looked pretty simple but turned out to be tricky to implement: https://github.com/oscaro/django-o18n. Resolving is easy, the hard part is reversing. Perhaps you could check that your design allows implementing this pattern without monkey-patching.

The API you’re proposing for patterns cannot work because it’s a “SyntaxError: non-keyword arg after keyword arg”. You’ll have to find something else.

Your milestone 3 shouldn’t take more than a few days once milestones 1 and 2 are in. However it’s good to have some padding for unforeseen complications :-)

-- 
Aymeric.



--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/e45c8521-bacb-448f-9231-34452e89439b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Alexander Patel

unread,
Mar 28, 2015, 2:00:51 PM3/28/15
to django-d...@googlegroups.com
Thanks for the feedback, everyone.

The final proposal can be found here.

Let me know if you have any questions.

Best,

Alex
Reply all
Reply to author
Forward
0 new messages