Ticket #28609 - Making REQUEST_URI available in runserver

67 views
Skip to first unread message

Jay Lynch

unread,
Sep 28, 2017, 7:09:44 AM9/28/17
to Django developers (Contributions to Django itself)

Hi all!

Apologies if this I miss any conventions here, I've been working with Django for a long time now but this is my first message here. I'm here to chat about...


(Closed) Ticket #28609 - Request URI

I recently submitted a PR for a change which would make the original / raw URI of a request available to the development server's WSGI processing, as a parameter REQUEST_URI.

This would allow the creation of WSGI middleware which needs access to this information that can be used universally between both development with devserver and production with Nginx or similar.

Tim Graham suggested I post here if I disagree with it being flagged as wontfix on grounds of being something that would need to be discouraged as it enabled practices which are "not recommended".

I do disagree as there was a lack of any explanation or data around how this is a universal bad and whilst I agree it could be misused if a developer were naive of the details of how it can be used, I don't think that is sufficient justification. 


Lack of access to this means using very basic WSGI middleware with Django requires hugely disproportionate levels of effort. (eg. some suggestions offered were "write an alternative runserver" or "replace it with Apache", both likely measured in days of work for our team, in order to avoid a trivial code change which took 5 minutes and brings the dev server further in line with other commonly used WSGI servers)



Reasoning

I would like to enable access to this parameter so that we can continue to use runserver when working locally on the API for https://www.broadsheet.com.au - an Australian city guide and mid-sized publisher.

The specific use case we have for the raw URI is simple – there is a fairly universal form of search URL where spaces are replaced with + rather than URL encoded and in order to enable "+required" style binary operators, any literal + characters in the search query are URL encoded. (Attached is a screenshot of Google, Youtube, Bing, DuckDuckGo, Amazon all using this behaviour)

So a search for "test search +banana" (typically a query which requests results for "test search" but limits results to pages including the word "banana") is encoded as test+search+%2Bbanana

The benefits are clear: cleaner, human-friendly URLs, use of the same query format in places where spaces are used as delimiters, and maintaining use of + for binary modifiers, all with one syntax.

The WSGI middleware which performs this replacement is trivial, but impossible to use with runserver currently as it does not provide the information.



Alternatives

It is not possible to implement this as a library or add-on as the dev server does not pass this data downstream. Replacing the development server is a high cost if it is for this reason alone.

Alternative implementations within dev server could include full support for WSGI parameters, which seems excessive, or custom query string processing for this use case, which would be less compatible and may break other expected behaviours.



Potential Downsides

Could be used inappropriately and may need to be cautioned against

I don't see why there is any need for it to be documented, it is providing optional access to an internal, as is the case with WSGI parameters in other web servers. That aside, I disagree that requiring caution invalidates the value in making a tool available to users. 


REQUEST_URI is not actually a standard 

This is true but it has become a de facto standard which is supported by most major web servers and in many cases enabled by default. Enabling it allows consistent WSGI middleware use.


Cluttering the WSGI environment

Given a very similar mechanism is being used to pass information like color highlighting status around it seems inconsistent to call this a concern.


The implementation is flawed

I would be very happy to receive feedback here, this is my first time submitting a patch / PR.

 

Code

See https://github.com/django/django/pull/9101 – the change involved is minimal, it simply has the dev server add the raw URI as a parameter in the WSGI environment so that middleware can use it:


-        return super(WSGIRequestHandler, self).get_environ()

+        environ = super(WSGIRequestHandler, self).get_environ()
+        environ[str("REQUEST_URI")] = self.path
+        return environ


We have been using a branch with this change applied to 1.11 in both development and production since the PR was submitted, though the related WSGI middleware is not yet in production.



Thanks for your time!


Jay Lynch



Shai Berger

unread,
Oct 7, 2017, 7:02:27 AM10/7/17
to django-d...@googlegroups.com
Hi Jay,

First, I'd like to say that your post was very nicely done. You did not miss
any conventions as far as I could see. I believe the only reason you got no
responses so far is that the issue is relatively fringe, and your message was
relatively long.

For the subject matter, though, it seems like your use case could be easily
served by using the stdlib's urllib.parse.quote_plus (or urllib.quote_plus, if
you're on Python 2) in your WSGI middleware. This should give you the
parameters of interest in the format you're looking for, without changing
Django's code, and while still enjoying the other benefits of URL
normalization. It seems you have considered only alternatives for passing the
original data into your code, and passed over the idea of reconstructing from
available data. Or, if this is what you meant by

> [...] custom query string
> processing for this use case, which would be less compatible and may break
> other expected behaviours.

I think you'll need to give a little more details for the argument to be
convincing.

Of your other comments, this caught my attention:

> *Could be used inappropriately and may need to be cautioned against*
>
> I don't see why there is any need for it to be documented, it is providing
> optional access to an internal, as is the case with WSGI parameters in
> other web servers.

If a feature is not documented, it means you should not rely on it.
Undocumented APIs are usually added because they are required for implementing
some documented feature; adding undocumented APIs on their own makes little
sense.

> That aside, I disagree that requiring caution
> invalidates the value in making a tool available to users.

Requiring caution, in and of itself, does not invalidate the value of making a
tool available to users -- but it does detract from it. We then ask ourselves
if we can recommend safer alternatives, and I believe I have.

Hope this helps,
Shai.
Reply all
Reply to author
Forward
0 new messages