Controlling the upload of large files

16 vues
Accéder directement au premier message non lu

Lisper

non lue,
24 nov. 2011, 13:48:0124/11/2011
à mod...@googlegroups.com
I'm trying to implement a limit on the size of uploaded files which depends on a user's status, i.e. a user who pays gets to upload larger files than one who doesn't.  I want to check the size of the upload BEFORE reading the data because I don't want to read 27 gigabytes only to find that this user is limited to 1MB.  I can't user Apache's LimitRequestBody directive.  The filtering has to be done within the application because it's the only part of the system that knows the user's status.

I found this:

http://mail.python.org/pipermail/web-sig/2008-November/003639.html

which says:

"If you use embedded mode, so long as your WSGI application doesn't
read the input and just returns the error response, the request
content wouldn't be read at all."

So I tried this:

def application(environ, start_response):
 status = '200 OK'
 headers = [('Content-type', 'text/html'),('Connection','close')]
 out = start_response(status, headers)
 out('--%s--<br>' % environ.get('CONTENT_LENGTH'))
 return ['''<form method=post enctype="multipart/form-data">
 <input name=f type=file>
 <input type=submit></form>''']

and ran it using:

 WSGIScriptAlias /wsgitest /path/to/driver.wsgi
 <location /wsgitest>
 WSGIApplicationGroup %{GLOBAL}
 </Location>

which as far as I can tell should result in running in embedded mode.  This seems like it *should* display the size of the uploaded file without reading the file data.  However, empirically, the data is being read.  If I upload a large file, there's a long delay before getting a response.

So what am I doing wrong?

Thanks, and Happy Thanksgiving!

Jason Garber

non lue,
24 nov. 2011, 13:55:4324/11/2011
à mod...@googlegroups.com

If I recall correctly, when I wrote the code to read uploaded files, it was one chunk at a time (eg 4096 bytes).  Wouldn't this give you the opportunity to stop at the right point?

take a look at github.com/appcove/AppStruct ... Python/AppStruct/WSGI/Lib.py (or related, maybe werkseug.py) to see the code.

Take care.

--
You received this message because you are subscribed to the Google Groups "modwsgi" group.
To view this discussion on the web visit https://groups.google.com/d/msg/modwsgi/-/rEYG3K-zj74J.
To post to this group, send email to mod...@googlegroups.com.
To unsubscribe from this group, send email to modwsgi+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/modwsgi?hl=en.

Lisper

non lue,
24 nov. 2011, 15:00:3924/11/2011
à mod...@googlegroups.com
No, the problem is that the entire file is (apparently) read by modwsgi before the application code is run at all.

Jason Garber

non lue,
24 nov. 2011, 15:20:4424/11/2011
à mod...@googlegroups.com

I am pretty sure this is not the case...

Looking forward to Graham's comments.

Look at docs on environ['wsgi.input'].

Thanks!

On Nov 24, 2011 3:00 PM, "Lisper" <ron.g...@gmail.com> wrote:
No, the problem is that the entire file is (apparently) read by modwsgi before the application code is run at all.

--
You received this message because you are subscribed to the Google Groups "modwsgi" group.
To view this discussion on the web visit https://groups.google.com/d/msg/modwsgi/-/tQ3BVEA08bAJ.

Graham Dumpleton

non lue,
24 nov. 2011, 15:28:3224/11/2011
à mod...@googlegroups.com
Do you mean sent rather than read?

Except for Opera, browsers don't implement 100-continue and so the
browser will always send the huge upload anyway. If the browser said
it is using HTTP/1.1 then Apache doesn't have a choice but to still
read the entire request content to throw it away if you return a 200
response. This is because there may be another request following the
first over the same connection. So the problem is the browsers that
send the data anyway.

Try your test again but don't return a 200 response, instead return a
413 request entity too large error response. When a non 200 response
is returned Apache will know it is an error and should just send the
response and not also try and consume the request content as when it
is an error response browsers aren't supposed to send a subsequent
request over the same connection.

Graham

On 25 November 2011 07:00, Lisper <ron.g...@gmail.com> wrote:
> No, the problem is that the entire file is (apparently) read by modwsgi
> before the application code is run at all.
>

> --
> You received this message because you are subscribed to the Google Groups
> "modwsgi" group.
> To view this discussion on the web visit

> https://groups.google.com/d/msg/modwsgi/-/tQ3BVEA08bAJ.

Graham Dumpleton

non lue,
24 nov. 2011, 15:39:0824/11/2011
à mod...@googlegroups.com
BTW, don't expect to see your nice message in the browser even when
using 413. Because a browser not implementing 100-continue as it
should, they usually don't try and deal with a response before sending
all data. Because Apache will close input side of connection, the
browser will fail sending all the content and general throws up a
nasty connection error rather than read the error response and show
it.

Graham

Lisper

non lue,
24 nov. 2011, 16:26:4524/11/2011
à mod...@googlegroups.com
Wow, that really sucks.  It does indeed appear to be the case that the server is doing the Right Thing and it's the browser that's screwing this up.  Bummer.  I guess it's time to break out the client-side Javascript filefrobber object :-(

Thanks, Graham!

Répondre à tous
Répondre à l'auteur
Transférer
0 nouveau message