Limiting file size before upload?

2,327 views
Skip to first unread message

daniel...@gmail.com

unread,
Jul 10, 2006, 6:42:12 AM7/10/06
to Django users
Hi.

Is there a way to limit upload file size *before* the upload is
accepted?

Say I want to give people the option of uploading movie files, but I
want to limit them to 4MB per movie. If I understand the upload
behaviour of Django correctly, the file is first accepted into main
memory (*) , after which I have access to metadata.
This would mean that somebody could easily OOM my Apache just by
uploading a huuuuuge file. Also, in a hosted environment, this is
unnecessary traffic which I'd have to pay for.

If there were a way to check the "content-length" header BEFORE
accepting the upload, this problem could be prevented (the socket could
just be closed and a "permission denied" response or some other HTTP
error could be returned).

Is there a way to do this?

Daniel

(*) I'm aware that there are also Django extensions which do this
streaming to disk, instead of to memory; still, I'd have to accept the
sender's bulk data before I could discover that it is way too big.

Malcolm Tredinnick

unread,
Jul 10, 2006, 6:57:54 AM7/10/06
to django...@googlegroups.com
On Mon, 2006-07-10 at 10:42 +0000, daniel...@gmail.com wrote:
> Hi.
>
> Is there a way to limit upload file size *before* the upload is
> accepted?
>
> Say I want to give people the option of uploading movie files, but I
> want to limit them to 4MB per movie. If I understand the upload
> behaviour of Django correctly, the file is first accepted into main
> memory (*) , after which I have access to metadata.
> This would mean that somebody could easily OOM my Apache just by
> uploading a huuuuuge file. Also, in a hosted environment, this is
> unnecessary traffic which I'd have to pay for.

Without answering your question directly, let me just point out that
even if such a limit existed in Django, it still wouldn't help your
Apache process, since it is already accepting the data.

Fortunately, you *can* control this at the Apache level. Look at the
LimitRequestBody directive.

Regards,
Malcolm


daniel...@gmail.com

unread,
Jul 10, 2006, 7:15:26 AM7/10/06
to Django users
Malcolm,

Malcolm Tredinnick wrote:
> Without answering your question directly, let me just point out that
> even if such a limit existed in Django, it still wouldn't help your
> Apache process, since it is already accepting the data.

Hummm... That's probably right. I wasn't too sure at which point during
the HTTP request processing mod_python / Django would be called.

> Fortunately, you *can* control this at the Apache level. Look at the
> LimitRequestBody directive.

Thanks! I didn't know that directive and indeed it seems to do exactly
what I want. And the Apache configuration probably *is* the better
place to handle this.

Daniel

Malcolm Tredinnick

unread,
Jul 10, 2006, 7:24:40 AM7/10/06
to django...@googlegroups.com
On Mon, 2006-07-10 at 11:15 +0000, daniel...@gmail.com wrote:
> Malcolm,
>
> Malcolm Tredinnick wrote:
> > Without answering your question directly, let me just point out that
> > even if such a limit existed in Django, it still wouldn't help your
> > Apache process, since it is already accepting the data.
>
> Hummm... That's probably right. I wasn't too sure at which point during
> the HTTP request processing mod_python / Django would be called.

I'm not 100% certain either. However, of the only hope you (Apache) have
to avoid being hit by a truck is to hope that the guy behind you
(Django) hold up a stop sign in time, then I would start looking for a
plan B.

Regards,
Malcolm

Ivan Sagalaev

unread,
Jul 10, 2006, 7:46:50 AM7/10/06
to django...@googlegroups.com
Malcolm Tredinnick wrote:
> I'm not 100% certain either. However, of the only hope you (Apache) have
> to avoid being hit by a truck is to hope that the guy behind you
> (Django) hold up a stop sign in time, then I would start looking for a
> plan B.

AFAIK mod_python's handler is called very early in the process before
all the data is already on the server. I then read all the request data
from a stream (which I suppose connects to the receving tcp socket may
be over some buffering wrappers). But user's code is not hit until
Django mod_python handler chews everything in memory or in the temp file
(using appropriate patches). I often think this would be a good idea to
be able to intercept this process with some user pluggable code.

The following is may be better suit for django-developers... I see this
as two changes:

- a setting for buffer size when reading from a stream. This may as well
override my STORE_UPLOAD_ON_DISK from
http://code.djangoproject.com/ticket/1484. When stream is bigger than
the buffer it is restored in a temp file and kept in memory otherwise

- a new middleware method "prcess_stream" that users can hook into to do
something. Two obvious usecases are limiting the upload size and
counting upload progress somewhere in the database that can be read live
from another ajax request updating some shiny progress bar in a browser.

But I'm a bit reluctant to start implementing such things since my much
simpler patch is not applied for some unknown reason :-)

Reply all
Reply to author
Forward
0 new messages