wsgi and multipart

84 views
Skip to first unread message

william opensource4you

unread,
Dec 19, 2010, 1:23:37 PM12/19/10
to fapws
All,

Following the WSGI specificications, Fapws puts the posted data into
file like object called wsgi.input.
Unfortunately, most of the time, we have to parse it to got the data,
file's content, ...
Do you know if there is a standard approach for this ?
By looking at different webserver, if I'm not wrong, there is not
found a common approach (Django, quixote, ...).

On the other hand, we should keep Fapws with a light memory foot print ?
Thus should we store it into a temporary file ? when to remove it ?
automatically ? on user's request ?
Do you have idea how this is handle into other webserver ?

Thus 2 different questions, but both closely interconnected :-)


Many thanks

W.

Jonas H.

unread,
Dec 19, 2010, 3:10:30 PM12/19/10
to fa...@googlegroups.com
On 12/19/2010 07:23 PM, william opensource4you wrote:
> All,
>
> Following the WSGI specificications, Fapws puts the posted data into
> file like object called wsgi.input.
> Unfortunately, most of the time, we have to parse it to got the data,
> file's content, ...
> Do you know if there is a standard approach for this ?

I don't think there's any module in the Python standard library to do
this. Marcel Hellkamp (the author of the Bottle micro webframework) has
as very well-written multipart parser which you can find here:
https://github.com/defnull/multipart

For parsing POST/GET data I think there's `urllib.parse_qs[l]`.

You could also have a look at small Web frameworks (bottle, ...) and
Werkzeug, both have integrated POST/GET and multipart parsing.

> On the other hand, we should keep Fapws with a light memory foot print ?
> Thus should we store it into a temporary file ? when to remove it ?
> automatically ? on user's request ?
> Do you have idea how this is handle into other webserver ?

I think the best solution is to keep HTTP bodies in memory unless they
exceed a certain size limit (then I'd put them into a file). You could
have a look at meinheld (https://github.com/mopemope/meinheld) which
uses exactly that approach.

Another idea is to use memory mapped files, assuming the operating
system does its memory swapping job well.

Please let be know what you decide to do! (Well, actually I'll know
anyway thanks to GitHub feeds :-)

Jonas

william opensource4you

unread,
Dec 22, 2010, 5:44:15 PM12/22/10
to fa...@googlegroups.com
Thanks Jonas.

Finally, after having checked what exist, I've implemented my solution
(check github).

Last step would be to adapt current flow of fapws to assure big upload
will goes directly to disk and keep memory foot print to lowest level.
Mainly connection_cb where we should write to 'wsgi.input' instead of
collecting into cli->input_body.
Is there a volunteer to code this :-)

W.

Reply all
Reply to author
Forward
0 new messages