On Apr 12, 6:15 pm, Hongli Lai <
hon...@phusion.nl> wrote:
> The specification currently does not state whether rack.input has to
> be seekable.
The specification also says that three different APIs must be provided
concurrently: each, read and gets. This is another fundamental
problem. Any middleware which touches the body must also provide these
three APIs to the consumer.
The worst of these is 'gets', because a malicious client may force the
webserver to read the entire input into RAM (e.g. by submitting a 2GB
body which doesn't contain a newline)
read(n) isn't quite so bad, because the consumer can specify the
maximum amount to read, and the producer doesn't necessarily have to
provide that much for each chunk. However read(n) and each are
fundamentally different: read(n) is a "pull" (driven by consumer) and
each() is a "push" (driven by producer). It becomes very hard to write
modules which work both ways.
Personally I think that Rack should specify exactly one API for
rack.input. My preference is for each(), since this is simple to
implement both for producers and consumers. This is however orthogonal
to the rewindability question, since you still need to define whether
the consumer can call 'each' only once or more than once.
I think that it should be defined that 'each' can only be called once,
but we can then provide a middleware module which makes the body
rewindable, and in addition provides the 'read' and 'gets' APIs. This
middleware would keep small bodies in RAM but spool larger ones to
disk.
This approach keeps web server adapters and middleware modules simple
but still provides buffered input to those apps which require it. The
disadvantage is that if the webserver itself already provides body
buffering, we would be ignoring this capability. (Do any of them do
this?)
Regards,
Brian.