proposal for http file uploads

286 views
Skip to first unread message

Andrew Gerrand

unread,
Apr 19, 2011, 11:06:17 PM4/19/11
to golang-dev
This is a proposal to make handling HTTP file uploads a lot easier,
and to make it possible to access the fields of multipart-encoded
forms with the http.Request.FormValue method.

Brad and I had a discussion earlier today and came up with this approach:


First, some lower level enhancements to mime/multipart:

Add a new ReadAll method on multipart.Reader that returns a map of the
field names to *Part.

func (*Reader) ReadAll(max int64) (map[string]*Part, os.Error)

ReadAll reads the entire request body, storing up to max bytes of data
in memory and writing the rest to temporary files (via
ioutil.TempFile).

mulitpart.*Part is an io.ReadCloser, and will need to be extended to
provide data stored either in memory or on disk.


Second, a lightweight wrapper of multipart in http:

func (r *Request) ParseMultipartForm(max int64) os.Error

ParseMultipartForm calls r.MultipartReader().ReadAll(max) and
populates a map in the Request that is read by FormValue and FormFile.
ParseMultipartForm should be analogous to ParseForm. It should be
idempotent.

func (*Request) FormFile(s string) *multipart.Part

FormFile is used to access uploaded files as a *multipart.Part. If
FormFile is called before ParseMultipartForm, it calls
ParseMultipartForm with a default maximum size;
http.DefaultMultipartMemoryBytes (or whatever).


Sample usage in an http handler:

func handle(w http.ResponseWriter, r *http.Request) {
f := r.File("image")
if f == nil {
http.Error(w, "no image provided", 401)
return
}
defer f.Close()
var buf bytes.Buffer
buf.ReadFrom(f)
// do something with buf, etc.
}


Additional thoughts:

multipart.Part could expose a raw []byte for files stored in memory.

The FormValue method could call ParseMultipartForm automatically if
the request's Content-type is multipart. This would make FormValue
"just work" for multipart forms.


Feedback?

Andrew

Russ Cox

unread,
Apr 20, 2011, 5:09:30 AM4/20/11
to Andrew Gerrand, golang-dev
It sounds reasonable to me.

I would rather not have to add a new
ParseMultipartForm but I don't see a way
around it. At least I never have to call it.
(FormValue should take care of it too.)

I am a little surprised that FormFile returns no error.
There are plenty of possible errors, from not having a key
to not having a valid header.

I don't understand why FormFile returns a *multipart.Part.
The Part interface is very much a stream kind of thing:
you have to consume one Part before you can use the
next one. It's confusing to reuse that type in a non-stream
context. I think it makes more sense to define either the
interface you promise or a specific concrete type.
Using the interface means that callers can potentially check
whether the returned file is an *os.File in order to do other
operations, but it means you have to return the header separately.

// A File provides access to the content of a file uploaded as
// part of an HTTP request.
type File interface {
io.Reader
io.ReaderAt
io.Seeker
io.Closer
}

// FormFile returns the header and content of a file uploaded
// as part of the request r. The caller must call f.Close when
// finished with the file.
func (r *Request) FormFile(key string) (hdr Header, f File, err os.Error)

Russ Cox

unread,
Apr 20, 2011, 5:12:08 AM4/20/11
to Andrew Gerrand, golang-dev
An alternative to returning the header, by the way, is to
record the "filename" field as the form value, so that if you
upload c:\my.txt as "file1" then FormValue("file1") == `c:\my.txt`
and FormFile("file1") returns File, os.Error.

I do that in a Go web server of mine and it seems okay.
At the least it gives an answer to the question of what FormValue
should return.

It's only really applicable if filename is the only possible
interesting header value. Maybe it is, maybe it isn't. I don't know.

Russ

Brad Fitzpatrick

unread,
Apr 20, 2011, 12:17:48 PM4/20/11
to Russ Cox, Andrew Gerrand, golang-dev
On Wed, Apr 20, 2011 at 2:09 AM, Russ Cox <r...@google.com> wrote:
It sounds reasonable to me.

I would rather not have to add a new
ParseMultipartForm but I don't see a way
around it.  At least I never have to call it.
(FormValue should take care of it too.)

I am a little surprised that FormFile returns no error.

What's an error besides ENOENT, which we can use nil for?
 
There are plenty of possible errors, from not having a key
to not having a valid header.

Valid Part Header?  What's invalid?
 

I don't understand why FormFile returns a *multipart.Part.
The Part interface is very much a stream kind of thing:
you have to consume one Part before you can use the
next one.  It's confusing to reuse that type in a non-stream
context.  I think it makes more sense to define either the
interface you promise or a specific concrete type.
Using the interface means that callers can potentially check
whether the returned file is an *os.File in order to do other
operations, but it means you have to return the header separately.

Yeah, we were thinking returning a Part (or nil) so we'd have an io.Reader and the headers.

But being able to get to the *os.File would indeed be useful.

 
// A File provides access to the content of a file uploaded as
// part of an HTTP request.
type File interface {
   io.Reader
   io.ReaderAt
   io.Seeker
   io.Closer
}

// FormFile returns the header and content of a file uploaded
// as part of the request r.  The caller must call f.Close when
// finished with the file.
func (r *Request) FormFile(key string) (hdr Header, f File, err os.Error)

Maybe swap File and Header?  I image a fair number of callers would assign Header to _. 

Russ Cox

unread,
Apr 20, 2011, 12:21:27 PM4/20/11
to Brad Fitzpatrick, Andrew Gerrand, golang-dev
>> I am a little surprised that FormFile returns no error.
>
> What's an error besides ENOENT, which we can use nil for?
>
>>
>> There are plenty of possible errors, from not having a key
>> to not having a valid header.
>
> Valid Part Header?  What's invalid?

I don't know. There's certainly "file not uploaded".
Is there other stuff like encodings and such that
could go wrong too, like "client clearly sent me
something but I don't understand it"?

>> // FormFile returns the header and content of a file uploaded
>> // as part of the request r.  The caller must call f.Close when
>> // finished with the file.
>> func (r *Request) FormFile(key string) (hdr Header, f File, err os.Error)
>
> Maybe swap File and Header?  I image a fair number of callers would assign
> Header to _.

Is the header useful? Is the only thing in it the file name?
That could be handled as FormValue if needed (see mail
I sent after the one you replied to). I honestly just don't
know. Is there an RFC?

Russ

Brad Fitzpatrick

unread,
Apr 20, 2011, 12:26:02 PM4/20/11
to Russ Cox, Andrew Gerrand, golang-dev
On Wed, Apr 20, 2011 at 9:21 AM, Russ Cox <r...@google.com> wrote:
>> I am a little surprised that FormFile returns no error.
>
> What's an error besides ENOENT, which we can use nil for?
>
>>
>> There are plenty of possible errors, from not having a key
>> to not having a valid header.
>
> Valid Part Header?  What's invalid?

I don't know.  There's certainly "file not uploaded".
Is there other stuff like encodings and such that
could go wrong too, like "client clearly sent me
something but I don't understand it"?

Fair enough.
 

>> // FormFile returns the header and content of a file uploaded
>> // as part of the request r.  The caller must call f.Close when
>> // finished with the file.
>> func (r *Request) FormFile(key string) (hdr Header, f File, err os.Error)
>
> Maybe swap File and Header?  I image a fair number of callers would assign
> Header to _.

Is the header useful?  Is the only thing in it the file name?

content-type / charset too.
 
That could be handled as FormValue if needed (see mail
I sent after the one you replied to).  I honestly just don't
know.  Is there an RFC?


Andrew Gerrand

unread,
Apr 20, 2011, 9:15:22 PM4/20/11
to Russ Cox, golang-dev
Some good points.

On 20 April 2011 19:09, Russ Cox <r...@google.com> wrote:
> I am a little surprised that FormFile returns no error.
> There are plenty of possible errors, from not having a key
> to not having a valid header.

I'm still not convinced an error is necessary. If there's an error in
the data, make it a Read error when you try to use the File.

> I don't understand why FormFile returns a *multipart.Part.

The plan is still to add the tempfile functionality to the multipart
package, right?

What about adding the type File to multipart, and having
multipart.Reader.ReadAll() return map[string]*File ? multipart.File
can then represent either a memory- or disk-based file, and
effectively represents the random-access version of multipart.Part.

type File struct {
Name string
ContentType string
Header textproto.MIMEHeader
// implementation
}


I think Request.FormValue should return the contents of the mime part
as a string. If it's a file and it'll be big, don't do that. But not
every mime part is a file, and accessing those fields via FormValue is
useful.

Andrew

Russ Cox

unread,
Apr 21, 2011, 7:44:41 AM4/21/11
to Andrew Gerrand, golang-dev
> I'm still not convinced an error is necessary. If there's an error in
> the data, make it a Read error when you try to use the File.

Why doesn't os.Open do that?

> The plan is still to add the tempfile functionality to the multipart
> package, right?

> I think Request.FormValue should return the contents of the mime part


> as a string. If it's a file and it'll be big, don't do that. But not
> every mime part is a file, and accessing those fields via FormValue is
> useful.

Accessing non-file fields via FormValue is certainly the right
thing to do. It is much less clear to me that accessing file
content via FormValue is correct. It will work for tiny files
and then fail for big ones.

Russ

Andrew Gerrand

unread,
Apr 25, 2011, 4:54:21 AM4/25/11
to Russ Cox, Brad Fitzpatrick, golang-dev
How about this:

--

package multipart:

// ReadForm parses an entire multipart message.
// Up to maxMemory total bytes of its file parts will be stored in RAM,
// with the remainder stored on disk in a temporary directory
func (*Reader) ReadForm(maxMemory int64) (Form, os.Error)

type Form struct {
Values map[string]string
Files map[string]*FormFile
}

type FormFile struct {
Header textproto.MIMEHeader
}

func (*FormFile) Open() (File, os.Error)

type File interface {
io.Reader
io.ReaderAt
io.Seeker
io.Closer
}


package http:

func (*Request) ParseMultipartForm(maxMemory int64) os.Error

func (*Request) FormFile(fieldname string) *multipart.FormFile

--

Usage:

func handler(w http.ResponseWriter, r *http.Request) {
ff := r.FormFile("image")
if ff == nil {
error("no image field found")
}
f, err := ff.Open()
if err != nil {
error("some kind of io or parse error", err)
}

// and, if you like...
fd, ok := f.(*os.File)
}


Thoughts?

Russ Cox

unread,
Apr 25, 2011, 7:32:46 AM4/25/11
to Andrew Gerrand, Brad Fitzpatrick, golang-dev
FormFile should return an error. This isn't C.

Overloading the ordinary return value implies that
(a) there is exactly one thing that can go wrong,
and (b) the calling code needs to remember what
that is, so it can issue the correct error to the user.
Returning an error solves both of those. And then
you don't need the separate Open.

Russ

Andrew Gerrand

unread,
Apr 25, 2011, 8:18:42 PM4/25/11
to Russ Cox, Brad Fitzpatrick, golang-dev
On 25 April 2011 21:32, Russ Cox <r...@google.com> wrote:
> FormFile should return an error.  This isn't C.
>
> Overloading the ordinary return value implies that
> (a) there is exactly one thing that can go wrong,

There is, in this case. Like FormValue, getting a nil from FormFile
implies there is no such field.

> and (b) the calling code needs to remember what
> that is, so it can issue the correct error to the user.
> Returning an error solves both of those.  And then
> you don't need the separate Open.

I wrestled with this for a while because there are instances where the
user will want to inspect the headers (and FormFile should also have a
Filename field, extracted from the Content-Disposition header) without
opening the file. If you think it's okay to force the user to Open to
inspect this data, then I'd prefer to make Open a method of Form:

func (Form) Open(fieldname string) (File, os.Error)

Where File is now a struct containing Header, Filename, etc.

Andrew

Russ Cox

unread,
Apr 25, 2011, 11:15:57 PM4/25/11
to Andrew Gerrand, Brad Fitzpatrick, golang-dev
>> Overloading the ordinary return value implies that
>> (a) there is exactly one thing that can go wrong,
>
> There is, in this case. Like FormValue, getting a nil from FormFile
> implies there is no such field.

Even if that were true, returning an error means that
the calling doesn't need to remember what the error is.
It can just print it.

Also, that's not the only error. For example, if the multipart
reader encounters an invalid multipart upload or an unexpected
EOF or something like that, you can return the error if
there's an error return. Otherwise you have to rely on
out-of-band logging that may or may not ever be read.

Why do you object so much to having an error as part
of the return value? Since people have to check whether
the call suceeded either way, it is no more onerous to
write err != nil than to write f == nil, and the former lets
the library return much more detail.

>> and (b) the calling code needs to remember what
>> that is, so it can issue the correct error to the user.
>> Returning an error solves both of those.  And then
>> you don't need the separate Open.
>
> I wrestled with this for a while because there are instances where the
> user will want to inspect the headers (and FormFile should also have a
> Filename field, extracted from the Content-Disposition header) without
> opening the file. If you think it's okay to force the user to Open to
> inspect this data,

Sure. It's super cheap.

> then I'd prefer to make Open a method of Form:
>
> func (Form) Open(fieldname string) (File, os.Error)
>
> Where File is now a struct containing Header, Filename, etc.

That would require calling ParseForm, since when a
request begins executing, r.Form is nil. I would prefer
not to have any methods on Form. Only power users
should have to know about ParseForm.

Russ

Andrew Gerrand

unread,
Apr 25, 2011, 11:47:15 PM4/25/11
to Russ Cox, Brad Fitzpatrick, golang-dev

There's some confusion between the interface as it should appear in
http.Request and the multipart package. That previous example was
about the multipart package. Sorry, I should have been more clear.

I agree with what you've said, and think we're on the same page. Let
me summarize.

Andrew

Reply all
Reply to author
Forward
0 new messages