code review 4528086: http: content-type negotiation and Accept header parsing. (issue4528086)

609 views
Skip to first unread message

wwa...@googlemail.com

unread,
May 20, 2011, 6:50:59 PM5/20/11
to golan...@googlegroups.com, golan...@googlegroups.com, re...@codereview.appspotmail.com
Reviewers: golang-dev_googlegroups.com,

Message:
Hello golan...@googlegroups.com (cc: golan...@googlegroups.com),

I'd like you to review this change to
https://go.googlecode.com/hg/


Description:
http: content-type negotiation and Accept header parsing.

This introduces two functions and one type. The type is Accept
which represents a clause in an Accept header. The functions
are ParseAccept which parses such a header and Negotiate
which, given a header and a list of alternatives content types
returns the most appropriate result.

It is possible that the Negotiate should default by returning
the first of the alternative in the case that the provided
header is empty. This doesn't appear to happen often in
practice and in any case perhaps it is the calling function's
job to make sure that the header passed in is valid.

Please review this at http://codereview.appspot.com/4528086/

Affected files:
M src/pkg/http/Makefile
A src/pkg/http/autoneg.go
A src/pkg/http/autoneg_test.go


brad...@golang.org

unread,
May 20, 2011, 7:38:29 PM5/20/11
to wwa...@googlemail.com, golan...@googlegroups.com, golan...@googlegroups.com, re...@codereview.appspotmail.com
What do you think about:

type Accept struct{}

type AcceptList []Accept

func (al AcceptList) Negotiate(contentType ...string) string {
// ...

}

func (r *Request) Accept() AcceptList {
}

?

http://codereview.appspot.com/4528086/

William Waites

unread,
May 21, 2011, 4:37:52 AM5/21/11
to wwa...@googlemail.com, golan...@googlegroups.com, brad...@golang.org, re...@codereview.appspotmail.com
Generally I think this would be a nice way to go about it, however
there are some concerns:

* right now there is an accept_slice, which is like AcceptList. It exists for
using the sort package and I purposely did not want to expose the sort
implementation publicly, not least because it is backwards - sorts from
largest to smallest - which could be confusing if someone tried to use it
and I can't imagine why they would try to use it anyways. So probably
it makes sense to keep the private accept_slice and make AcceptList
a separate type that exists only for the purpose of the Negotiate method.
Doing this rather than just having a function is just a matter of style and
I'm happy to defer to your judgement.

* Maybe it should handle errors in the Accept header more strictly. With an
invalid header right now it will try to do the best it can,
skipping over anything
it doesn't understand how to parse. Returning an error also from Accept
would be a stronger statement about the completeness of the implementation
then the best effort that it makes now.

* Likewise for corner cases in the Negotiate function. What should it do if
the AcceptList is empty? What should it do if the list of alternatives is
empty? On a high level what should happen in the former case is probably
return the first alternative, and the second should result in a 406 Not
Acceptable. Right now it returns an empty string in both cases which doesn't
distinguish. So also returning an error here would probably make sense?

* Hiding the explicit parse function is probably ok. I can't really see why it
would be desirable to want to parse such a header string outside of the
context of an http.Request.

I'll work on rejigging it now...

Cheers,
-w

--
William Waites
Email: wwa...@gmail.com
UK tel: +44 131 516 3563
UK mob: +44 789 798 9965

wwa...@googlemail.com

unread,
May 21, 2011, 5:29:10 AM5/21/11
to golan...@googlegroups.com, brad...@golang.org, golan...@googlegroups.com, re...@codereview.appspotmail.com

wwa...@googlemail.com

unread,
May 21, 2011, 5:30:46 AM5/21/11
to golan...@googlegroups.com, brad...@golang.org, golan...@googlegroups.com, re...@codereview.appspotmail.com

dch...@gmail.com

unread,
May 21, 2011, 7:08:57 AM5/21/11
to wwa...@googlemail.com, golan...@googlegroups.com, brad...@golang.org, golan...@googlegroups.com, re...@codereview.appspotmail.com
I'd love to have this type somewhere in http package:

struct HeaderField type {
Value string
Q int
Params map[string]string
}

and ParseHeaderFields that parses and sorts it. It could be reused in
parsing Accept-Encoding, Accept-Charset, and Accept-Language.


http://codereview.appspot.com/4528086/diff/9001/src/pkg/http/autoneg.go
File src/pkg/http/autoneg.go (right):

http://codereview.appspot.com/4528086/diff/9001/src/pkg/http/autoneg.go#newcode14
src/pkg/http/autoneg.go:14:
No empty line here.

http://codereview.appspot.com/4528086/diff/9001/src/pkg/http/autoneg.go#newcode25
src/pkg/http/autoneg.go:25: // Structure to represent a clause in an
HTTP Accept Header
. and the end of sentences (everywhere)

http://codereview.appspot.com/4528086/diff/9001/src/pkg/http/autoneg.go#newcode33
src/pkg/http/autoneg.go:33: type accept_sorter []Accept
acceptSorter

http://codereview.appspot.com/4528086/diff/9001/src/pkg/http/autoneg.go#newcode64
src/pkg/http/autoneg.go:64: func (al AcceptList) Negotiate(alternatives
...string) (content_type string, err os.Error) {
contentType

http://codereview.appspot.com/4528086/diff/9001/src/pkg/http/autoneg.go#newcode108
src/pkg/http/autoneg.go:108: media_range := mrp[0]
mediaRange or something shorter.

http://codereview.appspot.com/4528086/diff/9001/src/pkg/http/autoneg.go#newcode122
src/pkg/http/autoneg.go:122: err = os.ErrorString(errs)
err = os.NewError("autoneg: invalid media range in " + part)

http://codereview.appspot.com/4528086/

William Waites

unread,
May 21, 2011, 8:48:06 AM5/21/11
to wwa...@googlemail.com, golan...@googlegroups.com, brad...@golang.org, dch...@gmail.com, re...@codereview.appspotmail.com
On 21 May 2011 12:08, <dch...@gmail.com> wrote:
> I'd love to have this type somewhere in http package:
>
> struct HeaderField type {
>        Value string
>        Q int
>        Params map[string]string
> }
>
> and ParseHeaderFields that parses and sorts it. It could be reused in
> parsing Accept-Encoding, Accept-Charset, and Accept-Language.

The thing is, the sorting algorithm is related but not the same. For
Accept we have type/subtype with facilities for wildcards, for Accept-Language
we have lang[-country]. I think but am not sure without checking that
Accept-Encoding and Accept-Charset would be the same, but still
different from the other two. Similarly with the matching algorithm.

If you have time to generalise the implementation in such a way as to
handle all of these cases, please feel free.

Cheers,
-w

dch...@gmail.com

unread,
May 21, 2011, 9:09:38 AM5/21/11
to wwa...@googlemail.com, golan...@googlegroups.com, brad...@golang.org, golan...@googlegroups.com, re...@codereview.appspotmail.com
On 2011/05/21 12:48:08, ww wrote:

> On 21 May 2011 12:08, <mailto:dch...@gmail.com> wrote:
> > I'd love to have this type somewhere in http package:
> >
> > struct HeaderField type {
> > &nbsp; &nbsp; &nbsp; &nbsp;Value string
> > &nbsp; &nbsp; &nbsp; &nbsp;Q int
> > &nbsp; &nbsp; &nbsp; &nbsp;Params map[string]string

> > }
> >
> > and ParseHeaderFields that parses and sorts it. It could be reused
in
> > parsing Accept-Encoding, Accept-Charset, and Accept-Language.

> The thing is, the sorting algorithm is related but not the same. For
> Accept we have type/subtype with facilities for wildcards, for
Accept-Language
> we have lang[-country]. I think but am not sure without checking that
> Accept-Encoding and Accept-Charset would be the same, but still
> different from the other two. Similarly with the matching algorithm.

If sorting doesn't apply, we can parse first into the list of
HeaderField and then apply different sortings for AcceptList,
LanguageList, etc.

http://codereview.appspot.com/4528086/

William Waites

unread,
May 21, 2011, 9:46:22 AM5/21/11
to wwa...@googlemail.com, golan...@googlegroups.com, brad...@golang.org, dch...@gmail.com, re...@codereview.appspotmail.com
> If sorting doesn't apply, we can parse first into the list of
> HeaderField and then apply different sortings for AcceptList,
> LanguageList, etc.

The reason I didn't do this originally is that it would mean that the
Accept (or HeaderField) contains only partially parsed values - I
thought it cleaner to have the parsing code in the parsing function
and the sorting and matching code in the sorting and matching
functions.

Looking at RFC2616 at the other negotiation headers, it is indeed
the case that Charset and Encoding would be the same. However
none of Charset, Encoding or Language have any concept of extra
parameters.

So the choices seem to be:

* Have three different-but-similar types and three different-but-similar
parsers
* Have one type and move some of the parsing into the sort/match
functions which have to be different anyways
* Replace HeaderField.Value with an interface that can be used for
sorting and a two-pass parser and have a special type for Accept
that perhaps embeds HeaderField and adds a map for params

The types are not complicated (Accept is actually the most
complicated) and neither are the parsers. The redundancy implied by
the first option (more or less where we are now) is not that great and
it keeps the code simple to understand.

The second option would be the straightforward way of implenting what
you suggest, but having parsing in the sorter is ugly as I have mentioned.

The third stands to remove all possible redundancy in the code and might
ultimately be the nicest but the effort/benefit tradeoff for removing such
a small amount of redundancy (and adding structural complexity to an
otherwise straightforward implementation) seems wrong to me, but I
wouldn't object if someone else wanted to do it.

dch...@gmail.com

unread,
May 21, 2011, 10:10:49 AM5/21/11
to wwa...@googlemail.com, golan...@googlegroups.com, brad...@golang.org, golan...@googlegroups.com, re...@codereview.appspotmail.com
What I'm trying to say is that it would be better if the parsing stages
and sorting stages were separate. We have the same parser for 3
different header types.

ParseHeaderFields(headerValue string) []HeaderField

It would return just an unsorted slice of header fields:

ParseHeaderFields("text/html;q=0.7, text/*")

=>

[&HeaderField{Value: "text/html", Q: 0.7, ...}, &HeaderField{Value:
"text/*", Q: 1, ...}]

ParseHeaderFields("en-us;q=0.8, de;q=0.1")

=>

[&HeaderField{Value: "en-us", Q: 0.8, ...}, &HeaderField{Value: "de", Q:
0.1, ...}]

Then []Accept, []Languages, []Charsets thing can sort them as needed and
return their own values.
This way we need to write parser only once, minimizing the effect of the
"Don't parse" rule (http://cr.yp.to/qmail/guarantee.html) :-)

http://codereview.appspot.com/4528086/

William Waites

unread,
May 21, 2011, 10:24:14 AM5/21/11
to wwa...@googlemail.com, golan...@googlegroups.com, brad...@golang.org, dch...@gmail.com, re...@codereview.appspotmail.com
The don't parse rule is about "converting an unstructured sequence of commands,
in a format usually determined more by psychology than by solid
engineering, into
structured data"

All of these headers are quite well structured and rigorously defined
in the RFC.
Moreover we have to parse them to do the sorting and matching because those
operations depend on the internal structure of things you're suggesting to leave
unparsed.

Right now the parsing stages and sorting stages *are* separate.

dch...@gmail.com

unread,
May 21, 2011, 10:33:44 AM5/21/11
to wwa...@googlemail.com, golan...@googlegroups.com, brad...@golang.org, golan...@googlegroups.com, re...@codereview.appspotmail.com
On 2011/05/21 14:24:16, ww wrote:
> All of these headers are quite well structured and rigorously defined
> in the RFC.

Except, you're receiving an untrusted input.

> Moreover we have to parse them to do the sorting and matching because
those
> operations depend on the internal structure of things you're
suggesting to leave
> unparsed.

You'll get what you need for Accept sorting in HeaderField.Value.

> Right now the parsing stages and sorting stages *are* separate.

But I won't be able to use this parser to implement []Languages, so I'll
have to write another one.

http://codereview.appspot.com/4528086/

William Waites

unread,
May 21, 2011, 10:41:42 AM5/21/11
to wwa...@googlemail.com, golan...@googlegroups.com, brad...@golang.org, dch...@gmail.com, re...@codereview.appspotmail.com
On 21 May 2011 15:33, <dch...@gmail.com> wrote:
> On 2011/05/21 14:24:16, ww wrote:
>>
>> All of these headers are quite well structured and rigorously defined
>> in the RFC.
>
> Except, you're receiving an untrusted input.

And if it can't be parsed, an error is returned which the application can
handle as it wishes/

> You'll get what you need for Accept sorting in HeaderField.Value.

Which I then have to parse (for Accept and Language) inside the
sorting function... And the matching function.

>> Right now the parsing stages and sorting stages *are* separate.
>
> But I won't be able to use this parser to implement []Languages, so I'll
> have to write another one.

Its a pretty simple parser. The format of the string being parsed is
different. I don't think this is much of a burden.

dch...@gmail.com

unread,
May 21, 2011, 10:53:01 AM5/21/11
to wwa...@googlemail.com, golan...@googlegroups.com, brad...@golang.org, golan...@googlegroups.com, re...@codereview.appspotmail.com
On 2011/05/21 14:41:44, ww wrote:

> On 21 May 2011 15:33, <mailto:dch...@gmail.com> wrote:
> > On 2011/05/21 14:24:16, ww wrote:
> >>
> >> All of these headers are quite well structured and rigorously
defined
> >> in the RFC.
> >
> > Except, you're receiving an untrusted input.

> And if it can't be parsed, an error is returned which the application
can
> handle as it wishes/

Provided that the parser code is correct.

> > You'll get what you need for Accept sorting in HeaderField.Value.

> Which I then have to parse (for Accept and Language) inside the
> sorting function... And the matching function.

You do basically the same, but instead of the long parsing method, you
have to care only about individual values.

parseAccept(fields []HeaderField) []Accept {
...
}

> >> Right now the parsing stages and sorting stages *are* separate.
> >
> > But I won't be able to use this parser to implement []Languages, so
I'll
> > have to write another one.

> Its a pretty simple parser. The format of the string being parsed is
> different. I don't think this is much of a burden.

I agree that it's simple, but is it different?
value;q=...;extension=..., *

http://codereview.appspot.com/4528086/

William Waites

unread,
May 21, 2011, 10:56:50 AM5/21/11
to wwa...@googlemail.com, golan...@googlegroups.com, brad...@golang.org, dch...@gmail.com, re...@codereview.appspotmail.com
> I agree that it's simple, but is it different?
> value;q=...;extension=..., *

Extension does not exist for Charset, Language, Encoding.

dch...@gmail.com

unread,
May 21, 2011, 11:01:31 AM5/21/11
to wwa...@googlemail.com, golan...@googlegroups.com, brad...@golang.org, golan...@googlegroups.com, re...@codereview.appspotmail.com
On 2011/05/21 14:56:51, ww wrote:
> > I agree that it's simple, but is it different?
> > value;q=...;extension=..., *

> Extension does not exist for Charset, Language, Encoding.

And thus, ignored when filling

type Language struct {
Script string
Region string
Q float32
}

from HeaderField

http://codereview.appspot.com/4528086/

wwa...@googlemail.com

unread,
May 21, 2011, 11:14:36 AM5/21/11
to golan...@googlegroups.com, brad...@golang.org, dch...@gmail.com, golan...@googlegroups.com, re...@codereview.appspotmail.com

William Waites

unread,
May 21, 2011, 11:16:43 AM5/21/11
to wwa...@googlemail.com, golan...@googlegroups.com, brad...@golang.org, dch...@gmail.com, re...@codereview.appspotmail.com
On 21 May 2011 16:01, <dch...@gmail.com> wrote:
> On 2011/05/21 14:56:51, ww wrote:
>>
>> > I agree that it's simple, but is it different?
>> > value;q=...;extension=..., *
>
>> Extension does not exist for Charset, Language, Encoding.
>
> And thus, ignored when filling

So you don't want to parse the media type in the generic parser
but you do want to parse the extension parameters which are
illegal in the other headers?

I've just updated with your cosmetic change suggestions.

William Waites

unread,
May 21, 2011, 11:34:50 AM5/21/11
to wwa...@googlemail.com, golan...@googlegroups.com, brad...@golang.org, dch...@gmail.com, re...@codereview.appspotmail.com
Actually, discussing unrelated things in IRC, what would be *really* nice is
for the ebnf package's Grammar to be able to unmarshal things in a similar
way to the json and xml packages. That would be a nice general solution.

Pascal S. de Kloe

unread,
May 23, 2011, 6:47:24 PM5/23/11
to wwa...@googlemail.com, golan...@googlegroups.com, re...@codereview.appspotmail.com
Hi William,

It seems we have been doing the same thing during the weekend by pure chance.

type Variant struct {
// contains filtered or unexported fields
}
The http.Variant is an immutable resource representation option.

func NewVariant(mediaType string, languages ...string) *Variant
Gets a new http.Variant.
The mediaType may contain a charset parameter to specify its
character encoding.
If either the mediaType or languages are malformed NewVariant
returns nil.

func SelectVariant(available []Variant, request *Request, response
ResponseWriter) *Variant
SelectVariant returns the most appropriate Variant or nil if no match was
found.
When nil is returned then no futher action to the ResponseWriter is required.

func (v *Variant) Apply(header Header)
Applies the appropriate content headers for the response message.


The Accept, Accept-Charset and Accept-Language header parsing is done and
tested but the rest is still under development. I can send you a copy if you
want to?

Idealy I'd like to end up with func (mux *ServeMux) HandleVariant(pattern
string, handler Handler, content *Variant). What do you think?

Brad Fitzpatrick

unread,
May 23, 2011, 8:25:15 PM5/23/11
to William Waites, golang-dev, dch...@gmail.com, reply
On Sat, May 21, 2011 at 8:34 AM, William Waites <wwa...@googlemail.com> wrote:
Actually, discussing unrelated things in IRC, what would be *really* nice is
for the ebnf package's Grammar to be able to unmarshal things in a similar
way to the json and xml packages. That would be a nice general solution.

That might be a distraction.

That's an internal implementation detail.  Let's get the types & API right and if we have to copy/paste a bit of code for now, so be it.  If the Grammar package or whatever lets us simplify the implementation in the future, great.

I also wouldn't mind re-using the same struct type ("Accept") for both "Accept" and "Accept-Language", even if one doesn't have optional parameters.  Both are arguably about "Accept".  Or some other name.  Just document it well.

brad...@golang.org

unread,
May 23, 2011, 8:28:43 PM5/23/11
to wwa...@googlemail.com, golan...@googlegroups.com, dch...@gmail.com, pas...@quies.net, golan...@googlegroups.com, re...@codereview.appspotmail.com

http://codereview.appspot.com/4528086/diff/9001/src/pkg/http/autoneg.go#newcode163
src/pkg/http/autoneg.go:163: // such a client would be quite broken...
isn't this a comma-separated field? multiple headers should mean the
same thing according to RFC 2616, no?

That is:

Accept: application/xml,application/xhtml+xml

and:

Accept: application/xml
Accept: application/xhtml+xml

Are the same, aren't they?

http://codereview.appspot.com/4528086/

andrew...@99designs.com

unread,
Aug 21, 2014, 9:35:42 PM8/21/14
to golan...@googlegroups.com, wwa...@googlemail.com, dch...@gmail.com, pas...@quies.net, re...@codereview.appspotmail.com, brad...@golang.org

Hi,

I'm currently looking into parsing Accept-Language headers, and I came across this patch. Is there any likelihood 
 of this patch getting accepted into core, or are there any third-party packages which provide similar functionality?

Cheers,
Andy

Brad Fitzpatrick

unread,
Aug 28, 2014, 11:25:01 AM8/28/14
to andrew...@99designs.com, golang-dev, wwa...@googlemail.com, Dmitry Chestnykh, pas...@quies.net, re...@codereview.appspotmail.com
Until something in the standard library depends on this, it's unlikely we'll add it to the standard library.  It should be made available somewhere where it's go gettable.  It probably is already.

Reply all
Reply to author
Forward
0 new messages