RFC: Removing the default Content-Type from the http package

David Symonds

unread,

May 31, 2011, 8:45:44 PM5/31/11

to golan...@googlegroups.com

This is a proposal to drop the default Content-Type that is set by the
http package when running as a server.

The existing behaviour is that when the http package is writing the
response headers, if no explicit Content-Type has been set,
"text/html; charset=utf-8" is used.
I think that's a mistake, and my arguments are below.

(1) Setting a default Content-Type, while convenient, is not Go-like.
It is backward-looking, not forward-looking.

One of, if not the most important argument put forth that the
Content-Type should be defaulted to this is that older browsers
(particularly Internet Explorer) do a bad job of content-sniffing when
they don't receive a Content-Type.
However, newer browsers tend to be better behaved, *except* when you
give them the wrong Content-Type (see point 2). We're optimising for a
dying breed of clients, and *only* when the programmer doesn't declare
what Content-Type they are generating.
If we care this much that a Content-Type is set, perhaps the http
package should throw errors if content is written without a
Content-Type being set.

(2) This particular default, "text/html; charset=utf-8" is not almost
always the right one.

When we dropped the laddr argument to net.Dial, that was a useful
change because the programmer almost always wants laddr to be "", and
there were other options for the rare instance where the programmer
wanted something else. That's not the case with Content-Type.

It's true that many uses of the http package will be sending UTF-8
encoded HTML back, but it's only a majority case, and probably only a
slim majority at that. Other responses include image/*, text/plain,
application/json, application/octet-stream, and so on. It would be
better for there to be *no* Content-Type sent with those responses
than an *incorrect* Content-Type for many reasons, not least of which
that browsers behave unpredictably when given an incorrect
Content-Type.

A small anecdote: I was a teaching assistant at Google I/O BootCamp
this year, and I came across one attendee who was horribly confused.
Their tiny HTTP handler looked something like this:
func serve(w http.ResponseWriter, r *http.Request) {
t := T{"something", 4}
fmt.Fprintf("{ %s , %d }", t.Name, t.Age)
}
Their browser (I can't remember; it might have been Firefox) was
throwing up an obscure XML error message trying to parse the response,
and it was because the Content-Type was silently set to "text/html;
charset=utf-8". That's not a good first experience, and it wasn't easy
to explain.

In short, the Web is not the entirety of the Internet, and HTML is not
the only thing sent over HTTP.

(3) Bad programs are still going to get it wrong.

A program that doesn't care (or forgets) to explicitly set a
Content-Type header is not guaranteed to be generating valid
UTF-8-encoded HTML.

(4) We're violating the RFC.

The HTTP RFC specifies that a Content-Type header SHOULD be included,
and that a client MAY guess if the header isn't there. That's the
protocol supported by an increasing majority of browsers; while we're
trying to be clever to work around bad behaviour of older, dying
clients, we're mucking up the behaviour of newer, well-behaved
clients. RFCs aren't the be-all and end-all, but standards are most
useful when things conform to them, and short of compelling reasons
(old IE support not being one of those) we should follow the standard.

(5) It's magical.

I expect a HTTP package to do the right kinds of protocol work and
header formatting for me, and maybe even set things like
Content-Length that it can perfectly deduce. I don't expect a HTTP
package to declare to the world what my Content-Type is, especially
when it is a static default. It's not the way that the vast majority
of other widely used HTTP packages work, and it's surprising.

Dave.

Kyle Lemons

unread,

Jun 1, 2011, 11:51:03 AM6/1/11

to David Symonds, golan...@googlegroups.com

(1) Setting a default Content-Type, while convenient, is not Go-like.
It is backward-looking, not forward-looking.

One of, if not the most important argument put forth that the
Content-Type should be defaulted to this is that older browsers
(particularly Internet Explorer) do a bad job of content-sniffing when
they don't receive a Content-Type.
However, newer browsers tend to be better behaved, *except* when you
give them the wrong Content-Type (see point 2). We're optimising for a
dying breed of clients, and *only* when the programmer doesn't declare
what Content-Type they are generating.
If we care this much that a Content-Type is set, perhaps the http
package should throw errors if content is written without a
Content-Type being set.

I agree that we should remove the content-type default. I really like that the only headers that get sent are the ones I explicitly provide. I have no problem with adding a convenience method to instruct the http package to guess the content type using some magic sauce, though. I also don't think an error should be given for omitting the content-type, but I can understand the argument and wouldn't push back.

(2) This particular default, "text/html; charset=utf-8" is not almost
always the right one.

In short, the Web is not the entirety of the Internet, and HTML is not
the only thing sent over HTTP.

+1. I have been bitten by similar HTML parsing issues when I wasn't actually sending HTML. Easy fix, but if I didn't know what was going on, I would have been stumped.

(3) Bad programs are still going to get it wrong.

A program that doesn't care (or forgets) to explicitly set a
Content-Type header is not guaranteed to be generating valid
UTF-8-encoded HTML.

I've even found that when I'm writing html, i usually want it to be interpreted as HTML5, so even then I change the content-type.

(4) We're violating the RFC.

The HTTP RFC specifies that a Content-Type header SHOULD be included,
and that a client MAY guess if the header isn't there. That's the
protocol supported by an increasing majority of browsers; while we're
trying to be clever to work around bad behaviour of older, dying
clients, we're mucking up the behaviour of newer, well-behaved
clients. RFCs aren't the be-all and end-all, but standards are most
useful when things conform to them, and short of compelling reasons
(old IE support not being one of those) we should follow the standard.

I may have consistently been in the minority, but I have always left IE (especially IE6) out in the cold when I have to choose between a hack to make it work and ostracizing people who still use that browser. That bias disclosed, of course I support dropping anything that was added strictly for IE compatibility. There may have been other, better reasons that I don't know, though.

(5) It's magical.

And not the good kind of magic.

--
~Kyle

"Everyone knows that debugging is twice as hard as writing a program in the first place. So if you're as clever as you can be when you write it, how will you ever debug it?"
— Brian Kernighan

Russ Cox

unread,

Jun 1, 2011, 11:54:01 AM6/1/11

to Kyle Lemons, David Symonds, golan...@googlegroups.com

Why do you change the content type for html5?
I thought you were supposed to write <!DOCTYPE html> ?

Russ Cox

unread,

Jun 1, 2011, 12:01:14 PM6/1/11

to David Symonds, golan...@googlegroups.com

I wrote the current code, so just to give the rationale...

> (1) Setting a default Content-Type, while convenient, is not Go-like.
> It is backward-looking, not forward-looking.

It is very Go like to make the API as simple as possible.
The most common thing you want to do in a web server
is serve web pages, and there the content type should
be text/html; charset=utf8, hence the default.

The charset=utf8 is important, because it is standard in Go.
Most people who write these servers will forget that part.
That's why it's there automatically.

> One of, if not the most important argument put forth that the
> Content-Type should be defaulted to this is that older browsers
> (particularly Internet Explorer) do a bad job of content-sniffing when
> they don't receive a Content-Type.
> However, newer browsers tend to be better behaved, *except* when you
> give them the wrong Content-Type (see point 2). We're optimising for a
> dying breed of clients, and *only* when the programmer doesn't declare
> what Content-Type they are generating.

I don't buy this at all. Not setting Content-Type is equivalent
to using an uninitialized variable: it might happen to work out,
but it's not guaranteed. The safe thing is to initialize the variable
to a defined default, and then you'll get consistent behavior
everywhere.

> (2) This particular default, "text/html; charset=utf-8" is not almost
> always the right one.
>

> It's true that many uses of the http package will be sending UTF-8
> encoded HTML back, but it's only a majority case, and probably only a
> slim majority at that. Other responses include image/*, text/plain,
> application/json, application/octet-stream, and so on. It would be
> better for there to be *no* Content-Type sent with those responses
> than an *incorrect* Content-Type for many reasons, not least of which
> that browsers behave unpredictably when given an incorrect
> Content-Type.

At least they behave the same. Sure there are other possible
content-types. That's why it's not hard-coded. Most handlers
people write send HTML.

> A small anecdote: I was a teaching assistant at Google I/O BootCamp
> this year, and I came across one attendee who was horribly confused.
> Their tiny HTTP handler looked something like this:
> func serve(w http.ResponseWriter, r *http.Request) {
> t := T{"something", 4}
> fmt.Fprintf("{ %s , %d }", t.Name, t.Age)
> }
> Their browser (I can't remember; it might have been Firefox) was
> throwing up an obscure XML error message trying to parse the response,
> and it was because the Content-Type was silently set to "text/html;
> charset=utf-8". That's not a good first experience, and it wasn't easy
> to explain.

Huh? How is `{ something, 4 }` not valid HTML?

> (3) Bad programs are still going to get it wrong.
>
> A program that doesn't care (or forgets) to explicitly set a
> Content-Type header is not guaranteed to be generating valid
> UTF-8-encoded HTML.

A program that sets it is not guaranteed to do so either.
This is not a valid argument.

> (4) We're violating the RFC.
>
> The HTTP RFC specifies that a Content-Type header SHOULD be included,
> and that a client MAY guess if the header isn't there. That's the
> protocol supported by an increasing majority of browsers; while we're
> trying to be clever to work around bad behaviour of older, dying
> clients, we're mucking up the behaviour of newer, well-behaved
> clients. RFCs aren't the be-all and end-all, but standards are most
> useful when things conform to them, and short of compelling reasons
> (old IE support not being one of those) we should follow the standard.

You have a very different interpretation of the RFC than I do.
My reading of those words is that setting Content-Type is
preferable to not setting it.

> (5) It's magical.
>
> I expect a HTTP package to do the right kinds of protocol work and
> header formatting for me, and maybe even set things like
> Content-Length that it can perfectly deduce. I don't expect a HTTP
> package to declare to the world what my Content-Type is, especially
> when it is a static default. It's not the way that the vast majority
> of other widely used HTTP packages work, and it's surprising.

It's not magical; it's a default setting.

Russ

Kyle Lemons

unread,

Jun 1, 2011, 12:16:50 PM6/1/11

to r...@golang.org, David Symonds, golan...@googlegroups.com

On Wed, Jun 1, 2011 at 8:54 AM, Russ Cox <r...@golang.org> wrote:

Why do you change the content type for html5?
I thought you were supposed to write <!DOCTYPE html> ?

Sorry, I meant to say xhtml5.

I use application/xhtml+xml and friends.

David Symonds

unread,

Jun 1, 2011, 7:12:16 PM6/1/11

to rsc, golan...@googlegroups.com

On Thu, Jun 2, 2011 at 2:01 AM, Russ Cox <r...@golang.org> wrote:

> I wrote the current code, so just to give the rationale...
>
>> (1) Setting a default Content-Type, while convenient, is not Go-like.
>> It is backward-looking, not forward-looking.
>
> It is very Go like to make the API as simple as possible.

It's Go-like in its simplicity, but not in its practicality. And it
seems to be tilted towards older, dying browsers, rather than newer,
rising browsers; *that* is what this point was about.

It's Go-like to be explicit about things; an extra line of code isn't
going to kill people, and we demand it in many situations. Providing
this default in this circumstance stands out.

>> One of, if not the most important argument put forth that the
>> Content-Type should be defaulted to this is that older browsers
>> (particularly Internet Explorer) do a bad job of content-sniffing when
>> they don't receive a Content-Type.
>> However, newer browsers tend to be better behaved, *except* when you
>> give them the wrong Content-Type (see point 2). We're optimising for a
>> dying breed of clients, and *only* when the programmer doesn't declare
>> what Content-Type they are generating.
>
> I don't buy this at all. Not setting Content-Type is equivalent
> to using an uninitialized variable: it might happen to work out,
> but it's not guaranteed. The safe thing is to initialize the variable
> to a defined default, and then you'll get consistent behavior
> everywhere.

It's nothing like using an uninitialized variable. It's more like
using a zero value. It's well-defined, and the fact that older, dying
browsers misbehave is orthogonal to that.

You're right that the safe thing to do is to initialise it to the
right thing, but it's the programmer who knows best what that right
thing is, and the right thing is *not* always "text/html;
charset=utf-8".

>> (2) This particular default, "text/html; charset=utf-8" is not almost
>> always the right one.
>>
>> It's true that many uses of the http package will be sending UTF-8
>> encoded HTML back, but it's only a majority case, and probably only a
>> slim majority at that. Other responses include image/*, text/plain,
>> application/json, application/octet-stream, and so on. It would be
>> better for there to be *no* Content-Type sent with those responses
>> than an *incorrect* Content-Type for many reasons, not least of which
>> that browsers behave unpredictably when given an incorrect
>> Content-Type.
>
> At least they behave the same. Sure there are other possible
> content-types. That's why it's not hard-coded. Most handlers
> people write send HTML.

No, they don't behave the same. IE does some sniffing, and will ignore
Content-Type if it looks too incorrect for some classes of MIME types.
Firefox throws weird errors. Chrome usually takes the Content-Type at
face value.

The HTML case is probably a majority, but I'd wager it's more like a
60% majority than a 99% majority. We should make it easy, but I
disagree it should be the default.

>> A small anecdote: I was a teaching assistant at Google I/O BootCamp
>> this year, and I came across one attendee who was horribly confused.
>> Their tiny HTTP handler looked something like this:
>> func serve(w http.ResponseWriter, r *http.Request) {
>> t := T{"something", 4}
>> fmt.Fprintf("{ %s , %d }", t.Name, t.Age)
>> }
>> Their browser (I can't remember; it might have been Firefox) was
>> throwing up an obscure XML error message trying to parse the response,
>> and it was because the Content-Type was silently set to "text/html;
>> charset=utf-8". That's not a good first experience, and it wasn't easy
>> to explain.
>
> Huh? How is `{ something, 4 }` not valid HTML?

It's not valid HTML. HTML starts with a tag, whether <HTML> or a <!DOCTYPE>.
I suspect the browser was guessing that it might have been JavaScript.

>> (3) Bad programs are still going to get it wrong.
>>
>> A program that doesn't care (or forgets) to explicitly set a
>> Content-Type header is not guaranteed to be generating valid
>> UTF-8-encoded HTML.
>
> A program that sets it is not guaranteed to do so either.
> This is not a valid argument.

A program that sets it is much more likely to get it right, because
the value is visible to the programmer, and they at least had to do
something to put it there. If they get it wrong, and something
misbehaves, they will see the Content-Type value they set, as opposed
to having to memorise the default that the http package applies
silently.

>> (4) We're violating the RFC.
>>
>> The HTTP RFC specifies that a Content-Type header SHOULD be included,
>> and that a client MAY guess if the header isn't there. That's the
>> protocol supported by an increasing majority of browsers; while we're
>> trying to be clever to work around bad behaviour of older, dying
>> clients, we're mucking up the behaviour of newer, well-behaved
>> clients. RFCs aren't the be-all and end-all, but standards are most
>> useful when things conform to them, and short of compelling reasons
>> (old IE support not being one of those) we should follow the standard.
>
> You have a very different interpretation of the RFC than I do.
> My reading of those words is that setting Content-Type is
> preferable to not setting it.

Setting it to a correct value, yes, but for a good portion of the time
"text/html; charset=utf-8" is *not* the correct value.

>> (5) It's magical.
>>
>> I expect a HTTP package to do the right kinds of protocol work and
>> header formatting for me, and maybe even set things like
>> Content-Length that it can perfectly deduce. I don't expect a HTTP
>> package to declare to the world what my Content-Type is, especially
>> when it is a static default. It's not the way that the vast majority
>> of other widely used HTTP packages work, and it's surprising.
>
> It's not magical; it's a default setting.

A default setting that, incidentally, is not documented.
But even if it were, it's still unusual amongst HTTP packages, and
surprising to me and others.

Dave.

David Symonds

unread,

Jun 1, 2011, 8:42:24 PM6/1/11

to golan...@googlegroups.com

Actually, I know that Russ and I could argue about this until the cows
come home, but we can do that in any forum.
The purpose of this email was to solicit the opinions of other folk.
I thank Kyle for chiming in (not just because he agreed with me), and
hope that other people can voice their opinion.

Dave.

Andrew Gerrand

unread,

Jun 1, 2011, 8:57:29 PM6/1/11

to David Symonds, rsc, golan...@googlegroups.com

On 2 June 2011 09:12, David Symonds <dsym...@golang.org> wrote:
> Setting it to a correct value, yes, but for a good portion of the time
> "text/html; charset=utf-8" is *not* the correct value.

So let me get this straight. You state that:
- most of the time the present default is correct,
- sometimes it is not,
- people might not set the content-type,
- therefore we should force them to set it every time.

IMO, your proposed change merely increases the likelihood of people
getting it wrong.

It is reasonable to expect people to set the content-type in the
minority cases, but why should I have to add (literally) hundreds of
lines to my existing web projects? It is boilerplate and that sucks.

> A default setting that, incidentally, is not documented.

It should be documented, then.

+1 to everything Russ said.

Andrew

David Symonds

unread,

Jun 1, 2011, 9:15:07 PM6/1/11

to Andrew Gerrand, rsc, golan...@googlegroups.com

On Thu, Jun 2, 2011 at 10:57 AM, Andrew Gerrand <a...@golang.org> wrote:

> So let me get this straight. You state that:
> - most of the time the present default is correct,

I don't even know that UTF-8-encoded HTML is "most of the time".

As for some concrete data, I went to
http://www.google.com/codesearch?hl=en&lr=&q=lang%3Ago+func.*http.ResponseWriter&sbtn=Search
to get a feel for what types of data people are generating. Here's a summary:
1. Ambiguous. It's a framework.
2. Produces an incorrect HTTP response, because it's writing plain
text, not HTML.
3. Ambiguous, but certainly gets it wrong in error cases because
it's writing plain text, not HTML.
4. Gets it wrong, because it's not producing HTML.
5. Aah, finally some HTML. This benefits from the default.
6. Gets it right in one place, because it explicitly sets a
Content-Type, but gets it wrong in almost every other because it isn't
HTML.
7. Has a helper to explicitly set a Content-Type correctly.

I got tired after those seven, but they seem like they might be
representative. You'll notice that only 1 in 7 benefits from the
default. So I retract my statement that HTML is the majority case,
because it seems not to be.

> IMO, your proposed change merely increases the likelihood of people
> getting it wrong.

At the moment, if the average programmer gets it 100% wrong in the
non-HTML case; I think that's a big case, if not a strict majority.

With my proposal, the average programmer will get it 0% wrong if they
don't set a Content-Type, and will usually get it close to right if
they do set it.

Not including the "charset=utf-8" is *not* getting it wrong; it's just
suboptimal. And if a programmer writes code that sets Content-Type to
image/png, but then produces JSON, then there's not much we can do to
stop them.

Dave.

David Symonds

unread,

Jun 1, 2011, 9:15:57 PM6/1/11

to Andrew Gerrand, rsc, golan...@googlegroups.com

On Thu, Jun 2, 2011 at 10:57 AM, Andrew Gerrand <a...@golang.org> wrote:

> It is boilerplate and that sucks.

Oh, and it might feel like boilerplate, but it's not. It's a
legitimate HTTP response to not include Content-Type, so you don't
*have* to include it.

Dave.

Russ Cox

unread,

Jun 1, 2011, 10:20:16 PM6/1/11

to David Symonds, Andrew Gerrand, golan...@googlegroups.com

It's pretty clear that you think not sending a Content-Type is a good thing.
I respectfully disagree.

Russ

Kyle Lemons

unread,

Jun 1, 2011, 10:34:00 PM6/1/11

to Andrew Gerrand, David Symonds, rsc, golan...@googlegroups.com

Perhaps it's a default that we could set on a per-http.Server basis? (on that note, perhaps there could be a default Header in the Server so that any header could be given a per-application default)

This removes the boilerplate from each individual request, but gives individual developers the ability to change the default value for themselves. It also gives developers the ability to swap out the content-type without fishing around and finding all of the Content-Types, for instance if they decide to switch from HTML to XHTML by default. It wouldn't affect all of the handlers that set it explicitly.

The more arguments I read against ditching the default completely, the more I think that instead of being thrown out, the compromise might instead be to set the default to something painful or printing a warning when the content-type is unset.

For setting the default to something painful:

I wouldn't recommend doing this without my other suggestion, as it does make it horrible to anyone who knowingly relies on the default when that's the proper mime type. I'm not sure I would recommend going quite as far as setting it to application/octet-stream (though that would pretty clearly indicate to most web developers what the problem was when they got a download dialog), but I think text/plain might be suitable.

The rationale here is that it should never cause the browser to make an inappropriate decision about the data (it will be displayed as plain text which can never fail to parse or get executed by some processor) and it is recognizable enough that in cases where it is not the desired content type the developer should know what to fix.

For printing an error message:

If, as the assumption appears to go, most output is HTML, then we can safely prepend or append a message about needing to set a content-type. I think this works even better if the default can be set for the server, as the error message can almost say "Insert this line of code where you set up your HTTP handlers" and wouldn't require much to fix.

Russ Cox

unread,

Jun 1, 2011, 10:41:52 PM6/1/11

to Kyle Lemons, Andrew Gerrand, David Symonds, golan...@googlegroups.com

On Wed, Jun 1, 2011 at 22:34, Kyle Lemons <kev...@google.com> wrote:
> Perhaps it's a default that we could set on a per-http.Server basis?

You are adding complexity. Any server can use a handler that does
whatever it likes before passing the buck to other handlers.

Russ

Russ Cox

unread,

Jun 1, 2011, 10:52:06 PM6/1/11

to David Symonds, Andrew Gerrand, golan...@googlegroups.com

On Wed, Jun 1, 2011 at 22:20, Russ Cox <r...@golang.org> wrote:
> It's pretty clear that you think not sending a Content-Type is a good thing.
> I respectfully disagree.

To put it a different way, can you point to any authority that
says that omitting Content-Type in HTTP responses is the
new Right Thing To Do?

Russ

Nigel Tao

unread,

Jun 1, 2011, 10:57:51 PM6/1/11

to David Symonds, golan...@googlegroups.com

On 2 June 2011 10:42, David Symonds <dsym...@golang.org> wrote:
> The purpose of this email was to solicit the opinions of other folk.

FWIW, when I was reviewing the http/fcgi package, I made (or was going
to make, I forget exactly) a comment that it shouldn't add
Content-Type by default, and was surprised to learn that the http
package does.

I would still prefer a naked HTTP response, but I can see Russ'
rationale for the current behavior, and am willing to let him pick the
bikeshed color.

David Symonds

unread,

Jun 1, 2011, 10:59:32 PM6/1/11

to rsc, Andrew Gerrand, golan...@googlegroups.com

You're not really stating my position quite accurately. Here's a list,
from best to worst, in my opinion:
- setting the Content-Type correctly
- not setting the Content-Type
- setting the Content-Type incorrectly

I'd prefer the second over the third, and the third occurs much more
regularly with the current default.

RFC 2616 section 7.2.1 says
Any HTTP/1.1 message containing an entity-body SHOULD include a
Content-Type header field defining the media type of that body. If
and only if the media type is not given by a Content-Type field, the
recipient MAY attempt to guess the media type via inspection of its
content and/or the name extension(s) of the URI used to identify the
resource. If the media type remains unknown, the recipient SHOULD
treat it as type "application/octet-stream".

I'd take it as implicit in that first sentence that the media type
should be the correct one.

And as far as RFCs are concerned, SHOULD is not MUST. I think an
incorrect value is worse than no value, and I reckon that at least Jon
Postel would agree.

Dave.

Kyle Lemons

unread,

Jun 1, 2011, 11:37:13 PM6/1/11

to David Symonds, rsc, Andrew Gerrand, golan...@googlegroups.com

Is it possible under the current setup to explicitly NOT send a content-type?

If not, that may be an argument against setting the default.

I also agree that making the default a configurable default is adding complexity, but I don't see why that's an argument against it. (removing the content-type default would remove complexity from the http package too, but that's not a good reason to do so.) It's one more method on one more object, it doesn't break any existing code, and it removes the appearance that anyone who's writing lots of non-HTTP handlers is a second-class user of http (I have one app with almost no HTML handlers because the loading page is static and it loads content via XHR from the app). If expanding the interface is a worry, why not grab that one text string and make it into an exported package variable, so it could be changed on a per-app basis (like flag.Usage)?

David Symonds

unread,

Jun 1, 2011, 11:44:02 PM6/1/11

to Kyle Lemons, rsc, Andrew Gerrand, golan...@googlegroups.com

On Thu, Jun 2, 2011 at 1:37 PM, Kyle Lemons <kev...@google.com> wrote:

> Is it possible under the current setup to explicitly NOT send a
> content-type?

It's currently impossible. That bit us while implementing the
blobstore API for App Engine, incidentally, though that's a bit more
of a niche situation.

Dave.

Brad Fitzpatrick

unread,

Jun 2, 2011, 1:05:58 AM6/2/11

to David Symonds, golan...@googlegroups.com

Top-replying here for grouping with the thread, but replying to nothing in particular.

I was hoping this would die again but it seems opinions are still sought.

Brain dump:

* RFCs in general, and especially RFC 2616, are often accidentally, optimistically, or delusional wrong. Reality trumps spec ambiguity.

* Changing this would break a lot of code. What's your proposed migration path? I can't think of a good one.

* I would like it to be possible to send an empty Content-Type, e.g.:

rw.Header().Set("Content-Type", "")

... should be enough. we can distinguish that from an unset key in the Headers map.

* But I want it to be HARD to do that. Because why the hell would you go out of your way to do that? If you're already thinking about the Content-Type by typing those lines, you might as well type application/octet-stream.

* So that leaves people who forget to set it. Now somebody has to sniff. Is that browsers (n outcomes) or Go (1 outcome). If anybody is sniffing, I would prefer it be us.

* In conclusion, I'm quite happy with the current behavior (sans inability to explicitly do weirdo missing Content-Type, which bugs me only on principle). If I had to rank the options, from best to worst:

1) leave things as-is, but permit empty Content-Type if set explicitly

2) leave things as-is

3) default to Go sniffing and setting something sane (html utf8, jpeg, gif, octet stream, etc)

4) not sniffing, but exploding hard if user didn't explicitly set something (and crap, what if they set "text/html" without a charset?!)
5) defaulting to not sending a Content-Type (weirdo behavior, even Apache has sent a default since 199whatever) and letting browsers sniff, leading to multiple possible outcomes (even ignoring security and IE stuff, which doesn't interest me either way)

I could imagine all sorts of new handlers (possible in DefaultServeMux) to do any of the above magic at various places but that just introduces complexity & options to appease everybody but doesn't solve the problem. Perhaps the least offensive of those is we could keep things as-is, but introduce minimal sniffing in Go that doesn't change the response Content-Type but *does* log.Printf("http: you might be making a mistake, yo") if they're sending a JPEG or JSON as text/html. Consider that 1.5) in my list above.

David Symonds

unread,

Jun 2, 2011, 1:25:04 AM6/2/11

to Brad Fitzpatrick, golan...@googlegroups.com

On Thu, Jun 2, 2011 at 3:05 PM, Brad Fitzpatrick <brad...@golang.org> wrote:

> * RFCs in general, and especially RFC 2616, are often accidentally,
> optimistically, or delusional wrong. Reality trumps spec ambiguity.

Reality is that this default is incorrect for what seems to be a
majority of code. The RFC reference is but one of my points.

> * Changing this would break a lot of code. What's your proposed migration
> path? I can't think of a good one.

I can't see how this would break any code. It would, in fact, fix the
HTTP response of what seems to be a majority of code that does not set
an explicit Content-Type.

> * So that leaves people who forget to set it. Now somebody has to sniff.
> Is that browsers (n outcomes) or Go (1 outcome). If anybody is sniffing, I
> would prefer it be us.

Browsers are already sniffing. Why do you necessarily think we'd do a
better job?
Incidentally, I'm not opposed to us adding some sniffing on the Go
side, if that's what people really, truly want. But even then, if the
sniffing isn't confident it should still default to nothing.

Dave.

Brad Fitzpatrick

unread,

Jun 2, 2011, 1:35:04 AM6/2/11

to David Symonds, golan...@googlegroups.com

On Wed, Jun 1, 2011 at 10:25 PM, David Symonds <dsym...@golang.org> wrote:

On Thu, Jun 2, 2011 at 3:05 PM, Brad Fitzpatrick <brad...@golang.org> wrote:

> * Changing this would break a lot of code. What's your proposed migration
> path? I can't think of a good one.

I can't see how this would break any code.

Disagree,

It would, in fact, fix the
HTTP response of what seems to be a majority of code that does not set
an explicit Content-Type.

"fix" in the sense of users _generally_ get what they wanted, some/most of the time, at ship it, even though it's incorrect.

Their JSON response today omitting the content-type works, as long as the JSON contains no HTML-like strings in the first 1024 kb (which wasn't in their test suite), but then they shit it in prod and sometimes it stops working because now some users with some content are generating JSON that happens to look like HTML in some browsers.

So we gave them confidence to ship their buggy code. Yay us. No, I'd rather we break their Content-Type-less JSON immediately and force them to do the right thing. Not get bitten later.

> * So that leaves people who forget to set it. Now somebody has to sniff.
> Is that browsers (n outcomes) or Go (1 outcome). If anybody is sniffing, I
> would prefer it be us.

Browsers are already sniffing. Why do you necessarily think we'd do a
better job?

When I write code in Go, I expect it to run the same on Linux, Mac, FreeBSD, Windows, Chrome, Firefox, or MSIE.

Portability means not leaving API behavior up to the environment, be that system calls or content-type sniffing.

David Symonds

unread,

Jun 2, 2011, 1:40:55 AM6/2/11

to Brad Fitzpatrick, golan...@googlegroups.com

On Thu, Jun 2, 2011 at 3:35 PM, Brad Fitzpatrick <brad...@golang.org> wrote:

> So we gave them confidence to ship their buggy code. Yay us. No, I'd
> rather we break their Content-Type-less JSON immediately and force them to
> do the right thing. Not get bitten later.

So if a Content-Type is set, we should log loudly or even panic.
Setting Content-Type to "text/html; charset=utf-8" when the content is
JSON is going to behave differently on different browsers, so we'd
have to hope that they're testing every response their (who?) server
makes on every browser.

>> > * So that leaves people who forget to set it. Now somebody has to
>> > sniff.
>> > Is that browsers (n outcomes) or Go (1 outcome). If anybody is
>> > sniffing, I
>> > would prefer it be us.
>>
>> Browsers are already sniffing. Why do you necessarily think we'd do a
>> better job?
>
> When I write code in Go, I expect it to run the same on Linux, Mac, FreeBSD,
> Windows, Chrome, Firefox, or MSIE.
> Portability means not leaving API behavior up to the environment, be that
> system calls or content-type sniffing.

Then you shouldn't want this default Content-Type. It results in HTTP
responses that do *not* behave the same across browsers.

Dave.

Brad Fitzpatrick

unread,

Jun 2, 2011, 1:52:15 AM6/2/11

to David Symonds, golan...@googlegroups.com

On Wed, Jun 1, 2011 at 10:40 PM, David Symonds <dsym...@golang.org> wrote:

On Thu, Jun 2, 2011 at 3:35 PM, Brad Fitzpatrick <brad...@golang.org> wrote:

> So we gave them confidence to ship their buggy code. Yay us. No, I'd
> rather we break their Content-Type-less JSON immediately and force them to
> do the right thing. Not get bitten later.

So if a Content-Type is set, we should log loudly or even panic.
Setting Content-Type to "text/html; charset=utf-8" when the content is
JSON is going to behave differently on different browsers, so we'd
have to hope that they're testing every response their (who?) server
makes on every browser.

Give me examples.

You said earlier you didn't care about MSIE, but if you do, let's just set "X-Content-Type-Options: nosniff" on all our responses:

http://blogs.msdn.com/b/ie/archive/2008/09/02/ie8-security-part-vi-beta-2-update.aspx

>> > * So that leaves people who forget to set it. Now somebody has to
>> > sniff.
>> > Is that browsers (n outcomes) or Go (1 outcome). If anybody is
>> > sniffing, I
>> > would prefer it be us.
>>
>> Browsers are already sniffing. Why do you necessarily think we'd do a
>> better job?
>
> When I write code in Go, I expect it to run the same on Linux, Mac, FreeBSD,
> Windows, Chrome, Firefox, or MSIE.
> Portability means not leaving API behavior up to the environment, be that
> system calls or content-type sniffing.

Then you shouldn't want this default Content-Type. It results in HTTP
responses that do *not* behave the same across browsers.

I remain unconvinced. I'm off to sleep now, but surprise me with a standalone Go server in the morning that demonstrates the problem in various browsers.

Mikio Hara

unread,

Jun 2, 2011, 4:19:07 AM6/2/11

to David Symonds, Brad Fitzpatrick, golan...@googlegroups.com

On Thu, Jun 2, 2011 at 2:25 PM, David Symonds <dsym...@golang.org> wrote:
> On Thu, Jun 2, 2011 at 3:05 PM, Brad Fitzpatrick <brad...@golang.org> wrote:
>
>> * RFCs in general, and especially RFC 2616, are often accidentally,
>> optimistically, or delusional wrong. Reality trumps spec ambiguity.
>
> Reality is that this default is incorrect for what seems to be a
> majority of code. The RFC reference is but one of my points.

Thank you for sharing with us.

But I'm still not sure what's the *real* problem you are trying to fix,
to figure it out.

> (1) Setting a default Content-Type, while convenient, is not Go-like.
> It is backward-looking, not forward-looking.

Package http API design issue?
(if so I have no preference)

> (2) This particular default, "text/html; charset=utf-8" is not almost
> always the right one.

Interoperability issue?
(seems like not)

> (3) Bad programs are still going to get it wrong.

Language dissemination issue?
(if so I have no preference)

> (4) We're violating the RFC.

I think it doesn't matter unless the http package breaks the communication
btw customers of package http.

> (5) It's magical.

[...]

I guess the word "right/wrong or correct/incorrect" is very subjective, more
practical words help me to understand your issue.

-- Mikio

David Symonds

unread,

Jun 2, 2011, 8:57:14 AM6/2/11

to Brad Fitzpatrick, golan...@googlegroups.com

On Thu, Jun 2, 2011 at 3:52 PM, Brad Fitzpatrick <brad...@golang.org> wrote:

> You said earlier you didn't care about MSIE, but if you do, let's just set
> "X-Content-Type-Options: nosniff" on all our responses:
> http://blogs.msdn.com/b/ie/archive/2008/09/02/ie8-security-part-vi-beta-2-update.aspx

That doesn't help IE6 or IE7. And if we're getting it wrong so much of
the time then we need to rely upon the browser to get the sniffing
right.

>> >> > * So that leaves people who forget to set it. Now somebody has to
>> >> > sniff.
>> >> > Is that browsers (n outcomes) or Go (1 outcome). If anybody is
>> >> > sniffing, I
>> >> > would prefer it be us.
>> >>
>> >> Browsers are already sniffing. Why do you necessarily think we'd do a
>> >> better job?
>> >
>> > When I write code in Go, I expect it to run the same on Linux, Mac,
>> > FreeBSD,
>> > Windows, Chrome, Firefox, or MSIE.
>> > Portability means not leaving API behavior up to the environment, be
>> > that
>> > system calls or content-type sniffing.
>>
>> Then you shouldn't want this default Content-Type. It results in HTTP
>> responses that do *not* behave the same across browsers.
>
> I remain unconvinced. I'm off to sleep now, but surprise me with a
> standalone Go server in the morning that demonstrates the problem in various
> browsers.

A quick Google search found this:
https://developer.mozilla.org/en/Properly_Configuring_Server_MIME_Types
Browsers based on Gecko 2 will stop accepting different-origin CSS
files with the wrong MIME type.

You say that you expect your Go code to run the same in lots of
environments, and that you don't want to leave API behaviour up to the
environment. When you write something that speaks a protocol, then,
it's your job to follow that protocol. And, to a first degree, your
program is *not* speaking browser, it's speaking HTTP. The Go http
package is currently lying about the Content-Type in many situations,
and violating that protocol.

We've all had to deal with systems that don't follow the rules and are
broken in some regard. Why on earth would we want Go to be that kind
of system?

Furthermore, by propagating this notion that "text/html;
charset=utf-8" is a sensible default when we don't even try to see
whether its HTML we're producing, we're telling HTTP clients: "Hey,
don't trust our Content-Type, especially if it says text/html. You
better sniff the content and take a stab in the dark."

To turn it around: what's the benefit of having this default?
- it saves one line of (strictly speaking, optional) code, that,
without it, can sometimes confuse IE6.

That's it? Seriously?

Dave.

Russ Cox

unread,

Jun 2, 2011, 9:15:11 AM6/2/11

to David Symonds, Brad Fitzpatrick, golan...@googlegroups.com

> Furthermore, by propagating this notion that "text/html;
> charset=utf-8" is a sensible default when we don't even try to see
> whether its HTML we're producing, we're telling HTTP clients: "Hey,
> don't trust our Content-Type, especially if it says text/html. You
> better sniff the content and take a stab in the dark."

No, we are telling clients "believe what I say" so that they all
behave the same instead of some guessing right and some
guessing wrong.

Also, there are two guesses involved here: text/html and
charset=utf-8. While it might be easy (but not always)
to tell whether something is HTML, it is often very difficult
in predominantly ASCII pages to tell UTF-8 from other encodings.
I care much more about getting the charset tag out than
I do about the text/html part. That's one line I don't have
to look up every time I want to remember how to spell it
(which I did for years before writing this package).

Russ

Jim Whitehead

unread,

Jun 2, 2011, 10:34:30 AM6/2/11

to golang-dev

I heartily agree with the proposal to remove the (current) default
Content-Type from the http package. I don't think the package gains
anything from it being included. A handler that is properly written is
going to include the proper header itself, and we should encourage
them to be written well instead of relying on a magic default that is
provided by the server.

Just my £0.02.

- Jim

David Symonds

unread,

Jun 2, 2011, 8:26:53 PM6/2/11

to rsc, Brad Fitzpatrick, golan...@googlegroups.com

On Thu, Jun 2, 2011 at 11:15 PM, Russ Cox <r...@golang.org> wrote:

>> Furthermore, by propagating this notion that "text/html;
>> charset=utf-8" is a sensible default when we don't even try to see
>> whether its HTML we're producing, we're telling HTTP clients: "Hey,
>> don't trust our Content-Type, especially if it says text/html. You
>> better sniff the content and take a stab in the dark."
>
> No, we are telling clients "believe what I say" so that they all
> behave the same instead of some guessing right and some
> guessing wrong.

If I use a server, and it's telling me Content-Type=text/html for
things that are definitely not HTML, then I stop believing what the
server is saying. That's what's going on here.
I repeat: HTML is not the 99% case; it's probably not even a majority.

> Also, there are two guesses involved here: text/html and
> charset=utf-8. While it might be easy (but not always)
> to tell whether something is HTML, it is often very difficult
> in predominantly ASCII pages to tell UTF-8 from other encodings.
> I care much more about getting the charset tag out than
> I do about the text/html part. That's one line I don't have
> to look up every time I want to remember how to spell it
> (which I did for years before writing this package).

I can get behind having a charset default. Something like this would
be fine with me:
ct := r.Header().Get("Content-Type")
if strings.HasPrefix(ct, "text/") && strings.Index(ct, ";") == -1 {
ct += "; charset=utf-8"
}

Dave.

Russ Cox

unread,

Jun 2, 2011, 9:54:39 PM6/2/11

to David Symonds, Brad Fitzpatrick, golan...@googlegroups.com

> I repeat: HTML is not the 99% case; it's probably not even a majority.

People who serve non-HTML from a web server expect that
they have to set Content-Type. And they do.

Russ

David Symonds

unread,

Jun 2, 2011, 9:56:47 PM6/2/11

to rsc, Brad Fitzpatrick, golan...@googlegroups.com

They don't. See my straw poll from further up thread. We're even
getting it wrong in the standard library.

Dave.

Russ Cox

unread,

Jun 2, 2011, 10:02:39 PM6/2/11

to David Symonds, Brad Fitzpatrick, golan...@googlegroups.com

>> People who serve non-HTML from a web server expect that
>> they have to set Content-Type. And they do.
>
> They don't.

I think this is the root of the disagreement.
It is mind-boggling to me that people expect they can just
spit out any content at all and let the browsers figure it out.

Again, can you point at any reference that says this is
the new Right Way To Do It?

Russ

David Symonds

unread,

Jun 2, 2011, 10:55:07 PM6/2/11

to rsc, Brad Fitzpatrick, golan...@googlegroups.com

It's the old Right Way To Do It, per RFC 2616
(http://www.w3.org/Protocols/rfc2616/rfc2616-sec7.html#sec7.2.1).

"Any HTTP/1.1 message containing an entity-body SHOULD include a
Content-Type header field defining the media type of that body. If and
only if the media type is not given by a Content-Type field, the
recipient MAY attempt to guess the media type via inspection of its
content and/or the name extension(s) of the URI used to identify the
resource. If the media type remains unknown, the recipient SHOULD
treat it as type "application/octet-stream"."

We may be interpreting that differently, so let me give my interpretation.

The first sentence says: Content-Type should be set to what the media
type is. (not: Content-Type should be set at all costs, even if
incorrect)
The second sentence says: The browser may sniff the contents if the
Content-Type header is missing. (that implies that the Content-Type
header is optional)
The third sentence says: application/octet-stream is the true default type.

In reality, a majority of browsers implement sniffing. IE implements
more aggressive sniffing than what the standard permits.

I think it would be reasonable for us to log something angrily if a
http response is written and a Content-Type header was not explicitly
set. I think that would have a good corrective reaction for Go
servers.

Dave.

Kyle Lemons

unread,

Jun 3, 2011, 11:50:15 AM6/3/11

to David Symonds, rsc, Brad Fitzpatrick, golan...@googlegroups.com

The third sentence says: application/octet-stream is the true default type.

+1. I actually make it the default in the current project I'm working on, because I can't let it send text/html by default and sending nothing isn't possible.

I think it would be reasonable for us to log something angrily if a
http response is written and a Content-Type header was not explicitly
set. I think that would have a good corrective reaction for Go
servers.

I'm not sure if I would prefer the PHP-style insert-"<b>Warning</b>: No content-type set; using text/html" approach or logging a message to the console, but one of these may be the right thing to do.

Russ Cox

unread,

Jun 3, 2011, 11:57:10 AM6/3/11

to golan...@googlegroups.com

I think it's time to walk away from this bike shed.

Setting charset on text is something I considered and may
even have done originally but that seemed much more magical
than having a default. It also doesn't handle the case where
the handler doesn't set text/html. I have seen enough UTF-8
mangled as Latin-1 on the web that I think this is important
enough to make sure it happens without any effort.

We all know what the default is.
We all know how to override the default.
Let's move on.

Russ

Martin Capitanio

unread,

Jun 3, 2011, 4:38:16 PM6/3/11

to golan...@googlegroups.com, r...@golang.org

+ 21

I agree here with Russ. text/html and utf-8 are

the best defaults, especialy for jung people lerning go.

Generating html5 documents from templates, are probably the

first sane things you can do. (I don't even scare them,

there was in bad times a crap behind the ascii code 127)

Martin

Reply all

Reply to author

Forward