Impossible to override Content-Type header?

3,155 views
Skip to first unread message

Christian

unread,
Dec 13, 2012, 4:28:38 PM12/13/12
to golan...@googlegroups.com
I have a func that should set the Content-Type header if it does not already exist. It seems like the Header implementation does this already, but I wonder why and cannot see it in the source code.

I wonder why the the following code doesn't override an existing Content-Type header:

func (writer DiscardBodyResponseWriter) Write(data []byte) (int, error) {
    contentType := http.DetectContentType(data)
    writer.Header().Set("Content-Type", contentType)
    return 0, nil
}


I guessed that I needed some check if the header exists:

contentType := writer.Header().Get("Content-Type") {
if contentType == nil || contentType == "" {
    writer.Header().Set("Content-Type",  http.DetectContentType(data))
}


But it works without this check (in my unit tests). Why?

And is it correct that ResponseWriter.Write([]byte) returns the length of written bytes? API doc is a bit sparse.

Christian

David Symonds

unread,
Dec 13, 2012, 4:46:14 PM12/13/12
to Christian, golan...@googlegroups.com
net/http already sets the Content-Type if one doesn't exist (search
for DetectContentType in net/http), so I'm not sure what you're trying
to achieve.

The implementation will call WriteHeader(200) for you if Write is
called without a preceding WriteHeader call, which will write the
headers, and thus it is too late to change headers in Write.

Kevin Gillette

unread,
Dec 13, 2012, 5:23:39 PM12/13/12
to golan...@googlegroups.com
http.ResponseWriter is an io.Writer. Unless documented otherwise, you may assume it has one of the behaviors specified in the documentation for io.Writer. Most Readers and Writers will have sparse documentation on the behavior of Read and Write (because they are compliant implementations).

Christian

unread,
Dec 14, 2012, 1:47:04 AM12/14/12
to golan...@googlegroups.com, Christian
I want to set Content-Type (and Content-Length) for HEAD request. That is not done correctly in the standard library!

The only thing I find confusing is that the Set(key, value string) doesn't override the Content-Type header if it already exists. The API documentation says exactly this.

Christian

David Symonds

unread,
Dec 14, 2012, 1:51:17 AM12/14/12
to Christian, golan...@googlegroups.com
On Fri, Dec 14, 2012 at 5:47 PM, Christian <chri...@helmbold.de> wrote:

> I want to set Content-Type (and Content-Length) for HEAD request. That is
> not done correctly in the standard library!

Well, it can't sniff a non-existent body.

> The only thing I find confusing is that the Set(key, value string) doesn't
> override the Content-Type header if it already exists. The API documentation
> says exactly this.

It definitely should override it. Perhaps you've found a bug, or
perhaps you are doing something wrong. Can you post some code to
demonstrate what you're seeing?

minux

unread,
Dec 14, 2012, 1:51:35 AM12/14/12
to Christian, golan...@googlegroups.com
On Fri, Dec 14, 2012 at 2:47 PM, Christian <chri...@helmbold.de> wrote:
I want to set Content-Type (and Content-Length) for HEAD request. That is not done correctly in the standard library!
please elaborate on this?
i think it's the http handlers' job to handle HEAD requests (i.e they should set Content-Length/Content-Type directly,
and shouldn't write to the ResponseWriter).

The only thing I find confusing is that the Set(key, value string) doesn't override the Content-Type header if it already exists. The API documentation says exactly this.
Set will override an existing entry.

I think the problem is that you can't modify the header once it's already been sent out. 

Christian

unread,
Dec 14, 2012, 2:07:32 AM12/14/12
to golan...@googlegroups.com, Christian


Am Freitag, 14. Dezember 2012 07:51:35 UTC+1 schrieb minux:

On Fri, Dec 14, 2012 at 2:47 PM, Christian <chri...@helmbold.de> wrote:
I want to set Content-Type (and Content-Length) for HEAD request. That is not done correctly in the standard library!
please elaborate on this?

The mistake is in this check:

if w.header.Get("Content-Type") == "" && w.req.Method != "HEAD" {
  w.needSniff = true
}

This indicates that the content type should not be detected on HEAD requests. If I think about it, this may be correct in general, because a handler would typically not send a body as an answer to a HEAD request. In my case I generate a HEAD handler automatically if there is a GET handler.

i think it's the http handlers' job to handle HEAD requests (i.e they should set Content-Length/Content-Type directly,
and shouldn't write to the ResponseWriter).

The only thing I find confusing is that the Set(key, value string) doesn't override the Content-Type header if it already exists. The API documentation says exactly this.
Set will override an existing entry.

I think the problem is that you can't modify the header once it's already been sent out. 

You're right. Generating a HEAD handler seems to be not so easy as I've thought.

Kevin Gillette

unread,
Dec 14, 2012, 2:11:53 AM12/14/12
to Christian, golan...@googlegroups.com
HEAD requests (or responses) have the same rules as GET requests, except that the response may not contain a body. GET requests may not send a body, and therefore neither can HEAD requests. Since HEAD requests must not involve a body in any sense, then sending a body is invalid (and anything that accepts a body in a HEAD request is invalid). What you list as a mistake is not a mistake... there is no body to sniff.


--
 
 

minux

unread,
Dec 14, 2012, 2:12:26 AM12/14/12
to Christian, golan...@googlegroups.com
On Fri, Dec 14, 2012 at 3:07 PM, Christian <chri...@helmbold.de> wrote:
Am Freitag, 14. Dezember 2012 07:51:35 UTC+1 schrieb minux:

On Fri, Dec 14, 2012 at 2:47 PM, Christian <chri...@helmbold.de> wrote:
I want to set Content-Type (and Content-Length) for HEAD request. That is not done correctly in the standard library!
please elaborate on this?

The mistake is in this check:

if w.header.Get("Content-Type") == "" && w.req.Method != "HEAD" {
  w.needSniff = true
}

This indicates that the content type should not be detected on HEAD requests. If I think about it, this may be correct in general, because a handler would typically not send a body as an answer to a HEAD request. In my case I generate a HEAD handler automatically if there is a GET handler.
I think for HEAD requests, there should be not content available to sniff and it's the handler's
job to set Content-Type (although net/http can save and discard the body generated by the handler
for HEAD requests, but it doesn't feel right for Go for performance reasons because it's much easier
to simply not generating the body for HEAD requests).

I know some Go programs ignore HEAD requests (try send a HEAD request to a godoc -http running
locally and you will see). Perhaps we can do something for this, but I'm not sure.

Christian

unread,
Dec 14, 2012, 2:18:55 AM12/14/12
to golan...@googlegroups.com, Christian
I checked my code again and came to the conclusion that the Content-Type header is set before http.ResponseWriter.WriteHeader(status int) is called.

I wrap a httpResponseWriter like this:

type DiscardBodyResponseWriter struct {
    http.ResponseWriter

}

func (writer DiscardBodyResponseWriter) Write(data []byte) (int, error) {
    contentType := http.DetectContentType(data)
    writer.Header().Set("Content-Type", contentType)
    return 0, nil
}

Then I create a HEAD handler in which the DiscardBodyResponseWriter is passed to the GET handler:

    if GETHandler, ok := resource.handlers["GET"]; ok {
        resource.handlers["HEAD"] = func(writer http.ResponseWriter, request *Request) {
            GETHandler(DiscardBodyResponseWriter{writer}, request)
        }
    }

My test GET handler looks like this:

func get(writer http.ResponseWriter, request *Request) {
    writer.Write([]byte("demo")) // Content-Type is set here (before WriteHeader(int) is called
    writer.Header().Set("Content-Type", defaultContentType)
    writer.WriteHeader(http.StatusOK)
}

Christian

unread,
Dec 14, 2012, 2:20:34 AM12/14/12
to golan...@googlegroups.com, Christian

Christian

unread,
Dec 14, 2012, 2:22:54 AM12/14/12
to golan...@googlegroups.com, Christian
This is correct. What I want to achieve is something like Java Servlets do to handle HEAD if there is a GET but not an explicit GET handler. It is a special case and no mistake in the standard library.

David Symonds

unread,
Dec 14, 2012, 2:23:29 AM12/14/12
to Christian, golan...@googlegroups.com
On Fri, Dec 14, 2012 at 6:18 PM, Christian <chri...@helmbold.de> wrote:

> My test GET handler looks like this:
>
> func get(writer http.ResponseWriter, request *Request) {
> writer.Write([]byte("demo")) // Content-Type is set here (before
> WriteHeader(int) is called
> writer.Header().Set("Content-Type", defaultContentType)
> writer.WriteHeader(http.StatusOK)
> }

This doesn't make sense. Once you start writing a body you cannot
change the header, which must be written to the network before the
body.

Patrick Mylund Nielsen

unread,
Dec 14, 2012, 2:29:43 AM12/14/12
to Kevin Gillette, Christian, golang-nuts
>> I want to set Content-Type (and Content-Length) for HEAD request. That is not done correctly in the standard library!

HEAD requests (or responses) have the same rules as GET requests, except that the response may not contain a body. GET requests may not send a body, and therefore neither can HEAD requests. Since HEAD requests must not involve a body in any sense, then sending a body is invalid (and anything that accepts a body in a HEAD request is invalid). What you list as a mistake is not a mistake... there is no body to sniff.

Yes and no. The HEAD requests SHOULD (in RFC terms) contain the same headers as would the equivalent GET request, meaning the Content-Type should be the same, not e.g. text/plain, or at least not be misleading. We decided to omit it, I think mainly because writing to the ResponseWriter is an error, and so sniffing the body isn't very practical. Users can set the content-type header if they need to.




--
 
 

Christian

unread,
Dec 14, 2012, 2:30:10 AM12/14/12
to golan...@googlegroups.com, Christian

Am Freitag, 14. Dezember 2012 08:23:29 UTC+1 schrieb David Symonds:

This doesn't make sense. Once you start writing a body you cannot
change the header, which must be written to the network before the
body.

That is correct, but the body is written with the Write(..) method and in my overriding implementation of this method I set the header before the body is written (since it is never written at all). This could only happen if I would call Write on the wrapped ResponseWriter, but I do not.

Patrick Mylund Nielsen

unread,
Dec 14, 2012, 2:34:16 AM12/14/12
to Kevin Gillette, Christian, golang-nuts
Forgot to add that it IS an accepted deficiency: http://code.google.com/p/go/issues/detail?id=2886

But yeah, it requires a fairly big change for a relatively small gain.

Kevin Gillette

unread,
Dec 14, 2012, 9:38:47 AM12/14/12
to golan...@googlegroups.com, Kevin Gillette
I think an automatic stdlib approach will get abused: some will use GET handling exactly, and write 100mb of data to a discarding writer wrapper, wasting a lot of cpu cycles, and not realize that this is a bad thing; or the writer will error out after it successfully sniffed, and a naive app will respond to that incorrectly (marking in a database that the response failed, as if it were a GET). When the app programmer knows what they're doing (as Christian did) because they had to write the wrapper themselves, the likelyhood of abuse is very low. Also, Go makes it easy to fulfill interfaces, and net/http is designed to be easily overridden in many places, so custom overrides should not be a major burden.

At any rate, if you can write:

if r.Method == "HEAD" {
  w.Write(datachunks[0]) // assuming HEAD sniffs
} else if r.Method == "GET" {
  for i := 0; i < len(datachunks); {
    // ...
  }
}

Then you can afford to do (as David hinted):

if r.Method == "HEAD" {
  w.Headers().Set("Content-Type", http.DetectContentType(datachunks[0]))
} else if r.Method == "GET" {
  // ...
}

Except use of DetectContentType forces you to consider that HEAD should behave differently than GET on the backend (or makes you decide to use a Writer wrapper), and DetectContentType works now.

Christian

unread,
Dec 15, 2012, 7:15:14 AM12/15/12
to golan...@googlegroups.com, Kevin Gillette
After all these explanations and some subtle things I think, that it was a bad idea to create autmatic HEAD handling by discarding the body of a GET response. If someone wants to support GET he should go the extra mile an do it properly.

Patrick Mylund Nielsen

unread,
Dec 15, 2012, 7:24:46 AM12/15/12
to Christian, golang-nuts, Kevin Gillette
Many services, e.g. link services like URL shorteners or social networks (to validate a link) assume you support HEAD. Any net/http handler supports it seamlessly. Very few people actually need to create a custom HEAD handler compared to how many things would break if it were explicit.

In almost all cases, this is a sane default. It's the same for FormValue: it doesn't matter where the form value came from, query parameters or request body, you can just use it. In some cases it creates some ambiguity, but you can work around it, e.g. by using if req.Method == "HEAD" { ... }. (In the latter case, e.g. if you want to be absolutely sure you are reading a form value from the response body, we created a separate req.PostForm and req.PostFormValue())


On Sat, Dec 15, 2012 at 1:15 PM, Christian <chri...@helmbold.de> wrote:
After all these explanations and some subtle things I think, that it was a bad idea to create autmatic HEAD handling by discarding the body of a GET response. If someone wants to support GET he should go the extra mile an do it properly.

--
 
 

Kevin Gillette

unread,
Dec 15, 2012, 7:40:45 AM12/15/12
to golan...@googlegroups.com, Christian, Kevin Gillette
On Saturday, December 15, 2012 5:24:46 AM UTC-7, Patrick Mylund Nielsen wrote:
Many services, e.g. link services like URL shorteners or social networks (to validate a link) assume you support HEAD. Any net/http handler supports it seamlessly. Very few people actually need to create a custom HEAD handler compared to how many things would break if it were explicit.

Traditionally, I think setting any headers that allow a GET resource to be cacheable will cause just about any advanced http client (especially browsers) to send a HEAD request the next time if the cached copy could possibly be fresh, though conditional GET requests (does net/http transparently handle those somehow?) are much more common now. HEAD is also the standard way of asking about a resource without actually investing the resources to fetch it, so it certainly is a "primary" verb.

On Sat, Dec 15, 2012 at 1:15 PM, Christian <chri...@helmbold.de> wrote:
After all these explanations and some subtle things I think, that it was a bad idea to create autmatic HEAD handling by discarding the body of a GET response. If someone wants to support GET he should go the extra mile an do it properly.

Do you mean go the extra mile and support HEAD? It'd be a pretty odd service to support GET but not head (the meta is the resource?) 

Christian

unread,
Dec 16, 2012, 1:40:15 AM12/16/12
to golan...@googlegroups.com, Christian, Kevin Gillette


Am Samstag, 15. Dezember 2012 13:40:45 UTC+1 schrieb Kevin Gillette:

Do you mean go the extra mile and support HEAD? It'd be a pretty odd service to support GET but not head (the meta is the resource?) 

Yes, this is what I meant. An argument for atomatic HEAD handler generation is programmer convenience, but in terms of performance it is just bad. The question is how likely it is that the lib user forgets to provide HEAD together with GET.

Kevin Gillette

unread,
Dec 16, 2012, 2:41:41 AM12/16/12
to golan...@googlegroups.com, Christian, Kevin Gillette
Yeah, that's a natural side effect of the language being imperative/procedural. Declarative-style web servers have been done, especially in python, where you have methods called 'do_GET' and 'do_HEAD' as part of a handler class. The downside of those is that you have to go out of your way to do any common request handling (for example, using the same code to handle the common bits between a HEAD and GET request to the same resource).

http.ServeContent should be suitable for common handling of GET and HEAD (though this is an opt-in mechanism, not an opt-out, as would be involved with a discarding writer); ServeContent is smart enough to handle range requests and a great deal of the "trickier" http stuff that usually goes unimplemented in apps. If you have a dynamic data resource (such as providing a number of line-separated random numbers based on query parameters), the trick is to not necessarily "push" that data to the writer, but to (also) provide it as an io.ReadSeeker where possible, in which case you can let ServeContent do all the protocol work for you.  If it's a HEAD request, ServeContent will only seek to the end (to find the Content-Length), try to let the client use a cached copy if fresh, and sniff the first 1024 bytes if the Content-Type header isn't already set. An io.ReadSeeker that generates its content on the fly could easily special-case end-seeks for many kinds of data (and where it's not possible to get an exact estimate, it's occasionally data-safe to overestimate and add trailing whitespace or null padding to the end). The 1024-byte seek can also be avoided by pre-setting Content-Type, in which case HEAD should be virtually free even though it's being transparently handled.

The real benefit is that naively written programs that offload onto ServeContent aren't all that naive after all. Unlike a discarding writer, the app code won't be able blindly write 100mb of data (and ignore EOF errors) for a HEAD request. Instead, ServeContent will competently be controlling how much app/resource-specific data flows, at will currently, at worst, read at most 1k for a HEAD request.

Patrick Mylund Nielsen

unread,
Dec 16, 2012, 3:39:03 AM12/16/12
to Kevin Gillette, golang-nuts, Christian
Note that most of these languages/frameworks use some kind of inheritance, such that do_HEAD by default simply does do_GET but avoids sending a body, and can be overridden by the programmer. That is exactly what Go provides. If you are worried about HEAD performance, you write your own HEAD handler. It is much less important, generally, than actually supporting HEAD.


--
 
 

minux

unread,
Dec 16, 2012, 10:06:54 AM12/16/12
to Kevin Gillette, golan...@googlegroups.com, Christian
On Sun, Dec 16, 2012 at 3:41 PM, Kevin Gillette <extempor...@gmail.com> wrote:
Yeah, that's a natural side effect of the language being imperative/procedural. Declarative-style web servers have been done, especially in python, where you have methods called 'do_GET' and 'do_HEAD' as part of a handler class. The downside of those is that you have to go out of your way to do any common request handling (for example, using the same code to handle the common bits between a HEAD and GET request to the same resource).

http.ServeContent should be suitable for common handling of GET and HEAD (though this is an opt-in mechanism, not an opt-out, as would be involved with a discarding writer); ServeContent is smart enough to handle range requests and a great deal of the "trickier" http stuff that usually goes unimplemented in apps. If you have a dynamic data resource (such as providing a number of line-separated random numbers based on query parameters), the trick is to not necessarily "push" that data to the writer, but to (also) provide it as an io.ReadSeeker where possible, in which case you can let ServeContent do all the protocol work for you.  If it's a HEAD request, ServeContent will only seek to the end (to find the Content-Length), try to let the client use a cached copy if fresh, and sniff the first 1024 bytes if the Content-Type header isn't already set. An io.ReadSeeker that generates its content on the fly could easily special-
nit picking
it should be 512 as documented here:

Kevin Gillette

unread,
Dec 16, 2012, 12:03:17 PM12/16/12
to golan...@googlegroups.com, Kevin Gillette, Christian
I was describing ServeContent's current behavior, not necessarily what it 'should' do. net/http/fs.go, lines 141-142 in revision 15137:a70be086fe02 (tip as of right now) show a 1024b array being filled for sniffing purposes, even if only the first 512b get used. If '1024' were replaced by the constant sniffLen, then there should be no wastage.

Kevin Gillette

unread,
Dec 16, 2012, 12:09:02 PM12/16/12
to golan...@googlegroups.com, Kevin Gillette, Christian
On Sunday, December 16, 2012 1:39:03 AM UTC-7, Patrick Mylund Nielsen wrote:
Note that most of these languages/frameworks use some kind of inheritance, such that do_HEAD by default simply does do_GET but avoids sending a body, and can be overridden by the programmer. That is exactly what Go provides. If you are worried about HEAD performance, you write your own HEAD handler. It is much less important, generally, than actually supporting HEAD.

My main point was that being lazy (or pragmatic) by not distinguishing between HEAD and GET costs almost nothing in either cpu or mem efficiency if you can provide an io.ReadSeeker and pass it to ServeContent, compared to explicitly handling HEAD by doing your own 512b sniff, passing it to DetectContentType, then setting the Content-Type header and possibly doing your own Content-Length determination (all of which is what ServeContent does anyway).
Reply all
Reply to author
Forward
0 new messages