[PSR-7] about the Host

104 views
Skip to first unread message

Evert Pot

unread,
Feb 26, 2015, 5:38:22 PM2/26/15
to php...@googlegroups.com
Hi guys,

Michael Dowling raised an issue about the Host header yesterday, and we
decided that this should be addressed in PSR-7.

His main point was, if the user changes the URI with withUri, does this
affect the result of:

$request->getHeader('Host')

and if so, how?

Just like the "request target" the Host header and the result of
->getUri()->getHost() are two fundamentally different things.

The former describes:

1. Part of the URI we are retrieving
2. The contents of the Host header

While the latter describes the actual host we are connecting to to
request the resource.

It gets even a bit more complicated, because the request target may also
have a host, which give us three distinct places to put a host.

We've found that the request object in Go actually handles this in a
different way.

https://golang.org/src/net/http/request.go

In Go, their request object has an equivalent 'req.Host'. They actually
provide separate accessors to everything in the "Request Target".

Their req.Host therefore refers both to request targets in the 'absolute
form', as well as the contents of the "Host:" header. Only one of these
should ever appear in HTTP after all.

(note that GO also has a separate req.URL.getHost() just like we do).

I actually think that the GO approach models the HTTP request even more
perfectly than we do, but taking their approach would mean a larger
departure of the existing api... which is perhaps a bad idea at this
stage in the game.

So, lacking a larger change, the workaround that we both felt was 'good
enough' was to treat the Host header identical to the RequestTarget.

So that means that the Host header should be special cased, specifically:

* If we don't have a Host header and getHeader('Host') is called, it
will return the value of getUri()->getHost()
* If the user explicitly sets the 'Host' header, that will always take
precendence.
* Other methods that returns headers also need to behave this way for
the host.

This addresses the two main use-cases:

1. Most users will just want to set setUri() and you don't want to ask
them to explicitly also have to set the Host header every time.
2. Some users may care about the distinction between the Host header
and setUri().

It also makes the behavior identical to getRequestTarget() that can be
overridden, but normally uses the information from getUri().

But there is one thing that this solution does not address. A user may
want to make a HTTP/1.0 request without a Host header at all. I think
this is an edge-case that we don't have to care about, but should be
documented.


# Is this a radical change?

I don't think this is. This proposal keeps the interface the same, but
just changes some docblocks. This change makes something extremely
explicit what would otherwise likely have been implementation-specific,
causing potential interoperability problems.

Cheers,
Evert

Matthew Weier O'Phinney

unread,
Mar 1, 2015, 4:18:09 PM3/1/15
to php...@googlegroups.com
On Thu, Feb 26, 2015 at 4:38 PM, Evert Pot <ever...@gmail.com> wrote:
> Michael Dowling raised an issue about the Host header yesterday, and we
> decided that this should be addressed in PSR-7.
>
> His main point was, if the user changes the URI with withUri, does this
> affect the result of:
>
> $request->getHeader('Host')
>
> and if so, how?
>
> Just like the "request target" the Host header and the result of
> ->getUri()->getHost() are two fundamentally different things.
>
> The former describes:
>
> 1. Part of the URI we are retrieving
> 2. The contents of the Host header
>
> While the latter describes the actual host we are connecting to to
> request the resource.
>
> It gets even a bit more complicated, because the request target may also
> have a host, which give us three distinct places to put a host.

Much as I love the HTTP specification for its simplicity, this is yet
another one of those areas where it introduces some really, really
awful ambiguity. :)

> We've found that the request object in Go actually handles this in a
> different way.
>
> https://golang.org/src/net/http/request.go
>
> In Go, their request object has an equivalent 'req.Host'. They actually
> provide separate accessors to everything in the "Request Target".
>
> Their req.Host therefore refers both to request targets in the 'absolute
> form', as well as the contents of the "Host:" header. Only one of these
> should ever appear in HTTP after all.

SHOULD, but the spec allows for it. :)

> (note that GO also has a separate req.URL.getHost() just like we do).
>
> I actually think that the GO approach models the HTTP request even more
> perfectly than we do, but taking their approach would mean a larger
> departure of the existing api... which is perhaps a bad idea at this
> stage in the game.
>
> So, lacking a larger change, the workaround that we both felt was 'good
> enough' was to treat the Host header identical to the RequestTarget.
>
> So that means that the Host header should be special cased, specifically:
>
> * If we don't have a Host header and getHeader('Host') is called, it
> will return the value of getUri()->getHost()
> * If the user explicitly sets the 'Host' header, that will always take
> precendence.

I like this approach; it's consistent with how the RequestTarget is
modeled in the interfaces (which you also note separately below).

> * Other methods that returns headers also need to behave this way for
> the host.

I'm not sure what you mean here...

> This addresses the two main use-cases:
>
> 1. Most users will just want to set setUri() and you don't want to ask
> them to explicitly also have to set the Host header every time.
> 2. Some users may care about the distinction between the Host header
> and setUri().
>
> It also makes the behavior identical to getRequestTarget() that can be
> overridden, but normally uses the information from getUri().
>
> But there is one thing that this solution does not address. A user may
> want to make a HTTP/1.0 request without a Host header at all. I think
> this is an edge-case that we don't have to care about, but should be
> documented.

I think that could be an implementation detail of an HTTP client, TBH.
Also, since HTTP/1.0 DOES allow sending the Host header, I think it's
definitely a very, very slim edge case, and it would have to be
something the consumer of the client opts into specifically (and the
client would essentially need to check to see if the URI host and the
Host header match before deciding to omit it).

> # Is this a radical change?
>
> I don't think this is. This proposal keeps the interface the same, but
> just changes some docblocks. This change makes something extremely
> explicit what would otherwise likely have been implementation-specific,
> causing potential interoperability problems.

The language here would be SHOULD, as in optional but recommended
behavior. For instance, a server-side specific implementation of PSR-7
might not need to implement this behavior.

This is the PR I've created according to your post:

- https://github.com/php-fig/fig-standards/pull/446

and the related commit for http-message:

- https://github.com/php-fig/http-message/pull/26

As you'll see, the change posed a bit of a problem: where to document
the behavior. Since headers are defined in MessageInterface, but the
Host header and changes suggested are specific to the
RequestInterface, the only viable approach was to have
RequestInterface override the getHeader() method. Literally, the only
change was overriding the docblock to add the verbiage and some @see
annotations.

Let me know if this makes sense, and/or if you have any suggested
changes. This should be merged before we enter voting phase, but I
don't think it warrants a new or extended Review phase.

--
Matthew Weier O'Phinney
mweiero...@gmail.com
https://mwop.net/

Michael Dowling

unread,
Mar 1, 2015, 10:11:30 PM3/1/15
to php...@googlegroups.com
If this ends up being a "should", then implementations would basically still need to check for both a host header and a URI host when trying to determine a host.

I would also expect a call to getHeaders to include a Host header from the URI. But the more I think about it, the more confusing and hard to document it becomes. I wonder if this should just be left out.

Thanks,
Michael
> --
> You received this message because you are subscribed to the Google Groups "PHP Framework Interoperability Group" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to php-fig+u...@googlegroups.com.
> To post to this group, send email to php...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/php-fig/CAJp_myXCM9r8pC9PxXaQ2DnCOZJtMf-7-1w8syFF7gP7da_gSA%40mail.gmail.com.
> For more options, visit https://groups.google.com/d/optout.

Larry Garfield

unread,
Mar 2, 2015, 2:00:08 PM3/2/15
to php...@googlegroups.com
Out of curiosity... what would a Go-like solution look like? (I've not
tried Go's HTTP support yet; it's on my todo list.) I know we're close
to completion, but if we can fix a problem "right" rather than adjust it
to be "OK at this point", I'd rather go with right. This is a spec that
we want to last. :-)

--Larry Garfield

Evert Pot

unread,
Mar 2, 2015, 2:11:31 PM3/2/15
to php...@googlegroups.com

>> Their req.Host therefore refers both to request targets in the 'absolute
>> form', as well as the contents of the "Host:" header. Only one of these
>> should ever appear in HTTP after all.
>
> SHOULD, but the spec allows for it. :)

Actually.. it's different than I thought:

A client MUST send a Host header field in all HTTP/1.1 request
messages. If the target URI includes an authority component, then a
client MUST send a field-value for Host that is identical to that
authority component, excluding any userinfo subcomponent and its "@"
delimiter (Section 2.7.1). If the authority component is missing or
undefined for the target URI, then a client MUST send a Host header
field with an empty field-value.

Since the Host field-value is critical information for handling a
request, a user agent SHOULD generate Host as the first header field
following the request-line.

A client MUST send a Host header field in an HTTP/1.1 request even if
the request-target is in the absolute-form, since this allows the
Host information to be forwarded through ancient HTTP/1.0 proxies
that might not have implemented Host.

When a proxy receives a request with an absolute-form of
request-target, the proxy MUST ignore the received Host header field
(if any) and instead replace it with the host information of the
request-target. A proxy that forwards such a request MUST generate a
new Host field-value based on the received request-target rather than
forward the received Host field-value.

Pretty specific!

So while they _can_ differ, this would only happen in case of broken
implementations and the spec accounts for resolving this issue.

The 'host' in the request-target and the host header therefore refer to
the same thing.

>
>> * Other methods that returns headers also need to behave this way for
>> the host.
>
> I'm not sure what you mean here...

I was talking about getHeaderLines() and getHeaders()
I disagree. The server-case is even more obvious, because there's almost
always a Host header, and therefore the 'explicitly overriden case'
always kicks in.

Evert

Evert Pot

unread,
Mar 2, 2015, 2:20:20 PM3/2/15
to php...@googlegroups.com
> Out of curiosity... what would a Go-like solution look like? (I've not
> tried Go's HTTP support yet; it's on my todo list.) I know we're close
> to completion, but if we can fix a problem "right" rather than adjust it
> to be "OK at this point", I'd rather go with right. This is a spec that
> we want to last. :-)

I'm having a little bit of trouble reading the exact implementation, and
I've been going back and forward a bit on whether or not they 'did it
right'.

The only real issues I currently see are:

1. Is that we have 3 places to specify the Host header, while the
abstract data-model has 2.
2. We could have further split up some of the data in the request into
their own properties, whereas we pack them into strings with specific
formats (again the request target).

We're fixing the data-model issue partially now, because it's still
possible to:

1. Have an overridden request target in absolute form.
2. Override the host as well

This is a bit odd, because a properly functioning HTTP actor should
ignore the host header if the request target is in absolute form.

Regardless, I think we're pretty damn close, and if people are fine
leaving abominations such as the 'attributes' feature in there, and I
don't really see any motivation to attempt to put lipstick on this pig.

Evert

Matthew Weier O'Phinney

unread,
Mar 3, 2015, 5:04:05 PM3/3/15
to php...@googlegroups.com
Your "abomination" is a feature to many. Such is the nature of
consensus building. :)

I do need a clarification, however. Are you suggesting:

- the RFC is fine as-is with regards to how the Host is handled?
- or that we should merge the pull request I've opened?

I just need to know how to proceed so that we can get things finalized
before putting the PSR into voting phase.

Evert Pot

unread,
Mar 3, 2015, 5:10:41 PM3/3/15
to php...@googlegroups.com
It's a poorly thought out feature, and you know it :)

>
> I do need a clarification, however. Are you suggesting:
>
> - the RFC is fine as-is with regards to how the Host is handled?
> - or that we should merge the pull request I've opened?
>
> I just need to know how to proceed so that we can get things finalized
> before putting the PSR into voting phase.
>

I think the pull request should be slightly extended to also return a
Host header from getHeaders() and getHeaderLines(), and it should not be
made optional.

Evert

Matthew Weier O'Phinney

unread,
Mar 3, 2015, 5:24:29 PM3/3/15
to php...@googlegroups.com
Okay, I've made those changes; can you review, please?
Reply all
Reply to author
Forward
0 new messages