HTTP/1.0 support (or "Need we really output absolute IRIs?")

4 views
Skip to first unread message

Geoffrey Sneddon

unread,
Sep 4, 2008, 6:30:41 PM9/4/08
to habar...@googlegroups.com
As I'm sure many of you have seen, getting errors such as "Notice:
Undefined index: HTTP_HOST in system/classes/site.php line 137" is far
from uncommon. The basic reason behind this is HTTP/1.0 has no host
header, and thus HTTP_HOST has nowhere to be initialized from (the
server has no idea what domain was used for the request unless only
one domain is on the IP). This is <http://trac.habariproject.org/habari/ticket/631
>.

Is there any reason we actually output absolute IRIs? Can we not get
by with just outputting relative IRIs, even if it is absolute path IRIs?


--
Geoffrey Sneddon
<http://gsnedders.com/>

Owen Winkler

unread,
Sep 4, 2008, 6:34:32 PM9/4/08
to habar...@googlegroups.com
Geoffrey Sneddon wrote:
> As I'm sure many of you have seen, getting errors such as "Notice:
> Undefined index: HTTP_HOST in system/classes/site.php line 137" is far
> from uncommon.

...

> Is there any reason we actually output absolute IRIs? Can we not get
> by with just outputting relative IRIs, even if it is absolute path IRIs?

Can you explain what these thoughts have to do with each other?

Owen

Geoffrey Sneddon

unread,
Sep 4, 2008, 7:27:32 PM9/4/08
to habar...@googlegroups.com

We use HTTP_HOST to get the host for the authority of the absolute
IRI. What else can we use for the host if we do not have it? All I
can think that we would always have is the server IP, which may not be
ideal if it is an internal IP. The only place we must use absolute
URIs is the location header in HTTP for redirects (everything supports
relative URIs for it, though). If we don't bother to use absolute IRIs
(or authority relative IRIs) elsewhere we never hit this issue apart
from in that case.

Owen Winkler

unread,
Sep 4, 2008, 8:04:46 PM9/4/08
to habar...@googlegroups.com
Geoffrey Sneddon wrote:
>
> We use HTTP_HOST to get the host for the authority of the absolute
> IRI. What else can we use for the host if we do not have it? All I
> can think that we would always have is the server IP, which may not be
> ideal if it is an internal IP. The only place we must use absolute
> URIs is the location header in HTTP for redirects (everything supports
> relative URIs for it, though). If we don't bother to use absolute IRIs
> (or authority relative IRIs) elsewhere we never hit this issue apart
> from in that case.
>

Outputting URLs is not the only place where HTTP_HOST is used. It is
used in the Site class to find the correct config, plugin, and theme
data for a site. There may be other places it is used.

I think there isn't much to discuss on this topic. Was there an implied
question as to why we use absolute URLs? If so, no, there's no reason.
But as I said above, we do use the HTTP_HOST for at least one other thing.

Owen

Michael C. Harris

unread,
Sep 4, 2008, 8:11:47 PM9/4/08
to habar...@googlegroups.com

Isn't the issue that it's not always available, because HTTP 1.0
clients don't send the Host header ?

--
Michael C. Harris, School of CS&IT, RMIT University
http://twofishcreative.com/michael/blog
IRC: michaeltwofish #habari

Owen Winkler

unread,
Sep 4, 2008, 8:16:20 PM9/4/08
to habar...@googlegroups.com

Right, which is to say that the statement "If we don't bother to use
absolute IRIs elsewhere we never hit this issue apart from in that
case." isn't entirely true. If the HTTP_HOST is missing, at least the
one other thing I mentioned will fail.

I'm sure there is some reason why we should support HTTP 1.0 without
host headers, but I can't bring myself to think on it.

Owen

Arthus Erea

unread,
Sep 4, 2008, 10:37:44 PM9/4/08
to habar...@googlegroups.com
I think Owen clearly gives an example of why we absolutely need to
know the URL of the requested resource.

Multisite requires it–that's a pretty darn good reason.

Has anyone _besides_ Geoffrey actually run into this issue?

Owen Winkler

unread,
Sep 4, 2008, 10:57:24 PM9/4/08
to habar...@googlegroups.com
Arthus Erea wrote:
> I think Owen clearly gives an example of why we absolutely need to
> know the URL of the requested resource.
>
> Multisite requires it–that's a pretty darn good reason.
>

If you make a request with HTTP 1.0, it can only possibly support one
site, since there is no http_host in the request header.

But sites that support multisite would need to be able to parse that
data. Is it unreasonable to require that a web server needs to support
HTTP 1.1 to use Habari?

Owen

Michael C. Harris

unread,
Sep 5, 2008, 1:28:43 AM9/5/08
to habar...@googlegroups.com
On Thu, Sep 04, 2008 at 10:57:24PM -0400, Owen Winkler wrote:
>
> Arthus Erea wrote:
> > I think Owen clearly gives an example of why we absolutely need to
> > know the URL of the requested resource.
> >
> > Multisite requires it–that's a pretty darn good reason.
>
> If you make a request with HTTP 1.0, it can only possibly support one
> site, since there is no http_host in the request header.

So, the real issue is how should we deal with the case where we the
request is made from an HTTP 1.0 client and therefore we _don't_ have
a Host header. How common is that ? Do most proxies now support HTTP
1.1 ? Geoffrey ?

> But sites that support multisite would need to be able to parse that
> data. Is it unreasonable to require that a web server needs to support
> HTTP 1.1 to use Habari?

I think it's perfectly reasonable to only support HTTP 1.1 enabled
servers.

Matthias Bauer

unread,
Sep 5, 2008, 5:36:24 AM9/5/08
to habar...@googlegroups.com
Michael C. Harris wrote:

>> But sites that support multisite would need to be able to parse that
>> data. Is it unreasonable to require that a web server needs to support
>> HTTP 1.1 to use Habari?
>
> I think it's perfectly reasonable to only support HTTP 1.1 enabled
> servers.

This isn't about servers, this is about clients. If the request is made
via HTTP 1.0, that's what we get (and that is also what the server must
respond in).

-Matt

Michael C. Harris

unread,
Sep 5, 2008, 6:41:21 AM9/5/08
to habar...@googlegroups.com

Which is what the other half of the message you snipped was about ;)

Owen Winkler

unread,
Sep 5, 2008, 9:06:40 AM9/5/08
to habar...@googlegroups.com

Well, ok, but as I understand it...

If a HTTP 1.0 client makes a request of a server that has Habari
installed on a shared IP (like most sites are configured these days),
then there's no chance of it routing the request to the correct virtual
host.

Using a HTTP 1.0 client makes it impossible to access sites that are
served from (as an example) Apache VirtualHosts, since the host header
is used to resolve the documentroot. Isn't that right? Or did I
completely misunderstand something?

From a server perspective, I think our focus would be better applied by
assuming that the site is served from a virtual host, since that will be
the configuration of the majority of our users.

For sites that are meant to be accessed via HTTP 1.0, where there is a
single site behind a single IP address, is it possible to drop something
like this into the config?:

$_SERVER['HTTP_HOST'] = 'example.com';

Seems like a simple fix for those edge cases, if it works.

Owen

Chris Meller

unread,
Sep 5, 2008, 3:09:08 PM9/5/08
to habar...@googlegroups.com
That would seem to work to me, in the exceedingly rare case that it's not available.

And for the record, absolute URLs are the way to go. Relativity sucks... 

Geoffrey Sneddon

unread,
Sep 5, 2008, 8:23:44 PM9/5/08
to habar...@googlegroups.com

On 5 Sep 2008, at 06:28, Michael C. Harris wrote:

> On Thu, Sep 04, 2008 at 10:57:24PM -0400, Owen Winkler wrote:
>>
>> Arthus Erea wrote:
>>> I think Owen clearly gives an example of why we absolutely need to
>>> know the URL of the requested resource.
>>>
>>> Multisite requires it–that's a pretty darn good reason.
>>
>> If you make a request with HTTP 1.0, it can only possibly support one
>> site, since there is no http_host in the request header.
>
> So, the real issue is how should we deal with the case where we the
> request is made from an HTTP 1.0 client and therefore we _don't_ have
> a Host header. How common is that ? Do most proxies now support HTTP
> 1.1 ? Geoffrey ?

I don't really know, but I know well enough to say that 3% of my log
messages are due to it. We should _not_ break just because we have a
request from an HTTP/1.0 client. We _should_ do our best to support
them. As nice as controlling every UA that visits your site would be
nice, it is entirely unrealistic.

Reply all
Reply to author
Forward
0 new messages