Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Border case: Basic authentication with a comma in the auth realm fails

0 views
Skip to first unread message

(Harald Jörg)

unread,
Jan 12, 2018, 7:15:02 PM1/12/18
to lib...@perl.org
Hello libwww,

I've encountered a situation where browsers "just work" but
LWP::UserAgent fails.

This is a border case which apparently went unnoticed for decades, and
for my own problem I've found a ridiculously easy workaround. So I'm
not sure whether it is a good idea or worth the effort to change
things. I volunteer to create a test case but am hesitating with a
fix because this might break things in the real world.


SETUP:

* A web server with basic authentication for an auth realm (chosen
somewhat, but not completely arbitrary):

$realm = "data, protected";

* A LWP::UserAgent which gets passed the correct credentials with:

$ua->credentials($netloc,$realm,$uid,$password);


SYMPTOM:

No matter what, you'll get a 401 status code for every request.

The web server sends a WWW-Authenticate header:

WWW-Authenticate: Basic realm="data, protected"

On the client side, it turns out that the requests don't contain the
corresponding Authorization header.


ROOT CAUSE:

When processing the WWW-Authenticate header, LWP::UserAgent
unconditionally changes all commas in this header's value to
semicolons:

$challenge =~ tr/,/;/; # "," is used to separate auth-params!!

But this modifies the realm if there's a comma in it. Therefore,
there's no longer a match when the credentials are looked up later
in the process, so no credentials are sent with the request.

The code change happened almost exactly 20 years ago, in commit
c4cefa219297e42ce73a10b6ad1fe4d9a19a9373 (1997-12-01). Since then,
RFC 2068 (dated Jan 1997) was obsoleted by RFC 2617 (Jun 1999),
which in turn has been obsoleted by RFC 7235 (Jun 2014). But does
that say that it is reasonably safe to rely on HTTP::Header::Util to
split words as intended?

Is it safe to do that replacement only _after_ the string
"auth-params" or are there other values where that replacment is
required?


WORKAROUNDS:

* Either get rid of the comma in the web server's realm definition.
Can be tricky it isn't _your_ server.

* Or do the same unconditional modification to the realm before
passing it to the LWP::UserAgent's credentials method. (BTW:
Fixing it in LWP::UserAgent's get_basic_credentials method is
possible but ugly, and not fully sufficient because you are
advised to override this method when subclassing).

* Or, if you're actually using a derived class like WWW::Mechanize
and talk to just one application, use the credentials method of
this class which allows to omit $netloc and $realm.

--
Cheers,
haj

Olaf Alders

unread,
Jan 22, 2018, 9:45:02 PM1/22/18
to Harald Jörg, lib...@perl.org
Hi Harald,


> On Jan 12, 2018, at 6:54 PM, Harald Jörg <Harald...@arcor.de> wrote:
>
> Hello libwww,
>
> I've encountered a situation where browsers "just work" but
> LWP::UserAgent fails.
>
> This is a border case which apparently went unnoticed for decades, and
> for my own problem I've found a ridiculously easy workaround. So I'm
> not sure whether it is a good idea or worth the effort to change
> things. I volunteer to create a test case but am hesitating with a
> fix because this might break things in the real world.


Thanks very much for this detailed explanation of what you've been seeing. I don't really know this part of the code well enough to be able to comment on this right now, but there was a recent pull request which deals with authentication. Does https://github.com/libwww-perl/libwww-perl/pull/255/files fix anything for you?

If it does or it doesn't, it might be worth commenting on the existing pull request.

Best,

Olaf

(Harald Jörg)

unread,
Jan 23, 2018, 11:30:02 AM1/23/18
to Olaf Alders, lib...@perl.org
Hello Olaf,

you write:

>> On Jan 12, 2018, at 6:54 PM, Harald Jörg <Harald...@arcor.de> wrote:
>>
>> Hello libwww,
>>
>> I've encountered a situation where browsers "just work" but
>> LWP::UserAgent fails.
>> [...]
>
> Thanks very much for this detailed explanation of what you've been
> seeing. I don't really know this part of the code well enough to be
> able to comment on this right now, but there was a recent pull request
> which deals with authentication. Does
> https://github.com/libwww-perl/libwww-perl/pull/255/files fix anything
> for you?
>
> If it does or it doesn't, it might be worth commenting on the existing
> pull request.

Thanks for the pointer. Unfortunately that pull request tries to fix
another issue which isn't closely related to my own. The pull request
does, however, introduce yet another of these unconditional translations
of commas to semicolons, which is somewhat foolhardy but doesn't do
extra damage.

I think I can prepare a fix to make authentication RFC compliant, but
since I haven't working in the guts of LWP since 10+ years this would
also be foolhardy :)

Some more details on the handling of auth headers:

If I have a header like this:

WWW-Authenticate: Basic realm="Hello, world"

...then LWP::UA converts this value to 'Basic realm="Hello; world"'.
This can't be right. Quoted strings should be retained as they are.

The conversion is done with the intent to fit the specs of
HTTP::Headers::Util::split_header_words, which works quite fine for
headers which aren't WWW-Authenticate. But WWW-Authenticate is
"special", to say it politely. The example in
https://tools.ietf.org/html/rfc7235#section-4.1 reads:

WWW-Authenticate: Newauth realm="apps", type=1,
title="Login to \"apps\"", Basic realm="simple"

So, the comma is not only used to separate auth-params within one
authentication scheme, it also separates two different authentication
schemes. The RFC says, encouragingly,

User agents are advised to take special care in parsing the field
value, as it might contain more than one challenge, and each
challenge can contain a comma-separated list of authentication
parameters. Furthermore, the header field itself can occur multiple
times.

Today, LWP::UA wouldn't be able to process the RFC example correctly.
The params of the header are parsed into a hash, so that the second
realm clobbers the first. With the pull request it would be able to
process the following equivalent headers quite fine:

WWW-Authenticate: Newauth realm="apps", type=1, title="Login to \"apps\""
WWW-Authenticate: Basic realm="simple"

The options are: Either we take special care in parsing the field value,
or we just live with the fact that a comma in the realm might cause
issues, like we did in the last 20 years. Ignorance is bliss :)
--
Cheers,
haj
0 new messages