Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

URI::Escape

2 views
Skip to first unread message

Jeff Boes

unread,
Feb 23, 2001, 9:59:36 AM2/23/01
to
Can anyone explain this behavior to me?

-----start of code
use URI::Escape;

print $URI::Escape::VERSION, q! '!, uri_escape('&'), "'\n";
-----end of code

-----start of results
3.13 '&'
-----end of results

The documentation for uri_escape() implies that '&' is one of the characters to
be converted.

--
~~~~~~~~~~~~~~~~|It is by caffeine alone I set my mind in motion,
Jeffery Boes |It is by the beans of Java that thoughts acquire speed,
jb...@qtm.net |The hands acquire shaking, the shaking becomes a warning,
UIN 3394914 |It is by caffeine alone I set my mind in motion.

John Joseph Trammell

unread,
Feb 23, 2001, 10:29:20 AM2/23/01
to
On Fri, 23 Feb 2001 09:59:36 -0500, Jeff Boes wrote:
> The documentation for uri_escape() implies that '&' is one of
> the characters to be converted.

I disagree -- ampersand is a reserved character in RFC 2396; I
can understand wy you'd want it escaped though. If you *must*
escape it, you can:

#!/usr/bin/perl -w
use strict;
use URI::Escape;

print qq[version => $URI::Escape::VERSION\n];
printf qq["%s"\n], uri_escape("&");
printf qq["%s"\n], uri_escape("&", "\0-\377");
__END__
version => 3.13
"&"
"%26"

Ilmari Karonen

unread,
Feb 23, 2001, 10:59:44 AM2/23/01
to
In article <4euc9t4iso5m7ihf9...@4ax.com>, Jeff Boes wrote:
>Can anyone explain this behavior to me?
[snip]

>The documentation for uri_escape() implies that '&' is one of the characters to
>be converted.

First of all, the default list of allowed characters in URI::Escape is
good for absolutely nothing:

* It can't be used to encode URI fragments before joining them, as
specified in RFC 2396, since it includes the reserved characters
[;/?:@&=+$,] that are used as delimiters in the next step.

* I can't be used to "fix" complete URIs containing unescaped spaces
or other unsafe characters -- a practice that the RFC frowns upon
anyway -- since it doesn't contain '%', leading to double escapes
if that is attempted.

No, I've no idea why it's like that. Anyway, what you want is:

my $uri_fragment = uri_escape($string, "^-_.!~*'()A-Za-z0-9");

--
Ilmari Karonen - http://www.sci.fi/~iltzu/
"That's probably because in Europe, the road signs really are signs and
not short stories. In the States they babble 'right lane must exit left'
or something." -- Jukka Aho in the monastery

Please ignore Godzilla and its pseudonyms - do not feed the troll.

jb...@qtm.net

unread,
Feb 23, 2001, 3:16:18 PM2/23/01
to
In article <slrn99cu9r....@bayazid.hypersloth.net>, John
Joseph

Trammell <tram...@bayazid.hypersloth.net> writes:
>On Fri, 23 Feb 2001 09:59:36 -0500, Jeff Boes wrote:
>> The documentation for uri_escape() implies that '&' is one of
>> the characters to be converted.
>
>I disagree -- ampersand is a reserved character in RFC 2396

Ah. A very careful re-reading of the documentation shows my error. The
default
escaped character set DOES NOT include '&'. Hardly intuitive, since the
module supposedly escapes "unsafe" characters, and I would think
'&' was
pretty darned unsafe.


----- Posted via NewsOne.Net: Free (anonymous) Usenet News via the Web -----
http://newsone.net/ -- Free reading and anonymous posting to 60,000+ groups
NewsOne.Net prohibits users from posting spam. If this or other posts
made through NewsOne.Net violate posting guidelines, email ab...@newsone.net

0 new messages