Recently, I fetched a page from a server, its info is:
Server: Apache/1.3.33 (Unix) mod_fastcgi/2.4.2 PHP/4.3.10
mod_ssl/2.8.22 OpenSSL/0.9.7d
X-Powered-By: PHP/4.3.10
Within the HTML response I found numerous lines that had only
3 characters each, always 0..9 and a..f i.e. hexadecimal.
These were always inside of links.
Can someone tell me what these little buggers are all about?
Thanks.
Encoded characters. They might be trying to hide them from prying eyes, or
they could be characters that need encoding to be part of the URI (due to
the rules of what characters are allowed in them).
--
If you insist on e-mailing me, use the reply-to address (it's real but
temporary). But please reply to the group, like you're supposed to.
This message was sent without a virus, please delete some files yourself.
> > Can someone tell me what these little buggers are all about?
>
> Encoded characters. They might be trying to hide them from prying
eyes, or
> they could be characters that need encoding to be part of the URI
(due to
> the rules of what characters are allowed in them).
But I've found that the URL works when I remove these
'characters'. Which incidentally, 3 hex digits is the wrong
number for either 1 or 2 characters. Also note, in one case
I found that these 3 character were within the http:// expression.
Clearly these characters are supposed to be ignored.
But when did this become a standard that browsers adhere to?
It seems to be an attempt at preventing spidering.
It looks like you've seen "chunked content transfer encoding";
see RFC 2616, chapter 3.6.1.
--
Wolf a.k.a. Juha Laiho Espoo, Finland
(GC 3.0) GIT d- s+: a C++ ULSH++++$ P++@ L+++ E- W+$@ N++ !K w !O !M V
PS(+) PE Y+ PGP(+) t- 5 !X R !tv b+ !DI D G e+ h---- r+++ y++++
"...cancel my subscription to the resurrection!" (Jim Morrison)
Thanks!