In article <
52f4187...@audiomisc.co.uk>,
Jim Lesurf <
no...@audiomisc.co.uk> wrote:
>AIUI spaces are nominally 'legal' in web addresses [...]
Spaces are not legal in URIs, which are the things that a web browser
sends to a web server. They have to be replaced with an escape
sequence, %20. However, what you type into the address field of a web
browser is not necessarily a URI; the browser will normally do any
necessary escaping to turn it into one. There is (as far as I know)
no standard specifying exactly what a web browser must handle, though
the W3C Note
http://www.w3.org/TR/leiri describes an escaping
procedure used by some other standards.
>simply because the
>conventions for web addresses are an extension of the *nix ones for
>filenames. In *nix any web address is just the location for something on a
>device "http", or whatever. In *nix everything is a file.
Certainly the original definition of URLs was strongly based on unix
filenames.
>However, nice as that is in theory, the problem in practice is, of course,
>that many non-*nix systems don't support this. And even in *nix systems,
>having spaces in names can be a PITA because it can confuse commands
>unless you add delimiters to parse the command. Hence the convention of
>either using '%' followed by a number to 'escape' the space, or simply
>using an underscore instead, (or hardspaces, which can also cause
>confusion).
The escaping convention is part of the URI specification. %20 is
converted to a space character by the web server, and is gone by the
time the URI is mapped to a file name.
I'm not sure what you mean when you say many non-unix systems don't
support spaces in file names. The other OS widely used for web
serving is Windows, and it certainly does. No doubt there have been
many systems that didn't, but they don't really affect the usability
of spaces in web addresses.
[I have referred to URIs. The more commonly used term URL technically
refers to a subset of URIs. There are also IRIs, Internationalized
Resource Identifiers, which allow most non-delimiter Unicode
characters to appear directly rather than being escaped.]
-- Richard