[Will you *please* stop this amok-crossposting? Usenet is _not_ your
personal private support forum/playground. If you must crosspost, then
crosspost to the *correct* newsgroup (see charters and taglines), and
*set Followup-To*. In particular, Apache is _not_ a UNIX-*only* Web server
(RTFM).
X-Post & F’up2 <news:comp.infosystems.www.authoring.misc>]
Ivan Shmakov wrote in <news:comp.infosystems.www.authoring.html>:
[Fixed quotes; see <
http://www.netmeister.org/news/learn2quote.html>]
> Hendrik Maryns […] writes:
>> The strange redirects are due to some experimenting with .htaccess,
>> I’ll have to fix that, disabled it for now.
>
> (I’ve suspected something along these lines.)
Me too.
>> Ivan Shmakov also noted that I claim html4 compliance but should move
>> to html5 if I want to use “unencoded” UTF-8 in ‘href’.
That was and is “not even wrong”. Sorry to break this to you, but you have
been listening to a *wannabe*.
<
https://unicode.org/faq/>
<
https://www.w3.org/TR/html/links.html#element-attrdef-a-href>
>> Clicking the button in the footer seems to indeed validate, so I wonder
>> what the exact problem is. I vaguely remember that in the past I
>> decided not to move to html5, but forgot for what reason. Maybe I will
>> for this reason.
>
> Frankly, I’m unsure if HTML4 allowed whitespace in href
It does not, and that is not hard to find out either. Just RTFSpec:
<
http://www.w3.org/TR/1999/REC-html401-19991224/struct/links.html#adef-href>
> (and I’m pretty sure it didn’t allow UTF-8;
Percent-encoded characters according to RFC 3986 & children: no problem.
Unescaped non-ASCII characters: *big* problem.
> hence I suspect that failing to catch that may be due to a bug in the
> validator),
Sure, blame the Validator for your incompetence. What else is new? :->
> but at least the validator at [1] correctly reports space characters as
> (HTML5?) errors:
It is more likely that an HTML5-supporting validator will catch this error
because HTML5 is not based on a DTD that can be checked against. This
encourages validator developers to check more carefully against the
Specification *prose*.
It certainly is so in the case in the case of the *W3C* Validator. Why are
you not using *it* instead (<
https://validator.w3.org/>)? It has been
supporting HTML5 for several years now (although as an implicit switch to
the HTML5 validator – the “Nu Html Checker” at
<
https://validator.w3.org/nu/> – when the HTML5 doctype is recognized or
selected).
> 3. Error: Bad value Antropozofi/Valentin Wember – Waar gaan we
> eigenlijk heen%3F.pdf for attribute href on element a: Illegal
> character in path segment: space is not allowed.
Correct. Neither are unescaped non-ASCII characters. Supportive UA
behavior to the contrary is *implementation-dependent*.
> [2]
http://httpd.apache.org/docs/2.4/mod/core.html#errordocument
>
> However, the problem is not in the “document,” but rather in the
> Content-Type: header, which is:
>
> Content-Type: text/html; charset=iso-8859-1
>
> At the same time, Apache includes the (supposed) filename in the
> response “as is”: in UTF-8.
>
> Curiously, adding ‘AddDefaultCharset utf-8’ [3] to my .htaccess
> didn’t seem to have any effect on the 404 response header,
[3] says
| AllowOverride: FileInfo
On the other hand, if the error message files are UTF-8 encoded – and
| $ file -i /usr/share/apache2/error/HTTP_NOT_FOUND.html.var
| /usr/share/apache2/error/HTTP_NOT_FOUND.html.var: text/html; charset=utf-8
|
| $ dpkg -S /usr/share/apache2/error/HTTP_NOT_FOUND.html.var
| apache2-data: /usr/share/apache2/error/HTTP_NOT_FOUND.html.var
|
| $ dpkg -l apache2-data | awk '/^.i/ {print $3}'
| 2.4.23-4
suggests just that –, “AddDefaultCharset” is stupidly set to “On” (the
previous default) or “iso-8859-1” and it *works* with the OP, then it would
be no surprise that the error messages are garbled.
> so I’m interested in how it can be fixed, too.
AddDefaultCharset off
or (with Apache 2.4.x+)
# AddDefaultCharset on
(disabling it, therefore falling back to the default, which should be “off”)
in the httpd.conf/apache2.conf. LART that stuck-in-the-1980s server admin
if necessary. (Unicode 1.0.0 was published in 1991.)
> Cross-posting to
> news:comp.infosystems.www.servers.unix, as the question is
> specific to server software, not HTML.)
,-------------.
: ↑ Go to top :
`-------------'
> [3]
http://httpd.apache.org/docs/2.4/mod/core.html#adddefaultcharset
As you can read there, “AddDefaultCharset” != “off” is a *deprecated*
approach:
,-<
http://httpd.apache.org/docs/2.4/mod/core.html.en#adddefaultcharset>
|
| […]
| AddDefaultCharset should only be used when all of the text resources to
| which it applies are known to be in that character encoding and it is too
| inconvenient to label their charset individually. One such example is to
| add the charset parameter to resources containing generated content, such
| as legacy CGI scripts, that might be vulnerable to cross-site scripting
| attacks due to user-provided data being included in the output. Note,
| however, that a better solution is to just fix (or delete) those scripts,
| since setting a default charset does not protect users that have enabled
| the "auto-detect character encoding" feature on their browser.
It has been deprecated for more than 10 years:
<
https://bz.apache.org/bugzilla/show_bug.cgi?id=23421>
Fun fact: Before the Apache default was changed in 2004 CE, the problem with
this default was *obvious* in the Bugzilla interface (but IIRC using a
different URI then) because the reporter of this bug (Martin Dürst) has a
name that contains a non-ASCII character which Bugzilla properly served
UTF-8-encoded, but Apache’s header field default caused HTML UAs to
interpret it as ISO-8859-1 regardless of the correct Content-Type “meta”
element (IIRC); so his name was displayed as “Martin Dürst” there for quite
some time.
> (Reading [4] wasn’t enlightening so far, either.
>
> [4]
http://httpd.apache.org/docs/2.4/mod/mod_mime.html
This module has nothing to do with the problem.
PointedEars
--
Sometimes, what you learn is wrong. If those wrong ideas are close to the
root of the knowledge tree you build on a particular subject, pruning the
bad branches can sometimes cause the whole tree to collapse.
-- Mike Duffy in cljs, <
news:Xns9FB6521286...@94.75.214.39>