xhtml teething troubles

Ted

unread,

May 30, 2006, 4:04:41 PM5/30/06

to

This page
http://homepage.ntlworld.com/r.a.mccartney/test/utf-8_test_file_hacked_for_ie_with_local_dtd.xml
doesn't work properly in Firefox or IE6. The faults are different. In
Firefox the TestText entity is not recognised. In IE6, the <br /> tag
doesn't cause a line break. Can anyone tell me what I'm doing wrong?

Jack

unread,

May 30, 2006, 4:31:56 PM5/30/06

to

As you say, the DTD is local; I can't inspect it. You may or may not
have defined those entities. Consider defining them in an inline DTD
subset; you can't count on the browser fetching your DTD.

--
Jack.

Chaddy2222

unread,

May 31, 2006, 2:38:16 AM5/31/06

to

For a start, the XHTML support inIE6 is not very good. You also must
validate your XHTML, that should go a long way in getting the browsers
to display your code correctly.
Also, use a proper XHTML Strict Doc Type as that will be better
recognized by browsers.
--
Regards Chad. http://freewebdesign.cjb.cc

VK

unread,

May 31, 2006, 2:48:36 AM5/31/06

to

Ted wrote:
> http://homepage.ntlworld.com/r.a.mccartney/test/utf-8_test_file_hacked_for_ie_with_local_dtd.xml
> doesn't work properly in Firefox or IE6. The faults are different. In
> Firefox the TestText entity is not recognised.

Right - because Firefox currently is not able to fetch external DTD's
of any kind. See <https://bugzilla.mozilla.org/show_bug.cgi?id=35984>
and outsprings. A very nasty bug forcing to declare all extra entities
in internal DTD like
<!DOCTYPE template [
<!ENTITY nbsp " ">
... etc
]>

Note: there is no "local DTD" as a term. There is external DTD (your
case) and internal DTD (my sample).

In IE6, the <br /> tag
> doesn't cause a line break.

Because an XML document in default XML namespace has no special
treatment for <br /> - it's just a well-formed element w/o closing tag.
It meay mean something important to HTML parser (like "make line break
here"), but mute to XML.

Use your template in XSL template and link it to XML data file so the
resulting page would be HTML. Briefly: stop /hacking/ things and start
/using/ them ;-)

<comp.text.xml> is another good source of help on the matter.

VK

unread,

May 31, 2006, 5:08:40 AM5/31/06

to

VK wrote:
> Use your template in XSL template and link it to XML data file so the
> resulting page would be HTML. Briefly: stop /hacking/ things and start
> /using/ them ;-)

Also it is not clear why did you call the post " xhtml teething
troubles" as the linked document has no relation neither to XHTML nor
to HTML. It's a well-formed (accounting extra entities in DTD) XML
document served as XML document.

IE handles is absolutely correctly, FF cannot retrieve DTD because of
bug I mentioned so it breaks its "well-formedness".
In both cases it is really out of the scope of (X)HTML authoring,
<comp.text.xml> is more relevant.

Andy Dingley <dingbat@codesmiths.com>

unread,

May 31, 2006, 6:10:08 AM5/31/06

to

Ted wrote:
> http://homepage.ntlworld.com/r.a.mccartney/test/utf-8_test_file_hacked_for_ie_with_local_dtd.xml
> doesn't work properly in Firefox or IE6.

Pragmatically, I think you're stuffed. It appears to be implemented
correctly as a piece of XML work, but this just isn't how the web
works, barely how XML is used (in practice) and is _certainly_ not a
technique that's usable on the web for the forseeable future.

There's a raft of valid techniques out there that just aren't commonly
used in practice, so support for them varies from poor to none.

Michael Winter

unread,

May 31, 2006, 6:11:14 AM5/31/06

to

On 31/05/2006 07:48, VK wrote:

> [...] Firefox currently is not able to fetch external DTD's
> of any kind.

That statement, as-is, is entirely false. Firefox can process external
subsets, but only does so in certain circumstances and they do not
include DTDs found on the Web.

> [...] A very nasty bug

How many times do I have to repeat this: it is /not/ a bug! It may not
be desirable, but it's entirely correct behaviour. As I've also stated
before, Firefox is not the only browser featuring an XML processor that
doesn't process external subsets.

> forcing to declare all extra entities in internal DTD [...]

Or don't use entities; encode the document using UTF-8, for example, and
enter the characters directly.

[snip]

> Note: there is no "local DTD" as a term.

The OP didn't use it as such. From his perspective the DTD is local as
it's on the same server. Of course, to the rest of us, it's just as remote.

[snip]

Mike

--
Michael Winter
Prefix subject with [News] before replying by e-mail.

VK

unread,

May 31, 2006, 7:00:28 AM5/31/06

to

Michael Winter wrote:
> > [...] Firefox currently is not able to fetch external DTD's
> > of any kind.
>
> That statement, as-is, is entirely false. Firefox can process external
> subsets, but only does so in certain circumstances and they do not
> include DTDs found on the Web.

Yeh, right... Firefox can process external subsets but it cannot
process external subsets. :-)

> > [...] A very nasty bug
>
> How many times do I have to repeat this: it is /not/ a bug! I

You may repeat it as many times as you want: it doesn't change the
production mechanics.
<http://www.w3.org/TR/REC-xml/>
<q>The productions later in this specification for individual
nonterminals (elementdecl, AttlistDecl, and so on) describe the
declarations after all the parameter entities have been included.</q>

Plain and simple: either you include all declared entities and start
production, or you don't bother with the production at all. I already
explained the real issue with this bug: the impossibility to divide
standard-wise between bogus DTD in (X)HTML and real DTD

Eric B. Bednarz

unread,

May 31, 2006, 7:01:55 AM5/31/06

to

"VK" <school...@yahoo.com> writes:

> Note: there is no "local DTD" as a term. There is external DTD (your
> case) and internal DTD (my sample).

There ain't no 'external DTD' and 'internal DTD' as terms as well,
except perhaps in your parallel universe. There is an external and an
internal subset, and there is a DTD, which is the sum of both.

--
||| hexadecimal EBB
o-o decimal 3771
--oOo--( )--oOo-- octal 7273
205 goodbye binary 111010111011

VK

unread,

May 31, 2006, 7:48:01 AM5/31/06

to

Eric B. Bednarz wrote:
> "VK" <school...@yahoo.com> writes:
>
> > Note: there is no "local DTD" as a term. There is external DTD (your
> > case) and internal DTD (my sample).
>
> There ain't no 'external DTD' and 'internal DTD' as terms as well,
> except perhaps in your parallel universe. There is an external and an
> internal subset, and there is a DTD, which is the sum of both.

I'm affraid that the you are who's speaking from a parallel universe
;-) There are "external DTD subsets" and "internal DTD subsets"
commonly referred as "external DTD" and "internal DTD" ("commonly"
means by authoring software producers and in manuals, not by VK).
Before inventing more stuff, check the Web.

Michael Winter

unread,

May 31, 2006, 8:22:32 AM5/31/06

to

On 31/05/2006 12:00, VK wrote:

> Michael Winter wrote:

[snip]

>> Firefox can process external subsets, but only does so in certain
>> circumstances and they do not include DTDs found on the Web.
>
> Yeh, right... Firefox can process external subsets but it cannot
> process external subsets. :-)

I don't expect you to have taken the time to understood what was
written, but it would have been nice if you had.

[snip]

>> How many times do I have to repeat this: it is /not/ a bug!
>

> You may repeat it as many times as you want:

Yes, I know. You're a stubborn fool that refuses to learn anything from
anyone.

As I've stated before, I don't reply to your posts any more to attempt
to further your knowledge. It's wasted effort. However, I have no
intention of letting you mislead others who ask this group for help or
informed opinion.

> it doesn't change the production mechanics.

None of the grammar productions, nor the prose that accompanies them,
have any relevance here. Section 5.1 Validating and Non-Validating
Processors[1] states exactly what's expected from conforming processors.
Specifically,

Non-validating processors are REQUIRED to check only the
document entity, including the entire internal DTD subset,
for well-formedness.

noting that Firefox (and others) use non-validating processors. Furthermore,

While they are not required to check the document for validity,
they are REQUIRED to process all the declarations they read in
the internal DTD subset and in any parameter entity that they
read ...

Note the conspicuous absence of any reference to an external entity.

> <q>The productions later in this specification for individual
> nonterminals (elementdecl, AttlistDecl, and so on) describe the
> declarations after all the parameter entities have been included.</q>
>
> Plain and simple: either you include all declared entities and start
> production, or you don't bother with the production at all.

It would seem that you don't know the difference between an entity
reference and a parameter entity reference. The latter has nothing to do
with the problem described by the OP.

[snipped gibberish]

Mike

[1] 5.1 Validating and Non-Validating Processors, XML 1.0
Specification. <http://www.w3.org/TR/REC-xml/#proc-types>

VK

unread,

May 31, 2006, 10:01:42 AM5/31/06

to

Michael Winter wrote:
> However, I have no
> intention of letting you mislead others who ask this group for help or
> informed opinion.

That is exactly my intention too, unfortunately.

All differences between external DTD subsets and internal DTD subsets
are well spelled at <http://www.w3.org/TR/REC-xml/>, section 2.8

Section 2.9 also defines that in the OP case standalone in prolog
presumed "no" and /cannot/ be "yes": <q>entities (other than amp, lt,
gt, apos, quot), if references to those entities appear in the
document</q>

The idea that internal DTD is /a must to process/ while external DTD is
something /optional to process/ is plain stupid. It took a long brain
waching from one organisation to start to believe it.

A standard-compliant XML processor is not a capricious lady as one is
trying to present: "if I'm moody, I'll retrieve external subsets before
parsing; otherwize I'll ignore them and just break parsing on the first
non-declared entity".

The real reason of external DTD complications for someone is because of
bogus DTD's for (X)HTML. It is like a primitive tribe would find a TV
set and use it as a local god for years - with dancing around and
putting flowers on it. What would be a hard discover for them that it
is actually something practically useful and the actual purpose of this
thing is not to stay in the middle of the village and be covered with
flowers.

Ted

unread,

May 31, 2006, 3:50:32 PM5/31/06

to

"VK" <school...@yahoo.com> wrote in message
news:1149066520....@f6g2000cwb.googlegroups.com...

>
> Also it is not clear why did you call the post " xhtml teething
> troubles" as the linked document has no relation neither to XHTML nor
> to HTML. It's a well-formed (accounting extra entities in DTD) XML
> document served as XML document.
>

Sorry if you think I'm posting to the wrong group. I did a |Google search to
find newsgroups which dealt with xhtml, and this seemed to be the best
group. xhtml is supposed to be served as xml

> In both cases it is really out of the scope of (X)HTML authoring,
> <comp.text.xml> is more relevant.

Thaks for the tip.

VK

unread,

May 31, 2006, 4:29:31 PM5/31/06

to

Ted wrote:
> did a Google search to find newsgroups which dealt with xhtml

And you came to the right one. The problem is that your code is not
XHTML and has nothing to do with it - though I believe you that you
fairly thought otherwise.

> xhtml is supposed to be served as xml

Noop. Never. XHTML is supposed to be served as application/xhtml+xml
XML/XSL is supposed to be served as text/xml.

XHTML code must be /well-formed in accordance with XML rules/, but it
is /not/ served as XML.

Ted

unread,

May 31, 2006, 10:14:35 PM5/31/06

to

"VK" <school...@yahoo.com> wrote in message

news:1149107371.1...@c74g2000cwc.googlegroups.com...

>
> Ted wrote:
> xhtml is supposed to be served as xml
>
> Noop. Never. XHTML is supposed to be served as application/xhtml+xml
> XML/XSL is supposed to be served as text/xml.
>
> XHTML code must be /well-formed in accordance with XML rules/, but it
> is /not/ served as XML.
>

According to : http://www.w3.org/MarkUp/2004/xhtml-faq#texthtml
"XHTML is an XML format; this means that strictly speaking it should be sent
with an XML-related media type (application/xhtml+xml, application/xml, or
text/xml). "

As long as I stick with the free webspace provided by NTL, any file with the
extension .xml will be served as application/xml

Ted

unread,

May 31, 2006, 10:18:23 PM5/31/06

to

<din...@codesmiths.com> wrote in message
news:1149070208.2...@j55g2000cwa.googlegroups.com...

Great! Can anyone give me some links to web pages which teach the techniques
which are supported?

VK

unread,

Jun 1, 2006, 4:55:02 AM6/1/06

to

Ted wrote:
> According to : http://www.w3.org/MarkUp/2004/xhtml-faq#texthtml
> "XHTML is an XML format; this means that strictly speaking it should be sent
> with an XML-related media type (application/xhtml+xml, application/xml, or
> text/xml). "

Does it really say that?! A second... I be damned, indeed! Sometimes I
feel like to call marines to smash W3C down. Unfortunately the absence
of W3C would be even more harmful than its presence, so needs to be
tolerated :-( :-)

No, text/xml has no relation with XHTML. XML parser has no idea about
additional layout/styling rules for (X)HTML element. You already
discovered it on <br> sample. Each Content-Type for a particular
content, this is how the Web works.

There is a lot of scum connected with XHTML, so before to proceed you
may want to check that your hosting provider supports XHTML as such.
For this you may use this primitive probe: save it as say probe.html
and upload it.

Open it: it shows up fine because it is served as Content-Type
text/html, so browser uses HTML parser on it, and as HTML it is a valid
markup: you are allowed to skip closing tag for list elements.

Now let's try to find out if server admin has an extension associated
with Content-Type application/xhtml+xml (XHTML). In the US the most
common extensions for this are .xhtml and .xht
So try to change the extension first to .xhtml then to .xht, upload and
view in Firefox. The sign that the page is served and parsed as XHTML
will be error message "closing tag missing". From this point out you
have a real XHTML page you can play with. It also means that
unfortunately this page is not available to Internet Explorer users
anymore (IE doesn't support XHTML and it will not do it in any near
future).

// probe page

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type"
content="text/html; charset=iso-8859-1" />
<title>XHTML Probe</title>
</head>

Andy Dingley <dingbat@codesmiths.com>

unread,

Jun 1, 2006, 5:30:52 AM6/1/06

to

Ted wrote:

> According to : http://www.w3.org/MarkUp/2004/xhtml-faq#texthtml
> "XHTML is an XML format; this means that strictly speaking it should be sent
> with an XML-related media type (application/xhtml+xml, application/xml, or
> text/xml). "

Be aware that VK is a clueless idiot (recent threads passim), although
some believe instead that he's part troll (just not as funny as Bill
Bailey). Typically about 2/3rd of what he posts is wrong.

This ng is probably your best place to discuss techniques for pushing
XHTML. The local attitude is against it, but that's from a position of
some serious knowledge on the matter. Do some archive searching -
you'll find useful advice on many topics and soon recognise who the
usefully knowledgeable posters are (Jukka, Alan and Henri spring to
mind for this topic).

VK

unread,

Jun 1, 2006, 8:10:24 AM6/1/06

to

VK wrote:
> <meta http-equiv="Content-Type"
> content="text/html; charset=iso-8859-1" />

Oops... sorry. I used W3C Amaya for quick XHTML page generation and it
inserts this nonsense automatically - it has to be removed (though
harmful as server-side generated Content-Type has higher priority over
http-equiv).

Michael Winter

unread,

Jun 1, 2006, 8:10:57 AM6/1/06

to

On 31/05/2006 15:01, VK wrote:

> Michael Winter wrote:
>
>> However, I have no intention of letting you mislead others who ask
>> this group for help or informed opinion.
>
> That is exactly my intention too, unfortunately.

You don't qualify as informed, particularly as you've already admitted
to not understanding the XML Specification properly.

At the same time I miss a lot of professional knowledge in XML
and XSLT, definitely weak in reading W3C docs and my English
may fell down - especially after midnight.
-- 1148055460....@j73g2000cwa.googlegroups.com,
comp.infosystems.www.authoring.html

[snip]

> Section 2.9 also defines that in the OP case standalone in prolog

> presumed "no" and /cannot/ be "yes": [...]

The standalone document declaration is irrelevant here.

[snip]

> The idea that internal DTD is /a must to process/ while external DTD

> is something /optional to process/ is plain stupid. [...]

Your opinion on the matter is irrelevant. The required behaviour is what
it is. Validating processors must read and process the complete DTD, and
any external parsed entities. Non-validating processors are only
required to process the internal subset (if present) up to the first
unread parameter entity (except when processing a standalone document).

> A standard-compliant XML processor is not a capricious lady as one is

> trying to present: [...]

I've done no such thing. The behaviour of the XML processor used by
Firefox is well-defined in this regard.

> The real reason of external DTD complications for someone is because
> of bogus DTD's for (X)HTML.

And what bogus DTDs would these be? The official ones, knowing you.

[snipped drivel]

VK

unread,

Jun 1, 2006, 9:26:16 AM6/1/06

to

Michael Winter wrote:
<OT>

> On 31/05/2006 15:01, VK wrote:
> At the same time I miss a lot of professional knowledge in XML
> and XSLT, definitely weak in reading W3C docs and my English
> may fell down - especially after midnight.

Full ACK. Any relevance to this particular thread? Up to you anyway.
You can even automate the process and use this quote as signature in
your posts.
</OT>

> The standalone document declaration is irrelevant here.

Either you did not read 2.9 or it escaped you completely. You may want
to try again.

> Your opinion on the matter is irrelevant. The required behaviour is what
> it is. Validating processors must read and process the complete DTD, and
> any external parsed entities. Non-validating processors are only
> required to process the internal subset (if present) up to the first
> unread parameter entity (except when processing a standalone document).

b.s. - and a plain one. Nothing more to say. Read more authoring
manuals (and lesser W3C's revelations - though even there you have to
apply yourselve to read out such nonsense).

Full disclosure: I am aware of two camps around of XHTML: the camp
"XHTML in My Heart" and the camp "XHTML in My Content-Type". I have no
aim nor desire to fight with the first one. If one thinks that the
document is whatever you think it is, it's fine. Private fantasies are
not my preoccupation. But OP wanted to /study XHTML behavior in
different situations/. And one cannot study the /actual behavior/ of an
/imaginary thing/. A Shaolin monk on the third level of Enlightment may
possibly do it, but not a regular person. So before to study how XHTML
document handles this or that situation, one needs first to get XHTML
document (application/xhtml+xml) itself for stidies. Otherwise it will
be experimentation either with HTML (if text/html) or with XML
(text/xml) - in both cases irrelevant to XHTML specifics.

Michael Winter

unread,

Jun 1, 2006, 12:16:48 PM6/1/06

to

On 01/06/2006 14:26, VK wrote:

> Michael Winter wrote:

[VK's admission to lacking XML knowledge]

> Full ACK. Any relevance to this particular thread?

This thread is a discussion of an application of XML: XHTML. If you
don't understand XML properly, the OP should be aware of your propensity
for error.

[snip]

>> The standalone document declaration is irrelevant here.
>
> Either you did not read 2.9 or it escaped you completely. You may
> want to try again.

That's quite funny, considering that it was I that explained to you what
the standalone document declaration meant.

I wrote at the time:

The standalone document declaration doesn't instruct a
validating processor to 'do' anything. It is a requirement of
validating processors themselves to process the DTD and any
referenced external entities.

The standalone document declaration does have an impact on
well-formedness (see Entity Declared in section 4.1 [p.33]),
and on non-validating processor when reading parameter
entities.
-- oYIag.70876$wl.3...@text.news.blueyonder.co.uk

As far as the specification reference is concerned, I meant the
following paragraph in particular:

Note that non-validating processors are not obligated to to
read and process entity declarations occurring in parameter
entities or in the external subset; for such documents, the
rule that an entity must be declared is a well-formedness
constraint only if standalone='yes'.
-- 4.1 Character and Entity References, XML 1.0, 3rd ed.

As the document under discussion is not a standalone document and we are
dealing with non-validating processors, it is not a well-formedness
error to encounter a reference to an undeclared entity (I've explained
all of this before). That Firefox and Opera do report errors is
complicated in part[1] by the presence of a public identifier; this
identifier will match entries in their internal catalogues, resulting in
the definition of entities common to XHTML (á, and the like). The
external subset itself is ignored.

<supposition>The public identifier, and the subsequent use of internal
definitions, produces behaviour somewhat like the processing of an
internal subset. As the TestText entity isn't defined in that internal
DTD, the later reference is considered an error.</>

If validation against the XHTML Basic DTD is desired, then perhaps the
internal subset should define the TestText entity:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML Basic 1.0//EN"
"http://www.w3.org/TR/xhtml-basic/xhtml-basic10.dtd" [
<!ENTITY TestText "This is some test text">
]>

>> Your opinion on the matter is irrelevant. The required behaviour is
>> what it is. Validating processors must read and process the
>> complete DTD, and any external parsed entities. Non-validating
>> processors are only required to process the internal subset (if
>> present) up to the first unread parameter entity (except when
>> processing a standalone document).
>
> b.s. - and a plain one.

If that's your opinion, then why don't you suggest to the W3C that the
next major version of XML makes processing external markup declarations
a requirement? If they say that doing so would be a bad idea, ask them
why they think so. Certainly stop bitching about it here. It won't
change anything, I don't care what you think about the W3C, and I doubt
anyone else does, either.

[snip]

Mike

[1] ...complicated in part

Firefox features a bug that causes a well-formedness error to
be raised for non-standalone documents containing undeclared
entities. Opera acts correctly in this instance by bypassing
the reference. Remove the public identifier and compare the
behaviour of both browsers.

VK

unread,

Jun 1, 2006, 1:03:12 PM6/1/06

to

Michael Winter wrote:
<snip>

Leaving out the "facts" in your post: just simply look at you. On the
run of this branch you fighting for the holly right of XML processor do
not process external DTD subsets. And you are fighting for it as it was
a vital part of XML mechanics. You don't want for XML a freedom to
neglect linked XSL templates; you don't want it to be able to skip on
loading stylesheets. Everything else - but for the right to not load
external DTD's you are ready to read all available specs upside down
and by diagonal.

This alone is the best proof of the real reason of the "DTD issue" I
spelled several times already.

And sorry, I'm opting out on the theoretical discussion "XML processor
has rights to disregard selectively some linked resources such as...".
That's too much crazy to follow.

Paragraphs 2.8 and 2.9 to start with, the rest of the Web to continue -
whoever wants to continue by himself.