DTD in browsers

VK

unread,

May 3, 2006, 4:57:11 AM5/3/06

to

Randy Webb wrote:
> VK said the following on 5/2/2006 9:48 AM:
> > If you mean "trying to render it" then FF behavior is the same as for
> > all other UA's willing to be in use (and not W3C demos). If document is
> > served as text/html, FF will render it somehow anyhow.
>
> So you are saying it totally disregards the DTD and any hints from the
> server how to handle the document?

Except server reported Content-Type (text/plain, text/html, text/xml,
application/xhtml+xml etc.)
DTD string itself is irrelevant (and this string by itself is not a
"hint from the server" but a "hint from the document").

> > Obviously it doesn't connect every time to w3.org to get a DTD, it uses
> > a build one.
>
> So you are saying, again, that DTD's are irrelevant?

>From the document parsing point of view: yes, absolutely irrelevant.
They have some theoretical importance for documents' indexing and
searching. Most importantly DTD allows - so far - to switch IE into W3C
box model (unless short HTML Transitional). Without the latter their
usage would be limited by ciwas and ciwah exclusively.

> > That would be another aspect of your question: what DTD/
> > tag database is build in into FF? Only one so far: XHTML 1.0 The only
> > namespace for HTML Firefox knows about is
> > xmlns:html="http://www.w3.org/1999/xhtml"
>
> If that is true, then Firefox is not even close to Standards Compliant.

It is true, but Firefox *is* Standards Compliant - as much as it's
humanly possible without rendering a UA useless and by keeping it
attractive for potential users.

> > But what decision will it make based on this table - it depends
> > completely on the Content-Type. Say absolutely the same content with
> > Content-Type text/html will go through or get adjusted, but with
> > application/xhtml+xml will lead to a parsing error.
>
> Odd behavior if you tell it text/html with a 4.01 DTD

WWW doesn't go by extensions or formal document signs, never did and
never will. The only important part is Content-Type. It defines
everything.

> > And if anyone curious: the build in DTD of IE6 is
> > <http://www.w3c.org/TR/1999/REC-html401-19991224/loose.dtd> This is the
> > only one it's aware of and the only one it uses. Respectively the only
> > type of documents existing in IE is <!DOCTYPE HTML PUBLIC "-//W3C//DTD
> > HTML 4.01 Transitional//EN">
>
> Now that I don't believe.

As you wish. But you believe or disbelieve doesn't change anything in
this matter. The only "major" change expecting in IE7 will be <abbr>
element added as separate entity (now it goes as synonim or <acronym>).
Of course IE knows a bounch of other proprietary tags. It has tables
for behaviors (<public>, <component>, <attach> etc.), tables for VML
(<v:group>, <v:line>, <v:oval> etc.) and so on. But talking about
*those* DTD - from W3C - the above mentioned DTD is the only one.

> > By providing other DTD's one can switch IE in "CSS1Compat" mode, but
> > it's just a formal reaction on "Unknown DTD" programmed into the
> > browser, DTD itself never changes.
>
> Can you prove that?

Oh com'on! Again: "prove me that the sky is blue" ? :-)

<!DOCTYPE FOOBAR "Micro$oft must die!">
<html>
<head>
<title>Untitled Document</title>
<meta http-equiv="Content-Type"
content="text/html; charset=iso-8859-1">
</head>
<body onload="alert(document.compatMode)">
</body>
</html>

> > Say you can put IE into CSS1Compat mode by placing instead:
> > <!DOCTYPE FOOBAR "Micro$oft must die!">
>
> Does the fun never end?

See above

> document.doctype gives some neat info in Firefox though.

document.doctype is just a convenience access to the provided DTD
string wich is hardly accessible otherwise (because it's formally
located outside of any document blocks, even outside of
documentElement). In IE document.doctype==null for all HTML documents -
to not make DTD users too much upset I guess.

Eric B. Bednarz

unread,

May 3, 2006, 6:14:10 AM5/3/06

to

"VK" <school...@yahoo.com> writes:

> Randy Webb wrote:

>> So you are saying, again, that DTD's are irrelevant?
>
> From the document parsing point of view: yes, absolutely irrelevant.

The document type declaration subset ist absolutely relevant if the
document instance is not fully- or amply-tagged and absolutely
irrelevant for rendering.

For HTML as of UAs in the wild it's the other way around of course; but
as usual, it's not really clear what you actually mean, which seems to
be your general discussion strategy.

> They have some theoretical importance for documents' indexing and
> searching.

¿Que?

> Most importantly DTD allows - so far - to switch IE into W3C
> box model (unless short HTML Transitional). Without the latter their
> usage would be limited by ciwas and ciwah exclusively.

Try to get a clue; the target audience scope of 'DTD users' is not a
particular news group but simply those people who care to employ
software that can process the declaration subset in the *authoring*
process, where it belongs.

>>> And if anyone curious: the build in DTD of IE6 is
>>> <http://www.w3c.org/TR/1999/REC-html401-19991224/loose.dtd>

1) Clarify what you mean by that, if anything
2) Please finish step 1 1st
3) Provide evidence

> <!DOCTYPE FOOBAR "Micro$oft must die!">
> <html>
> <head>
> <title>Untitled Document</title>
> <meta http-equiv="Content-Type"
> content="text/html; charset=iso-8859-1">
> </head>
> <body onload="alert(document.compatMode)">

The point being? And as the declaration above is invalid in any
scenario you could boil it down to the string which is relevant for M$IE
here, namely '<!DOCTYPE', e.g.

<!DOCTYPE<html><body onload="alert(document.compatMode)">

--
||| hexadecimal EBB
o-o decimal 3771
--oOo--( )--oOo-- octal 7273
205 goodbye binary 111010111011

Eric B. Bednarz

unread,

May 3, 2006, 6:22:40 AM5/3/06

to

I wrote:

> "VK" <school...@yahoo.com> writes:

>> From the document parsing point of view: yes, absolutely irrelevant.

<ins/From the parsing point of view, /

> The document type declaration subset ist absolutely relevant if the
> document instance is not fully- or amply-tagged and absolutely
> irrelevant for rendering.
>
> For HTML as of UAs in the wild it's the other way around of course;

[... etc]

VK

unread,

May 3, 2006, 6:23:39 AM5/3/06

to

The thread is moved to ciwah as OT to clj.
One may look at
<http://groups.google.com/group/comp.infosystems.www.authoring.html/browse_frm/thread/4ac44109aac7fa53/9849c2f0f8ec9a28>

VK

unread,

May 3, 2006, 7:04:03 AM5/3/06

to

> "VK" <school...@yahoo.com> writes:
> >> So you are saying, again, that DTD's are irrelevant?
> > From the document parsing point of view: yes, absolutely irrelevant.

Eric B. Bednarz wrote:
> The document type declaration subset ist absolutely relevant if the
> document instance is not fully- or amply-tagged and absolutely
> irrelevant for rendering.
>
> For HTML as of UAs in the wild it's the other way around of course; but
> as usual, it's not really clear what you actually mean, which seems to
> be your general discussion strategy.

? You just repeated my statement with one word changed ("rendering"
instead of "parsing") but you say that my statement is not clear. I
don't get it.

> > They have some theoretical importance for documents' indexing and
> > searching.
>
> Que?

I don't know of any practical applications of that. But theoretically
it is possible to search for only HTML 4.1 documents or only HTML < 4.1
documents or only XHTML documents if one decide to build such search
engine.

> > Most importantly DTD allows - so far - to switch IE into W3C
> > box model (unless short HTML Transitional). Without the latter their
> > usage would be limited by ciwas and ciwah exclusively.
>
> Try to get a clue; the target audience scope of 'DTD users' is not a
> particular news group but simply those people who care to employ
> software that can process the declaration subset in the *authoring*
> process, where it belongs.

I don't really care what target audience may use DTD for: for
authoring, for future searching, for personal preferences or anything
else.
Neither I call to drop DTD declarations: as long as they don't do any
harm, please use them as suggested. And currently they are as harmless
as useless (for Content-Type text/html and except the IE's box model
switch trick).

The original discussion arised from a *practical* aspect. Anyone is
entitled to carefully choose and insert some DTD. But as a developer
she has to be aware that it doesn't change anything in the document
loaded into UA - as long as it's served with the same Content-Type.
Either it's "HTML Transitional", "HTML Strict", "XHTML Strict" or
"FOOBAR VK NIGHTLY" - served with text/html Content-Type it produces
the same document tree and the same graphics context.

And again I'm not calling to transform DTD stuff into joke and use said
<!DOCTYPE FOOBAR VK NIGHTLY> in your documents. Please use only DTD
relevant to the used markup. One just needs to know how and if the
chosen actions affect onto the real life behavior.

> >>> And if anyone curious: the build in DTD of IE6 is
> >>> <http://www.w3c.org/TR/1999/REC-html401-19991224/loose.dtd>
>
> 1) Clarify what you mean by that, if anything

That IE has only one DTD build in it uses for all Content-Type:
text/html documents:
<http://www.w3c.org/TR/1999/REC-html401-19991224/loose.dtd>
I don't know how more clear I can be than that.

> 2) Please finish step 1 1st

I guess I did

> 3) Provide evidence

As soon as you prove me (with your own samples) what the default IE box
model differs from W3C's one. If we are going into boring obviosity
provement process then the suffer has to be at least mutual ;-)

> > <!DOCTYPE FOOBAR "Micro$oft must die!">
> > <html>
> > <head>
> > <title>Untitled Document</title>
> > <meta http-equiv="Content-Type"
> > content="text/html; charset=iso-8859-1">
> > </head>
> > <body onload="alert(document.compatMode)">
>
> The point being?

Add a HTML content of any complexity in <body>. Open page in IE.
Change the bogus DTD on any "official" one (HTML or XHTML). Open in
IE.
Look for differences.
Save all open pages File > Save As > Complete page.
Look for DTD in saved pages.

Henri Sivonen

unread,

May 3, 2006, 4:24:27 PM5/3/06

to

In article <1146646631....@j73g2000cwa.googlegroups.com>,
"VK" <school...@yahoo.com> wrote:

> Randy Webb wrote:

> > So you are saying it totally disregards the DTD and any hints from the
> > server how to handle the document?
>
> Except server reported Content-Type (text/plain, text/html, text/xml,
> application/xhtml+xml etc.)
> DTD string itself is irrelevant (and this string by itself is not a
> "hint from the server" but a "hint from the document").

The DTD is irrelevant (it is not fetched). However, for text/html, the
doctype is relevant:
http://hsivonen.iki.fi/doctype/

> They have some theoretical importance for documents' indexing and
> searching.

No, they do not.

> Most importantly DTD allows - so far - to switch IE into W3C
> box model (unless short HTML Transitional).

And Firefox. And Opera. And Safari.

> WWW doesn't go by extensions or formal document signs, never did and
> never will. The only important part is Content-Type. It defines
> everything.

Except, of course, when it doesn't.

> > > And if anyone curious: the build in DTD of IE6 is
> > > <http://www.w3c.org/TR/1999/REC-html401-19991224/loose.dtd> This is the
> > > only one it's aware of and the only one it uses.

IE does not have built-in DTDs at all. The parsing is not DTD-based.

--
Henri Sivonen
hsiv...@iki.fi
http://hsivonen.iki.fi/

VK

unread,

May 3, 2006, 5:06:01 PM5/3/06

to

Henri Sivonen wrote:
> The DTD is irrelevant (it is not fetched). However, for text/html, the
> doctype is relevant:
> http://hsivonen.iki.fi/doctype/

Not for IE6... "not exactly" for a better wording. This browser has
four options for two states (backCompat and CSS1Compat):

Option 1: No DTD at all
compatMode -> backCompat == IE box model

Option 2: Short Transitional (no URI)

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

compatMode -> backCompat == IE box model

Option 3: Full Transitional (with URI)

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"

"http://www.w3.org/TR/html401/loose.dtd">
compatMode -> CSS1Compat == W3C box model

Option 4: Any text within <!.. > brackets as the first line in the
document except Option 2
compatMode -> CSS1Compat == W3C box model

In this aspect say both
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"> and

<!DOCTYPE FOOBAR "Micro$oft must die!">

are going by the Option 4

On more than one <> pair before <html> tag IE treats everything as
trash and disregards until it hits a pair starting with <!DOCTYPE...
This way in XHTML agglomerate like
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
it sees only the last <> pair and goes by Option 4.
Actually the above agglomerate makes me wonder myself because it's a
wrong syntax for XML documents (if it pretends to be such).

> > They have some theoretical importance for documents' indexing and
> > searching.
>
> No, they do not.

*theoretically* ;-)

> > Most importantly DTD allows - so far - to switch IE into W3C
> > box model (unless short HTML Transitional).
>
> And Firefox. And Opera. And Safari.

No, because it's impossible. Firefox and others do not have IE box
model one could switch on or off. They have only one box model -
irrelevant of DTD.

> > WWW doesn't go by extensions or formal document signs, never did and
> > never will. The only important part is Content-Type. It defines
> > everything.
>
> Except, of course, when it doesn't.

As if?

> > > > And if anyone curious: the build in DTD of IE6 is
> > > > <http://www.w3c.org/TR/1999/REC-html401-19991224/loose.dtd> This is the
> > > > only one it's aware of and the only one it uses.
>
> IE does not have built-in DTDs at all. The parsing is not DTD-based.

Of course it does: otherwise how it would decide that tag to render and
how, what attributes to use for rendering and what to disregard? It
does have the above mentioned DTD - but of course not in the text
format as posted at the URL, it's binary coded in its parser.

VK

unread,

May 3, 2006, 5:07:54 PM5/3/06

to

Damn, somebody messed up followups... must be me... sorry

Richard Cornford

unread,

May 3, 2006, 5:27:34 PM5/3/06

to

VK wrote:
> Randy Webb wrote:
>> VK said the following on 5/2/2006 9:48 AM:

<snip>

>>> And if anyone curious: the build in DTD of IE6 is
>>> http://www.w3c.org/TR/1999/REC-html401-19991224/loose.dtd
>>> This is the only one it's aware of and the only one it uses.
>>> Respectively the only type of documents existing in IE is
>>> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<snip>

>>> By providing other DTD's one can switch IE in "CSS1Compat" mode,
>>> but it's just a formal reaction on "Unknown DTD" programmed into
>>> the browser, DTD itself never changes.
>>
>> Can you prove that?
>
> Oh com'on! Again: "prove me that the sky is blue" ? :-)

Experience has told us that your perception of blue looks far too
magenta for anything you say to be beyond question. Indeed so magenta at
times that it makes more sense to assume that everything you say is
nonsense.

In this case you are asserting that:-

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

- is the only formulation of <!DOCTYPE ... > that IE is 'aware' of, and
that _all_ others are "Unknown DTD" and that the use of an "Unknown DTD"
will result in IE going into standards mode (as manifest in the JScript
expression - document.compatMode - returning the string "CSS1Compat").

> <!DOCTYPE FOOBAR "Micro$oft must die!">
> <html>
> <head>
> <title>Untitled Document</title>
> <meta http-equiv="Content-Type"
> content="text/html; charset=iso-8859-1">
> </head>
> <body onload="alert(document.compatMode)">
> </body>
> </html>

So if I substitute:-

<!DOCTYPE FOOBAR "Micro$oft must die!" HTML 3>

In the above IE will consider this an "Unknown DTD" and the javascript
will alert "CSS1Compat"? But it doesn't, it alerts "BackCompat",
indicating that it went into quirks mode. And it does the same with:-

<!DOCTYPE HTML 4.99>
<!DOCTYPE HTML 40>
<!DOCTYPE HTML 200000000 ANY OLD RUBBISH>
<!DOCTYPE html 3210987654>
- and:-
<!DOCTYPE ANY OLD RUBBISH HTML 200000000>

- along with literally millions of other permutations.

This, of course, demonstrates that what you have been whitening on about
is utter nonsense, again. You would benefit considerably by
understanding that making things up off the top of your head and then
asserting that they are as true as "the sky is blue" is not the rout to
understanding, and certainly will not convince anyone to take you
seriously.

Richard.

Henri Sivonen

unread,

May 3, 2006, 5:38:32 PM5/3/06

to

In article <1146690361.9...@g10g2000cwb.googlegroups.com>,
"VK" <school...@yahoo.com> wrote:

> Henri Sivonen wrote:
> > The DTD is irrelevant (it is not fetched). However, for text/html, the
> > doctype is relevant:
> > http://hsivonen.iki.fi/doctype/
>
> Not for IE6... "not exactly" for a better wording. This browser has
> four options for two states (backCompat and CSS1Compat):

Let me guess. You did not read the document referenced above.

> Option 4: Any text within <!.. > brackets as the first line in the
> document except Option 2
> compatMode -> CSS1Compat == W3C box model

Like <!DOCTYPE HTML PUBLIC "ISO/IEC 15445:1999//DTD HTML//EN"> perhaps?
;-)

> This way in XHTML agglomerate like
> <?xml version="1.0" encoding="iso-8859-1"?>
> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
> "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
> it sees only the last <> pair and goes by Option 4.

Are you sure?

> > > Most importantly DTD allows - so far - to switch IE into W3C
> > > box model (unless short HTML Transitional).
> >
> > And Firefox. And Opera. And Safari.
>
> No, because it's impossible. Firefox and others do not have IE box
> model one could switch on or off. They have only one box model -
> irrelevant of DTD.

Actually, earlier versions of Opera did have the IE box model in the
quirks mode. My point was that even though they don't have the exact IE
box model quirks, they do doctype sniffing nonetheless.

> > > WWW doesn't go by extensions or formal document signs, never did and
> > > never will. The only important part is Content-Type. It defines
> > > everything.
> >
> > Except, of course, when it doesn't.
>
> As if?

http://ln.hixie.ch/?start=1144794177&count=1

> > > > > And if anyone curious: the build in DTD of IE6 is
> > > > > <http://www.w3c.org/TR/1999/REC-html401-19991224/loose.dtd> This is
> > > > > the
> > > > > only one it's aware of and the only one it uses.
> >
> > IE does not have built-in DTDs at all. The parsing is not DTD-based.
>
> Of course it does: otherwise how it would decide that tag to render and
> how, what attributes to use for rendering and what to disregard?

From hand-crafted C++ code and from CSS.

VK

unread,

May 3, 2006, 5:49:51 PM5/3/06

to

Richard Cornford wrote:
> In this case you are asserting that:-
>
> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
>
> - is the only formulation of <!DOCTYPE ... > that IE is 'aware' of

right

> and
> that _all_ others are "Unknown DTD" and that the use of an "Unknown DTD"
> will result in IE going into standards mode (as manifest in the JScript
> expression - document.compatMode - returning the string "CSS1Compat").

I'm using document.compatMode value only for a quick demo. From a
practical point of view it is irrelevant what document.compatMode
property is set to. What *is* relevant if IE in IE Box Model or W3C Box
Model. So a real "manifestation" would be say a div with width:100%
with margin/padding set inside another element and the rendering
change. But for a quick demo document.compatMode value does the trick.

> So if I substitute:-
>
> <!DOCTYPE FOOBAR "Micro$oft must die!" HTML 3>
>
> In the above IE will consider this an "Unknown DTD"

No. It will consider it as a non-rendering trash before the opening
<html> tag.
You are missing the difference between "Unrecognized DTD declaration"
like

<!DOCTYPE FOOBAR "Micro$oft must die!">

and "Not a DTD declaration at all" like

<!DOCTYPE FOOBAR "Micro$oft must die!" HTML 3>

The proper (at least from the IE's point of view) DTD declaration
syntax is described here:
<http://msdn.microsoft.com/workshop/author/dhtml/reference/objects/doctype.asp>

If a string is not matching the declared pattern, it's being simply
ignored.

Change <!DOCTYPE FOOBAR "Micro$oft must die!" HTML 3> back to
<!DOCTYPE FOOBAR "Micro$oft must die!"> - and the parser will hit a
match right away.

Richard Cornford

unread,

May 3, 2006, 6:49:28 PM5/3/06

to

VK wrote:
> Richard Cornford wrote:
>> In this case you are asserting that:-
>>
>> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
>>
>> - is the only formulation of <!DOCTYPE ... > that IE is 'aware' of
>
> right
>
>> and
>> that _all_ others are "Unknown DTD" and that the use of an
>> "Unknown DTD" will result in IE going into standards mode
>> (as manifest in the JScript expression - document.compatMode
>> - returning the string "CSS1Compat").
>
> I'm using document.compatMode value only for a quick demo. From a
> practical point of view it is irrelevant what document.compatMode
> property is set to. What *is* relevant if IE in IE Box Model or W3C
> Box Model. So a real "manifestation" would be say a div with
> width:100% with margin/padding set inside another element and the
> rendering change. But for a quick demo document.compatMode value
> does the trick.

What are you whittering about now? The - document.compatMode - property
exposed to scripts is there to state the mode the browser is operating
in. For IE, if one value then one box model, always, and if the other
value then the other box model.

>> So if I substitute:-
>>
>> <!DOCTYPE FOOBAR "Micro$oft must die!" HTML 3>
>>
>> In the above IE will consider this an "Unknown DTD"
>
> No. It will consider it as a non-rendering trash before the opening
> <html> tag.
> You are missing the difference between "Unrecognized DTD declaration"
> like
> <!DOCTYPE FOOBAR "Micro$oft must die!">
> and "Not a DTD declaration at all" like
> <!DOCTYPE FOOBAR "Micro$oft must die!" HTML 3>

ROTFLOL

<!DOCTYPE FOOBAR "Micro$oft must die!" HTML 3>

- Quirks mode, and:-

<!DOCTYPE FOOBAR "Micro$oft must die!" HTML 5>

- Standards mode. If IE considers either as "Not a DTD declaration at
all" because of its format it must consider both of them not to be DTDs,
and treat them the same, it doesn't treat them the same.

<snip>

> Change <!DOCTYPE FOOBAR "Micro$oft must die!" HTML 3> back to
> <!DOCTYPE FOOBAR "Micro$oft must die!"> - and the parser will hit a
> match right away.

And change it to:-

<!DOCTYPE FOOBAR "Micro$oft must die! HTML 30">

- and we are back to quirks mode. There is no point in your winding
around trying to justify your nonsense. You made it up off the top of
your head so the odds are that it does not describe reality.

Richard.

Eric B. Bednarz

unread,

May 3, 2006, 9:52:32 PM5/3/06

to

"VK" <school...@yahoo.com> writes:

> You are missing

*Somebody* is missing *something* for sure here.

> the difference between "Unrecognized DTD declaration"
> like
> <!DOCTYPE FOOBAR "Micro$oft must die!">

I wouldn't know what an "(Un)recognized DTD declaration" is, really. A
document type declaration is bound to expect either one of the keywords
SYSTEM or PUBLIC or an internal subset after the root element
specification.

> and "Not a DTD declaration at all" like
> <!DOCTYPE FOOBAR "Micro$oft must die!" HTML 3>

If you have an SGML parser that doesn't die on '"' in either case,
please publish the SGML declaration you used to accomplish that.

Toby Inkster

unread,

May 4, 2006, 3:30:29 AM5/4/06

to

VK wrote:

> It is true, but Firefox *is* Standards Compliant - as much as it's
> humanly possible without rendering a UA useless and by keeping it
> attractive for potential users.

Nonsense. Firefox doesn't support, for example, 'font-size-adjust'[1] from
the CSS 2 spec, but doing so wouldn't make it less attractive to potential
users.

And there are plenty[2] of other bug-fixes and improvements to standards
compliance that could be implemented without making it less attractive to
users.

____
1. http://www.w3.org/TR/REC-CSS2/fonts.html#propdef-font-size-adjust
2. https://bugzilla.mozilla.org/show_bug.cgi?id=238072
https://bugzilla.mozilla.org/show_bug.cgi?id=325680
https://bugzilla.mozilla.org/show_bug.cgi?id=318518
https://bugzilla.mozilla.org/show_bug.cgi?id=178258
https://bugzilla.mozilla.org/show_bug.cgi?id=312880
https://bugzilla.mozilla.org/show_bug.cgi?id=311942
https://bugzilla.mozilla.org/show_bug.cgi?id=311623
etc

--
Toby A Inkster BSc (Hons) ARCS
Contact Me ~ http://tobyinkster.co.uk/contact

ASM

unread,

May 4, 2006, 5:30:28 AM5/4/06

to

Toby Inkster a écrit :

> Firefox doesn't support, for example, 'font-size-adjust'[1] from
> the CSS 2 spec,

> ____
> 1. http://www.w3.org/TR/REC-CSS2/fonts.html#propdef-font-size-adjust

anyway ...
any of my browsers (Safari 1.3, Opera 9.00, IE 5.2, Fx 1.5.0.3)
supports this spec :-(

> And there are plenty[2] of other bug-fixes and improvements to standards
> compliance that could be implemented without making it less attractive to
> users.
>

> 2. https://bugzilla.mozilla.org/show_bug.cgi?id=238072

My English is too poor to understand whatever about this reports
Where are examples for encountered problems ?

> etc

--
Stephane Moriaux et son [moins] vieux Mac

VK

unread,

May 5, 2006, 3:11:29 AM5/5/06

to

Toby Inkster wrote:
> Nonsense. Firefox doesn't support, for example, 'font-size-adjust'[1] from
> the CSS 2 spec, but doing so wouldn't make it less attractive to potential
> users.
>
> And there are plenty[2] of other bug-fixes and improvements to standards
> compliance that could be implemented without making it less attractive to
> users.

Unlike CSS1, CSS2.1 is just a working draft: "Publication as a Working
Draft does not imply endorsement by the W3C Membership. This is a draft
document and may be updated, replaced or obsoleted by other documents
at any time. It is inappropriate to cite this document as other than
work in progress."

Besides some more or less stable parts, CSS 2.1 draft is also used as a
dumpster for some of W3C' members nightly thoughts and revelations :-)

No one "real" UA producer could just grab such document *in whole* and
rewrite the entire engine under it.

The most promising features are being first taken as -moz extensions,
and if proven to be usable and useful then eventually added to the main
set (like -moz-opacity > opacity).

At the same time CSS 2.1 working draft contans a lot of nonsense which
will never make into real life (and should be really removed right now
so to not confuse developers' minds).

Say :before and :after pseudo-elements is an application of XBL
(Mozilla) / Viewlink (Microsoft) but taken out of space, context and
sense. The implications of autogenerated anonymous content (DOM tree,
id's visibility scope etc.) is a big separate issue carefully treated
in both mentioned technologies. But if one has no clue about the
subject, then of course it's as simple as to add two new
presudo-elements into specs.

P.S.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html>
<head>
<title>Demo</title>
<style type="text/css">
#p01 {
-moz-binding: url(beforeafter.xml#default);
}
</style>
</head>
<body>

<p id="p01">Default content</p>

</body>
</html>

// beforeafter.xml

<?xml version="1.0"?>
<bindings xmlns="http://www.mozilla.org/xbl"
xmlns:html="http://www.w3.org/1999/xhtml">
<binding id="default">
<content>
<html:p>Content before</html:p>
<children/>
<html:p>Content after</html:p>
</content>
</binding>
</bindings>

Steve Pugh

unread,

May 5, 2006, 3:44:26 AM5/5/06

to

"VK" <school...@yahoo.com> wrote:

>Toby Inkster wrote:
>> Nonsense. Firefox doesn't support, for example, 'font-size-adjust'[1] from
>> the CSS 2 spec, but doing so wouldn't make it less attractive to potential
>> users.
>>
>> And there are plenty[2] of other bug-fixes and improvements to standards
>> compliance that could be implemented without making it less attractive to
>> users.
>
>Unlike CSS1, CSS2.1 is just a working draft: "Publication as a Working
>Draft does not imply endorsement by the W3C Membership. This is a draft
>document and may be updated, replaced or obsoleted by other documents
>at any time. It is inappropriate to cite this document as other than
>work in progress."

That's true as far as it goes, but CSS 2.1 is actually at the
Candidate Recommendation stage (a stage that didn't exist when CSS 1
was drafted) and regardless of its official status it's the closest
thing we have to a standard for CSS today.

>Besides some more or less stable parts, CSS 2.1 draft is also used as a
>dumpster for some of W3C' members nightly thoughts and revelations :-)

I think you may be getting confused with CSS 3.

>No one "real" UA producer could just grab such document *in whole* and
>rewrite the entire engine under it.

Well no, they have to be bugwards compatible with all the junk code
that's already out there as well. But all the major browser developers
are aiming to complete their support for CSS 2.1.

>The most promising features are being first taken as -moz extensions,
>and if proven to be usable and useful then eventually added to the main
>set (like -moz-opacity > opacity).

opaacity is in CSS 3 not CSS 2.1

>At the same time CSS 2.1 working draft contans a lot of nonsense which
>will never make into real life (and should be really removed right now
>so to not confuse developers' minds).

Care to give some more examples?

CSS 2.1 removed stuff from CSS 2 that wasn't supported by browsers.
CSS 2.1 only added a few things, mostly stuff that had already started
to be supported by browsers.

CSS 2.1 items that might be dropped due to poor browser support are a
rather short list. See 'Features at Risk' on the home page of the CSS
2.1 draft. So it looks like all the rest of the nonesense has already
been implemened by developers.

Again I think you're confusing CSS 2.1 with CSS 3.

>Say :before and :after pseudo-elements is an application of XBL
>(Mozilla) / Viewlink (Microsoft) but taken out of space, context and
>sense. The implications of autogenerated anonymous content (DOM tree,
>id's visibility scope etc.) is a big separate issue carefully treated
>in both mentioned technologies. But if one has no clue about the
>subject, then of course it's as simple as to add two new
>presudo-elements into specs.

If you look you'll see that :before and :after were already in CSS 2,
which became a recommendation in 1998, so I'm not sure that your
argument holds up on historical grounds let alone technical grounds.

Steve
--
"My theories appal you, my heresies outrage you,
I never answer letters and you don't like my tie." - The Doctor

Steve Pugh <st...@pugh.net> <http://steve.pugh.net/>

VK

unread,

May 5, 2006, 4:05:09 AM5/5/06

to

Steve Pugh wrote:
> CSS 2.1 is actually at the
> Candidate Recommendation stage

In what organization? Not in W3C at least where the latest publication
(April 2006) <http://www.w3.org/TR/CSS21/> is still in Working Draft
status.

> >Besides some more or less stable parts, CSS 2.1 draft is also used as a
> >dumpster for some of W3C' members nightly thoughts and revelations :-)
>
> I think you may be getting confused with CSS 3.

No I'm talking about CSS 2.1.
CSS 3 is a separate and even more difficult issue.

> >At the same time CSS 2.1 working draft contans a lot of nonsense which
> >will never make into real life (and should be really removed right now
> >so to not confuse developers' minds).
>
> Care to give some more examples?

I gave an example on the next line (:before / :after pseudo-elements
<http://www.w3.org/TR/CSS21/selector.html#before-and-after> )

You mean like a full revision of CSS 2.1 made by VK?
I doubt it would be of a wide public interest. :-)

Steve Pugh

unread,

May 5, 2006, 4:57:27 AM5/5/06

to

VK wrote:

> Steve Pugh wrote:
>>
> > >At the same time CSS 2.1 working draft contans a lot of nonsense
> > >which will never make into real life (and should be really removed right
> > >now so to not confuse developers' minds).
> >
> > Care to give some more examples?
>
> I gave an example on the next line (:before / :after pseudo-elements
> <http://www.w3.org/TR/CSS21/selector.html#before-and-after> )

That's why I said _more_ examples.

Anything else from CSS 2.1 that you think is "nonsense which will never

make into real life (and should be really removed right now so to not

confuse developers' minds)" ?

Unlike :before/:after which isn't nonsense, is already in real life and
which doesn't really seem to be confusing any developers as far as I
can see.

Steve

VK

unread,

May 5, 2006, 5:19:51 AM5/5/06

to

Richard Cornford wrote:
> <snip>
> > Change <!DOCTYPE FOOBAR "Micro$oft must die!" HTML 3> back to
> > <!DOCTYPE FOOBAR "Micro$oft must die!"> - and the parser will hit a
> > match right away.
>
> And change it to:-
>
> <!DOCTYPE FOOBAR "Micro$oft must die! HTML 30">
>
> - and we are back to quirks mode.

and change it to <!DOCTYPE FOOBAR "Micro$oft must die! XHTML 30"> and
we are back to CSS1Compat mode.

and now remove "DOCTYPE" word: <!FOOBAR "Micro$oft must die! XHTML 30">
and we are back to BackCompat mode.

Someone (not me) just refuses to read the relevant producer's
documentation and prefers to make up her own picture. And the needed
reading is really not so big in volume, just here:
<http://msdn.microsoft.com/workshop/author/dhtml/reference/objects/doctype.asp>
and here:
<http://msdn.microsoft.com/library/en-us/dnie60/html/cssenhancements.asp>

But I can make the challenge even less challenging :-) by giving you a
plain words summary:

1) IE treats any string starting with <!DOCTYPE before <html> tag as a
DTD
2) If it meets such string then it looks for substring
HTML<space><number>
3) If found then it chooses the mode by the table listed at
<http://msdn.microsoft.com/workshop/author/dhtml/reference/objects/doctype.asp>
(Posting tables in a plain text message is a pain in one place, so this
part of work you have to do yourselve by visiting the linked page).

This way (by the mentioned table) these strings will lead to BackCompat
(Quirk) mode:
<!DOCTYPE FOOBAR HTML>
<!DOCTYPE FOOBAR HTML 2.0>
<!DOCTYPE FOOBAR HTML 2>
<!DOCTYPE FOOBAR HTML 2006>
<!DOCTYPE FOOBAR HTML 3.0>
<!DOCTYPE FOOBAR HTML 4>
and this will lead to CSS1Compat (so called "standard") mode:
<!DOCTYPE FOOBAR XHTML>
<!DOCTYPE FOOBAR HTML 5>
<!DOCTYPE FOOBAR HTML 666>

Here is a test page to check that IE indeed follows MSDN specs. As a
reminder of what this mode biz really about, I also added width:100%
element inside other element. So besides the mode string change one
also can see the layout change.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html>
<head>
<title>$Template$</title>

</style>
<script type="text/javascript">
function init() {
window.alert(document.compatMode);
}

window.onload = init;
</script>
</head>

<body>
<p><span style="width:auto">Sample length 40em</span></p>
<p><span style="width:100%">Sample length 40em</span></p>
</body>
</html>

VK

unread,

May 5, 2006, 5:58:42 AM5/5/06

to

Steve Pugh wrote:
> > I gave an example on the next line (:before / :after pseudo-elements
> > <http://www.w3.org/TR/CSS21/selector.html#before-and-after> )
>
> That's why I said _more_ examples.

A theoretical criticism of W3C working drafts is out of my interest
really. I will call sh** on sh** then it starts to smell, not then it's
just written somewhere.

> Unlike :before/:after which isn't nonsense

a complete nonsense - and highly amateurish one, disregarding all
anonymous content implications. Next thing would be to add <circle> tag
to XHTML. It is OK that SVG exists, but for the convenience of an
occasional usage - why not? :-)

> is already in real life

You mean the partial :before pseudo-element support in Firefox and
Opera? It's very sad, I don't know how did it went through, especially
in Gecko with its XBL support. Just look at the poor DOM Inspector, and
compare with the proper way in my sample.

> and which doesn't really seem to be confusing any developers as far as I
> can see.

Because no one is using it unless a very bad developer in a Gecko-only
intranet.

VK

unread,

May 5, 2006, 11:02:40 AM5/5/06

to

Henri Sivonen wrote:
> "VK" <school...@yahoo.com> wrote:
>
> > Henri Sivonen wrote:
> > > The DTD is irrelevant (it is not fetched). However, for text/html, the
> > > doctype is relevant:
> > > http://hsivonen.iki.fi/doctype/
> >
> > Not for IE6... "not exactly" for a better wording. This browser has
> > four options for two states (backCompat and CSS1Compat):
>
> Let me guess. You did not read the document referenced above.

Now I did. It is a nice table there, but AFAICT you build it with the
same wrong assumption of how really DTD sniffing works, at least in IE.
Without it it will be just a record of test results, not an
explanation. The real explanation and relevant links see in my post in
this thread. Besides the results listed in your table it also explain
why say
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 222//EN">
leaves IE in BackCompat mode while
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 666//EN">
switches it into CSS1Compat mode.

Your table also doesn't tell anything about "Unrecognized DTD"
situation, thus about even perfectly valid DTD but not from W3C lists.
It is often forgotten but DTD files creation is not an exclusive right
of W3C. Everyone is welcome (and it's widely used in XML) to create own
DTD's for a particular set of documents.

> > Option 4: Any text within <!.. > brackets as the first line in the
> > document except Option 2
> > compatMode -> CSS1Compat == W3C box model
>
> Like <!DOCTYPE HTML PUBLIC "ISO/IEC 15445:1999//DTD HTML//EN"> perhaps?
> ;-)

:-)
No, this gives BackCompat. If you have "HTML" string in DTD, it has to
be followed by <space><number 5 or greater> in order to trig CSS1Compat
mode, see my post.

> > This way in XHTML agglomerate like
> > <?xml version="1.0" encoding="iso-8859-1"?>
> > <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
> > "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
> > it sees only the last <> pair and goes by Option 4.
>
> Are you sure?

Here you got me: indeed XHTML declaration soup nocks IE out and it
simply ignores everything until <html> tag (so staying in BackCompat
mode). So the parsing rule is more narrow as I though: it must be only
one string before <html> and it must start with "<!DOCTYPE" in order to
parser to pick up on it.

The practical conclusion would be never use "fully qualified" XHTML
declarations (with <?xml version="1.0" encoding="iso-8859-1"?>) in
documents served as text/html. Besides all other drawbacks it forces IE
to stay in quirk mode.

The rest goes by my explanations though. The only text (besides
!DOCTYPE) the parser is interested in is "HTML" or "XHTML" sequences.
Say to have CSS1Compat mode it is enough to place <!DOCTYPE XHTML>

> > > > Most importantly DTD allows - so far - to switch IE into W3C
> > > > box model (unless short HTML Transitional).
> > >
> > > And Firefox. And Opera. And Safari.
> >
> > No, because it's impossible. Firefox and others do not have IE box
> > model one could switch on or off. They have only one box model -
> > irrelevant of DTD.
>
> Actually, earlier versions of Opera did have the IE box model in the
> quirks mode. My point was that even though they don't have the exact IE
> box model quirks, they do doctype sniffing nonetheless.

Point taken.

> > > > WWW doesn't go by extensions or formal document signs, never did and
> > > > never will. The only important part is Content-Type. It defines
> > > > everything.
> > >
> > > Except, of course, when it doesn't.
> >
> > As if?
>
> http://ln.hixie.ch/?start=1144794177&count=1

Uhm... It seems like a message from another planet to me :-) At least
nothing similar in 100 miles area around me in the last 7 years. But
Switzerland is known for many specifics, so maybe it's true for the
area. But I would guess that the author simply mis-interpreted the
rumors about hackers attacks using Content-Type tricks. Originally some
.src indeed presumed only one content type, and it was used to serve
malicious content into them. But it was in 4th era. Later it was
another trick by serving proper Content-Type but followed by a wrong
malicious content (that is more recent, the last case was for images
for Windows XP below SP2). For the latter hack browser producers indeed
had to add some binary content check to see if it corresponds to the
Content-Type. But it has nothing to do with "Content-Type is useless
and disregarded". Just a small part of ever lasting battle with
hackers.

> > > > > > And if anyone curious: the build in DTD of IE6 is
> > > > > > <http://www.w3c.org/TR/1999/REC-html401-19991224/loose.dtd> This is
> > > > > > the
> > > > > > only one it's aware of and the only one it uses.
> > >
> > > IE does not have built-in DTDs at all. The parsing is not DTD-based.
> >
> > Of course it does: otherwise how it would decide that tag to render and
> > how, what attributes to use for rendering and what to disregard?
>
> From hand-crafted C++ code and from CSS.

And C++ code is made based on... right,
<http://www.w3c.org/TR/1999/REC-html401-19991224/loose.dtd> :-)

Henri Sivonen

unread,

May 5, 2006, 11:43:06 AM5/5/06

to

In article <1146841360....@e56g2000cwe.googlegroups.com>,
"VK" <school...@yahoo.com> wrote:

> Henri Sivonen wrote:
> > "VK" <school...@yahoo.com> wrote:
> >
> > > Henri Sivonen wrote:
> > > > The DTD is irrelevant (it is not fetched). However, for text/html, the
> > > > doctype is relevant:
> > > > http://hsivonen.iki.fi/doctype/
> > >
> > > Not for IE6... "not exactly" for a better wording. This browser has
> > > four options for two states (backCompat and CSS1Compat):
> >
> > Let me guess. You did not read the document referenced above.
>
> Now I did. It is a nice table there, but AFAICT you build it with the
> same wrong assumption of how really DTD sniffing works, at least in IE.

What's my wrong assumption?

If you believe that the quoted "The DTD is irrelevant (it is not
fetched). However, for text/html, the doctype is relevant" is my wrong
assumption, I assure you that the assumption is right as far as
text/html in IE goes and you need to be more precise about the words
doctype and DTD.

> Without it it will be just a record of test results, not an
> explanation.

Explaining the exact inner workings is not the goal of that document.
The goal of the document is to
a) explain why weird stuff happens (to people who don't
realize doctype sniffing is taking place)
b) motivate people to use the Standards more or at least
the Almost Standards mode.
I believe I give enough facts to meet these goals. I am deliberately
withholding details that I believe would either obscure the main point
or would make people feel too confident about deliberately using the
Quirks mode or deliberately using a doctype other than the two I've been
recommending for almost six years now.

(I will recommend the HTML5 doctype, when the spec stabilizes.)

> The real explanation and relevant links see in my post in
> this thread.

FWIW, I have read the relevant code from the Gecko and WebKit codebases.
(Have you?) I don't have access to the code of Opera, Mac IE 5 or
Windows IE 6, so I could only speculate based on black box testing.
While it would be interesting for the curious, I fail to see why J.
Random Web author would need to see that speculation.

> Your table also doesn't tell anything about "Unrecognized DTD"
> situation, thus about even perfectly valid DTD but not from W3C lists.

I have left it out deliberately in order to discourage people from using
them.

> It is often forgotten but DTD files creation is not an exclusive right
> of W3C. Everyone is welcome (and it's widely used in XML) to create own
> DTD's for a particular set of documents.

Homegrown DTDs for XML are legitimate for XML (but still arguably a bad
idea on the Web). It is not so clear whether homegrown DTDs are
appropriate for text/html.

> > http://ln.hixie.ch/?start=1144794177&count=1

> But I would guess that the author simply mis-interpreted the
> rumors about hackers attacks using Content-Type tricks.

He wasn't going by rumors. He has actually worked for Netscape and Opera
and also followed the bug database of Safari.

VK

unread,

May 5, 2006, 12:45:20 PM5/5/06

to

Henri Sivonen wrote:
> What's my wrong assumption?

That browser indeed treats quoted part of DTD as a unit (a la opaque
strings in namespace declarations). Sorry if I'm wrong and it was not
your assumption.

> FWIW, I have read the relevant code from the Gecko and WebKit codebases.
> (Have you?) I don't have access to the code of Opera, Mac IE 5 or
> Windows IE 6, so I could only speculate based on black box testing.
> While it would be interesting for the curious, I fail to see why J.
> Random Web author would need to see that speculation.

I see... Ignorance is the bless ;-)

> > Your table also doesn't tell anything about "Unrecognized DTD"
> > situation, thus about even perfectly valid DTD but not from W3C lists.
>
> I have left it out deliberately in order to discourage people from using
> them.

I see... Ignorance is the bless ;-)

> > It is often forgotten but DTD files creation is not an exclusive right
> > of W3C. Everyone is welcome (and it's widely used in XML) to create own
> > DTD's for a particular set of documents.
>
> Homegrown DTDs for XML are legitimate for XML (but still arguably a bad
> idea on the Web). It is not so clear whether homegrown DTDs are
> appropriate for text/html.

Proprietary DTD's are fully OK for XML, thus for XML+XSL transformers.
It can be a transformer producing the resulting document of type
text/foobar and respective DTD defining element <foobar> with allowed
attributes CDATA foo and logical bar. That is pretty close to how
Windows Vista file management will work - thus some part of Web
resources. But you may expect at the very least another year of
peaceful life :-)

> > > http://ln.hixie.ch/?start=1144794177&count=1
>
> > But I would guess that the author simply mis-interpreted the
> > rumors about hackers attacks using Content-Type tricks.
>
> He wasn't going by rumors. He has actually worked for Netscape and Opera
> and also followed the bug database of Safari.

Then his statement gets really strange - especially when anyone can
prove it wrong. The very same intentionally broken XHTML document (no
closing tag in list elements) demostrates completely different behavior
if served with Content-Type text/html:
<http://www.nskom.com/external/tmp/html/xhtml.html>
and with Content-Type application/xhtml+xml
<http://www.nskom.com/external/tmp/xhtml/xhtml.html>

I'm very far of thinking that the author is not current on the subject.
Then it may be explained by his bias. Maybe it's all about your
discussions here about XHTML served with text/html Content-Type. In
such case the mentioned article would be an attempt to "bring a peace
into starving souls" :-) Like if Content-Type is meaningless and
disregarded anyway, then it's not important what Content-Type to use.
The next logical conclusion out if it would be to try to serve
documents without Content-Type at all - let UA bothers with it by
formal signs. Anyone tryed it already?

Michael Winter

unread,

May 5, 2006, 2:14:00 PM5/5/06

to

On 05/05/2006 17:45, VK wrote:

> Henri Sivonen wrote:
>
>> What's my wrong assumption?
>

> That browser indeed treats quoted part of DTD as a unit [...]

You still have your terms confused. The DTD is the document type
definition; the syntax, if you will. The 'doctype', or document type
declaration, is the <!DOCTYPE ...> notation near the start of a document.

[snip]

>>>> http://ln.hixie.ch/?start=1144794177&count=1

That entry makes my skin crawl. I can understand Ian's reasoning, but
still... *ick*

>>> But I would guess that the author simply mis-interpreted the
>>> rumors about hackers attacks using Content-Type tricks.
>>
>> He wasn't going by rumors. He has actually worked for Netscape and Opera
>> and also followed the bug database of Safari.
>
> Then his statement gets really strange - especially when anyone can
> prove it wrong.

This'll be interesting...

> The very same intentionally broken XHTML document (no
> closing tag in list elements) demostrates completely different behavior
> if served with Content-Type text/html:
> <http://www.nskom.com/external/tmp/html/xhtml.html>
> and with Content-Type application/xhtml+xml
> <http://www.nskom.com/external/tmp/xhtml/xhtml.html>

And in what context are you making that statement?

Does Firefox exhibit very different behaviour? Sure, but Ian stated that
browsers "*largely* ignore the HTTP Content-Type header" (emphasis mine).

Does IE exhibit very different behaviour? No. It ignores the
Content-Type HTTP header and parses the document as tag-soup HTML (as
indicated by the meta element, even though it should have been superseded).

[snip]

Mike

c.l.javascript has been removed from follow-ups.

--
Michael Winter
Prefix subject with [News] before replying by e-mail.

VK

unread,

May 5, 2006, 2:47:08 PM5/5/06

to

Michael Winter wrote:
> On 05/05/2006 17:45, VK wrote:
>
> > Henri Sivonen wrote:
> >
> >> What's my wrong assumption?
> >
> > That browser indeed treats quoted part of DTD as a unit [...]
>
> You still have your terms confused. The DTD is the document type
> definition; the syntax, if you will. The 'doctype', or document type
> declaration, is the <!DOCTYPE ...> notation near the start of a document.

<http://www.w3.org/TR/html401/struct/global.html#h-7.2>
"The document type declaration names the document type definition (DTD)
in use for the document".
A perfect sample of W3C "catch it if you can" language :-), but overall
I would say that a string like <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML
4.01//EN"> is a document type declaration containing document type
definition in it. Lucky both produces the same acronym DTD :-)

> > The very same intentionally broken XHTML document (no
> > closing tag in list elements) demostrates completely different behavior
> > if served with Content-Type text/html:
> > <http://www.nskom.com/external/tmp/html/xhtml.html>
> > and with Content-Type application/xhtml+xml
> > <http://www.nskom.com/external/tmp/xhtml/xhtml.html>
>
> And in what context are you making that statement?
>
> Does Firefox exhibit very different behaviour? Sure, but Ian stated that
> browsers "*largely* ignore the HTTP Content-Type header" (emphasis mine).
>
> Does IE exhibit very different behaviour? No. It ignores the
> Content-Type HTTP header and parses the document as tag-soup HTML (as
> indicated by the meta element, even though it should have been superseded).

You may want to check it again. In older IE's it prompts for Save As.
Under IE 6 with secury patch installed it pops up download status bar
(it may be difficult to notice on a small file like that, but it blinks
for a second - watch the screen). So the file is downloaded into secure
store first, examined for content and viewed (if allowed) from that
secure store.
Look at you address bar: it is not
<http://www.nskom.com/external/tmp/xhtml/xhtml.html> anymore, it is
being changed to something like <C:\WINDOWS\Temporary Internet
Files\Content\I5OBI1Q1\xhtml[1].html>

Tell me that it is a regular browsing experience just like with
text/html ;-)

Michael Winter

unread,

May 5, 2006, 3:42:57 PM5/5/06

to

On 05/05/2006 19:47, VK wrote:

> Michael Winter wrote:

[snip]

>> You still have your terms confused. The DTD is the document type
>> definition; the syntax, if you will. The 'doctype', or document
>> type declaration, is the <!DOCTYPE ...> notation near the start of
>> a document.

[snip]

> A perfect sample of W3C "catch it if you can" language :-) [...]

I won't pretend I know the history of SGML (so I'm prepared for
corrections from those who do know), but it was first published as an
ISO standard in 1986, which is eight years before the W3C was founded. I
assume that SGML has always possessed these terms, and so the W3C is not
responsible for the nomenclature; that falls to Charles Goldfarb, et al.

> but overall I would say that a string like <!DOCTYPE HTML PUBLIC
> "-//W3C//DTD HTML 4.01//EN"> is a document type declaration
> containing document type definition in it.

But, of course, it doesn't. It's a document type declaration containing
a public identifier for HTML 4.01 Strict, and omitting the optional
system identifier.

[snip]

>>> if served with Content-Type text/html:
>>> <http://www.nskom.com/external/tmp/html/xhtml.html>
>>> and with Content-Type application/xhtml+xml
>>> <http://www.nskom.com/external/tmp/xhtml/xhtml.html>

[snip]

>> Does IE exhibit very different behaviour? No.

[snip]

> You may want to check it again.

Not really.

> [...] Under IE 6 with secury patch installed it pops up download
> status bar

No. In /your/ IE 6 is does that. Here it doesn't.

[snip]

> Look at you address bar: it is not

> <http://www.nskom.com/external/tmp/xhtml/xhtml.html> anymore, [...]

Yes, it is.

[snip]

Mike

VK

unread,

May 5, 2006, 4:02:10 PM5/5/06

to

Michael Winter wrote:
> >> Does IE exhibit very different behaviour? No.
>
> [snip]
>
> > You may want to check it again.
>
> Not really.
>
> > [...] Under IE 6 with secury patch installed it pops up download
> > status bar
>
> No. In /your/ IE 6 is does that. Here it doesn't.
>
> [snip]
>
> > Look at you address bar: it is not
> > <http://www.nskom.com/external/tmp/xhtml/xhtml.html> anymore, [...]
>
> Yes, it is.

<http://www.nskom.com/external/tmp/xhtml/xhtml.html> (served as
application/xhtml+xml)

Windows XP SP1
IE 6.0.2800.1106.xpsp1
Patches: Q324929; Q328970; Q828750

"Save As" dialog box pops up with security warning

Windows 98 SE
IE 6.0.2800.1106IC
Patches: Q905915; Q837009; Q833989; Q891781; q313829

Loads the file into temporary store and displays from where as
described earlier. Respectively any relative links gets broken.

Also asked a friend of mine (Windows XP, but full patch set is not
available). "Save As" dialog box.

Sorry, but something is wrong with *your* installation. Possibly you
did not install security patches for IE for too long.

Toby Inkster

unread,

May 5, 2006, 9:53:46 PM5/5/06

to

VK wrote:
> Toby Inkster wrote:
>
>> Nonsense. Firefox doesn't support, for example, 'font-size-adjust'[1] from
>> the CSS 2 spec, but doing so wouldn't make it less attractive to potential
>> users.
>

> Unlike CSS1, CSS2.1 is just a working draft

I didn't say anything about CSS 2.1 -- I was talking about CSS 2.

> At the same time CSS 2.1 working draft contans a lot of nonsense which
> will never make into real life (and should be really removed right now
> so to not confuse developers' minds).

Although I don't think there are any browsers that support the CSS 2.1
draft in its entirity, there is no part of CSS 2.1 that is not supported
in at least one browser.

Henri Sivonen

unread,

May 6, 2006, 4:19:19 AM5/6/06

to

In article <1146847520....@g10g2000cwb.googlegroups.com>,
"VK" <school...@yahoo.com> wrote:

> Henri Sivonen wrote:
> > What's my wrong assumption?
>
> That browser indeed treats quoted part of DTD as a unit (a la opaque
> strings in namespace declarations). Sorry if I'm wrong and it was not
> your assumption.

Gecko and WebKit indeed extract the public id as a string, fold it to
lower case and match the resulting string as an opaque string against a
list of known lowercased quirky public ids and almost standards mode
public ids. Like I said, I have not seen the source of IE, so I'm
refraining from claiming to know how exactly it does what it does.

> I see... Ignorance is the bless ;-)

ITYM bliss. ;-)

> > Homegrown DTDs for XML are legitimate for XML (but still arguably a bad
> > idea on the Web). It is not so clear whether homegrown DTDs are
> > appropriate for text/html.
>
> Proprietary DTD's are fully OK for XML, thus for XML+XSL transformers.

DTDs on the Web are a bad idea, because processing them is optional and
DTDs cause infoset augmentation, so the infoset reported to the
application may be different depending on whether the DTD was processed
or not.

> That is pretty close to how Windows Vista file management will work

Eh?

> > > > http://ln.hixie.ch/?start=1144794177&count=1
> >
> > > But I would guess that the author simply mis-interpreted the
> > > rumors about hackers attacks using Content-Type tricks.
> >
> > He wasn't going by rumors. He has actually worked for Netscape and Opera
> > and also followed the bug database of Safari.
>
> Then his statement gets really strange - especially when anyone can
> prove it wrong.

He said "largely ignore". Your example is one of the cases not covered
by "largely".

Henri Sivonen

unread,

May 6, 2006, 4:27:07 AM5/6/06

to

In article <1146854828.6...@y43g2000cwc.googlegroups.com>,
"VK" <school...@yahoo.com> wrote:

> I would say that a string like <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML
> 4.01//EN"> is a document type declaration

OK.

> containing document type definition in it.

Referencing it, rather.

See Goldfarb's annotation to clause 11.1 of ISO 8879.

> Lucky both produces the same acronym DTD :-)

But DTD only stands for document type *definition*.

--
Henri Sivonen
hsiv...@iki.fi
http://hsivonen.iki.fi/

Mozilla Web Author FAQ: http://mozilla.org/docs/web-developer/faq.html

Michael Winter

unread,

May 6, 2006, 5:02:35 AM5/6/06

to

On 05/05/2006 21:02, VK wrote:

> Michael Winter wrote:

[snip]

[VK:]

>>> Look at you address bar: it is not
>>> <http://www.nskom.com/external/tmp/xhtml/xhtml.html> anymore, [...]
>>
>> Yes, it is.

[snip]

> Sorry, but something is wrong with *your* installation.

Highly unlikely.

> Possibly you did not install security patches for IE for too long.

For the record, I have IE 6 with Service Pack 2 installed, as well as
all critical updates (and most of the optional ones, too).

Version: 6.0.2900.2180.xpsp_sp2_gdr.050301-1519

VK

unread,

May 6, 2006, 8:10:27 AM5/6/06

to

Michael Winter wrote:
> On 05/05/2006 21:02, VK wrote:
> >>> Look at you address bar: it is not
> >>> <http://www.nskom.com/external/tmp/xhtml/xhtml.html> anymore, [...]
> >>
> >> Yes, it is.

> For the record, I have IE 6 with Service Pack 2 installed, as well as
> all critical updates (and most of the optional ones, too).
>
> Version: 6.0.2900.2180.xpsp_sp2_gdr.050301-1519

OK... as a summary... :-)

I checked the situation on different computers, and IE does *not*
recognize "application/xhtml+xml" (which is not a secret), though with
all updates installed it allows to view pages served with such
Content-Type in a round around way described earlier:- instead of
simply prompt you with Save As.

On your machine(s) you have application/xhtml+xml added as extra
content type for ".html" extension through the Folder Options > File
Types. Either you did it long ago yourselve, or it was added by some
XML/XHTML related software installation. Simply remove this association
temporarily to observe the default behavior.

As "application/xhtml+xml" is not one of MIME's for .html in any of
default Windows/IE installations, you cannot count on it in the WWW. So
application/xhtml+xml Content-Type is not currently usable unless for
an intranet, as it renders your content unvailable for the majority of
visitors.

XML Prolog in XHTML document indeed forces IE to stay in Quirk
(BackCompat) mode. This problem asknowledged by IE team and will be
fixed in IE7, see
<http://blogs.msdn.com/ie/archive/2005/09/15/467901.aspx>
The XHTML document still will be parsed as HTML Transitional tag soup
with extra trash inside tags, but it least it will be possible to trig
W3C Box Model for it.

>From the same blog as an epilog:
<q>I've also been reading comments for some time in the IEBlog asking
for support for the "application/xml+xhtml" MIME type in IE. I
should say that IE7 will not add support for this MIME type - we
will, of course, continue to read XHTML when served as "text/html",
presuming it follows the HTML compatibility recommendations.</q>

P.S. <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"> is the only
doctype where you can match content, DTD and content type together. Use
this - or use XML+XSL

David Håsäther

unread,

May 6, 2006, 8:13:00 AM5/6/06

to

Henri Sivonen <hsiv...@iki.fi> wrote:

> In article
> <1146854828.6...@y43g2000cwc.googlegroups.com>,
> "VK" <school...@yahoo.com> wrote:
>
>> I would say that a string like <!DOCTYPE HTML PUBLIC "-//W3C//DTD

>> HTML 4.01//EN"> is a document type declaration containing document

>> type definition in it.
>
> Referencing it, rather.

Yes, it's referencing it, but I wouldn't call the wording "the
document type declaration contains the document type defintion"
wrong. In fact, I even remember Arjun Ray saying just that :-)

>> Lucky both produces the same acronym DTD :-)
>
> But DTD only stands for document type *definition*.

Confusingly though (and you probably know this Henri), the public
text class "DTD" as in "-//W3C//DTD HTML 4.01//EN" stands for
document type declaration *subset*.

--
David Håsäther

Richard Cornford

unread,

May 6, 2006, 8:48:31 AM5/6/06

to

VK wrote:
> Richard Cornford wrote:

>> VK wrote:
>> <snip>
>>> Change <!DOCTYPE FOOBAR "Micro$oft must die!" HTML 3>
>>> back to <!DOCTYPE FOOBAR "Micro$oft must die!"> - and
>>> the parser will hit a match right away.
>>
>> And change it to:-
>>
>> <!DOCTYPE FOOBAR "Micro$oft must die! HTML 30">
>>
>> - and we are back to quirks mode.
>
> and change it to <!DOCTYPE FOOBAR "Micro$oft must die! XHTML 30">
> and we are back to CSS1Compat mode.
>
> and now remove "DOCTYPE" word: <!FOOBAR "Micro$oft must die!
> XHTML 30"> and we are back to BackCompat mode.

Yes. When you attempted to dismiss formulations of DOCTYPE that were
"Unknown DTD" but still resulted in quirks mode because they were "Not a
DTD declaration at all" due to their format you were demonstrably wrong.
And similarly a DOCTYPE that you consider did not qualify as "Not a DTD
declaration at all" (had what you consider an acceptable format) but was
still an "Unknown DTD" could also result in IE operating in quirks mode.

They prove that your original assertion, the one that you suggested was
so self-evidently true that being asked to prove it was akin to being
asked to "prove me that the sky is blue", is in fact utterly false. That
it was, as expected, another fiction that you made up off the top of
your head.

> Someone (not me) just refuses to read the relevant
> producer's documentation and prefers to make up her
> own picture.

You are the individual here making false claims about IE's behaviour and
assessing that they are as true as "the sky is blue". I couldn't care
less about the exact algorithm IE uses. Any interest in that detail, and
time expended trying to deduce it from the behaviour, is utterly wasted.
It doesn't matter because:-

1. No other browser is likely to use precisely the same algorithm
to make similar determinations.
2. There is nothing that can be done with the information that
cannot be done without it (that is, an author may take total
control of which of the two modes apply to a particular HTML
document with no more information that that the DOCTYPEs
proposed in the HTML specifications will result in standards
mode and that no DOCTYPE at all will result in quirks mode. No
more than that can be achieved by knowing the precise location
and shape of the demarcation that IE draws between the two).

> And the needed reading is ...

... not needed. The falsity of your "the sly is blue" assertion has
already been demonstrated.

> But I can make the challenge even less challenging :-)
> by giving you a plain words summary:
>
> 1) IE treats any string starting with <!DOCTYPE before
> <html> tag as a DTD

In the parsing of HTML documents what would qualify as "any string
starting .. "? Or even as a "string"?

> 2) If it meets such string then it looks for substring
> HTML<space><number>

<snip>

So you are describing behaviour that is utterly different from your
original "the sky is blue" assertion? That was a fiction, and its being
questions was completely reasonable.

This, where are we now, third - fourth, formulation is as likely to be
guess-work as any of the preceding ones (even if you could express it in
suitable terminology). Without seeing Microsoft's source code the
reality of the precise behaviour would be difficult to deduce, and
particularly by someone with as little talent for analyses and logic as
you demonstrate. But above all, the pursuit of that information is an
irrelevance; a waste of time and effort that, even if successful, could
not result in anything of any greater use than knowing how to control
the outcome, as we already do.

Richard.

Henri Sivonen

unread,

May 6, 2006, 9:04:42 AM5/6/06

to

In article <Xns97BB909D72D...@195.67.237.51>,
"David Håsäther" <hasa...@msn.com> wrote:

> Henri Sivonen <hsiv...@iki.fi> wrote:
>
> > In article
> > <1146854828.6...@y43g2000cwc.googlegroups.com>,
> > "VK" <school...@yahoo.com> wrote:
> >
> >> I would say that a string like <!DOCTYPE HTML PUBLIC "-//W3C//DTD
> >> HTML 4.01//EN"> is a document type declaration containing document
> >> type definition in it.
> >
> > Referencing it, rather.
>
> Yes, it's referencing it, but I wouldn't call the wording "the
> document type declaration contains the document type defintion"
> wrong. In fact, I even remember Arjun Ray saying just that :-)

Well, yes, as an abstract concept, the document type declaration
"incorporates" the document type definition, but the *string* above does
not really literally contain the document type definition but its public
id.

> >> Lucky both produces the same acronym DTD :-)
> >
> > But DTD only stands for document type *definition*.
>
> Confusingly though (and you probably know this Henri), the public
> text class "DTD" as in "-//W3C//DTD HTML 4.01//EN" stands for
> document type declaration *subset*.

Isn't SGML terminology cool? :-)

To quote the Not-FAQ by Joe English:
'(SGML has a tradition of using the longest possible phrases to describe
the most frequently talked-about concepts; see also
"declared-content-or-content-model".)'

Of course, none of this matters on the real Web, because on the real
Web, HTML being an application of SGML is just fiction.

Michael Winter

unread,

May 6, 2006, 9:47:02 AM5/6/06

to

On 06/05/2006 13:10, VK wrote:

[snip]

> I checked the situation on different computers, and IE does *not*

> recognize "application/xhtml+xml" [...]

No-one has said it does.

> On your machine(s) you have application/xhtml+xml added as extra

> content type for ".html" extension [...]

No, I don't.

[snipped everything else predicated on that assumption]

IE is sniffing the extension (not that there is such a thing on the
Web). Notice that although

<http://www.nskom.com/external/tmp/xhtml/xhtml.html>

is served with the application/xhtml+xml Content-Type header, it uses a
.html 'extension'. It is this 'extension' that causes IE to process the
document as tag soup[1]. If it was changed to .xml, IE would display the
document in its XML tree view. If it was changed to .xhtml, IE would
display the download dialogue box.

> XML Prolog in XHTML document indeed forces IE to stay in Quirk
> (BackCompat) mode. This problem asknowledged by IE team and will be

> fixed in IE7 [...]

Interesting, but not that useful. There will still be /plenty/ of
pre-IE7 users left. Would I be correct in assuming that IE7 will still
be available only to users running XP and later?

[snip]

Mike

[1] I previously suggested that it was the meta element, but
that's not the case.

VK

unread,

May 6, 2006, 11:08:48 AM5/6/06

to

Michael Winter wrote:
> > On your machine(s) you have application/xhtml+xml added as extra
> > content type for ".html" extension [...]
>
> No, I don't.
>
> [snipped everything else predicated on that assumption]
>
> IE is sniffing the extension (not that there is such a thing on the
> Web). Notice that although
>
> <http://www.nskom.com/external/tmp/xhtml/xhtml.html>
>
> is served with the application/xhtml+xml Content-Type header, it uses a
> .html 'extension'. It is this 'extension' that causes IE to process the
> document as tag soup[1]. If it was changed to .xml, IE would display the
> document in its XML tree view. If it was changed to .xhtml, IE would
> display the download dialogue box.

I have by now 4 Windows XP machines poping up Save As dialog in this
very page. And one Windows 98 SE machine which downloads the page first
to the local store and shows it from there. You have one machine that
simply shows the page as a regular HTML. Of course the numbers can
change and your behavior can be the most common by percentage. But it's
definitely not something to claim as a "normal expected behavior" just
because it happens on one machine. Maybe we should call for volunteers
to visit this page on IE and tell us what happens for them.

> > XML Prolog in XHTML document indeed forces IE to stay in Quirk
> > (BackCompat) mode. This problem asknowledged by IE team and will be
> > fixed in IE7 [...]
>
> Interesting, but not that useful. There will still be /plenty/ of
> pre-IE7 users left. Would I be correct in assuming that IE7 will still
> be available only to users running XP and later?

Yup. IE7 is a Windows Vista pusher, screw on XP :-) Also this is the
last month of any official support for Windows 98/ 98 SE and Windows
2000 (ends up in June).

What I like in Microsoft - if show is over, it's over :-)

Michael Winter

unread,

May 6, 2006, 12:34:08 PM5/6/06

to

On 06/05/2006 16:08, VK wrote:

> Michael Winter wrote:

[snip]

>> IE is sniffing the extension (not that there is such a thing on the

>> Web). Notice that although
>>
>> <http://www.nskom.com/external/tmp/xhtml/xhtml.html>
>>
>> is served with the application/xhtml+xml Content-Type header, it
>> uses a .html 'extension'. It is this 'extension' that causes IE to

>> process the document as tag soup.

[snip]

> Maybe we should call for volunteers to visit this page on IE and tell
> us what happens for them.

Nothing is stopping readers from posting their own comments. However,
what purpose would that serve? IE's content handling mechanism is
fundamentally broken, and an author would be foolish to serve XHTML to
it. Why should anyone care what actually happens or why, if it's stupid
to do it in the first place?

Ian Hickson's blog entry made the observation that user agents will
ignore the Content-Type header in some circumstances. I have
demonstrated that he is correct - not that such a demonstration was
warranted - without fiddling with settings in order to support that
statement.

[snip]

> What I like in Microsoft - if show is over, it's over :-)

Just because Microsoft chooses to withdraw support for a particular
product doesn't mean its users suddenly upgrade. The legacy of
Microsoft's broken software will be with us all for a long time.

Mike

Thomas 'PointedEars' Lahn

unread,

May 13, 2006, 7:12:54 PM5/13/06

to

Henri Sivonen wrote:

> In article <1146646631....@j73g2000cwa.googlegroups.com>,
> "VK" <school...@yahoo.com> wrote:
>> Randy Webb wrote:
>> > So you are saying it totally disregards the DTD and any hints from the
>> > server how to handle the document?
>>
>> Except server reported Content-Type (text/plain, text/html, text/xml,
>> application/xhtml+xml etc.)
>> DTD string itself is irrelevant (and this string by itself is not a
>> "hint from the server" but a "hint from the document").
>
> The DTD is irrelevant

Not at all.

> (it is not fetched).

It is not fetched by tagsoup parsers in many known Web browsers because it
is built-in there. It is definitely fetched by XML parsers in known Web
browsers.

>> > > And if anyone curious: the build in DTD of IE6 is
>> > > <http://www.w3c.org/TR/1999/REC-html401-19991224/loose.dtd> This is
>> > > the only one it's aware of and the only one it uses.
>
> IE does not have built-in DTDs at all. The parsing is not DTD-based.

I would like to see proof of this. However, since we are talking about
closed source, I don't think there will be any. Still, even IE's parser
MUST implement a set parsing rules, as they are described by a DTD.

F'up2 ciwam

PointedEars
--
There are two possibilities: Either we are alone in the
universe or we are not. Both are equally terrifying.
-- Arthur C. Clarke

Eric B. Bednarz

unread,

May 13, 2006, 7:24:36 PM5/13/06

to

Thomas 'PointedEars' Lahn <Point...@web.de> writes:

> Henri Sivonen wrote:

>> The DTD is irrelevant
>
> Not at all.

Well, that's settled then.
(Henri, back to square one, please)

> [...] Still, even IE's parser

> MUST implement a set parsing rules,

I suppose I do agree, but someone who does speak English might care to
complete this sentence.

> as they are described by a DTD.

*That* would be a *real* challenge.

> F'up2 ciwam

Didn't work.

--
||| hexadecimal EBB
o-o decimal 3771
--oOo--( )--oOo-- octal 7273
205 goodbye binary 111010111011

Thomas 'PointedEars' Lahn

unread,

May 14, 2006, 6:12:12 AM5/14/06

to

Eric B. Bednarz wrote:

> Thomas 'PointedEars' Lahn <Point...@web.de> writes:
>> Henri Sivonen wrote:
>> [...] Still, even IE's parser MUST implement a set parsing rules,
>
> I suppose I do agree, but someone who does speak English might care to
> complete this sentence.

a set _of_ parsing rules

>> as they are described by a DTD.
>
> *That* would be a *real* challenge.

Because?

>> F'up2 ciwam
>
> Didn't work.

_ciwah_ was meant of course.

PointedEars

Thomas 'PointedEars' Lahn

unread,

May 14, 2006, 6:59:15 AM5/14/06

to

Steve Pugh wrote:

> "VK" <school...@yahoo.com> wrote:
>> Unlike CSS1, CSS2.1 is just a working draft: "Publication as a Working
>> Draft does not imply endorsement by the W3C Membership. This is a draft
>> document and may be updated, replaced or obsoleted by other documents
>> at any time. It is inappropriate to cite this document as other than
>> work in progress."
>
> That's true as far as it goes, but CSS 2.1 is actually at the
> Candidate Recommendation stage (a stage that didn't exist when CSS 1
> was drafted)

Not anymore. It has been a CR until June 2005; it is a WD again (second
revision already).

> and regardless of its official status it's the closest thing we have to
> a standard for CSS today.

Yes and no. CSS 2.1, being a WD, still MUST NOT be used as reference
material (that applied even to the previous CR[1]!); however, the CSS2
errata also say that CSS 2.1 should be considered the CSS2 errata where
CSS2 and CSS 2.1 differ.[2]

>>Besides some more or less stable parts, CSS 2.1 draft is also used as a
>>dumpster for some of W3C' members nightly thoughts and revelations :-)
>
> I think you may be getting confused with CSS 3.

Since parts of CSS3 (like `opacity', as you mentioned) are already
implemented in some UAs (with Gecko-based UAs being the foremost ones
in that regard), probably not. VK is just hallucinating again about
the "evil W3C" of his fantasy world.

PointedEars
___________
[1] <URL:http://www.w3.org/TR/2004/CR-CSS21-20040225/>
[2] <URL:http://www.w3.org/Style/css2-updates/REC-CSS2-19980512-errata.html>

VK

unread,

May 14, 2006, 7:52:37 AM5/14/06

to

VK wrote:
> Besides some more or less stable parts, CSS 2.1 draft is also used as a
> dumpster for some of W3C' members nightly thoughts and revelations :-)

Steve Pugh wrote:
> _more_ examples.

VK wrote:
> A theoretical criticism of W3C working drafts is out of my interest

But occasionally I've found another one: see "Problems loading fonts"
thread in this group.
W3C is again like in a time warp - as if the reasons of .ptr and .eot
formats never existed and it's still 1995 outside.

Steve Pugh

unread,

May 14, 2006, 9:34:57 AM5/14/06

to

"VK" <school...@yahoo.com> wrote:
>VK wrote:
>> Besides some more or less stable parts, CSS 2.1 draft is also used as a
>> dumpster for some of W3C' members nightly thoughts and revelations :-)
>
>Steve Pugh wrote:
>> _more_ examples.
>
>VK wrote:
>> A theoretical criticism of W3C working drafts is out of my interest
>
>But occasionally I've found another one: see "Problems loading fonts"
>thread in this group.

Seen it.

>W3C is again like in a time warp - as if the reasons of .ptr and .eot
>formats never existed and it's still 1995 outside.

Not sure what you're trying to say.

CSS2 contained @font-face as a generic mechanism for embedding fonts.
It listed "An initial list of format strings defined by this
specification and representing formats likely to be used by
implementations on various platforms" which included both .ptr and
.eot. So whatever the reasons for those formats they were taken into
consideration by the W3C.

Now, CSS 2.1 which was actually what we were discussing here dropped
support for @font-face because of lack of support amongst browsers.
The merits of that decision that can be debated but am I to assume
that when you spoke of "dumpster for some of W3C' members nightly
thoughts and revelations" you were talking about the things that were
taken out of 2.1 as well as the things that put in?

Steve
--
"My theories appal you, my heresies outrage you,
I never answer letters and you don't like my tie." - The Doctor

Steve Pugh <st...@pugh.net> <http://steve.pugh.net/>

Henri Sivonen

unread,

May 14, 2006, 4:32:08 AM5/14/06

to

In article <1722908.x...@PointedEars.de>,
Thomas 'PointedEars' Lahn <Point...@web.de> wrote:

> Henri Sivonen wrote:

> > The DTD is irrelevant
>
> Not at all.
>
> > (it is not fetched).
>
> It is not fetched by tagsoup parsers in many known Web browsers because it
> is built-in there. It is definitely fetched by XML parsers in known Web
> browsers.

Do Firefox, Opera and Safari count as known Web browsers?

http://hsivonen.iki.fi/test/entitytest.xml

Toby Inkster

unread,

May 15, 2006, 3:21:35 AM5/15/06

to

Henri Sivonen wrote:

> Thomas 'PointedEars' Lahn wrote:
>
>> It is not fetched by tagsoup parsers in many known Web browsers because it
>> is built-in there. It is definitely fetched by XML parsers in known Web
>> browsers.
>
> Do Firefox, Opera and Safari count as known Web browsers?
> http://hsivonen.iki.fi/test/entitytest.xml

He didn't say "all known browsers", just "known browsers" which implies at
least two known browsers.

http://www.google.com/search?q=HyBrick

That's the only one I know of though.

Eric B. Bednarz

unread,

May 16, 2006, 5:33:06 PM5/16/06

to

Thomas 'PointedEars' Lahn <Point...@web.de> writes:

> Eric B. Bednarz wrote:
>
>> Thomas 'PointedEars' Lahn <Point...@web.de> writes:

>>> [...] Still, even IE's parser MUST implement a set parsing rules,
>>
>> I suppose I do agree, but someone who does speak English might care to
>> complete this sentence.
>
> a set _of_ parsing rules

Ah; I (re)consider it a nice touch that cynism is lost on you :)

>>> as they are described by a DTD.
>>
>> *That* would be a *real* challenge.
>
> Because?

I wouldn't know how tag salad mangling could possibly be described by a
DTD, or whatever practical outcome would indicate that anything like
that was even silently pre-attached before assumed error handling
(especially in the case of text/html and specifically Internet
Exploder).

But beyond M$ bashing, Geckos evidently treat IDs *syntactically* wrong
in HTML (i.e. case sensitive, as the prose erroneously requires), which
simply could not happen with an SGML parser pre-attached. But we've
already been there, if memory serves.

Let alone *any* markup minimization features that cannot be handled (at
least, Opera stands out and manages to ignore internal subsets since
version 8, I think; quite something).

What would be the purpose of looking for a description of parsing rules
that cannot be handled by the executing application in the first place?

VK

unread,

May 17, 2006, 8:17:33 AM5/17/06

to

Eric B. Bednarz wrote:
> I wouldn't know how tag salad mangling could possibly be described by a
> DTD, or whatever practical outcome would indicate that anything like
> that was even silently pre-attached before assumed error handling
> (especially in the case of text/html and specifically Internet
> Exploder).

I had/have/will have nasty argues with Thomas, but his original
statement that "DTD for XML are always fetched" is totally correct. The
problem is that unlike W3C DTD's these are /real/ DTD's - and they are
in the same relations with the discussed woodoo before <html> tag as
say chemistry and alchemy.

In the test case (I posted it for another style-related question in
ciwas, but it goes as a quick sample) the xsl transformer has doctype
declaration:

<http://www.geocities.com/schools_ring/tmp/demo01/index.xml>

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE template [
<!ENTITY nbsp " "> <!ENTITY copy "©">
<!ENTITY lq "«"> <!ENTITY rq "»">
]>

This is pretty standard for xsl templates (because say non-breaking
space in template would lead to a parsing error). Obviously if you have
to include the same doctype in a number of templates, you want to make
it as a separate file:

<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?>
<!DOCTYPE RootElement SYSTEM "common.dtd">

where common.dtd file is:
<!ENTITY nbsp " ">
<!ENTITY copy "©">
<!ENTITY lq "«">
<!ENTITY rq "»">

1) standalone="no" flag in prolog instruct the parser that before
validating it has to retrieve additional definitions from external DTD

2) RootElement is the root element in your XML file; in my sample it's
<repository> so it says <!DOCTYPE repository SYSTEM "common.dtd">

3) SYSTEM modifier indicates that this DTD is made for internal use by
transformers and not for external references (that would be PUBLIC
then).

Is not it cool to watch the "woodoo" gets back its original sense and
functionality? ;-)

It also shows clearly the whole "woodooism" of W3C's DTD's because in
their "suggested use" they do something really strange: they have XML
prolog (taking XHTML) but with standalone="yes" (default value)
immediately followed by instruction to load an external DTD. It doesn't
have any relation with proper XML not talking about HTML. It may be a
proper XHTML though... a third path or something...

Overall W3C DTD by their current functionality have nothing to do with
the original non-degraded DTD's. W3C DTD's are pretty close to opaque
strings in namespace declarations. Say when I write
<bindings xmlns="http://www.mozilla.org/xbl">
I don't care if there is something useful at
<http://www.mozilla.org/xbl> (actually there is nothing). It's just an
opaque string to match against used namespace <bindings>. This is the
way for UA to ensure that the used namespace is /this/ namespace it's
aware of, not some other with the same element names.

And the last but not least:
Currently Firefox cannot load external DTD's at all. This is a nasty
bug, but to fix it properly they have to solve somehow the problem with
the bogus DTD from W3C. Other XSLT-standard compliant browsers
including IE do not have this problem. See more at:

"External entities are not included in XML document"
<https://bugzilla.mozilla.org/show_bug.cgi?id=69799>

Eric B. Bednarz

unread,

May 17, 2006, 7:37:58 AM5/17/06

to

"VK" <school...@yahoo.com> writes:

> Eric B. Bednarz wrote:

> I had/have/will have nasty argues with Thomas, but his original
> statement that "DTD for XML are always fetched" is totally correct.

(Misquoting him will keep the ball rolling, I am sure.)

The *above* statement is totally incorrect; if you disagree just quote
the relevant part of the XML spec.

But then I don't have any idea what kind of point you are aiming at; you
were babbling about IE 'using' a particular default DTD in a text/html
context.

> Is not it cool to watch the "woodoo" gets back its original sense and
> functionality? ;-)

It was cool that you cared to explain all this amazing rocket science
stuff to me, thank you so much. As to "woodoo" (John Woodoo, I
presume), document type declarations work for me in their original sense
every other day when writing HTML.

> standalone="yes" (default value)

Once you located the spec to back up your first statement above, please
make a note about the section that defines this default value as well.

VK

unread,

May 17, 2006, 12:52:49 PM5/17/06

to

Eric B. Bednarz wrote:
> But then I don't have any idea what kind of point you are aiming at; you
> were babbling about IE 'using' a particular default DTD in a text/html
> context.

I was not bubling as you say: I was explaining you the basic things you
should know

> It was cool that you cared to explain all this amazing rocket science
> stuff to me, thank you so much.

You are welcome.

> As to "woodoo" (John Woodoo, I
> presume), document type declarations work for me in their original sense
> every other day when writing HTML.

(woodoo = voodoo)

"in their original sense" I presume then as opaque strings identifying
this or that (X)HTML environment? Sorry for you if it /is/ the original
sense of DOCTYPE and linked DTD's.

> > standalone="yes" (default value)
>
> Once you located the spec to back up your first statement above, please
> make a note about the section that defines this default value as well.

You have to learn XML (real XML, not pseudo-XML XHTML crap) and XSLT
(the latter is not obligatory but would be very nice). I linked in my
previous post a sample and the Mozilla bug at bugzilla.mozilla.org
which contains a lot of useful links and references. You may start with
the latter for the basics of the prolog syntax.

Overall I suggest to read the relevant manuals and make a couple of
your own simple pages - a lot of things are getting much clearer on
practice.

Michael Winter

unread,

May 17, 2006, 1:21:24 PM5/17/06

to

On 17/05/2006 13:17, VK wrote:

[snip]

> I had/have/will have nasty argues with Thomas, but his original
> statement that "DTD for XML are always fetched" is totally correct.

Thomas did not write that, and a good thing, too: as a blank statement,
it is totally false.

[snip]

> 1) standalone="no" flag in prolog instruct the parser that before
> validating it has to retrieve additional definitions from external DTD

The standalone document declaration doesn't instruct a validating
processor to 'do' anything. It is a requirement of validating processors
themselves to process the DTD and any referenced external entities.

The standalone document declaration does have an impact on
well-formedness (see Entity Declared in section 4.1 [p.33]), and on
non-validating processor when reading parameter entities.

[snip]

> [...] standalone="yes" (default value) [...]

How on Earth you managed to get that idea is beyond me.

[snip]

> And the last but not least:
> Currently Firefox cannot load external DTD's at all.

It can. It just chooses not to for the most part. Opera doesn't process
external entities, either.

> This is a nasty bug,

It's not a bug at all; on the Web, neither Firefox nor Opera implement
validating XML processors.

> but to fix it properly they have to solve somehow the problem with
> the bogus DTD from W3C.

Hopefully you've resolved that misconception.

[snip]

VK

unread,

May 17, 2006, 2:24:55 PM5/17/06

to

Michael Winter wrote:
> > 1) standalone="no" flag in prolog instruct the parser that before
> > validating it has to retrieve additional definitions from external DTD
>
> The standalone document declaration doesn't instruct a validating
> processor to 'do' anything. It is a requirement of validating processors
> themselves to process the DTD and any referenced external entities.
>
> The standalone document declaration does have an impact on
> well-formedness (see Entity Declared in section 4.1 [p.33]), and on
> non-validating processor when reading parameter entities.
>
> [snip]
>
> > [...] standalone="yes" (default value) [...]
>
> How on Earth you managed to get that idea is beyond me.
>
> [snip]

With all my deep respect I only can repeat the advise given to the
previous opponent. Besides a very informative discussion around the
mentioned bug at bugzilla, you also may read
<http://www.w3.org/TR/REC-xml/#sec-rmd> Actually this and additional
sections are mentioned in the bug thread, but you may want to start
right wrom W3C.

Michael Winter

unread,

May 17, 2006, 3:06:22 PM5/17/06

to

On 17/05/2006 19:24, VK wrote:

> Michael Winter wrote:

[VK:]

>>> [...] standalone="yes" (default value) [...]
>>
>> How on Earth you managed to get that idea is beyond me.
>

> With all my deep respect I only can repeat the advise given to the
> previous opponent.

If there are no external markup declarations, the standalone
document declaration has no meaning. If there are external
markup declarations but there is no standalone document
declaration, the value "no" is assumed.
-- 2.9 Standalone Document Declaration, XML 1.0 (2nd Ed.)

Eric B. Bednarz

unread,

May 17, 2006, 5:32:02 PM5/17/06

to

"VK" <school...@yahoo.com> writes:

> Eric B. Bednarz wrote:

>> [...] document type declarations work for me in their original sense

>> every other day when writing HTML.

> "in their original sense" I presume then as opaque strings identifying

> this or that (X)HTML environment?

No, I mean that Emacs knows where my catalog and nsgmls are, and that my
catalog knows where the DTDs are.

> Sorry for you if it /is/ the original
> sense of DOCTYPE and linked DTD's.

Oh. Now that's what I call a tough break.

> You have to learn XML

Due to this encouragement I'll try some day.

(Emacs also knows where my RELAX NG schemas are; but if I have any
questions about DTDs, you'll be the first one I ask for advice. :)

VK

unread,

May 18, 2006, 4:10:51 AM5/18/06

to

Michael Winter wrote:
> If there are no external markup declarations, the standalone
> document declaration has no meaning. If there are external
> markup declarations but there is no standalone document
> declaration, the value "no" is assumed.
> -- 2.9 Standalone Document Declaration, XML 1.0 (2nd Ed.)

Bingo! ;-)

Applying the quoted rule to say:

<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

Andy Dingley <dingbat@codesmiths.com>

unread,

May 18, 2006, 5:27:28 AM5/18/06

to

VK wrote:

> I had/have/will have nasty argues with Thomas, but his original
> statement that "DTD for XML are always fetched" is totally correct.

No, that statement is entirely incorrect (and Thomas is too smart to
have said that anyway).

XML apps _may_ of course fetch DTDs, but in the absence of any real
statistical evidence I'd guess that their actual practice is that far
fewer XML apps fetch DTDs than similar SGML apps do. The reason is
simple - XML never needs a DTD to parse the document into the DOM. This
is the absolutely fundamental design difference between XML and SGML.

As an emergent result, it's possible to do useful work with XML without
ever even producing a DTD or Schema, and this is generally the way that
commercial XML work is done. If you ever do fetch the DTD the most
common result is to discover that it's actually some years out of date
and is no longer valid against the structure of the live documents.

VK

unread,

May 18, 2006, 6:18:54 AM5/18/06

to

VK wrote:
> > I had/have/will have nasty argues with Thomas, but his original
> > statement that "DTD for XML are always fetched" is totally correct.

Andy Dingley wrote:
> No, that statement is entirely incorrect (and Thomas is too smart to
> have said that anyway).

<http://groups.google.com/group/comp.infosystems.www.authoring.html/tree/browse_frm/thread/4ac44109aac7fa53/2d44e7f84a7a0065?rnum=21&hl=en&_done=%2Fgroup%2Fcomp.infosystems.www.authoring.html%2Fbrowse_frm%2Fthread%2F4ac44109aac7fa53%2F0a4906e0027d1058%3Fhl%3Den%26#doc_6f68f10a6c9518a9>

Thomas 'PointedEars' Lahn wrote:
> It is not fetched by tagsoup parsers in many known Web browsers because it
> is built-in there. It is definitely fetched by XML parsers in known Web
> browsers.

Which is (once again) totally correct in application to /DTD/ in /XML/
(not pseudo-DTD in pseudo-XML aka XHTML)
I just presume that Thomas was not aware of the current bug in Gecko
preventing it to fetch external DTD. But this bug (though should be
mentioned) doesn't render the statement as it is wrong.

> XML apps _may_ of course fetch DTDs, but in the absence of any real
> statistical evidence I'd guess that their actual practice is that far
> fewer XML apps fetch DTDs than similar SGML apps do. The reason is
> simple - XML never needs a DTD to parse the document into the DOM. This
> is the absolutely fundamental design difference between XML and SGML.

Yet again the lack of prectical experience is demonstrated in this
statement. Without a DTD with at least the most necessary extra
entities you simply not capable to parse HTML template. The only
entities any XML parser is aware of are amp quot apos lt gt. Anything
atop has to be added to internal or external DTD declaration to let the
document to validate right. And - because of the mentioned Mozilla bug
- so far the only option is to use internal DTD. Take any real life XML
(or XML feed) to find out that DOCTYPE section can take many lines
right after prolog.

> As an emergent result, it's possible to do useful work with XML without
> ever even producing a DTD or Schema

Only in an narrow set of situations not conncted with the Web, see
above.

> and this is generally the way that
> commercial XML work is done.

Technically impossible and factually wrong. Please follow my advise and
study some real life commercial XML/XSL solution or a XML news feed.

> If you ever do fetch the DTD the most
> common result is to discover that it's actually some years out of date
> and is no longer valid against the structure of the live documents.

Here you're comind back to the the W3C's bogus DTD used as opaque
strings. This is the hard choice to make to you (and to W3C). Either we
agree that there is only one DOCTYPE and only one DTD mechanics equal
to any document where used; or we agree that there is that DOCTYPE
(HTML/XHTML) and this DOCTYPE (XML) with completely different rules and
functionality.

Andy Dingley <dingbat@codesmiths.com>

unread,

May 18, 2006, 8:31:31 AM5/18/06

to

> VK wrote:
> > > I had/have/will have nasty argues with Thomas, but his original
> > > statement that "DTD for XML are always fetched" is totally correct.
>
> Andy Dingley wrote:
> > No, that statement is entirely incorrect (and Thomas is too smart to
> > have said that anyway).
> <http://groups.google.com/group/comp.infosystems.www.authoring.html/tree/browse_frm/thread/4ac44109aac7fa53/2d44e7f84a7a0065?rnum=21&hl=en&_done=%2Fgroup%2Fcomp.infosystems.www.authoring.html%2Fbrowse_frm%2Fthread%2F4ac44109aac7fa53%2F0a4906e0027d1058%3Fhl%3Den%26#doc_6f68f10a6c9518a9>

And Thomas' exact quote is "It is definitely fetched by XML parsers in
known _WEB_ browsers."
(my emphasis)

Now I have no idea if this is accurate - I haven't tested XML web
browsers.

However it is _not_ the same as saying that the fetch happens for "all
XML parsers". Now from my own direct knowledge I know that much
certainly isn't true. For one thing it can't be true because many of
the world's live XML apps don't even _have_ a DTD (or an accurate DTD)
to fetch.

> Which is (once again) totally correct in application to /DTD/ in /XML/
> (not pseudo-DTD in pseudo-XML aka XHTML)

XHTML is not pseudo-XML. XHTML under Appendix C is pseudo-SGML, and
not XML at all. XHTML as XML) is (or should be) perfectly compliant
XML.

Of course a "workable" browser needs to make best sense of any rubbish
it's given, but that's a separate problem (and they're hardly likely to
resolve it by fetching DTDs)

> > XML apps _may_ of course fetch DTDs, but in the absence of any real
> > statistical evidence I'd guess that their actual practice is that far
> > fewer XML apps fetch DTDs than similar SGML apps do. The reason is
> > simple - XML never needs a DTD to parse the document into the DOM. This
> > is the absolutely fundamental design difference between XML and SGML.
>
> Yet again the lack of prectical experience is demonstrated in this
> statement.

I've been delivering commercial XML apps since back in the last
century. I have far more XML experience than you, and I'm no doubt
twice your age and have twice the experience in software engineering
too. I'll listen to "lack of experience" claims from Jukka, Alan or
Nick, but not many others in this ng.

> Without a DTD with at least the most necessary extra
> entities you simply not capable to parse HTML template.

HTML ? or XHTML ? And what's a "template" ? If you're going to nit
pick, then you need to be precise.

> The only entities any XML parser is aware of are amp quot apos lt gt.

Agreed.

However why is this a problem "for parsing into the XML DOM" ? If an
XML parser meets an unrecognised entity, then it's entitled to choke on
it. For that reason entity references (other than those in the TR) are
not commonly used in XML apps, as they are in widespread use in the
SGML world.

As a specific instance, look at the number of RSS feeds around with
HTML (non-XML) entity references occuring in them, and the errors that
causes. If you want to build a "workable" RSS feed parser then you need
to cope with this, because feeds just aren't reliably XML valid if they

encounter an é

> > As an emergent result, it's possible to do useful work with XML without
> > ever even producing a DTD or Schema
>
> Only in an narrow set of situations not conncted with the Web, see
> above.

Hardly "narrow". In fact it's pretty much all the XML in the world
(the web not yet being a widespread XML medium)

> > and this is generally the way that
> > commercial XML work is done.
>
> Technically impossible and factually wrong.

Why? I build XML apps all day - I _very_ rarely see a DTD. If I bother
building something at all, it's far more likely to be XML Schema
anyway. Admittedly I don't use entities.

> Please follow my advise and
> study some real life commercial XML/XSL solution or a XML news feed.

So where is the DTD for an RSS 2.0 news feed ?! (or most other RSS
versions)

> > If you ever do fetch the DTD the most
> > common result is to discover that it's actually some years out of date
> > and is no longer valid against the structure of the live documents.
>
> Here you're comind back to the the W3C's bogus DTD used as opaque
> strings.

No, I'm talking about widespread non-web XML practice. DTD's just don't
get written. Almost no commercial XML developers even understand their
syntax!

> This is the hard choice to make to you (and to W3C). Either we
> agree that there is only one DOCTYPE and only one DTD mechanics equal
> to any document where used; or we agree that there is that DOCTYPE
> (HTML/XHTML) and this DOCTYPE (XML) with completely different rules and
> functionality.

I'd agree with this statement in the context of web browsing. DTDs are
a design and documentation mechanism, no more (in practice, for the
web). HTML's parsing and usage depends on some internal structure
representation within the browser (I can't say any more detail than
this) and there's no reason why that needs to be a DTD, rather than
explicit code. Doctype identifiers on "the web" are thus merely
treated as opaque strings, not URLs to a DTD that needs to be retrieved
(of course any browser may choose to, but most are unlikely to).

Michael Winter

unread,

May 18, 2006, 9:29:46 AM5/18/06

to

On 18/05/2006 09:10, VK wrote:

> Michael Winter wrote:
>
>> If there are no external markup declarations, the standalone
>> document declaration has no meaning. If there are external
>> markup declarations but there is no standalone document
>> declaration, the value "no" is assumed.
>> -- 2.9 Standalone Document Declaration, XML 1.0 (2nd Ed.)
>
> Bingo! ;-)

Please don't tell me that you still think you were right. One doesn't
need to be an XML expert to realise that the phrase 'the value "no" is
assumed' means quite the opposite of what you wrote.

If that is, somehow, an admission that you were wrong, you might want to
make it a bit more explicit in future.

VK

unread,

May 18, 2006, 9:40:30 AM5/18/06

to

Andy Dingley <din...@codesmiths.com> wrote:
> And Thomas' exact quote is "It is definitely fetched by XML parsers in
> known _WEB_ browsers."
> (my emphasis)

Thomas reads way too many W3C specs and he got infected by W3C's style
where a crystal clear looking affirmative statement contains one or two
words allowing to interprete it in N different ways :-)

I guess it will be useful for him to see the actual negative effect of
such writing. Let's imagine for a second that the Thomas' post is one
of W3C's paragraphs and we have to retrieve the "original intended
meaning" out of it.

First of all let's bring the quote to the source state:
<q>It is definitely fetched by XML parsers in known Web browsers.</q>
As you see there is no emphasis of any kind on the word "Web". By
reading the sentence with normal intonation we see nothing but regular
term "Web browser" written by rules of English grammar thus with the
word "Web" capitalized. Yet this sentence leaves a hole - a very narrow
one though - you just managed to squeeze in. By adding emphasis: "in
known W`eb browsers" one cand pretend that there are some Web browsers
and non-Web browsers. That's a nice try (reinforced by immediately
invented "XML web browsers") but unfortunately having no sense in
English. There are only browsers / Web browsers and nothing more. Yet
different Web browsers are capable to render different sets of
electonic documents. How Thomas could avoid this branch of discussion
(and me from typing all this words)? By simply saying it in the
technically proper way: "It is definitely fetched by known Web browsers
if served as XML document".
(I wish some of W3C writers would found this thread :-)

> Now I have no idea if this is accurate - I haven't tested XML web
> browsers.

Naturally you didn't as there are not such. There are Web browsers with
XML parsers; see the rest a bit above.

> However it is _not_ the same as saying that the fetch happens for "all
> XML parsers". Now from my own direct knowledge I know that much
> certainly isn't true. For one thing it can't be true because many of
> the world's live XML apps don't even _have_ a DTD (or an accurate DTD)
> to fetch.

1) If a XML document served as XML document and 2) it contains external
DTD declaration and 3) prolog has flag standalone="no" then any
standard-compliant XML parser is /obligated/ to retrieve all entities
from the linked DTD before starting the validation. All
standard-compliant browsers indeed do this except Firefox due to the
mentioned bug to be fixed. The relevant part of XML specs was also
linked and quoted several times here, so if it contadicts your
expectations, you may argue with W3C, not with me.

> > Which is (once again) totally correct in application to /DTD/ in
/XML/
> > (not pseudo-DTD in pseudo-XML aka XHTML)
>
> XHTML is not pseudo-XML. XHTML under Appendix C is pseudo-SGML, and
> not XML at all. XHTML as XML) is (or should be) perfectly compliant
> XML.

This is why I guess they start with `XML' prolog? Not, of course not,
it's just added for fun so XML parsers would have some job to do.

> > > XML apps _may_ of course fetch DTDs, but in the absence of any real
> > > statistical evidence I'd guess that their actual practice is that far
> > > fewer XML apps fetch DTDs than similar SGML apps do. The reason is
> > > simple - XML never needs a DTD to parse the document into the DOM. This
> > > is the absolutely fundamental design difference between XML and SGML.
> >
> > Yet again the lack of prectical experience is demonstrated in this
> > statement.
>
> I've been delivering commercial XML apps since back in the last
> century. I have far more XML experience than you, and I'm no doubt
> twice your age and have twice the experience in software engineering
> too. I'll listen to "lack of experience" claims from Jukka, Alan or
> Nick, but not many others in this ng.

OK - taking "experience" back, rephrase it to "XHTML-free thinking
experience". While dealing with all these XML-looking imitations it is
easy to forget that there is somewhere real XML with its own real
rules. You may notice that I got enough either of "no experience /
foggy mumbling" comments. People A, B and C get you to the point, and D
gets all your frustration. Happens in the life, happens in the Usenet
:-)

> Doctype identifiers on "the web" are thus merely
> treated as opaque strings, not URLs to a DTD that needs to be retrieved

And: XHTML documents are not XML document and do not follow
XML-conforming DOCTYPE/DTD rules.

This statement can be used as the conclusive for the discussion unless
someone has more comments.

VK

unread,

May 18, 2006, 9:52:53 AM5/18/06

to

Michael Winter wrote:
> >> If there are no external markup declarations, the standalone
> >> document declaration has no meaning. If there are external
> >> markup declarations but there is no standalone document
> >> declaration, the value "no" is assumed.
> >> -- 2.9 Standalone Document Declaration, XML 1.0 (2nd Ed.)
> >
> > Bingo! ;-)
>
> Please don't tell me that you still think you were right.

Yes I do - and I'm actually surprised that you don't. I read this quote
in your previous post as an admission of your mistake. If we still
tango, then read the relevant discussion Masayasu Ishikawa - Heikki
Toivonen at <https://bugzilla.mozilla.org/show_bug.cgi?id=69799>

Andy Dingley <dingbat@codesmiths.com>

unread,

May 18, 2006, 11:24:58 AM5/18/06

to

VK wrote:

> I guess it will be useful for him to see the actual negative effect of
> such writing. Let's imagine for a second that the Thomas' post is one
> of W3C's paragraphs and we have to retrieve the "original intended

> meaning" out of it.0

What deconstructivist twaddle is this?

You're in a hole, stop digging.

It might be true that some XML gadgets somewhere sometimes retrieve a
DTD, but it is not a requirement on XML parsing in total. Now stop
saying it is, stop saying you never said that it is, and stop saying
that other people agreed with you.

...or else get yourself a black turtleneck, shave your head and start
posting about transformative hermeneutics instead.

> 1) If a XML document served as XML document and 2) it contains external
> DTD declaration and 3) prolog has flag standalone="no" then any
> standard-compliant XML parser is /obligated/ to retrieve all entities
> from the linked DTD before starting the validation.

This is reasonably correct, however condition 2) is not usually
encountered in XML documents, especially the more trivial ones, and
_unlike_ SGML, XML still has parsing behaviour permitted (nay,
encouraged) in this case.

XML is parseable without a DTD, unless one is required.
SGML requires a DTD and isn't parseable without one.

> > > Which is (once again) totally correct in application to /DTD/ in
> /XML/
> > > (not pseudo-DTD in pseudo-XML aka XHTML)

What is "pseudo-XML" about XHTML ?

Bizarre behaviour about the processing of XHTML in some web browsers is
not a fault of XHTML, it's a characteristic of the browsers. (nor is
Appendix C pseudo-XML, because it's clearly presenting its XML-like
self as pseudo-SGML instead)

Incidentally, your logical inferencing is wrong too. If we take real
pseudo XML like M$oft's CDF or ASX files, then these are psudo-XML, but
that's certainly not to say that all pseudo-XML is implied to be one of
these particular formats. You appear to be arguing over points of logic
that your mind, or at least your prose, just isn't adequate for.

> > XHTML is not pseudo-XML. XHTML under Appendix C is pseudo-SGML, and
> > not XML at all. XHTML as XML) is (or should be) perfectly compliant
> > XML.
>
> This is why I guess they start with `XML' prolog?

XHTML starts with an XML prolog (if it does), because it's claiming to
be XML, and well-formed valid XML at that. Appendix C XHTML doesn't
have an XML prolog.

> OK - taking "experience" back, rephrase it to "XHTML-free thinking
> experience".

OK then, I've been arguing free-thinking XHTML in this very newsgroup
since some time in the last century. Not always correctly or wisely, I
grant you, but it was early days and I was young and foolish.

> While dealing with all these XML-looking imitations it is
> easy to forget that there is somewhere real XML with its own real
> rules.

All XML has the same rules - that's the point (and its single huge
benefit over SGML). There aren't exceptions for either the web, or for
your crazy imaginings. This stuff may not always be simple, but it is
(for once) written down and fairly clearly readable.

> > Doctype identifiers on "the web" are thus merely
> > treated as opaque strings, not URLs to a DTD that needs to be retrieved
>
> And: XHTML documents are not XML document and do not follow
> XML-conforming DOCTYPE/DTD rules.

In what way do XHTML (not Appendix C) documents on the web _not_
conform to the rules?

NB - Documents - not _processing_. Processing is an artefact of the
processors and I'm sure some of them have very weird and unconformant
behaviours -- I know, I've written RSS parsers that have enormous
inferences built-in to try to work around badly formed XML.

> This statement can be used as the conclusive for the discussion unless
> someone has more comments.

If you want to have the last word, try not to leave it as "Wibble".

Michael Winter

unread,

May 18, 2006, 12:01:32 PM5/18/06

to

On 18/05/2006 14:52, VK wrote:

> Michael Winter wrote:

[snip]

>> Please don't tell me that you still think you were right.
>

> Yes I do [...]

Why does that not surprise me. *sigh*

> I read this quote in your previous post as an admission of your
> mistake.

I quoted from the specification because either you haven't read it, or
you don't understand it.

You stated in <1147868253....@g10g2000cwb.googlegroups.com>
that, in the context of XHTML, the standalone document declaration has a
default value of "yes". However, section 2.9 specifies that if there are
external markup declarations (for example, an external subset), but the
standalone declaration is absent, then the default value is "no".

> If we still tango, then read the relevant discussion Masayasu
> Ishikawa - Heikki Toivonen at
> <https://bugzilla.mozilla.org/show_bug.cgi?id=69799>

Their discussion is irrelevant to what you wrote.

Masayasu Ishikawa points out that the non-validating XML processor in
Mozilla reports undefined entities to be well-formedness errors in
non-standalone documents. However, this behaviour is incorrect:
non-validating XML processors should only treat undefined entities as
well-formedness errors in standalone documents.

When he writes, 'even if external DTD subset is present'[1], he is
referring to the fact that in such circumstances, the default for a
standalone document declaration is "no". A fact that you seem to be
unable to comprehend.

Mike

[1] Masayasu Ishikawa, comment #9
<https://bugzilla.mozilla.org/show_bug.cgi?id=69799#c9>

VK

unread,

May 18, 2006, 12:26:19 PM5/18/06

to

Michael Winter wrote:
> On 18/05/2006 14:52, VK wrote:
>
> > Michael Winter wrote:
>
> [snip]
>
> >> Please don't tell me that you still think you were right.
> >
> > Yes I do [...]
>
> Why does that not surprise me. *sigh*
>
> > I read this quote in your previous post as an admission of your
> > mistake.
>
> I quoted from the specification because either you haven't read it, or
> you don't understand it.

I considered this quote as the statement "I read the relevant part in
full, here is the proof" :-)

Why did you stop reading in the middle then?

<http://www.w3.org/TR/REC-xml/#sec-rmd>
... three lines below of what you already read:

Validity constraint: Standalone Document Declaration

The standalone document declaration MUST have the value "no" if any
external markup declarations contain declarations of:

* attributes with default values, if elements to which these
attributes apply appear in the document without specifications of
values for these attributes, or
* entities (other than amp, lt, gt, apos, quot), if references to
those entities appear in the document, or
* attributes with tokenized types, where the attribute appears in
the document with a value such that normalization will produce a
different value from that which would be produced in the absence of the
declaration, or
* element types with element content, if white space occurs
directly within any instance of those types.

VK

unread,

May 18, 2006, 1:12:06 PM5/18/06

to

VK wrote:
> Why did you stop reading in the middle then?
>
> <http://www.w3.org/TR/REC-xml/#sec-rmd>
> ... three lines below of what you already read:
>
>
> Validity constraint: Standalone Document Declaration
>
> The standalone document declaration MUST have the value "no" if any
> external markup declarations contain declarations of:
>
> * attributes with default values, if elements to which these
> attributes apply appear in the document without specifications of
> values for these attributes, or
> * entities (other than amp, lt, gt, apos, quot), if references to
> those entities appear in the document, or
> * attributes with tokenized types, where the attribute appears in
> the document with a value such that normalization will produce a
> different value from that which would be produced in the absence of the
> declaration, or
> * element types with element content, if white space occurs
> directly within any instance of those types.

Yet after deep thinking I admit that it seems not humanly possible to
get a definitive result out of:

"If there are external markup declarations but there is no standalone

document declaration, the value "no" is assumed."

and

"The standalone document declaration MUST have the value "no" if any

external markup declarations contain declarations of: <snip>".

These sentences are collocated in two consecutive paragraphs, but only
a real W3C language specialist can tell what do they mean. Does it mean
that for the situations spelled in the MUST section one has to
explicetly set standalone="no" ? Or does it mean that for the
situations spelled in the MUST section the value "no" must be assumed
and in other situations it /may/ be assumed? God damn, I've never seen
so low clarity in so clear looking text. I guess it is needed to ask at
<comp.text.xml>, maybe they already managed to decrypt this fragment.
Practically for me it was always obvious to set standalone="no" if DTD
is used, without hope on some default values - but of course it doesn't
prove anything.

VK

unread,

May 18, 2006, 2:17:25 PM5/18/06

to

Andy Dingley wrote:
> You're in a hole, stop digging.

and at the beginning he wrote:
<q>XML apps _may_ of course fetch DTDs, but in the absence of any real

statistical evidence I'd guess that their actual practice is that far
fewer XML apps fetch DTDs than similar SGML apps do. The reason is
simple - XML never needs a DTD to parse the document into the DOM. This
is the absolutely fundamental design difference between XML and SGML.

</q>

and VK continuosly stated that:

1) If a DTD is provided for XML document, it is /obligated/ to fetch it
before proceed with validation. If for some reason parser doesn't want
to retrieve the DTD, it is not allowed to validate/unvalidate the
document (thus raise parsing errors).

2) DTD are videly used in XML/XSL templates to add extra entities which
otherwise would raise parsing errors (this is not the only use of DTD
but the most common one).

3) A document not following the rule 1) is not XML-conformant.

You're in a hole, stop digging - especially after you've made such
great step forward by admitting that DTD in HTML/XHTML are formal
opaque strings and their usage there is not /totally/ the same as
DOCTYPE/DTD specs say.

Com'on! Come out! I gocha! :-)

Jack

unread,

May 18, 2006, 2:36:30 PM5/18/06

to

VK wrote:
>
> 1) If a DTD is provided for XML document, it is /obligated/ to fetch
> it before proceed with validation. If for some reason parser doesn't
> want to retrieve the DTD, it is not allowed to validate/unvalidate
> the document (thus raise parsing errors).
>

VK also wrote:
>
> 3) A document not following the rule 1) is not XML-conformant.

Your rule 1) doesn't refer to a document; the first "it" in the first
sentence implicitly refers to a parser. That is, your rule is saying
that a parser must fetch the DTD for a document if it wishes to validate
that document. You can't use rule 1) to conclude that any given document
is or isn't XML-conformant.

To put it more simply: you're talking rubbish again. And your prose is
significantly less clear than certain W3C recommendations.

--
Jack.

VK

unread,

May 18, 2006, 2:49:06 PM5/18/06

to

Jack wrote:
> VK wrote:
> >
> > 1) If a DTD is provided for XML document, it is /obligated/ to fetch
> > it before proceed with validation. If for some reason parser doesn't
> > want to retrieve the DTD, it is not allowed to validate/unvalidate
> > the document (thus raise parsing errors).
> >
> VK also wrote:
> >
> > 3) A document not following the rule 1) is not XML-conformant.
>
> Your rule 1) doesn't refer to a document; the first "it" in the first
> sentence implicitly refers to a parser.

"it" meaning is pretty clear from the context (also how in the world a
/document/ could fetch a DTD - even if I wrote explicetly like that,
that would be an obvious typo - yet I did not).

> That is, your rule is saying
> that a parser must fetch the DTD for a document if it wishes to validate
> that document. You can't use rule 1) to conclude that any given document
> is or isn't XML-conformant.

Here indeed a bit of W3C style - it must be contageous :-)

"A document usind DOCTYPE and DTD but not expecting from them to be
treated by the rule 1) is not XML-conformant"

> To put it more simply: you're talking rubbish again.And your prose is

> significantly less clear than certain W3C recommendations.

no direct comments - see above.

VK

unread,

May 19, 2006, 2:42:43 AM5/19/06

to

VK wrote:
> Yet after deep thinking I admit that it seems not humanly possible to
> get a definitive result out of:
>
> "If there are external markup declarations but there is no standalone
> document declaration, the value "no" is assumed."
> and
> "The standalone document declaration MUST have the value "no" if any
> external markup declarations contain declarations of: <snip>".
>
> These sentences are collocated in two consecutive paragraphs, but only
> a real W3C language specialist can tell what do they mean. Does it mean
> that for the situations spelled in the MUST section one has to
> explicetly set standalone="no" ? Or does it mean that for the
> situations spelled in the MUST section the value "no" must be assumed
> and in other situations it /may/ be assumed? God damn, I've never seen
> so low clarity in so clear looking text. I guess it is needed to ask at
> <comp.text.xml>, maybe they already managed to decrypt this fragment.
> Practically for me it was always obvious to set standalone="no" if DTD
> is used, without hope on some default values - but of course it doesn't
> prove anything.

<comp.text.xml> comment:
<q>Since it's a constraint attached to the production for standalone
declarations, I think you should take it as "if there is a standalone
declaration, it must have the value "no" if ...". You're certainly not
required to have one.</q>

That's a creative reading of the damaged text fragment (never crossed
my mind) and seems the only one having sense.

This way my original statement that it is not a valid XML syntax per se
(without explicit standalone="no"):

<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

- this statement is not correct. I was wrong and you were right.

Toby Inkster

unread,

May 19, 2006, 2:32:09 AM5/19/06

to

VK wrote:

> 1) If a DTD is provided for XML document, it is /obligated/ to fetch it
> before proceed with validation.

Correct, but misleadingly phrased.

If a DTD is provided for an XML document, the user agent is *not* obliged
to fetch it *unless* it wants to validate the document. (And many user
agents have no interest in validation -- only well-formedness.)

Andy Dingley <dingbat@codesmiths.com>

unread,

May 19, 2006, 5:03:52 AM5/19/06

to

VK wrote:
> and VK continuosly stated that:

You're talking about yourself in the third person?

We have clear k00k-sign !

(*plonk* - you're just not worth it)

VK

unread,

May 19, 2006, 5:40:59 AM5/19/06

to

Andy Dingley wrote a while ago:

<q>XML apps _may_ of course fetch DTDs, but in the absence of any real
statistical evidence I'd guess that their actual practice is that far
fewer XML apps fetch DTDs than similar SGML apps do. The reason is
simple - XML never needs a DTD to parse the document into the DOM. This
is the absolutely fundamental design difference between XML and
SGML.</q>

No one statement in this quote appeared to be correct. The only correct
statement you forcely did later was that DTD in HTML/XHTML are opaque
strings and they have different functionality than described in XML
specs. As you started to switch onto side topics about my style and
semantics in my posts, I presume that was the maximum compromise you
are ready to go for. I don't dare to force you any further.

Michael Winter

unread,

May 19, 2006, 7:50:00 AM5/19/06

to

On 18/05/2006 17:26, VK wrote:

> Michael Winter wrote:

[snip]

>> I quoted from the specification because either you haven't read it,

>> or you don't understand it.
>
> I considered this quote as the statement "I read the relevant part in
> full, here is the proof" :-)
>
> Why did you stop reading in the middle then?

I didn't, but I quoted only what was relevant.

[snip]

> Validity constraint: Standalone Document Declaration
>
> The standalone document declaration MUST have the value "no" if any
> external markup declarations contain declarations of:

[snip]

And what bearing do you think that has on your assertion that the
default value is "yes"?

Mike

VK

unread,

May 19, 2006, 8:53:12 AM5/19/06

to

Michael Winter wrote:
> And what bearing do you think that has on your assertion that the
> default value is "yes"?

No, it was my mistake I admitted in the previous post. Should I do it
again? /I was wrong/

In the case like:

<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

in a XML-conformant document standalone value in prolog is assumed "no"
and /cannot/ be set to "yes" (by "Validity constraint" section at
<http://www.w3.org/TR/REC-xml/#sec-rmd>)

Michael Winter

unread,

May 19, 2006, 11:47:09 AM5/19/06

to

On 19/05/2006 07:42, VK wrote:

[Regarding validity constraint, section 2.9, XML 1.0]

> <comp.text.xml> comment:
> <q>Since it's a constraint attached to the production for standalone
> declarations, I think you should take it as "if there is a standalone
> declaration, it must have the value "no" if ...". You're certainly not
> required to have one.</q>
>

> That's a creative reading of the damaged text fragment [...]

That would be a reasonable reading of English.

First, section 2.9 starts by defining what constitutes an 'external
markup declaration'. It then follows with a description of what a
standalone document declaration represents, its relationship with the
previous definition, and the default value of the declaration. Finally,
it sets out when the value must be 'no' for a document to be valid.

[snip]

> I was wrong and you were right.

Thank you for acknowledging that. However, I think I'm past the point of
caring (actually, I think I reached that point a while ago).

I'm fed up of banging my head against the wall, trying to make you see
sense. Don't be surprised if I ignore any discussions you intend to
start with me, even if I reply with corrections to your posts. They will
be for the benefit (or protection) of other readers, not you. You just
aren't worth the effort or the frustration any more.

A cop-out? Perhaps, but it's better than the alternative.

Mike

VK

unread,

May 19, 2006, 12:17:40 PM5/19/06

to

Michael Winter wrote:
> I'm fed up of banging my head against the wall, trying to make you see
> sense. Don't be surprised if I ignore any discussions you intend to
> start with me, even if I reply with corrections to your posts. They will
> be for the benefit (or protection) of other readers, not you. You just
> aren't worth the effort or the frustration any more.
>
> A cop-out? Perhaps, but it's better than the alternative.

I'm sorry if I caused such frustration - that was not neither initial
nor secondary purpose to "show up". I just wanted to show that
DOCTYPE/DTD in HTML/XHTML documents are formal opaque strings and they
have very far relation to DOCTYPE/DTD used in XML-conformant documents
and described in W3C specs. As simple as that - yet seemed to be held
as a non-disclosure taboo. I really wondering that the response be to
the recent
<http://groups.google.com/group/comp.infosystems.www.authoring.html/browse_frm/thread/06e7a7fe8050da5b/a62f3c8bd54ede47?hl=en#a62f3c8bd54ede47>
without this thread some below.

At the same time I miss a lot of professional knowledge in XML and
XSLT, definitely weak in reading W3C docs and my English may fell down
- especially after midnight.

cop-out

Henri Sivonen

unread,

May 20, 2006, 5:30:53 AM5/20/06

to

In article <1147868253....@g10g2000cwb.googlegroups.com>,
"VK" <school...@yahoo.com> wrote:

> Currently Firefox cannot load external DTD's at all.

That's not true.

Firefox can load external DTDs if the system ID has the chrome URI
scheme. Loading external DTDs has, by design, been prevented for other
URI schemes.

> This is a nasty
> bug, but to fix it properly they have to solve somehow the problem with
> the bogus DTD from W3C.

It is a feature. The XML spec, by design, allows not processing the
external DTD subset. The rationale for the spec feature was browsers. It
would be simply stupid for browsers to suffer the troubles of external
entities when the spec gives a way out.

Mozilla does do a dirty trick in this area, which causes error messages
to cite the wrong reason in a specific case. However, when an error is
displayed, it is legitimate per spec (even if the reason stated is not
legitimate).

--
Henri Sivonen
hsiv...@iki.fi
http://hsivonen.iki.fi/
Mozilla Web Author FAQ: http://mozilla.org/docs/web-developer/faq.html