Re: Problem applying HTML 4.01 DOM in scripting

0 views
Skip to first unread message

David Dorward

unread,
Mar 18, 2006, 2:58:17 PM3/18/06
to
VK wrote:

> It doesn't look right though, because say <br /> leaves Validator
> happy, but <br !> leads to "character "!" not allowed in attribute
> specification list".

<br !>
< - Start of tag
br - This is a br element
! - not allowed
> - end of tag

<br />
< - Start of tag
br - this is a br element
/ - end of tag
> - a greater than character (as text to appear after the line break)

> So it is an intentional patch, and Validator was/is/will (at least
> partially) a brain waching tool for pushing desired behavior.

No.

> We may continue this in ciwah if you want to (I don't).

Fine. Follow-ups set.

--
David Dorward <http://blog.dorward.me.uk/> <http://dorward.me.uk/>
Home is where the ~/.bashrc is

Alan J. Flavell

unread,
Mar 18, 2006, 4:06:57 PM3/18/06
to

On Sat, 18 Mar 2006, David Dorward wrote:

> VK wrote:
>
> > It doesn't look right though, because say <br /> leaves Validator
> > happy, but <br !> leads to "character "!" not allowed in attribute
> > specification list".

[...]


> > We may continue this in ciwah if you want to (I don't).
>
> Fine. Follow-ups set.

Which prompted me to review the earlier thread on c.l.j

That VK looks pretty impermeable to learning anything about SGML,
hmmm?

VK

unread,
Mar 19, 2006, 11:51:44 AM3/19/06
to

Alan J. Flavell wrote:
> That VK looks pretty impermeable to learning anything about SGML,
> hmmm?

Not really, I'm just from another Church :-D

My credo is that as of the year 2006 HTML is completely separate unit
with its own history, traditions and standards. Therefore it has to be
explained by itself and from itself. Yes, historically one well known
man took SGML papers and did a quick extract of whatever he thought
minimum necessary for e-docs exchange between involved CERN groups. But
it all was in the pre-pre-historic period.

This way arguments like "this exists in SGML so it ever existed (but
was not revealed up to now) in HTML" are totally alien to me (and I
dare to presume to many other people). To me it's like a person who is
looking for a word definition in the Reconstructive Indo-European
dictionary instead of the recent Random House Webster's :-) Or like
someone telling me that COMMENT should be a legal keyword to denote
comments in my C++ program, because it is defined in ALGOL :-)

I have to admit that I'm not a great specs reader (loosing my patience
and mind too quickly :-) But neither in
<http://www.w3.org/MarkUp/html-spec/html-spec_3.html#SEC3.2.2>
nor in
<http://www.w3.org/MarkUp/html-spec/html-spec_9.html#SEC9.5>
I didn't found any mention (even "for future use") of slash in
start-tag.

So for me <br /> goes by the same _HTML_ rule as <br foobar>:
unrecognized attribute to be ignored. If it has some special useful
meaning in _another_ markup system then so the better but irrelevant to
HTML DTD's. It can be added though to HTML syntax as a new markup sign.
The meaning and application of this _new_ markup sign has to be clearly
described in _HTML_ specs (not "see what does it mean in SGML,..
XML,...XYZ)
The rules for this _new_ markup sign have to be added to all relevant
HTML DTD's. Only then (and not a second earlier) Validator may validate
HTML 4.0x page with end slashes.

I guess it's an evangelism question though as I could conclude from the
group archives. In this case I've chosen my side then :-)

jens.br...@gmail.com

unread,
Mar 19, 2006, 5:27:01 PM3/19/06
to
VK wrote:
> My credo is that as of the year 2006 HTML is completely separate unit
> with its own history, traditions and standards. Therefore it has to be
> explained by itself and from itself.
> [...]

> This way arguments like "this exists in SGML so it ever existed (but
> was not revealed up to now) in HTML" are totally alien to me
>
> I have to admit that I'm not a great specs reader (loosing my patience
> and mind too quickly :-) But neither in
> <http://www.w3.org/MarkUp/html-spec/html-spec_3.html#SEC3.2.2>
> nor in
> <http://www.w3.org/MarkUp/html-spec/html-spec_9.html#SEC9.5>
> I didn't found any mention (even "for future use") of slash in
> start-tag.

I am sorry to contradict but actually, the SGML Declaration for HTML
does mention exactly this feature:

| FEATURES
| MINIMIZE
| DATATAG NO
| OMITTAG YES
| RANK NO
| SHORTTAG YES

SHORTTAG YES does mean that shorthand markup is allowed in HTML.

Now what is this shorthand markup?
Unfortunately you did not cite the current HTML specification, which is
HTML 4.01, but an outdated one, HTML 2.
Have a look at the current specification and you will find at

http://www.w3.org/TR/html401/appendix/notes.html#h-B.3.7

the description of SHORTTAGS, namely the NET tag <ELEM/.../

> The rules for this _new_ markup sign have to be added to all relevant
> HTML DTD's. Only then (and not a second earlier) Validator may validate
> HTML 4.0x page with end slashes.
>
> I guess it's an evangelism question though as I could conclude from the
> group archives. In this case I've chosen my side then :-)

You may now consider changing sides again.

Cheers,

jens

David Dorward

unread,
Mar 19, 2006, 6:47:13 PM3/19/06
to
jens.br...@gmail.com wrote:

> I am sorry to contradict but actually, the SGML Declaration for HTML
> does mention exactly this feature:
>
> | FEATURES
> | MINIMIZE
> | DATATAG NO
> | OMITTAG YES
> | RANK NO
> | SHORTTAG YES

I looked for this in an effort to quote it, but couldn't find it. Where is
it? (If its in the strict.dtd file then a line number would be helpful as I
must be going blind!)

Alan J. Flavell

unread,
Mar 19, 2006, 7:16:00 PM3/19/06
to
On Sun, 19 Mar 2006, David Dorward wrote:

> jens.br...@gmail.com wrote:
>
> > I am sorry to contradict but actually, the SGML Declaration for HTML
> > does mention exactly this feature:
> >
> > | FEATURES
> > | MINIMIZE
> > | DATATAG NO
> > | OMITTAG YES
> > | RANK NO
> > | SHORTTAG YES
>
> I looked for this in an effort to quote it, but couldn't find it. Where is
> it?

In the SGML Declaration, just as the hon. Usenaut said:
http://www.w3.org/TR/REC-html40/sgml/sgmldecl.html

> (If its in the strict.dtd file then a line number would be helpful

Don't confuse the SGML Declaration with a DTD !

bon soir

Alan J. Flavell

unread,
Mar 19, 2006, 8:11:23 PM3/19/06
to
On Sun, 19 Mar 2006, VK wrote:

> My credo is that as of the year 2006 HTML is completely separate unit
> with its own history, traditions and standards.

You don't accept the W3C definition of HTML?

(for all its faults and inconsistencies, it *does* seem to be fairly
widely accepted).

> This way arguments like "this exists in SGML so it ever existed (but
> was not revealed up to now) in HTML"

Its existence has been known throughout, since RFC1866 (HTML/2.0).

> are totally alien to me

I'd prefer to go along with the W3C definition of HTML4, as far as it
doesn't contradict itself.

Consider http://www.w3.org/TR/html401/conform.html#h-4.1 ,
first item.

Then...

Consider http://www.w3.org/TR/html401/appendix/notes.html#h-B.3.3 and
the subsequent sections, particularly B.3.7

Based on which, it should be clear that there are some SGML features
which are technically valid in HTML because of SGML, but are poorly
supported in HTML software, and thus the W3C advises against using
them.

You may not *wish* to know what they are; but it would be wise to know
what they are, otherwise it's hard to know how to avoid inadvertently
using them, n'est-ce pas?

Henri Sivonen

unread,
Mar 20, 2006, 3:51:42 AM3/20/06
to
In article <1142787104.1...@v46g2000cwv.googlegroups.com>,
"VK" <school...@yahoo.com> wrote:

> My credo is that as of the year 2006 HTML is completely separate unit
> with its own history, traditions and standards.

> So for me <br /> goes by the same _HTML_ rule as <br foobar>:


> unrecognized attribute to be ignored.

FWIW, in the current draft of Web Apps 1.0 aka. HTML5, which does not
pretend to be an application of SGML, the slash is not the same as an
unrecognized attribute.

--
Henri Sivonen
hsiv...@iki.fi
http://hsivonen.iki.fi/
Mozilla Web Author FAQ: http://mozilla.org/docs/web-developer/faq.html

VK

unread,
Mar 20, 2006, 5:11:56 AM3/20/06
to

Henri Sivonen wrote:
> In article <1142787104.1...@v46g2000cwv.googlegroups.com>,
> "VK" <school...@yahoo.com> wrote:
>
> > My credo is that as of the year 2006 HTML is completely separate unit
> > with its own history, traditions and standards.
>
> > So for me <br /> goes by the same _HTML_ rule as <br foobar>:
> > unrecognized attribute to be ignored.
>
> FWIW, in the current draft of Web Apps 1.0 aka. HTML5, which does not
> pretend to be an application of SGML, the slash is not the same as an
> unrecognized attribute.

It is great to know that W3C finally realizes the need of the HTML DTD
update :-)

But this thread is about HTML 4.01 (<em>Please note the total absence
of "X" letter in front<em>). AFAIK the relevant HTML definitions are
summarized in three DTD files: loose.dtd, frameset.dtd, strict.dtd and
reffered by codenames Transitional, Frameset, Strict. AFAIK the
validation process is build on what and how is layed out in these DTD.
Technical notes, internal e-mail correspondences and other side
information located on w3.org is not counted. If it is counted in some
parts in some circumstances, this has to be spelled either in DTD's or
at least in the validation results.

In my previous post I've shown that slash syntax was not mention in
HTML Pre-drafts (2.0) nor in HTML 3.0. You've found though the mention
of it HTML 4.0 related papers. That brings again the evangelism
answers: "it was always there through SGML - just was not revealed" or
"it is a completely new stuff added to the docs". My answer is the
latter.

I've found more or less recognisable mention of it only here:

<blockquote
cite="http://www.w3.org/TR/html4/appendix/notes.html#h-B.3.7">
Some SGML SHORTTAG constructs save typing but add no expressive
capability to the SGML application. Although these constructs
technically introduce no ambiguity, they reduce the robustness of
documents, especially when the language is enhanced to include new
elements. Thus, while SHORTTAG constructs of SGML related to attributes
are widely used and implemented, those related to elements are not.
Documents that use them are conforming SGML documents, but are unlikely
to work with many existing HTML tools.

The SHORTTAG constructs in question are the following:
* NET tags:
<name/.../

* closed Start Tag:
<name1<name2>

* Empty Start Tag:
<>

* Empty End Tag:
</>
</blockquote>

As I see I'm still missing the point of how this block of text could be
applied to change validator behavior for _HTML_ pages. Also please note
that unlike some people are trying to imply, say <br /> is _not_ NET
Tag. It has completely different syntax though rather close intended
meaning.

The key feature here is the _space_ which is treated as token delimiter
within HTML tag declaration. So the sequence in question is not "slash"
but "space-slash-gt". In this form it is recognizable HTML tag as well
as say <br foobar> (but not say <brfoobar>). Unrecognized attributes
are being ignored and do not lead to an error in HTML parser. In this
form we can say that <br /> (with space) was fully supported since at
least NCSA Mosaic, just never was implemented by HTML parsers. With
such creativity approach nothing stop us to declare that say <p !>
(with space) implies the required closing tag and was always supported
but not yet implemented. Or that say <hr !M$> always meant to be "the
content below are not allowed to display for IE users" :-) One need
just start this way. :-)

P.S. For a non-specs reader I may seem to be too preoccupated by
fine-tune syntax matters. But my aim is other: I want W3C stop srying
of the lost XHTML dreams, take HTML DTD's out of the freezer and start
fighting on the field they can win (at least something).

Henri Sivonen

unread,
Mar 20, 2006, 6:21:40 AM3/20/06
to
In article <1142849516.2...@i40g2000cwc.googlegroups.com>,
"VK" <school...@yahoo.com> wrote:

> Henri Sivonen wrote:
> > FWIW, in the current draft of Web Apps 1.0 aka. HTML5, which does not
> > pretend to be an application of SGML, the slash is not the same as an
> > unrecognized attribute.
>
> It is great to know that W3C finally realizes the need of the HTML DTD
> update :-)

W3C doesn't. HTML5 is not a W3C spec.
http://whatwg.org/specs/web-apps/current-work/

> The key feature here is the _space_

As far as real-world parsing goes, the space is not a key feature here.

> Unrecognized attributes
> are being ignored

Actually, they are not ignored in the parser. They end up in the DOM.

> I want W3C stop srying of the lost XHTML dreams, take HTML DTD's
> out of the freezer

I don't want any DTDs to be taken out of the freezer. I'd rather bury
DTDs (and use RELAX NG).

Jukka K. Korpela

unread,
Mar 20, 2006, 6:29:01 AM3/20/06
to
"VK" <school...@yahoo.com> wrote:

>> FWIW, in the current draft of Web Apps 1.0 aka. HTML5, which does not
>> pretend to be an application of SGML, the slash is not the same as an
>> unrecognized attribute.
>
> It is great to know that W3C finally realizes the need of the HTML DTD
> update :-)

So you don't know what "HTML5" is, do you?

> But this thread is about HTML 4.01 (<em>Please note the total absence
> of "X" letter in front<em>).

We not the lack of end tags for your em elements in your pseudo-markup. I
guess that's tantamount to SHOUTING IN PLAIN TEXT.

> AFAIK the relevant HTML definitions are
> summarized in three DTD files:

No, the DTDs are just DTDs. Of course, you need to understand what a DTD is
in order to see what this means.

> AFAIK the
> validation process is build on what and how is layed out in these DTD.

No, validation is based on processing DTDs, not on any particular DTDs
(though the W3C validator has been tuned in this respect).

> In my previous post I've shown that slash syntax was not mention in
> HTML Pre-drafts (2.0) nor in HTML 3.0.

What are you babbling about? HTML 2.0 was no pre-draft; it's actually the
only HTML specification that has been published as an RFC (i.e., the way
Internet protocols should be published, in many people's opionion). HTML 3.0
on the other hand never existed, except as a fragmentary sketch.

HTML 2.0 makes a normative reference to SGML, and it surely does not need to
list down every bit of SGML that applies to HTML 2.0.

> As I see I'm still missing the point of how this block of text could be
> applied to change validator behavior for _HTML_ pages.

Has someone said it could be so applied?

> Also please note
> that unlike some people are trying to imply, say <br /> is _not_ NET
> Tag. It has completely different syntax though rather close intended
> meaning.

Your statements are so vague that they are useless and pointless.
<br /> changes meaning when moving from HTML to XHTML, so it creates
nothing but confusion to discuss it without paying attention to the
difference.

> Unrecognized attributes
> are being ignored and do not lead to an error in HTML parser.

Here, too, you just confuse yourself and anyone else who does not understand
the issue. Browsers have been instructed to ignore unrecognized attributes,
but this never meant that documents with such attributes were correct.
Error processing recommendations do not define the language

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

VK

unread,
Mar 20, 2006, 7:58:27 AM3/20/06
to

Jukka K. Korpela wrote:
> So you don't know what "HTML5" is, do you?

I do perfectly and I think it is great for W3C what during the
in-between times a group of devoted people continued to work on the
future of HTML rather than x-cross it.
Now when the XHTML fantasies are mainly over, W3C Web Application Group
doesn't have to start everything from the scratch. Whoever involved in
<http://www.w3.org/2006/appformats/> can benefit of some work done at
<http://whatwg.org/specs/web-apps/current-work/>.

Coming back to the milestone blog finishing the "fight for XHTML"
period: <http://blogs.msdn.com/ie/archive/2005/07/29/445242.aspx>

Let's us skip on some hidden between the lines sarcasm: "No, we do not
accept a whole new markup standard, but we still want to be standard:
look ma, we added standard abbr tag support ;-)".

Overall it does define the maneuver field, because Microsoft (Micro$oft
is one likes) is under the highest pressure of accessibility demands,
both in technical and semantical aspects.

The only rule (as I can read out) is: no reloads and revolutions of any
kind! Unlike Mozilla Foundation, Microsoft has multi-million pre-paid
support lines, amortization etc. So they cannot go so easily on "let's
implement it and fix the bugs on the go, for specs changes we'll make a
patch".

In this aspect RELAX NG <http://www.relaxng.org/> IMHO seems to be
another extreme, but now from developers side: "Lesser rules the
better". It always like that: seller wants the best price, buyer wants
for close to nothing; developers want minimum markup hassles, semantic
gurus wants the whole content to be a leveled markup with content
buried inside; a traditionalist doesn't want to change a dot, a
progress fighter ready to drop all traditions and start from the
scratch. This is why two sides are needed to fight for a compromise
(and a really good compromise has to leave both sides equally
unsatisfied ;-)


P.S.
Do I agree with definition of HTML as an "application of SGML" as
stated at <http://www.w3.org/MarkUp/html-spec/html-spec_1.html#SEC1> ?
Of course I do!
...in the same sense as I do totally agree with the definition of
English as a "member of Germanic branch of Indo-European family of
languages". But I would be highly surprised if on the basis of the
latter right fact someone would require to use vernacularly (starting
say Jan 01, 2007) three quantity forms: single, double and plural and
four noun declination forms: separate for males, females, children and
animals, things :-) They are maybe presented in the proto-language, but
out of interest for contemporary speakers.


P.S.S.


> Here, too, you just confuse yourself and anyone else who does not understand
> the issue. Browsers have been instructed to ignore unrecognized attributes,
> but this never meant that documents with such attributes were correct.

Right at the point, the only adjustment to make: the fact that browsers
have been instructed to ignore unrecognized attribute was used by a
group of people to present <br /> syntax as an SGML feature always
supported in HTML. In the reality "space-slash-gt" goes absolutely by
the same rules as "space-foobar-gt" or "space-foobar-space-gt". A basic
HTML knowledge suffice to understand that IMHO.

Nick Kew

unread,
Mar 20, 2006, 9:18:22 AM3/20/06
to
jens.br...@gmail.com wrote:

> SHORTTAG YES does mean that shorthand markup is allowed in HTML.

Some of us consider that a bug in the spec.
It certainly causes serious divergence from anything
with _any_ mainstream browser support.

See http://valet.webthing.com/page/parsemode.html

--
Nick Kew

Jukka K. Korpela

unread,
Mar 20, 2006, 11:07:37 AM3/20/06
to
Nick Kew <ni...@asgard.webthing.com> wrote:

> jens.br...@gmail.com wrote:
>
>> SHORTTAG YES does mean that shorthand markup is allowed in HTML.
>
> Some of us consider that a bug in the spec.

For some interesting value of "bug". Do you call anything you don't like
a bug?

> It certainly causes serious divergence from anything
> with _any_ mainstream browser support.

I don't think you have got things upside down; you know how they are but you
intentionally misrepresent them. Browser vendors decided to deviate from the
specifications (never implement them as defined), so _browsers_ diverge from
the specs. Retrofitting the specs to the dark reality has been going for
quite some while, but let's face it as what it is.

Alan J. Flavell

unread,
Mar 20, 2006, 11:41:58 AM3/20/06
to
On Mon, 20 Mar 2006, Nick Kew wrote:

> jens.br...@gmail.com wrote:
>
> > SHORTTAG YES does mean that shorthand markup is allowed in HTML.
>
> Some of us consider that a bug in the spec.

Which is why AIUI they later decoupled the various features of
SHORTTAG from each other, so that this feature could be switched off
without having to lose the other aspects of SHORTTAG YES. I don't
need to tell *you* that I'm talking about the "Web SGML Adaptations" =
Annex K.

> It certainly causes serious divergence from anything
> with _any_ mainstream browser support.

You have the boot on the other foot. Mainstream browsers exhibit
serious divergence from support for published specifications, in this
regard.

At the time that HTML/2.0 was defined, it simply was not possible to
make this choice (i.e the HTML spec was trying to rule-out something
which the SGML spec didn't allow to be prohibited, since SHORTTAG YES
had been selected in the SGML Declaration for HTML, with *all* of its
implications).

Thanks to Annex K, as I understand it, it *would* now be possible to
rule out this feature, in an amended SGML Declaration for HTML. But
as the W3C had lost interest in HTML, and gone overboard for XML-based
specifications, I guess it's no use looking to them for a revision of
HTML.

Nor is HTML5 aiming to be it (it deliberately faces away from SGML).

David Dorward

unread,
Mar 20, 2006, 3:30:02 PM3/20/06
to
Alan J. Flavell wrote:

> In the SGML Declaration, just as the hon. Usenaut said:
> http://www.w3.org/TR/REC-html40/sgml/sgmldecl.html

Ahhh.

An HTML document will include a DTD with a public identifier and the URL to
the DTD. Given such a document, how does a validator find the SGML
declaration?

Henri Sivonen

unread,
Mar 20, 2006, 5:34:47 PM3/20/06
to
In article <Xns978CB84C29A5...@193.229.4.246>,

"Jukka K. Korpela" <jkor...@cs.tut.fi> wrote:

> Browser vendors decided to deviate from the
> specifications (never implement them as defined), so _browsers_ diverge from
> the specs.

But there were browsers before HTML 2.0, right? And at the time HTML was
inspired by SGML--not officially an application of SGML. Doesn't that
mean the SGMLness was retrofitting in the HTML 2.0 timeframe?

Alan J. Flavell

unread,
Mar 20, 2006, 6:02:19 PM3/20/06
to
On Tue, 21 Mar 2006, Henri Sivonen wrote:

> "Jukka K. Korpela" <jkor...@cs.tut.fi> wrote:
>
> > Browser vendors decided to deviate from the specifications (never
> > implement them as defined), so _browsers_ diverge from the specs.
>
> But there were browsers before HTML 2.0, right? And at the time HTML
> was inspired by SGML--not officially an application of SGML. Doesn't
> that mean the SGMLness was retrofitting in the HTML 2.0 timeframe?

Sort-of: HTML "Classic" was certainly adjusted in the general
direction of SGML in the lead-up to establishing the official
specification. I recall the <p> "tag" being converted from its
original end-of-paragraph marker (analogous to <br>) into a proper
container (and some of us wondering why <br> had been left the way
that it was...).

However, none of those original browsers are really recognisable
today, so I think one is still entitled to ask why a current browser
would want to implement some obsolete, not-fully-specified, HTML
"Classic", in preference to what all the official specifications are
currently saying (viz. that HTML is an application of SGML).

By the way, I learned (since earlier today) that, despite assertions
that HTML5 has turned away from SGML, there does exist a draft "SGML
Declaration for HTML5". Someone will sure to be along shortly to fill
in some history about this, but meantime, here's where I found it:

http://syntax.whatwg.org/sgml/html5core+wf2/pre1/declaration

, where it can be seen (as I was wibbling about before) that the
previous broad-brush "SHORTTAG YES" has been broken down into a whole
list of separate items:

SHORTTAG
STARTTAG
EMPTY NO -- outlaws "<>" --
UNCLOSED NO -- outlaws "<foo" --
NETENABL NO -- outlaws "<p/text<em/more text/ nested/" --
ENDTAG
EMPTY NO -- outlaws "</>" --
UNCLOSED NO -- outlaws "</foo" --
ATTRIB
DEFAULT YES -- allows defaulted attributes --
OMITNAME YES -- allows "<gi attr>" --
VALUE YES -- allows unquoted attrs; "<gi att=val>" --


This would then (inter alia) finally outlaw the NET construct, whose
existence VK is still, it seem, so volubly in denial about. VK might
ponder why it was necessary to outlaw it, if it doesn't exist. Ho
hum.

Henri Sivonen

unread,
Mar 21, 2006, 2:37:56 AM3/21/06
to
In article <Pine.LNX.4.62.06...@ppepc62.ph.gla.ac.uk>,

"Alan J. Flavell" <fla...@physics.gla.ac.uk> wrote:

> However, none of those original browsers are really recognisable
> today, so I think one is still entitled to ask why a current browser
> would want to implement some obsolete, not-fully-specified, HTML
> "Classic", in preference to what all the official specifications are
> currently saying (viz. that HTML is an application of SGML).

Legacy documents and continuity.

> By the way, I learned (since earlier today) that, despite assertions
> that HTML5 has turned away from SGML, there does exist a draft "SGML
> Declaration for HTML5". Someone will sure to be along shortly to fill
> in some history about this, but meantime, here's where I found it:
>
> http://syntax.whatwg.org/sgml/html5core+wf2/pre1/dec

syntax.whatwg.org has no normative standing whatsoever. Only the spec
itself is normative, and it clearly says HTML5 is a standalone language
and not an application of SGML. The spec uses English--not any schema
language. However, WHAT WG is willing to host non-normative schemas
written by others at syntax.whatwg.org.

AFAIK, the RELAX NG schema is the only one that is actually being worked
on. The sgml directory dates from July 2004.

VK

unread,
Mar 21, 2006, 4:59:50 AM3/21/06
to

Alan J. Flavell wrote:
> This would then (inter alia) finally outlaw the NET construct, whose
> existence VK is still, it seem, so volubly in denial about. VK might
> ponder why it was necessary to outlaw it, if it doesn't exist. Ho
> hum.

You might ponder why would be necessary to prove a presence of
something just to outlaw it right away :-)

More IMHighlyHO'ed clarifications:

1) HTML language is a whole separate system, historically (but only
historically) appertaining to SGML markup system. How separate and
important this system is:- it can be observed by the fact of
_legislature regulations_ of some of its aspect in many major
industrial countries. Can you name me another markup system or a
programming language leading to debats in the Congress?
"Senator N proposal #ABC123 to enforce opening bracket to be placed on
the same line as the function name was sustained 98:2. Currently
incurring practice to place it below the function name is ruled as
illegal and to be prosecuted".
Take some HTML tags and attributes instead of "bracket" and the crazy
article above becomes real.
So no, it is not a "just another rather sloppy SGML application to be
phase out by something more correct". It is not anymore and for a
longest time. The lack of this understanding is one of the major
reasons of W3C second in its history biggest defeats.

2) HTML consists of tags denoted by lesser-than and greater-than signs
and the content. It does not have any other units like
lesser-than...slash...content. It never had them and it hever will.
Some other languages including the proto-language (SGML) may have a lot
of funny signs but they have no relation to HTML. It is possible to
officially declare them as non-existing in HTML, but it is really
pointless. It is kind of officially prohibit to walk on the street up
side down on your hands after 8pm. It will look like a commonly
accepted low then (because nobody does it anyway), but will look
utterly funny.

3) <br /> (with space) construct is not SGML NET tag and it has nothing
in common with it syntaxically exept maybe the intended meaning.
It is "unrecognized attribute /" which is possible to feel with some
sense though - as any other proprietary attribute in HTML.
Read the NET tag syntax once again and now try to use <br/> (w/o
space). Any more questions?

4) Historically <p> and <br> both were single tags: "paragraph break"
(line space) and "line break" (no line space). But from the earliest
time I can remember everyone considered it as a mistake made by the Man
in rush, so <p>...</p> was always used istead. At the same time no one
ever considered single <br> as some default and all attempts to imply
some kind of <line>...</line> were silently ignored by the community.
Why that? Because it reflects the conditional readers' semantic. We are
learned to see paragraph as a separate semantical init. At the same
time a string line is not such unit. Moreover
<p><line>First line</line><lineSecond line</line></p> is completely
misleading because there is absolutely no correlation between the line
markup and what we perceive on the screen. <br> is simply "forced line
break" as opposed to automatic line breaks made for us by the engine
depending on the screen size. It has its application and most
importantly it is very convenient - a single <br> in some circunstances
needs lines and lines of CSS - with all involved cross-browser
discrepancies and overall CSS dependance.
It is a language feature of HTML, period. Parent - parents but child -
children; <p> - </p> but <br>. Just relax and move on ;-)

Lachlan Hunt

unread,
Mar 21, 2006, 5:08:20 AM3/21/06
to
David Dorward wrote:
> Alan J. Flavell wrote:
>> In the SGML Declaration, just as the hon. Usenaut said:
>> http://www.w3.org/TR/REC-html40/sgml/sgmldecl.html
>
> Ahhh.
>
> An HTML document will include a DTD with a public identifier and the URL to
> the DTD. Given such a document, how does a validator find the SGML
> declaration?

From the catalogue file - the same file that identifies all the public
identifiers. If you take a look at the W3C validator's catalogue,
you'll see this line near the top:

SGMLDECL sgml.dcl

http://dev.w3.org/cvsweb/~checkout~/validator/htdocs/sgml-lib/sgml.soc?rev=1.18&content-type=text/plain

All the DTDs, SGML declaration, etc. are located here.
http://dev.w3.org/cvsweb/validator/htdocs/sgml-lib/

--
Lachlan Hunt
http://lachy.id.au/
http://GetFirefox.com/ Rediscover the Web
http://GetThunderbird.com/ Reclaim your Inbox

Harlan Messinger

unread,
Mar 21, 2006, 10:12:20 AM3/21/06
to
VK wrote:

> This way arguments like "this exists in SGML so it ever existed (but
> was not revealed up to now) in HTML" are totally alien to me (and I
> dare to presume to many other people).

I'd swear that what you just said is that when a misconception of yours
comes to light, it's the world's obligation to move swiftly to conform
to it rather than yours to correct it.

[snip]


>
> So for me <br /> goes by the same _HTML_ rule as <br foobar>:
> unrecognized attribute to be ignored.

"So for me"--good grief. How solipsistic can you be?

David Dorward

unread,
Mar 21, 2006, 1:45:36 PM3/21/06
to
Lachlan Hunt wrote:

>> An HTML document will include a DTD with a public identifier and the URL
>> to the DTD. Given such a document, how does a validator find the SGML
>> declaration?

> From the catalogue file - the same file that identifies all the public
> identifiers. If you take a look at the W3C validator's catalogue,
> you'll see this line near the top:

So it isn't possible to know which SGML features are turned on or off given
an arbitrary SGML document? Only documents with known public identifiers?

Alan J. Flavell

unread,
Mar 21, 2006, 3:30:24 PM3/21/06
to
On Tue, 21 Mar 2006, Henri Sivonen wrote:

> "Alan J. Flavell" <fla...@physics.gla.ac.uk> wrote:
>
> > However, none of those original browsers are really recognisable
> > today, so I think one is still entitled to ask why a current browser
> > would want to implement some obsolete, not-fully-specified, HTML
> > "Classic", in preference to what all the official specifications are
> > currently saying (viz. that HTML is an application of SGML).
>
> Legacy documents and continuity.

Serves me right for not saying -

I think one is still entitled to ask why a current browser would want

to implement some obsolete, not-fully-specified, HTML "Classic", *to
the exclusion of* what all the official specifications are currently

saying (viz. that HTML is an application of SGML).

[...]

> syntax.whatwg.org has no normative standing whatsoever.

thanks

> Only the spec itself is normative,

..and then only in the context of the "whatwg", at least for now?...

cheers

Henri Sivonen

unread,
Mar 21, 2006, 4:13:28 PM3/21/06
to
In article <Pine.LNX.4.62.06...@ppepc62.ph.gla.ac.uk>,

"Alan J. Flavell" <fla...@physics.gla.ac.uk> wrote:

> On Tue, 21 Mar 2006, Henri Sivonen wrote:
>
> > "Alan J. Flavell" <fla...@physics.gla.ac.uk> wrote:
> >
> > > However, none of those original browsers are really recognisable
> > > today, so I think one is still entitled to ask why a current browser
> > > would want to implement some obsolete, not-fully-specified, HTML
> > > "Classic", in preference to what all the official specifications are
> > > currently saying (viz. that HTML is an application of SGML).
> >
> > Legacy documents and continuity.
>
> Serves me right for not saying -
>
> I think one is still entitled to ask why a current browser would want
> to implement some obsolete, not-fully-specified, HTML "Classic", *to
> the exclusion of* what all the official specifications are currently
> saying (viz. that HTML is an application of SGML).

HTML "Classic" is required for legacy content. OTOH, implementing HTML
as an application of SGML is not required for existing Web content and
doesn't provide any killer features to justify the effort. It would only
provide more syntactic sugar.

> > Only the spec itself is normative,
>
> ..and then only in the context of the "whatwg", at least for now?...

Right.

Alan J. Flavell

unread,
Mar 21, 2006, 4:40:55 PM3/21/06
to
On Tue, 21 Mar 2006, Henri Sivonen wrote:

> HTML "Classic" is required for legacy content. OTOH, implementing
> HTML as an application of SGML is not required for existing Web
> content and doesn't provide any killer features to justify the
> effort. It would only provide more syntactic sugar.

I note that at least one of the SGML features of emacs-w3 (YKWIM) had
to be deliberately broken in order to fall into line with the dreaded
"Appendix C". This, at least, was one product that was punished for
taking the W3C claim seriously!

Then there was Softquad Panorama...

regards

Michael Winter

unread,
Mar 22, 2006, 6:41:35 AM3/22/06
to
On 21/03/2006 09:59, VK wrote:

[snip]

> You might ponder why would be necessary to prove a presence of
> something just to outlaw it right away :-)

But that - proving the existence of something just prior to forbidding
it - never happened.

> More IMHighlyHO'ed clarifications:

[snipped something that was far from clear]

> 2) HTML consists of tags denoted by lesser-than and greater-than signs
> and the content. It does not have any other units like
> lesser-than...slash...content. It never had them and it hever will.

Yes, it does. It has been mentioned before: you are suffering from denial.

> Some other languages including the proto-language (SGML) may have a lot
> of funny signs but they have no relation to HTML.

You're apparently missing the small fact that HTML is defined in terms
of (or, as an application of) SGML. Even historical versions of HTML
from as far back as 1992 do so.

You may not like the idea of HTML as an application of SGML. It may not
even be practical now, but that's not really the point. HTML does have
these features. That browsers don't implement all of them simply means
that HTML documents should not use them, or anything that looks like them.

[snip]

> 3) <br /> (with space) construct is not SGML NET tag

Quite right. It is a NET-enabling start-tag, followed by a greater-than
character. The br element has an EMPTY content model and an end-tag is
forbidden: the slash closes the element, and the following character is
content of the parent element and a sibling of the br element.

A NET, or null end-tag, is a forward slash (/).

> and it has nothing in common with it syntaxically

Assuming you meant NET-enabling, then it has everything in common
syntactically. You still have problems with that space?

> exept maybe the intended meaning.

?

> It is "unrecognized attribute /" [...]

Nonsense. A forward slash doesn't match any part of the attribute
specification list production, so it cannot be an attribute.

Even if it didn't match the start-tag production, then it would be a
syntax error, not an unrecognised attribute.

[snip]

> Read the NET tag syntax once again and now try to use <br/> (w/o
> space).

I'm not sure what that statement is trying to prove, but does it mean
that you have read the SGML grammar? In light of your trouble accepting
spaces in NET-enabling start-tags, perhaps you'd like another try?

[18] net-enabling start-tag (7.4.1.3, 316:11) =
( stago ("<"),
generic identifier specification [29],
attribute specification list [31],
*s [5],
net ("/") )

Notice '*s' between the attribute list and NET?

[5] s (6.2.1, 297:23) =
( SPACE
| RE
| RS
| SEPCHAR )

Yes, that's white space, and it's optional.

> Any more questions?

You seem a little overconfident.

> 4) Historically <p> and <br> both were single tags: [...]

What's that got to do with anything? They aren't now, and haven't been
for over a decade.

[snip]

Mike

--
Michael Winter
Prefix subject with [News] before replying by e-mail.

VK

unread,
Mar 22, 2006, 7:15:57 AM3/22/06
to
> > 2) HTML consists of tags denoted by lesser-than and greater-than signs
> > and the content. It does not have any other units like
> > lesser-than...slash...content. It never had them and it hever will.
>
> Yes, it does. It has been mentioned before: you are suffering from denial.
>
> > Some other languages including the proto-language (SGML) may have a lot
> > of funny signs but they have no relation to HTML.
>
> You're apparently missing the small fact that HTML is defined in terms
> of (or, as an application of) SGML. Even historical versions of HTML
> from as far back as 1992 do so.

I already answered this argument in another post. Historically English,
German and (on deeper level) even Persian appertain to the same
proto-language. In XXI century this fact is known and interesting to
involved language history specialists.

> You may not like the idea of HTML as an application of SGML.

Why would I not like it? And would I like it or not it would stay an
application of SGML as English would stay a member of Indo-European
family. It has absolutely none relevance to the present and the future
of the language (in both cases), but the historical fact remains.

I think you are a bit mixed up, as well as some W3C members.
HTML can have any amount of the most strange lexems inside the tag. Any
browser is instructed to take whatever it understands and do whatever
it is unstructed to do with it. It is also instructed to skip on
whatever it doesn't understand. Do not hand up, do not show error page
- simply skip on it. This is a core feature HTML allowing to make all
kind of extentions and custom attributes. Any markup system or a
browser trying to act anyhow differently has no future in the Web and
it never will. I guess this (probably upsetting) fact is starting to
penetrate into even the most stubbering W3C minds.

This little piece of markup will be interpreted properly by any normal
browser and it will produce three line breaks:
<p>Line
<br />
break
<br / >
break
<br !@#$%>
break</p>

Any browser which will not produce three line breaks is extremely badly
broken or badly written.

Both "/" and "!@#$%" are treated as unrecognized lexems and simply
skipped.

Anyone can define "/" or "!@#$%" as some new tag attributes affecting
on parsing.

Anyone can ask from browser producers to endorse this new attribute
(but cannot force them of course).

There is no need though to write some papers to *allow such tag
syntax". It is already inside HTML: "/", "!@#$%", "^&*()" and ad
infinitum.
One can still issue a paper saying that <br /> is a valid HTML syntax.
But it is the same as issue a paper saying that "all people are
entitled to breath".

VK

unread,
Mar 22, 2006, 7:36:15 AM3/22/06
to

VK wrote:
> Anyone can define "/" or "!@#$%" as some new tag attributes affecting
> on parsing.
>
> Anyone can ask from browser producers to endorse this new attribute
> (but cannot force them of course).

And of course if that anyone decides to go for it, she also need to
specify all usage rules. Say:

<br />
<br / >
<br / clear="all">

are going to have the same effect or only space-slash-gt must be
counted?

David Dorward

unread,
Mar 22, 2006, 1:51:19 PM3/22/06
to
VK wrote:

> > You're apparently missing the small fact that HTML is defined in terms
> > of (or, as an application of) SGML. Even historical versions of HTML
> > from as far back as 1992 do so.
>
> I already answered this argument in another post. Historically English,
> German and (on deeper level) even Persian appertain to the same
> proto-language. In XXI century this fact is known and interesting to
> involved language history specialists.

You seem to be missing the point.

First: just because something has been historically defined as being
something, doesn't prevent it CURRENTLY being defined the same way too!

Second: English/German/Persian may have a common ancestor, but HTML is an
application of SGML, not a descendent of it!

> it is unstructed to do with it. It is also instructed to skip on
> whatever it doesn't understand.

> Both "/" and "!@#$%" are treated as unrecognized lexems and simply
> skipped.

http://www.w3.org/TR/html4/conform.html#h-4.1

This specification does not define how conforming user agents handle
general error conditions, including how user agents behave when they
encounter elements, attributes, attribute values, or entities not
specified in this document.

The specification does give some advice on how user agents may handle
errors:

http://www.w3.org/TR/html4/appendix/notes.html#notes-invalid-docs

However, that doesn't cover what happens with most of your examples as "/"
doesn't conform to the rules for attributes so can't be an "unrecognised
attribute" (as it can't be an attribute at all).

Besides, the spec defines <br /> as meaning the same as <br>&gt;. If a
browser doesn't understand that "/" then it doesn't implement HTML fully.
If a browser does implement HTML fully, then it should render a line break
followed by a greater than sign. In either case, the HTML means the same.

> One can still issue a paper saying that <br /> is a valid HTML syntax.

It is ... it just doesn't mean what most people think it means (and what
XHTML 1.0 Appendix C implies it means).

VK

unread,
Mar 22, 2006, 3:49:50 PM3/22/06
to

David Dorward wrote:
> First: just because something has been historically defined as being
> something, doesn't prevent it CURRENTLY being defined the same way too!
>
> Second: English/German/Persian may have a common ancestor, but HTML is an
> application of SGML, not a descendent of it!

Back to the round one :-)
This is why evengelism questions are called so: they have no commonly
accepted solution in a discussion.

The most importantly though that W3C cannot dictate anything by using
SGML rules as implication of something. They will be kicked from door
to door and just being funny to everyone: and the fun is the first
image killer.

HTML is an active SGML application or just SGML descendant - whatever.
The most important how to produce something Needed, Logical,
Convenient, Having hope to eventually go anywhere outside of W3C
cabinets. SGML is really not a big helper here IMHO.

> The specification does give some advice on how user agents may handle
> errors:
>
> http://www.w3.org/TR/html4/appendix/notes.html#notes-invalid-docs
>
> However, that doesn't cover what happens with most of your examples as "/"
> doesn't conform to the rules for attributes so can't be an "unrecognised
> attribute" (as it can't be an attribute at all).
>
> Besides, the spec defines <br /> as meaning the same as <br>&gt;

In what language? Not HTML the world knows about. A narrow group of
"carefully selected people" in W3C and outside may know that and enjoy
the feeling of exclusivity of their knowledge :-). But the practical
outcome of it? Overall this slash-problem is really totally pointless:
if browser knows that tag, then it also knows is it a single tag or a
paired one, so why would it need an extra reminder? And if the tag is
unknown, it will be ignored, paired or not.

David Dorward

unread,
Mar 22, 2006, 4:57:39 PM3/22/06
to
VK wrote:
> This is why evengelism questions are called so: they have no commonly
> accepted solution in a discussion.

I don't see anyone other that you disagreeing with the HTML specification.

> The most importantly though that W3C cannot dictate anything by using
> SGML rules as implication of something. They will be kicked from door
> to door and just being funny to everyone: and the fun is the first
> image killer.

What? Is that even English?

>> Besides, the spec defines <br /> as meaning the same as <br>&gt;
>
> In what language? Not HTML the world knows about.

So you are arguing that HTML is not what the HTML specification says, but
what a group of user agents that attempt to implement that spec do?

> A narrow group of "carefully selected people" in W3C and outside may know
> that and enjoy the feeling of exclusivity of their knowledge :-).

Given that quite a few people shout it from the rooftops whenever the
misconception comes up, that suggestion is utter rubbish.

> But the practical outcome of it?

That most user agents get <br /> wrong. That a few get it right. And that it
is best avoided since it provides no benefits and is harmful to users of
those few user agents that implement that bit of HTML correctly.

> Overall this slash-problem is really totally pointless:
> if browser knows that tag, then it also knows is it a single tag or a
> paired one, so why would it need an extra reminder?

What extra reminder? As several people mentioned before, the tag ends on
the "/". The ">" is a character that is supposed to be displayed, not
duplicate the purpose of the "/".

VK

unread,
Mar 27, 2006, 2:49:21 AM3/27/06
to

David Dorward wrote:
> What extra reminder? As several people mentioned before, the tag ends on
> the "/". The ">" is a character that is supposed to be displayed, not
> duplicate the purpose of the "/".

When you are loosing the rest of your mind by trying to understand W3C
specs, go to Amaya :-) Often W3C members just cannot express their
thoughts in written - but at least they can implement a sample to look
at.

Amaya is the only browser I'm aware of implementing the NET tag. I have
a sad news for you though: NET syntax has nothing to do with XHTML
"single tag". It also doesn't suppose to work in the way you think.
*Nothing* is supposed to be displayed after the slash. The exact
meaning of forward slash is: "ignore everything until another tag". By
its function it is very close to single line comment in programming
languages.

I guess it finally closes the question about forward slash in XHTML (in
XHTML it is an all new sign with an all new meaning) as well as about
the future of forward slash in HTML (it doesn't have any future).

// Test file:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Strict//EN"
"http://www.w3.org/TR/html401/strict.dtd">
<html>
<head>
<title>Test</title>
<meta http-equiv="Content-Type"
content="text/html; charset=iso-8859-1">
</head>

<body>
<p>Paragraph 1 <br / test</p>
<p>Paragraph 2 <br / test</p>
</body>
</html>

// DOM Structure in Amaya (screenshot):
<http://www.geocities.com/schools_ring/archives/NET_Tag.gif>

A copy of Amaya for independent tests can be obtained at:
<http://www.w3.org/Amaya/User/BinDist.html>

Lachlan Hunt

unread,
Mar 27, 2006, 6:36:26 AM3/27/06
to
VK wrote:
> David Dorward wrote:
>>> What extra reminder? As several people mentioned before, the tag ends on
>>> the "/". The ">" is a character that is supposed to be displayed, not
>>> duplicate the purpose of the "/".
>
> It also doesn't suppose to work in the way you think.
> *Nothing* is supposed to be displayed after the slash.

What specification or, as is more likely, what *implementation* are you
basing that statement on?

The fact is, regardless of actual implementations, in HTML (according to
SGML rules): <br> and <br/ are exactly equivalent. From that, we can see
that <br>>, <br>&gt; and <br/> (with or without any space before the
slash) are also equivalent. However, the way modern browsers implement
<br/> is to ignore the slash completely and carry on as if it weren't there.

For non-empty elements, the following are all equivalent, although
browsers only implement 1 and 4:
1. <p>foo</p>
2. <p/foo/
3. <p>foo</>
4. <p>foo
5. <p/foo
6. <>foo

(with the exception of white space differences at the end of the the
last 3; and for 6. only when preceded by another unclosed <p> element)

> The exact meaning of forward slash is: "ignore everything until another tag".
> By its function it is very close to single line comment in programming
> languages.

Again, where on earth did you hear (or, more likely, make up) that nonsense?

VK

unread,
Mar 27, 2006, 6:55:18 AM3/27/06
to

Lachlan Hunt wrote:
> What specification or, as is more likely, what *implementation* are you
> basing that statement on?

I gave you an HTML sample, DOM stricture as it is in Amaya (supporting
NET tag), as well as download link to Amaya browser on W3C site to
repeat the test independently.

If you have any troubles to access the text, please let me know so I
would repost it.

If you think that W3C's own implementation of forward slash is not
correct, you may inform W3C about it and explain to them what does it
really mean and how does it really suppose to work.

Until it's confirmed as an Amaya implementation bug (and so far it is
not), my explanation remains totally correct while yours remains...
well, simply your personal opinion contradicting to W3C's one.

Lachlan Hunt

unread,
Mar 27, 2006, 8:55:31 AM3/27/06
to
VK wrote:
> Lachlan Hunt wrote:
>> What specification or, as is more likely, what *implementation* are you
>> basing that statement on?
>
> I gave you an HTML sample, DOM stricture as it is in Amaya (supporting
> NET tag),

All you supplied was evidence that Amaya does not implement SHORTTAG NET
because its result is clearly not correct; yet you still tried to use it
as proof that you know what you're talking about.

> Until it's confirmed as an Amaya implementation bug (and so far it is
> not), my explanation remains totally correct while yours remains...
> well, simply your personal opinion contradicting to W3C's one.

OK, let's take a look at the markup validator's implementation, which is
built on top of one of the best SGML parsers available. See the source
code and the parse tree at the end of the result page:

http://validator.w3.org/check?uri=data%3Atext%2Fhtml%3Bcharset%3Dutf-8%2C%253C%21DOCTYPE%2520p%2520PUBLIC%2520%2522-%252F%252FW3C%252F%252FDTD%2520HTML%25204.01%252F%252FEN%2522%253E%250D%250A%253Cp%252Ftest%253Cbr%252Ftest&charset=%28detect+automatically%29&doctype=Inline&ss=1&sp=1&verbose=1

If you still don't believe what you see, how about you read up on it a
bit more.
http://www.is-thought.co.uk/book/sgml-9.htm#NET

If that's still not good enough proof, then I guess there's little hope
for you, but you could get yourself a copy of Goldfarb's SGML handbook
(your local library may have a copy) and find out right from the source.

VK

unread,
Mar 27, 2006, 9:26:56 AM3/27/06
to

Lachlan Hunt wrote:
> All you supplied was evidence that Amaya does not implement SHORTTAG NET
> because its result is clearly not correct; yet you still tried to use it
> as proof that you know what you're talking about.

Don't be funny now, really... Amaya is not just a browser: it is a
"mission statement" browser and editor made by W3C. Whatever W3C thinks
it must be - it is in Amaya, even it doesn't exists anywhere else on
the Web (this is why this browser is hardly usable / non-usable for
conventional browsing:- but it is a great tool to "translate" some dark
W3C texts). Evidently it is always under strict W3C surveyance.
At
<http://dev.w3.org/cvsweb/~checkout~/Amaya/doc/Amaya_open_bugs.html?content-type=text/html&only_with_tag=HEAD>

you may found all open bugs in Amaya. btw between the recent ones you
can see one filed by Berners-Lee (this January).

NET tag behavior is not among them, so it is considered right. So you
have a choice: either trash all books you mentioned and read W3C specs
all over again trying to find your misreading point. Or you can drop
reading W3C specs and by using the mentioned books form your own
organization of "True W3C specs readers". Yours to choose ;-)

VK

unread,
Mar 27, 2006, 10:05:01 AM3/27/06
to

VK wrote:
> The exact
> meaning of forward slash is: "ignore everything until another tag". By
> its function it is very close to single line comment in programming
> languages.

Which makes me wonder if NET sign in SGML and // for single line
comment in C-branch of languages are coming from the same proto-source.
That would make perfect sense: C-makers took a known sign with a close
function and just doubled it to not collide with the division sign.

I may look at old docs at my spare time one day out of curiosity (and
yes, I know about the cat ;-)

VK

unread,
Mar 27, 2006, 11:10:36 AM3/27/06
to
Sorry for making "addon" posts, but I did not initially get the
reference to <http://validator.w3.org/>
Of course it validates the sample code I posted because from the W3C's
point of view this nonsense is a valid HTML - and Amaya shows what does
it suppose to mean and how to be handled.

I guess "nonsense" is not completely correct though as it has well
defined sense and behavior in some tri-w'ed minds. It is a nonsense to
try to push it on the browser market (into practical use) - this is
what I meant.

David Dorward

unread,
Mar 27, 2006, 1:23:48 PM3/27/06
to
VK wrote:

> Don't be funny now, really... Amaya is not just a browser: it is a
> "mission statement" browser and editor made by W3C.

According to the front page of the Amaya website it is a "showcase". It
doesn't say that it is it either bug free, nor a nomative guide to the
specifications it implements.

David Dorward

unread,
Mar 27, 2006, 1:27:46 PM3/27/06
to
VK wrote:

> Of course it validates the sample code I posted because from the W3C's
> point of view this nonsense is a valid HTML

Yes, it is valid.

> - and Amaya shows what does it suppose to mean and how to be handled.

No, it doesn't.

Did you, as Lachlan recommended, pay special attention to the parse tree
view the validator generated?

> I guess "nonsense" is not completely correct though as it has well
> defined sense and behavior in some tri-w'ed minds. It is a nonsense to
> try to push it on the browser market (into practical use) - this is
> what I meant.

I don't remember anybody suggesting using SHORTTAG NET on the WWW, or trying
to persuade browsers to implement it correctly.

What has been stated is that XHTML 1.0 that follows all the suggestions of
Appendix C is libel to cause problems in browsers that (a) support SHORTTAG
NET correctly and (b) Try to parse the XHTML as HTML (and so it is
generally a better idea to transform such markup to HTML 4.01 before
serving it to the client).

Lachlan Hunt

unread,
Mar 28, 2006, 1:37:17 AM3/28/06
to
VK wrote:
> VK wrote:
>> The exact
>> meaning of forward slash is: "ignore everything until another tag". By
>> its function it is very close to single line comment in programming
>> languages.
>
> Which makes me wonder if NET sign in SGML and // for single line
> comment in C-branch of languages are coming from the same proto-source.
> That would make perfect sense: C-makers took a known sign with a close
> function and just doubled it to not collide with the division sign.

Hmm. Interesting idea. Following the current trend of ignoring the
facts that a) '/' is not, in any way, a comment marker in SGML or any
implementation of HTML, and b) The C programming language predates SGML
by around 2 decades; it's entirely plausable that the makers of C were
inspired by a copy of Goldfarb's SGML Handbook that fell through a time
warp.

> I may look at old docs at my spare time one day out of curiosity (and
> yes, I know about the cat ;-)

You do that - wasting time is always fun :-).

VK

unread,
Mar 28, 2006, 3:41:36 AM3/28/06
to

Lachlan Hunt wrote:
> VK wrote:
> > VK wrote:
> >> The exact
> >> meaning of forward slash is: "ignore everything until another tag". By
> >> its function it is very close to single line comment in programming
> >> languages.
> >
> > Which makes me wonder if NET sign in SGML and // for single line
> > comment in C-branch of languages are coming from the same proto-source.
> > That would make perfect sense: C-makers took a known sign with a close
> > function and just doubled it to not collide with the division sign.
>
> Hmm. Interesting idea. Following the current trend of ignoring the
> facts that a) '/' is not, in any way, a comment marker in SGML or any
> implementation of HTML

/ - "ignore everything until the next token or until null-tag" (SGML)

// - "ignore everything until the end of line" (C)

You don't see any correlation?

> b) The C programming language predates SGML
> by around 2 decades

You may want to check your calendar. GML (pre-SGML) was developed by
Goldfarb and Co back to 60's. Ritchie developed C back to 70's. I
presume the famous "C Language" book was published in 1974 (?)

Not to say I care about this useless academical crap (SHORTTAG, NET,
null-tag etc.) They have no outcome for the reality and never will. But
semantical drift and inheritance (in human languages though) is my
professional hobby.

Reply all
Reply to author
Forward
0 new messages