XML and lisp

Jacek Generowicz

unread,

Aug 23, 2001, 5:34:00 AM8/23/01

to

I've been doing my best to ignore XML thus far, but repeatedly
encountering comparisons of XML to lisp has piqued my interest. I am
wondering whether I can advance my understanding of lisp by learning
about its relation to XML.

Can you reccommend any books or URLs which could help me to learn
about XML, whith the aim of being able to discuss intelligently the
relative merits of the two.

Jacek

Marco Antoniotti

unread,

Aug 23, 2001, 9:42:48 AM8/23/01

to

Jacek Generowicz <j...@ecs.soton.ac.uk> writes:

> I've been doing my best to ignore XML thus far, but repeatedly
> encountering comparisons of XML to lisp has piqued my interest. I am
> wondering whether I can advance my understanding of lisp by learning
> about its relation to XML.

Roughly said (and with my agent provocateur hat on :) ) XML is a
re-invention of the wheel. The wheel is Lisp.

Cheers

--
Marco Antoniotti ========================================================
NYU Courant Bioinformatics Group tel. +1 - 212 - 998 3488
719 Broadway 12th Floor fax +1 - 212 - 995 4122
New York, NY 10003, USA http://bioinformatics.cat.nyu.edu
"Hello New York! We'll do what we can!"
Bill Murray in `Ghostbusters'.

Ian Wild

unread,

Aug 23, 2001, 10:14:54 AM8/23/01

to

Marco Antoniotti wrote:
>
> Jacek Generowicz <j...@ecs.soton.ac.uk> writes:
>
> > I've been doing my best to ignore XML thus far, but repeatedly
> > encountering comparisons of XML to lisp has piqued my interest. I am
> > wondering whether I can advance my understanding of lisp by learning
> > about its relation to XML.
>
> Roughly said (and with my agent provocateur hat on :) ) XML is a
> re-invention of the wheel. The wheel is Lisp.

Similarities:

-o- both are regarded by some as the best thing since sliced bread

-o- both go in heavily for balanced delimiters

-o- both are regarded as overly-bracketful by many people

Differences:

-o- One is a text markup language with little or no semantics

-o- One is a programming language with little or no syntax

Kaz Kylheku

unread,

Aug 23, 2001, 11:10:43 AM8/23/01

to

In article <g07kvvj...@scumbag.ecs.soton.ac.uk>, Jacek Generowicz wrote:
>I've been doing my best to ignore XML thus far, but repeatedly
>encountering comparisons of XML to lisp has piqued my interest.

Comparisons between XML and Lisp as a whole are meaningless.
Only comparisons between XML and Lisp as a data representation are
meaningful. XML is not a programming language, it is merely a syntax
for data representation which squanders bandwidth, memory and processing
time. Lisp as a data representation is more frugal. It's close to being
as compact as you can make a notation for structured data while remaining
in readable plain text.

Tim Bradshaw

unread,

Aug 23, 2001, 11:17:57 AM8/23/01

to

Ian Wild <i...@cfmu.eurocontrol.int> writes:
> Differences:
>
> -o- One is a text markup language with little or no semantics
>
> -o- One is a programming language with little or no syntax

((:reply :title "Lisp is not just a programming language")
(:body
(:p "It is also a text-markup language,
and many other things, as you can see here"
"For instance with a suitable (small) macro, this is quite legal
Lisp syntax, which is compiled to *ML. I have written significantly-sized
documents in this notation."))
(:signature "--tim"))

Bob Bane

unread,

Aug 23, 2001, 12:11:24 PM8/23/01

to

I've been using this as a .signature line on Slashdot for awhile:

To a Lisp hacker, XML is S-expressions in drag.

--
Remove obvious stuff to e-mail me.
Bob Bane

Kaz Kylheku

unread,

Aug 23, 2001, 12:57:55 PM8/23/01

to

In article <3B852B2C...@removeme.gst.com>, Bob Bane wrote:
>I've been using this as a .signature line on Slashdot for awhile:
>
> To a Lisp hacker, XML is S-expressions in drag.

To every other hacker, XML is just a drag, period.

Marco Antoniotti

unread,

Aug 23, 2001, 2:30:29 PM8/23/01

to

Ian Wild <i...@cfmu.eurocontrol.int> writes:

> Marco Antoniotti wrote:
> >
> > Jacek Generowicz <j...@ecs.soton.ac.uk> writes:
> >
> > > I've been doing my best to ignore XML thus far, but repeatedly
> > > encountering comparisons of XML to lisp has piqued my interest. I am
> > > wondering whether I can advance my understanding of lisp by learning
> > > about its relation to XML.
> >
> > Roughly said (and with my agent provocateur hat on :) ) XML is a
> > re-invention of the wheel. The wheel is Lisp.
>

> Differences:

>
> -o- One is a text markup language with little or no semantics
>
> -o- One is a programming language with little or no syntax

I like this one! :)

Graham Ward

unread,

Aug 23, 2001, 6:06:01 PM8/23/01

to

Jacek Generowicz <j...@ecs.soton.ac.uk> writes:

http://www-formal.stanford.edu/jmc/cbcl.html

g

> Jacek

Erik Naggum

unread,

Aug 24, 2001, 3:21:01 AM8/24/01

to

* Tim Bradshaw <t...@tfeb.org>

> ((:reply :title "Lisp is not just a programming language")
> (:body
> (:p "It is also a text-markup language,
> and many other things, as you can see here"
> "For instance with a suitable (small) macro, this is quite legal
> Lisp syntax, which is compiled to *ML. I have written significantly-sized
> documents in this notation."))
> (:signature "--tim"))

As long as we think aloud in alternative syntaxes, I actually prefer to
break the _incredibly_ stupid syntactic-only separation of elements and
attribute values. SGML and its descendants have made a crucial mistake:
For every level of container (there are about 7 of them), there is a new
syntax for _two_ properties of the container: (1) the contents is wrapped
in one syntax, but (2) the "writing on the box" is in quite another.
This means that information and meta-information are massively different
concepts, and this artificial separation runs through the whole SGML
design. Each level offers a new way to write the two differently. This
is what makes it so goddamn hard to reason about SGML documents and to do
reasonably intelligent transformations on them without working your butt
off specifying all sorts of irrelevant stuff that does _nothing_ but get
in your way.

I have come to _loathe_ the half-assed hybrid that some XML-in-Lisp tools
use and produce, because it makes XML just as evil in Lisp as it was in
XML to begin with, and we have gained absolutely nothing in either power
of processing or in abstraction, which is so very un-Lisp-like.

should be read as

(foo (bar "zot") "quux")

and most definitely _NOT_ as ((:foo :bar "zot") "quux"), which turns this
fairly reasonable structure into a morass of complexity worse than it was
to begin with. And it does _NOT_ help to represent empty elements only
with a keyword. Using three different levels of nesting to represent a
single concept is Just Plain Wrong. Also, using keywords is not a good
idea because there needs to be a lot of related information associated
with elements and attributes, in different contexts, not to mention all
the things they do with their funny "namespaces" these days.

Whether something is an attribute or element is _completely_ arbitrary.
It is based on some arbitrary choices in the design process that reveal
absolutely no inherent qualities. For purely pragmatic reasons, SGML
folks will use attributes for some things and elements for others because
their tools can deal with some things in attributes and some things in
elements. The faulty idea that attributes say something "about" the
element and sub-elements somehow constitute be their contents is the same
premature structuring that premature optimization of code suffers from.
The whole language is incredibly misdesigned in making that distinction.

As for writing SGML/XML/HTML/whatever, I have a simple way to get rid of
the annoying verbosity of these stupid languages while _retaining_ that
mistake between attribute values and elements, because it is quite hard
to make simple regular expression-based conversions retain enough data
about an element to decide what should be attribute and element. An
element has the form <name [attributes] | [contents]>. Attribute have
the form <name | value>. Internal whitespace is only for readability.

XML Enamel (NML) CL
<foo/> <foo> (foo)
<foo bar="zot"/> <foo <bar|zot>> (foo (bar "zot"))
<foo>zot</foo> <foo|zot> (foo "zot")
<foo bar="zot">quux</foo> <foo <bar|zot> |quux> (foo (bar "zot") "quux")
<foo>Hey, &quux;!</foo> <foo|Hey, [quux]!> (foo "Hey, " quux "!")
<foo>AT&T you will</foo> <foo|AT&T you will> (foo "AT&T you will")
<foo><bar>zot</bar></foo> <foo|<bar|zot>> (foo (bar "zot"))

So I have almost none of the annoying and arbitrary quote/escape mania in
attribute values or contents alike, either. Entities I write as [name],
and they end up in the Lisp version as symbols if not the character they
represent purely for syntactic reasons. Writing "code" in this language
is actually amazingly painless compared to the produced noise. Besides,
with a few simple modify-syntax-entry calls in Emacs, I get < and > to
match and blink and I can move up and down the structure very easily.

For processing this stuff in Common Lisp, it is _sometimes_ neat to
convert the single | attribute/content marker into the zero-length
symbol, ||, so pathological cases like

which could have been written like this to show how arbitrary the
syntactic disctinction in SGML/XML is

come out as

(foo (bar "zot") || (bar "zot"))

The really interesting thing is that writing in Enamel and producing XML
is so easy that a simple Perl or Lisp function that takes an Enamel
string as argument and produces XML is quite simple and straight-
forward. This makes for some interesting-looking "scripting" that blows
the mind of the miserable little wrecks that think they have to type the
endtag, the quotes and all the other user-inimical features of SGML/XML.

In my personal view, Lisp "markup" has the disadvantage of needing lots
of quotes, while Enamel has the strong advantage that in <xxx|yyy>, xxx
is always symbolic and yyy is always a string of characters subject to
interpretation by whatever the symbolic part instructs in context.

Since the key feature of markup languages is the separation of text from
markup, the simple idea in Enamel should carry enough force to make this
a fully realizable goal without making an artificial syntactic separation
between information and meta-information at any level. If the syntax is
good enough for the information, it should be good enough for the meta-
information, and I think Enamel is. Fortunately, I do not have to create
a whole new international following and engage in godawful politics to
use a better syntax for XML and the like, since XML and the like are only
used as interchange syntaxes these days. Nobody in their right mind
actuslly writes anything by hand in such stupid languages that require so
much attention to incredibly insignificant details and incomprehensibly
irrelevant redundancy, anyway, do they? :)

Finally, note that in Enamel, a complete element is enclosed in <...> and
that means it can be subject to a nice little Common Lisp reader macro,
and it can be taught to recognize other stuff, as well, such as the neat
concept of interpolating expression values where {expression} occurs.

Still at "internal use" stage, I plan to publish some stuff about Enamel
not too far into the future.

///

Tim Bradshaw

unread,

Aug 24, 2001, 6:29:25 AM8/24/01

to

Erik Naggum <er...@naggum.net> writes:

> <foo bar="zot">quux</foo>
>
> should be read as
>
> (foo (bar "zot") "quux")
>
> and most definitely _NOT_ as ((:foo :bar "zot") "quux"), which turns this
> fairly reasonable structure into a morass of complexity worse than it was
> to begin with. And it does _NOT_ help to represent empty elements only
> with a keyword. Using three different levels of nesting to represent a
> single concept is Just Plain Wrong. Also, using keywords is not a good
> idea because there needs to be a lot of related information associated
> with elements and attributes, in different contexts, not to mention all
> the things they do with their funny "namespaces" these days.
>

I don't think I disagree with any of this - my lhtml hack was never
meant as more than that - it originated out of dissatisfaction with
the WITH-x syntax that CL-HTTP uses which is really painful to type,
and you also need to define millions of macros, and it's only meant to
be better than that. I consciously ignored the whole namespace stuff,
because I was really only interested in spitting out something a
browser could render efficiently, and embedding it in lisp programs in
such a way that I can skip easily between lhtml and lisp (the macro
just checks if the car is a keyword basically...). So really I just
want to say that I'm not proposing the syntax I gave as anything other
than a quick hack.

I'm curious about your syntax though: If I want to go from Lisp to
something (rather than from something to Lisp), it seems that the
syntax you give is amiguous because of this (I cut the lines that
don't seem relevent).

> XML Enamel (NML) CL

> <foo bar="zot"/> <foo <bar|zot>> (foo (bar "zot"))

> <foo><bar>zot</bar></foo> <foo|<bar|zot>> (foo (bar "zot"))

What I considered for my own hack was to avoid the whole ((...) ...)
thing by always requiring an attribute list, which could be nil, so
these would come out as

(foo (bar "zot")) and (foo () (bar () "zot")) respectively. But for
most cases that was more typing than I liked (since I was typing in
Lisp not a better syntax).

>
> In my personal view, Lisp "markup" has the disadvantage of needing lots
> of quotes, while Enamel has the strong advantage that in <xxx|yyy>, xxx
> is always symbolic and yyy is always a string of characters subject to
> interpretation by whatever the symbolic part instructs in context.
>

Yes, this is a really good point, the quoting gets tedious.

--tim

Kent M Pitman

unread,

Aug 24, 2001, 11:19:56 AM8/24/01

to

Erik Naggum <er...@naggum.net> writes:

>
> * Tim Bradshaw <t...@tfeb.org>
> > ((:reply :title "Lisp is not just a programming language")
> > (:body
> > (:p "It is also a text-markup language,
> > and many other things, as you can see here"
> > "For instance with a suitable (small) macro, this is quite legal
> > Lisp syntax, which is compiled to *ML. I have written significantly-sized
> > documents in this notation."))
> > (:signature "--tim"))
>
> As long as we think aloud in alternative syntaxes, I actually prefer to
> break the _incredibly_ stupid syntactic-only separation of elements and
> attribute values. SGML and its descendants have made a crucial mistake:
> For every level of container (there are about 7 of them), there is a new
> syntax for _two_ properties of the container: (1) the contents is wrapped
> in one syntax, but (2) the "writing on the box" is in quite another.

Certainly what you say is undeniably true in terms of practice, and I'd even
give you that the notational distinction is not worth the mechanism, but
is there somewhere that the language actually forces this "role" relationship?

I wrote a package in Java at a prior employer which automatically generated
XML representations for classes as elements based on Java metadata, and the
tack I took was not that the XML attributes contain meta-data and the contents
data but rather that the XML attributes contain atomic data and the
contents contain compound data, since this is IN FACT what the real distinction
is. Some type Foo with

int x;
Date y;
ElementList z;

might produce

In effect, what I got out of this was a description that allowed two
syntaxes: an easy syntax for easy things, and a hard syntax for hard
things. Of course, there are all kinds of problems even then because of
subclass relationships (the analog of the problem of *print-nreadably*
and strings of base-char vs char, or the loss of fill-pointer, etc. in
printing a string. Fixing these gets very verbose very quickly.
So I'm mnot defending the notation in that regard.)

But what I'm really wondering is whether SGML has some "intended use" spec
that tells you that you have to put meta-info in the "car" of the "form",
and info in the "cdr". I thought the use of these containers was
semantics-free.

> I have come to _loathe_ the half-assed hybrid that some XML-in-Lisp tools
> use and produce, because it makes XML just as evil in Lisp as it was in
> XML to begin with, and we have gained absolutely nothing in either power
> of processing or in abstraction, which is so very un-Lisp-like.
>
> <foo bar="zot">quux</foo>
>
> should be read as
>
> (foo (bar "zot") "quux")
>

Maybe. Macsyma used a similar notation for years (though without the restriction
on container-ness). I don't think the answer is to change to do the rewrite
you suggest. I don't understand why it's not natural to add the
following as legal syntaxes:

or

This would keep people from feeling the attribute list was a shorthand
area and would also allow the storing of complex meta-data. Right now,
the fact that a use of <...> in the attribute thing seems a terrible waste.
The only rationales I can figure for this were either the desire to
periodically beat someone on the back of the hand for syntax errors
by having a regular application of over-applied syntax or else some sort of
efficiency bum to make the acquisition of strings in the attribute list
uselessly faster. Do you know what the reason was that recursive structures
were not allowed in this position in XML?

Or perhaps it was the fact that the "real world" substitutes for "parsed
structure" things like that weird assembly code like notation which looks
like

(A
AHREF=foo.html
-Text
)A

Perhaps someone was just being uncreative about how a compound-structure
could be offered as an attribute.

> As for writing SGML/XML/HTML/whatever, I have a simple way to get rid of
> the annoying verbosity of these stupid languages while _retaining_ that
> mistake between attribute values and elements, because it is quite hard
> to make simple regular expression-based conversions retain enough data
> about an element to decide what should be attribute and element. An
> element has the form <name [attributes] | [contents]>. Attribute have
> the form <name | value>. Internal whitespace is only for readability.
>
> XML Enamel (NML) CL
> <foo/> <foo> (foo)
> <foo bar="zot"/> <foo <bar|zot>> (foo (bar "zot"))
> <foo>zot</foo> <foo|zot> (foo "zot")
> <foo bar="zot">quux</foo> <foo <bar|zot> |quux> (foo (bar "zot") "quux")
> <foo>Hey, &quux;!</foo> <foo|Hey, [quux]!> (foo "Hey, " quux "!")
> <foo>AT&T you will</foo> <foo|AT&T you will> (foo "AT&T you will")
> <foo><bar>zot</bar></foo> <foo|<bar|zot>> (foo (bar "zot"))
>

> In my personal view, Lisp "markup" has the disadvantage of needing lots
> of quotes, while Enamel has the strong advantage that in <xxx|yyy>, xxx
> is always symbolic and yyy is always a string of characters subject to
> interpretation by whatever the symbolic part instructs in context.

I'd like to see a side-by-side elaboration of this problem to better
understand it.

> Still at "internal use" stage, I plan to publish some stuff about Enamel
> not too far into the future.

Good. I'd hate for it to be "lost" as merely a post here, though I think
it's fun that you felt comfortable in sharing your thoughts.

Erik Naggum

unread,

Aug 24, 2001, 11:17:11 AM8/24/01

to

* Tim Bradshaw <t...@tfeb.org>

> I'm curious about your syntax though: If I want to go from Lisp to
> something (rather than from something to Lisp), it seems that the syntax
> you give is amiguous because of this (I cut the lines that don't seem
> relevent).
>
> > XML Enamel (NML) CL
> > <foo bar="zot"/> <foo <bar|zot>> (foo (bar "zot"))
> > <foo><bar>zot</bar></foo> <foo|<bar|zot>> (foo (bar "zot"))

The key to this is the relationship between foo and bar. Whether bar is
an attribute or a sub-element of foo is irrelevant to processing them,
but when you need to turn this back into SGML/XML/Enamel, you need to

know which it is. This is why I said:

As for writing SGML/XML/HTML/whatever, I have a simple way to get rid
of the annoying verbosity of these stupid languages while _retaining_
that mistake between attribute values and elements, because it is quite
hard to make simple regular expression-based conversions retain enough
data about an element to decide what should be attribute and element.

... implying that I would normally have such information and use it when
generating attribute/value or sub-element/contents.

///

Barry Fishman

unread,

Aug 24, 2001, 11:46:40 AM8/24/01

to

Tim Bradshaw <t...@tfeb.org> writes:

> Erik Naggum <er...@naggum.net> writes:
>
>> XML Enamel (NML) CL
>> <foo bar="zot"/> <foo <bar|zot>> (foo (bar "zot"))
>> <foo><bar>zot</bar></foo> <foo|<bar|zot>> (foo (bar "zot"))
>
> What I considered for my own hack was to avoid the whole ((...) ...)
> thing by always requiring an attribute list, which could be nil, so
> these would come out as
>
> (foo (bar "zot")) and (foo () (bar () "zot")) respectively. But for
> most cases that was more typing than I liked (since I was typing in
> Lisp not a better syntax).

Wouldn't attribute lists need to have a more `let' like syntax (and behavior).

(foo ((bar "zot")) "text")

or for some HTML:

(font ((size 10) (color :yellow)) "text")

Which is just a lisp program, after applying even my minimal skills
with macros. The hard part is not overwelming the text.

(fragment ((layout :html-like) (feeling :pompous)) "
SGML and TeX, being just markup, did their best to preserve the bulk
of text without any transformation. Their goal is to take a normal
text document and " (tquote "mark it up") " for computer interpretation.
SGML markers are ugly but they weren't intended to dominate the file.
" (p) "
As people's interest has moved from SGML to XML, they now talk more off
" (italic "structured data") ", and although this is a somewhat subtle
change of mindset, it makes the markup the dominant part of the file.
Unfortunately, once people start down a course of action they rarely
stop to consider if the original design guidelines and intent may have
been lost.
" (p) "
Just by " (emph "standardizing") " a straightforward mapping of XML
into and back from lisp, the uglyness and verbosity of XML would be
less of an issue. You could use the syntax you liked. I suspect when
the enthusiasm for XML has died down a bit, the benefits of a
standardized lisp notation could become better recognized.
" (p) "
Without such standards, of course, forget it.
")

You do need to step around the native lisp functions like quote.

Barry Fishman

Erik Naggum

unread,

Aug 24, 2001, 1:05:27 PM8/24/01

to

* Barry Fishman <barry_...@acm.org>

> Wouldn't attribute lists need to have a more `let' like syntax (and behavior).

No. Please forget the attributes. There _are_ no attributes. Whether
something is an attribute or not is completely arbitrary and irrelevant.
Your access to that information is _not_ dependent on its rerepsentation
in SGML/XML. Treat everything as a subordinate element. This is the key
idea to gaining power of abstraction over the XML data. Holding on to
the mythical distinction between attribute and sub-element is the key
idea to losing any and all power of abstraction.

> Just by " (emph "standardizing") " a straightforward mapping of XML into
> and back from lisp, the uglyness and verbosity of XML would be less of an
> issue. You could use the syntax you liked. I suspect when the
> enthusiasm for XML has died down a bit, the benefits of a standardized
> lisp notation could become better recognized.

Please understand that that is what I was trying to do. The only way to
deal with the mistake that they made in syntactically separating
attributes from contents is to undo that mistake. Any and all catering
to it is only making it worse.

> You do need to step around the native lisp functions like quote.

Huh?

///

Erik Naggum

unread,

Aug 24, 2001, 4:03:19 PM8/24/01

to

* Kent M Pitman <pit...@world.std.com>

> Certainly what you say is undeniably true in terms of practice, and I'd even
> give you that the notational distinction is not worth the mechanism, but
> is there somewhere that the language actually forces this "role" relationship?

No, there is nothing that requires there to be element attributes as a
distinct concept from element contents. There are, however, a number of
practical things that follow from making that arbitrar distinction which
can look like rationales, but if you ask yourself "why can it not be a
subelement", there are no real answers, only appeals to the idea that
there somehow __have to be a distinction. It took me years to figure out
that the whole attribute idea is completely vacuous, and I worked with
the creator of SGML himself for several years on several SGML-related
standards and projects. I started writing "A conceptual introduction to
SGML" back in 1994, but as I had pained my way through five chapters, I
had to realize that it was all wrong. There was a basic design mistake
in the whole language framework. That mistake is that simply put: "what
is good enough for the users of the language is not good enough for its
creators". Each and every level of "containership" in SGML has its own
syntax, optimized for the task. Each and every level has a different
syntax for "the writing on the box" as opposed to "the contents of the
box". This follows from a very simple, yet amazingly elusive principle
in its design: Meta-data is conceptually incompatible with data. This is
in fact wrong. Meta-data is only data viewed from a different angle, and
vice versa. SGML forces you to remain loyal to your chosen angle of view.

> I wrote a package in Java at a prior employer which automatically
> generated XML representations for classes as elements based on Java
> metadata, and the tack I took was not that the XML attributes contain
> meta-data and the contents data but rather that the XML attributes
> contain atomic data and the contents contain compound data, since this is
> IN FACT what the real distinction is.

The key to understanding this is that there is no _one_ real distinction.
There are in fact any number of "real distinctions". You just found one
way to wrap your world in the attribute/contents dichotomy because it was
there. What would you do if it was not? What would you do if you had
only sub-elements? Would you have _invented_ attributes? I do not think
anyone would have, because using sub-elements exacts no higher cost than
using attributes.

> In effect, what I got out of this was a description that allowed two
> syntaxes: an easy syntax for easy things, and a hard syntax for hard
> things.

I propose an easier syntax for the harder things and a slightly harder
syntax for the easier things so they do not impose any easy-vs-hard
misconceptions on the user and designer. By making both things cost the
same, the decision to use an attribute or a sub-element becomes a very
different choice.

> But what I'm really wondering is whether SGML has some "intended use"
> spec that tells you that you have to put meta-info in the "car" of the
> "form", and info in the "cdr". I thought the use of these containers was
> semantics-free.

The intended use has less to do with it than the notion that you can
define what is meta-information and what is information at the time you
want to decide whether something goes in an attribute or a sub-element.
My argument is that this is impossible. Whether it is meta-information
or information is a reflection of the actual use, not the intended use.

However, given that the mechanism was created, and I will argue that it
was not so much created as it was never thought possible to be any other
way, it was used to define several language properties. "Now that we
have this, would it not also be nice to have that." This means that
several of the attribute types grew very far apart from the contents of
sub-elements and you sort of "had" to use them as attributes, but only
sort of, because the application can and does define the semantics of
everything, and if you want ID and IDREF, you can make the same choice as
you would in Common Lisp to use symbols or a hash tables of strings.

> > I have come to _loathe_ the half-assed hybrid that some XML-in-Lisp tools
> > use and produce, because it makes XML just as evil in Lisp as it was in
> > XML to begin with, and we have gained absolutely nothing in either power
> > of processing or in abstraction, which is so very un-Lisp-like.
> >
> > <foo bar="zot">quux</foo>
> >
> > should be read as
> >
> > (foo (bar "zot") "quux")
> >
>
> Maybe. Macsyma used a similar notation for years (though without the restriction
> on container-ness). I don't think the answer is to change to do the rewrite
> you suggest.

I cannot follow you here. I am not suggesting a rewrite. I suggest that
there is _no_ distinction between attribute and sub-element contents.
What I am trying to communicate is so emphatically _NOT_ syntax that we
will have a severe communications problem if this is not understood. The
syntax has a function, and I am challenging the _function_ of the syntax
that is believed by many people to support a concept I _also_ challenge.
What do you gain from the attribute-vs-contents dichotomy? Why do you
need it? What does it do for you? What would you have done if it were
not there? What choices and design decisions went into attributes that
would go into contents if you did not have attributes?

> I don't understand why it's not natural to add the
> following as legal syntaxes:
>
> <foo bar=<zot/>>
>
> or
>
> <foo bar=<string>zot</string>>quux</foo>

Imagine that all attributes are in fact sub-elements, and this problem
just goes away. Please, discard the concept of attributes. They no
longer exist. What used to be called "attributes" are only sub-elements
with special treatment and a whole bunch of arbitrary restrictions, one
of which is lack of internal structure (except insofar as defined by the
NOTATION attribute of attributes in SGML).

> This would keep people from feeling the attribute list was a shorthand
> area and would also allow the storing of complex meta-data.

But that is not my goal. My goal is to get rid of the idea that there is
a distinction that can be made once and for all, and prematurely at that,
that some information is meta-data and some information is data. The
core philosophical mistake in SGML is that you can specify these things
before you know them. SGML is great for after-the-fact description of
structures you already know how to deal with perfectly. It absolutely
sucks for structures that are in any way yet to be defined. This is
_because_ it is impossible to define what is considered meta-information
and what is considered information before you actually have a full-blown
software application that is hard to change your mind about. SGML was
supposedly designed to free data from the vagaries of software, but when
it adopted the attribute-content dichotomy, it dove right into dependency
on the software design process instead of the information design process.

> Do you know what the reason was that recursive structures were not
> allowed in this position in XML?

Yes, as a matter of fact, I do. Recursive structures are in fact allowed
in attribute values, provided that your application processe them and not
the SGML/XML parser. Back in the SGML days, the NOTATION attribute of
both elements and attribute values was designed as an "escape" to the
application to let some other syntax processor deal with the string of
characters. (Please understand that everything SGML/XML is a string of
characters. There are no _values_. Imposing valuedom on strings is the
kind of semantics that SGML/XML specifically does _not_ support.)

> Or perhaps it was the fact that the "real world" substitutes for "parsed
> structure" things like that weird assembly code like notation which looks
> like
>
> (A
> AHREF=foo.html
> -Text
> )A
>
> Perhaps someone was just being uncreative about how a compound-structure
> could be offered as an attribute.

No, they never actually thought of it that way. You have to understand
and appreciate that the design process for SGML was such that some people
had a very clear picture of the meta-information-vs-information dichotomy
and that it never occurred to anyone that meta-information had exactly
the same properties as information.

Whoever first decided to define HTML in such a way that unknown elements
should be displayed suffered from exactly the same problem. As a sorry
consequence, we have elements that have to contain _comments_ that are
the real contents because that somebody did not foresee the need to have
meta-information in contents. I argue that this is a result of "getting"
the invalid meta-information/information dichotomy. If that person had
not been bitten by the false idea that meta-information is fundamentally
different from information, he would have realized that there would be a
need to use element contents for meta-information, as well.

> Good. I'd hate for it to be "lost" as merely a post here, though I think
> it's fun that you felt comfortable in sharing your thoughts.

Well, it took ten years of discomfort with the "attribute" concept before
I went back to examine the genesis of the various forms of attributes and
persisted in asking the question "could it not have been done with
sub-elements", and finally found that the reason it could not was that
somebody did not _want_ it to be done with sub-elements, and that the
root cause of this was a fundamental misunderstanding of the relationship
between information and meta-information. Just like Plato and Aristotle
agreed that ideas and concepts were somehow "inherent" in the things we
saw and not a property of the person who observed and organized them in
his own mind, SGML embodies the false premise that structuring has some
inherent qualities and processing that structure should reflect its
inherent qualities. The result is that the processing defines the
structure. If there is a mismatch between the two, the result is a very
painful and elaborate processing, and it can be solved very simply by
removing the attribute/sub-contents dichotomy, because once we do that,
we return to first principles and can move forward with the same
knowledge and experience that created the attributes, but now we can do
it with sub-elements, instead, and I can promise you that once you start
off on that road, the least of your worries will be recursive structure
in attribute values.

///

Ray Blaak

unread,

Aug 24, 2001, 5:54:28 PM8/24/01

to

Erik Naggum <er...@naggum.net> writes:
> The key to understanding this is that there is no _one_ real distinction.
> There are in fact any number of "real distinctions". You just found one
> way to wrap your world in the attribute/contents dichotomy because it was
> there.

I fully agree with Eric here, and have myself implemented S-expression file
formats that in fact collapsed attributes to be just child elements in the
same way.

The only useful information was the name of some data, and the assumed or
explicit type of the data value. It made no difference in terms of processing
if the data was logically an "attribute" or an "element" -- the problem of
extraction is exactly the same in both cases.

That there is a difference with XML is only due to its artifical distinction
of attributes vs elements.

The only useful distinction I have found for attributes vs elements was
aesthetical: how did the element look to a human reader (i.e. me) of the XML
file? Whether I would choose a simple vs compound approach depended solely on
my mental picture of the data in question, e.g.

vs

In an S-expression format however, it doesn't matter, and the aesthetic
distinction is only:

(foo (name joe) (size big))

vs

(foo
(name joe)
(size big))

> Recursive structures are in fact allowed in attribute values, provided
> that your application processe them and not the SGML/XML parser.

<rant>

As a separate topic, I just *hate* when people encode complicated data into
attributes, forcing applications solve yet another parsing problem. The whole
point of something like XML is to have a standard encoding structure. The
"parsing" problem is supposed to be solved once, and only the semantic
interpretation should remain.

E.g.

<this_bad choice="a|b|c"/>

<this_better>
<choice>
<alternative>a</alternative>
<alternative>b</alternative>
<alternative>c</alternative>
</choice>
</this_better>

</rant>

Of course, S-expressions are much more preferrable. The only "cool" things
about XML that I like is the ability to specify character encodings (ASCII vs
Unicode, etc.) and the schema namespaces business, such that one can mix
"tags" from semantically different spaces.

Mind you, the details of Schemas are overly complicated and gross to use, but
they are better than DTDs which should just die die die. DTDs are a lesson in
why a separate language should *not* be invented.

--
Cheers, The Rhythm is around me,
The Rhythm has control.
Ray Blaak The Rhythm is inside me,
bl...@infomatch.com The Rhythm has my soul.

Barry Fishman

unread,

Aug 24, 2001, 11:31:26 PM8/24/01

to

Erik Naggum <er...@naggum.net> writes:
> No. Please forget the attributes. There _are_ no attributes. Whether
> something is an attribute or not is completely arbitrary and irrelevant.
> Your access to that information is _not_ dependent on its rerepsentation
> in SGML/XML. Treat everything as a subordinate element. This is the key
> idea to gaining power of abstraction over the XML data. Holding on to
> the mythical distinction between attribute and sub-element is the key
> idea to losing any and all power of abstraction.

I looked again, and you incantations did not work. Attributes still
seem to be in the language. I agree that when XML is used as a data
definition they are "completely arbitrary" and make a syntactic
separation which is destructive. I, personally, just avoid using them
when I have control of the XML I use to define data. But I can't
ignor them or re-format them, when I need to generate XML which
someone or some standard defined to use them. That battle belongs in
the XML standards committees, and I am afraid its a bit late to change
their minds.

If I just treat attributes as subordinate elements, I lose the ability
to simply translate from lisp into XML. In other news articles
you seem to suggest that you use information outside the lisp
representation to make that determination. This means that my tools
would require priori knowledge, which I feel a simple lisp->XML
(non-interpretating) translator should not need. I don't think
lisp->XML translators should have constraints that XML parsers don't
have.

In code which interprets the lispified XML, I know what the grammar is,
so can't I (at that time) bury any abstraction issues in the access
methods? I admit I don't fully understand the abstraction benefits
with which you are concerned. I've been overwelmed in tracking
all the XML languages which are being defined. I was hoping that
being able to map them into lisp syntax would help avoid being buried
in XML's confusing syntax. When looking at them in a lisp syntax,
thing can become clearer (and seem less innovative).

I don't agree that the distinction between attributes and entities is
always arbitrary. SGML does stands for Simple Graphical *Markup*
Language, and in a markup language, I think it is important to
distinguish the text of a document from it markup. Multiple
translators may be used, and they should not need to be kept up to
date on what attributes are used in the other translators. In an
expression like:

<header1><italic>Wow</italic>, this is difficult.</header1>

or as lisp (which I think is more readable):
(header1 (italic "Wow") " this is difficult")

it isn't clear whether "Wow" is text or the value of an attribute
unless you have prior knowledge of whether `italic` is a attribute in
the context of a header1 directive. So here the distinction is
simple, clear, and useful. (I am not commenting on the syntax.)

This is still important for things like xhtml -- and probably docbook,
whose standard I have not yet assimilated.

In my previous message I suggested that:
<header1 italic="Wow"> this is difficult</header1>

become:
(header1 ((italic "Wow")) " this is difficult")

With mimimal (but I admit real) damage to the syntax.

Barry Fishman

Erik Naggum

unread,

Aug 25, 2001, 8:08:48 AM8/25/01

to

* Barry Fishman <barry_...@acm.org>

> I looked again, and you incantations did not work. Attributes still
> seem to be in the language.

Sigh.

> I agree that when XML is used as a data definition they are "completely
> arbitrary" and make a syntactic separation which is destructive. I,
> personally, just avoid using them when I have control of the XML I use to
> define data. But I can't ignor them or re-format them, when I need to
> generate XML which someone or some standard defined to use them. That
> battle belongs in the XML standards committees, and I am afraid its a bit
> late to change their minds.

How you work with XML is not defined by those standards bodies. What
your _internal_ representation of XML looks like is not defined by those
standards bodies. One of the fundamental properties of Lisp is that we
have a very nice and well-defined mapping between external and internal
representation for most of our object types. There is no well-defined
mapping between XML syntax and internal representation. Lots of ways are
equally valid. Insisting on only some of them is counter-productive.

> If I just treat attributes as subordinate elements, I lose the ability to
> simply translate from lisp into XML.

You have made up your mind about this, so I shall not try to convince you
of the errors of your ways. People who are dead set on their ways should
be left alone, mostly because the get cranky when faced with alternatives.

> In other news articles you seem to suggest that you use information
> outside the lisp representation to make that determination.

No, you do not understand, and that is because you do not even try.

> This means that my tools would require priori knowledge, which I feel a
> simple lisp->XML (non-interpretating) translator should not need.

I see that you have to be very hard and fast on how you represent your
information. This is your choice. I wish you would recognize it as a
choice, and not try to impose a very specific view on the reality that is
far more flexible and adaptable than you have shown to believe it to be.

> I don't think lisp->XML translators should have constraints that XML
> parsers don't have.

Well, that is another choice you have made. Other people, other choices.

> In code which interprets the lispified XML, I know what the grammar is,
> so can't I (at that time) bury any abstraction issues in the access
> methods?

What does it matter to your access whether something is an attribute or a
sub-element? Why do you need to retain the distinction internally?

> I admit I don't fully understand the abstraction benefits with which you
> are concerned.

I appreciate that you state this, because you certainly have not.

> I've been overwelmed in tracking all the XML languages which are being
> defined.

Yes, overwhelmed by bad design, most people's brain shut down and they
refuse to deal with a massive simplification because it threatens to be
as painful as dealing with the complexity they have barely survived.

> I was hoping that being able to map them into lisp syntax would help
> avoid being buried in XML's confusing syntax.

That is my idea. I am sorry for you that you have to define away the
solution to your problem by insisting on a trivial one-to-one mapping of
conceptual elements that effectively block your own conceptualization.

> When looking at them in a lisp syntax, thing can become clearer (and seem
> less innovative).

How very true.

> I don't agree that the distinction between attributes and entities is
> always arbitrary.

Attribute and entities are very different concepts and distinction
between them is of fundamental importance. I fail to see how you think I
have made any claims about their relationship, however. I am talking
about _elements_.

> SGML does stands for Simple Graphical *Markup* Language,

It stands for Standard Generalized Markup Language, actually. They key
to understanding the name is that "generalized markup" is something more
than mere markup. SGML has aspirations beyond simply marking up text.

> and in a markup language, I think it is important to distinguish the text
> of a document from it markup.

I think I already said that.

> Multiple translators may be used, and they should not need to be kept up
> to date on what attributes are used in the other translators.

Your value judgments are your choice. I happen to disagree with them.
If you try to deny me this, please realize that I do not care at all.

> In an expression like:
>
> <header1><italic>Wow</italic>, this is difficult.</header1>
>
> or as lisp (which I think is more readable):
> (header1 (italic "Wow") " this is difficult")
>
> it isn't clear whether "Wow" is text or the value of an attribute
> unless you have prior knowledge of whether `italic` is a attribute in
> the context of a header1 directive.

Well, first off: You _have_ that prior knowledge. Your application will
actually need to know what to do with it whether it is an attribute or a
sub-element. If your application does not know what to do with it, I
fail to see how whether it is an attribute or an element can matter to
you. If you _do_ know what to do with it, how does it matter to you
whether it came from an attribute value or a sub-element?

> So here the distinction is simple, clear, and useful.

It is arbitrary.

> This is still important for things like xhtml -- and probably docbook,
> whose standard I have not yet assimilated.

No, it is fundamentally unimportant. Please try to accept this premise
for the sake of discussion, and see if something you believe falls out
and shows itself to you as more important than your simple protestations.

> In my previous message I suggested that:
> <header1 italic="Wow"> this is difficult</header1>
>
> become:
> (header1 ((italic "Wow")) " this is difficult")
>
> With mimimal (but I admit real) damage to the syntax.

Keeping the distinction between attributes and content is keeping you
from realizing how simple and efficiently you can deal with XML data.
But that is your choice. I fully expect that loads of people who have
fused their brains shut and have fully "integrated" the false dichotomy
of attributes and contents will never be able to unfuse it and open up to
a very simple realization that it has absolutely no bearing on anything
_other_ than the specific syntax in SGML/XML whether something is an
attribute or an element.

Those who grasp the concepts involved, will see that attributes are just
another form of contents. Those who do not grasp the concepts involved,
will think that attributes are different from contents because they have
been given syntactically different expression. But it is always the
syntax that follows the function. Someone believed that meta-information
should be fundamentally different from information. Someone believed
that the contents of elements should be text that wound up in the final
document on the printed page and the values of attributes should not, but
should only influence the processing of the information. This worked
only as long as SGML was used as a markup language for documents and had
no aspirations towards being an abstract structuring syntax. When it
came to use it as a more abstract syntax, there _is_ no inherent quality
that determines whether some value ends up displayed or not. That has to
be supplied by the software that processes the information, which is
precisely prior knowledge of the structure and its meaning.

///

Barry Fishman

unread,

Aug 25, 2001, 4:00:59 PM8/25/01

to

Erik Naggum <er...@naggum.net> writes:
> * Barry Fishman <barry_...@acm.org>

>> If I just treat attributes as subordinate elements, I lose the ability to
>> simply translate from lisp into XML.
>
> You have made up your mind about this, so I shall not try to convince you
> of the errors of your ways. People who are dead set on their ways should
> be left alone, mostly because the get cranky when faced with
> alternatives.

Crankyness is just a part of facing new ways of looking at things, Its
only a terminal disease in the very young. At my age, old ways of
thinking, still, do not give way without an internal fight. I have
great many years of Java/C/C++/Perl to overcome. I do comprehend that
writing the equivalent code in lisp is pointless, although easy.

I wouldn't be taking the time to learn and work in lisp, if I didn't
recognize that it could significantly improve the ways I analyze
and solve problems. This was made obvious by looking at (what I
presume is) good lisp code.

I will follow your suggestion and remove the entity/attribute
distinctions in my lisp code. I am then left with a strong desire to
keep the names of XML attribute names in a list, and use that in a
generic XML output translator.

I suspect this is still avoiding the issues you have raised. Instead
I will start by writing specific code for each case and see if a less
"C" like way of sharing code becomes evident.

I am open to any suggestions, although I can not guarantee I will
immediately grasp their rational. (I think I am past my cranky
stage. I am never cranky when I get to write code.)

>> SGML does stands for Simple Graphical *Markup* Language,
>
> It stands for Standard Generalized Markup Language, actually.

Yes, Yes, Yes. I was focused on the _markup_ part, but there really is
no excuse when the answer is just an `info psgml' away.

Barry Fishman
--
I am used to working from the general to the specific. Problems seem
to have the same design patterns in C/C++/Java and probably XML,
although the implementations may use slightly different language
features. However, this does not seem to follow with lisp. A new set
of approaches take center stage, and I do not yet have the judgement
to understand their implications. They just dangle before me benefits
which aren't present otherwise. They also seem to carry the seed of
complexitys which could bury the project as a whole. These disasters
of course are present in other languages, but there I can trust my
traditional ways of avoiding them. The answer is, as my music teacher
would say, is practice, practice practice.

Boris Schaefer

unread,

Aug 26, 2001, 5:45:29 AM8/26/01

to

Erik Naggum <er...@naggum.net> writes:

| * Barry Fishman <barry_...@acm.org>
|

| > In an expression like:
| >
| > <header1><italic>Wow</italic>, this is difficult.</header1>
| >
| > or as lisp (which I think is more readable):
| > (header1 (italic "Wow") " this is difficult")
| >
| > it isn't clear whether "Wow" is text or the value of an attribute
| > unless you have prior knowledge of whether `italic` is a attribute
| > in the context of a header1 directive.
|
| Well, first off: You _have_ that prior knowledge. Your
| application will actually need to know what to do with it whether
| it is an attribute or a sub-element. If your application does not
| know what to do with it, I fail to see how whether it is an
| attribute or an element can matter to you. If you _do_ know what
| to do with it, how does it matter to you whether it came from an
| attribute value or a sub-element?

Well, I agree that in most cases you will know whether something was
an attribute or contents, when you're processing it, but what about:

If I understand you correctly (and I'm not exactly sure about that),
you would represent this in Lisp as:

(foo (bar 1) (bar 2))

I don't see how you can distinguish attributes and contents in this
case, and how you can translate this back into the same XML. Probably
I'm missing something.

Boris

--
bo...@uncommon-sense.net - <http://www.uncommon-sense.net/>

If you want to put yourself on the map, publish your own map.

Michael Livshin

unread,

Aug 26, 2001, 4:21:08 AM8/26/01

to

Boris Schaefer <bo...@uncommon-sense.net> writes:

> Well, I agree that in most cases you will know whether something was
> an attribute or contents, when you're processing it, but what about:
>
> <foo bar="1"><bar>2</bar></foo>
>
> If I understand you correctly (and I'm not exactly sure about that),
> you would represent this in Lisp as:
>
> (foo (bar 1) (bar 2))
>
> I don't see how you can distinguish attributes and contents in this
> case, and how you can translate this back into the same XML. Probably
> I'm missing something.

assuming I understood Erik's point, you can surely distinguish
attributes and contents when translating back to XML. that is, your
Lisp->XML translator should know that `(bar 1)' under `foo' is
Lisp->supposed to be an attribute.

having this knowledge _only_ in the output translator frees you from
distinguishing attributes and contents in your program.

this does mean that the value of foo's `bar' attribute (er, content)
should magically always be atomic, because if it's not you won't be
able to output valid XML. but note that if your program can deal with
non-atomic values of `bar', then you probably have chosen wrong XML
format in the first place...

--
What is this talk of 'release'? Klingons do not make software
'releases.' Our software 'escapes' leaving a bloody trail of designers
and Quality Assurance people in its wake.
-- Klingon Programmer

Erik Naggum

unread,

Aug 26, 2001, 5:30:31 AM8/26/01

to

* Boris Schaefer <bo...@uncommon-sense.net>

> Well, I agree that in most cases you will know whether something was
> an attribute or contents, when you're processing it, but what about:
>
> <foo bar="1"><bar>2</bar></foo>
>
> If I understand you correctly (and I'm not exactly sure about that),
> you would represent this in Lisp as:
>
> (foo (bar 1) (bar 2))
>
> I don't see how you can distinguish attributes and contents in this case,
> and how you can translate this back into the same XML. Probably I'm
> missing something.

Yes, you are definitely missing the constraints of SGML in real life.
There are some problems that are not worth solving because they never
come up even if they superficially could appear to come up if you do not
pay attention. This is such a problem. You have failed to consider the
ramifications of the solutions and pose a problem that simply would not
exist if you did. This taxes my patience, which already legendary in its
general absence.

However, I apparently need to insist that you understand that in SGML and
XML alike, you do in fact know what attributes an element has. It cannot
possibly be ambiguous. If you decide to name a sub-element the same as
an attribute, however massively stupid that is even with SGML/XML as it
is, you _still_ know that you have an attribute with that name. That
there is a sub-element with that name, as well, is coincidental to the
representation. There simply is no way you can _not_ know that, unless
you go out of your way to destroy the information that SGML provides you
with. If you destroy the information that is available to you, you will
not get me to do stupid human tricks answering your resulting questions.

I truly wonder what is so hard to understand about this. We Lisp people
are quite used to association lists, right? Keyword-value pairs do not
need to be in property lists to be understandable by Lisp people, do
they? To my mind, whether you store something in a property list or an
association list is arbitrary. However, in the reactions that I have
seen to obliterating the false dichotomy between attributes and contents
in SGML, there somehow seems to be a _fundamental_ difference between
property lists and association lists. I completely fail to understand
how that can be.

The whole deal is so simple I do not even know how to explain it so
people get it if they do not get it immediately. It is somewhat like
seeing someone struggle with fractions. They either get it or they do
not, and although I have managed to make many a struggling child get the
idea, I have _no_ idea what precisely caused them to grasp it. It just
happened, and they laughed in relief. This attribute/container thing is
equally intuitively evident.

Case in point: An element has a fixed number of attributes. That is
reflected in a fixed length of the association list that makes up the
attributes. Attributes are not repetable and not omissible, so if there
are n attributes in the attribute list for an element, there will be n
conses with attributes in the cdr of the element representation. There
are no two ways about this. It is completely and irrevocably unambiguous.

By exploiting the rich information we have about the elements and their
makeup in SGML, we can reason about things with much simpler means than
by adhering strictly to the particular representational issues in SGML.
If it matters to you that some values are attributes, you ask for the
attribute information. If it does not matter to you, you can be relieved
of the distinction. If you want to transform attribute to contents or
vice versa, modify the information about the element, not the element; if
and when you print it out, the modifications will manifest themselves in
new SGML/XML syntax, but nothing happened to your internal representation.

///

Tim Bradshaw

unread,

Aug 26, 2001, 4:39:39 AM8/26/01

to

* Erik Naggum wrote:

> The key to this is the relationship between foo and bar. Whether bar is
> an attribute or a sub-element of foo is irrelevant to processing them,

Yes, that's a good point. Whenever I've tried to design DTDs I've
always ended up having no attributes but doing everything as
subelements, and it's interesting that another very rich & successful
markup language - CL - does everything as `subelements' except when
people like me try and make it mirror HTML.

> ... implying that I would normally have such information and use it when
> generating attribute/value or sub-element/contents.

Yes, that's the crucial information I don't have in my application.

--tim

Kent M Pitman

unread,

Aug 26, 2001, 8:50:58 AM8/26/01

to

Erik Naggum <er...@naggum.net> writes:

> I truly wonder what is so hard to understand about this.

I think in situations like this the answer is that you need to stay concrete.
One often can't say specifically why one finds something difficult or hard,
but one can generate a test case that is at their fringe. It was asked
whether

would be represented as

(foo (bar 1) (bar 2))

and you've sort of hinted yes. You've made allusions to alists as a way
of understanding this, but as a sense of intuition, of course, that doesn't
help a Lisp programmer a lot since plainly an alist is about th leftmost
of each named thing, and people are uneasy about accessing the next-leftmost
element behind it--that usually violates some sense of a-list/stack discipline.

You haven't offered an operator whose goal is to be like destructuring-bind
and so to get around this, so the burden seems, to those looking on, to be
on the programmer to pick apart this structure manually and the set of tools
seems light. That's probably only an artifact of not seeing your tools,
rather than anyone's belief that you have no such tools.

Likewise you haven't shown any syntax which is, by loose analogy, the
equivalent of Lisp's arglist strangeness for keywords where you map a keyword
to a differently-named variable by doing
(lambda ((:foo fu) 3) fu)
It is by having an abstraction like this that you can assure the person
that the caller's name for things will not confuse the callee. I toyed with
coming up with an analogously absurd example for Lisp and the following
was my best go of it. If it seems unhelpful, just ignore it. But the point
is just to show that you can manage
(let ((weird 'ee) (apartheid 'ii) (pie 'ii) (pier 'ee))
(labels ((fn1 (&key ((:ei e)) ((:ie i))) (list 'ee e 'ii i))
(fn2 (&key ((:ie e)) ((:ei i))) (list 'ee e 'ii i))
(sort-by-sound (&rest keys &key (first-vowel-wins-p t)
&allow-other-keys)
(apply (if first-vowel-wins-p #'fn1 #'fn2)
:allow-other-keys t
keys)))
(list (sort-by-sound :first-vowel-wins-p t :ei 'weird :ie 'pie)
(sort-by-sound :first-vowel-wins-p nil :ei 'apartheid :ie 'pier))))
((EE WEIRD II PIE)
(EE PIER II APARTHEID))

That is, somehow you'd expect the external representation (:ie vs :ei) to
have a fixed effect on what two functiosnt that each have the same body
might return, but the arglist mappings (the "magic" in your example, of a
different kind than the "magic" here of &key, but still magic in a way)
manage to sort things out. It isn't their behavior but the cross-bar you
plug between them that is doing something cool, and people don't see what
that cool thing is, probably only for lack of specificity rather than
disbelief that what you say might be true. Just as my example above is ho-hum
to a Lisp programmer, not mysterious, once they understand how keyargs work.

I think it would help if you posted the NML which helps you manipulate
these, and perhaps a small code fragment that showed an end-to-end use
of constructing an expression in Lisp and having it appear in the XML with
this notation Boris suggests, and the reverse. Then people would be
talking concrete still.

Rob Warnock

unread,

Aug 26, 2001, 10:05:43 AM8/26/01

to

Erik Naggum <er...@naggum.net> wrote:
+---------------

| * Boris Schaefer <bo...@uncommon-sense.net>
| > Well, I agree that in most cases you will know whether something was
| > an attribute or contents, when you're processing it, but what about:
| > <foo bar="1"><bar>2</bar></foo>

...

| > (foo (bar 1) (bar 2))
| > I don't see how you can distinguish attributes and contents in this case,

| > and how you can translate this back into the same XML. ...
...

| Case in point: An element has a fixed number of attributes. That is
| reflected in a fixed length of the association list that makes up the
| attributes. Attributes are not repetable and not omissible, so if there
| are n attributes in the attribute list for an element, there will be n
| conses with attributes in the cdr of the element representation. There
| are no two ways about this. It is completely and irrevocably unambiguous.

+---------------

While not repeatable, attributes *are* omissible if the DTD for those
attribute contains either default values or the "#IMPLIED" status keyword,
are they not? So if the DTD said:

<!ELEMENT foo (bar | PCDATA)*>
<!ATTLIST foo bar NUMBER #IMPLIED>

that is, the "foo" element has an optional "bar" attribute *and* also
allows an arbitrary number of "bar" sub-elements, then (foo (bar 1) (bar 2))
*would* be ambiguous.

I see two obvious ways to preserve the simplicity you seek:

1. Do what CL does for declarations, that is, reserve a symbol to
tag lists of attributes (like "declare" does), which are optional,
but if present may only appear before all non-attribute subforms:

(foo (attr (bar 1)) (bar 2))

2. Force attribute names and element names into different packages, e.g.:

(foo (attr:bar 1) (bar 2))

or if the current package is never the keyword package, simply:

(foo (:bar 1) (bar 2))

-Rob

p.s. The article "Element/Attribute Distinction Considered Harmful"
<URL:http://www.lists.ic.ac.uk/hypermail/xml-dev/xml-dev-Aug-1999/0375.html>
from the XML-DEV list discusses precisely the same issue, starting with:

After writing the usual 'when to use elements and when to use
attributes' bit for a new book and then spending some time
close up with the XLink specs, I'm really starting to wonder
if we haven't painted ourselves into a corner by treating leaf
elements and attributes differently.

Unfortunately, no significant followups seem to have been posted!!

-----
Rob Warnock, 30-3-510 <rp...@sgi.com>
SGI Network Engineering <http://reality.sgi.com/rpw3/>
1600 Amphitheatre Pkwy. Phone: 650-933-1673
Mountain View, CA 94043 PP-ASEL-IA

Kent M Pitman

unread,

Aug 26, 2001, 10:25:48 AM8/26/01

to

rp...@rigden.engr.sgi.com (Rob Warnock) writes:

> 2. Force attribute names and element names into different packages, e.g.:
>
> (foo (attr:bar 1) (bar 2))
>
> or if the current package is never the keyword package, simply:
>
> (foo (:bar 1) (bar 2))

Don't forget XML has a package namespace of its own. You'd need nested
namespaces to pull this off, no?

Thomas F. Burdick

unread,

Aug 26, 2001, 3:45:40 PM8/26/01

to

Kent M Pitman <pit...@world.std.com> writes:

> I think it would help if you posted the NML which helps you manipulate
> these, and perhaps a small code fragment that showed an end-to-end use
> of constructing an expression in Lisp and having it appear in the XML with
> this notation Boris suggests, and the reverse. Then people would be
> talking concrete still.

I'd like to echo this sentiment. I'm intrigued, but Dog knows
intriguing things can turn out to be pretty aweful in practice, or
divine, or anywhere in between, but it takes actual experience to tell
the difference most of the time.

Erik Naggum

unread,

Aug 27, 2001, 1:14:29 AM8/27/01

to

* Kent M Pitman <pit...@world.std.com>

> You've made allusions to alists as a way of understanding this, but as a
> sense of intuition, of course, that doesn't help a Lisp programmer a lot
> since plainly an alist is about th leftmost of each named thing, and
> people are uneasy about accessing the next-leftmost element behind
> it--that usually violates some sense of a-list/stack discipline.

Well, this is why association lists work as a metaphor -- attributes in
SGML/XML cannot be repeated. If there are more keys in the remainder of
the contents, they are not attributes.

> You haven't offered an operator whose goal is to be like
> destructuring-bind and so to get around this, so the burden seems, to
> those looking on, to be on the programmer to pick apart this structure
> manually and the set of tools seems light. That's probably only an
> artifact of not seeing your tools, rather than anyone's belief that you
> have no such tools.

Which tools are available for the contents? Why are they _not_ usable
directly for the attributes? I fail to grasp what you want to _do_ with
the attributes that you cannot do with them if they are sub-elements.

You imply that people are unable to deal with sub-elements and need
special tools to deal with attributes. This _must_ be wrong.

> I think it would help if you posted the NML which helps you manipulate
> these, and perhaps a small code fragment that showed an end-to-end use of
> constructing an expression in Lisp and having it appear in the XML with
> this notation Boris suggests, and the reverse. Then people would be
> talking concrete still.

I assume that people who voice their concerns in this discussion know
SGML. I have no inclination to write tutorials for people who do not.
It is a waste of my time, and I know that I will hate it. I have about
500 pages of a book entitled "A Conceptual Introduction to SGML" that I
swear to whichever deity is on duty today will _never_ be published,
because the design flaws of SGML are so pervasive that the only thing I
want to do with them is get rid of them. Accept the fact that I deal
with a history of personal pain in this regard. I invested 6 years of my
life on SGML and related standards, and the more I worked with it, the
more I found that SGML actively destroyed any hope of achieving what it
had set out to do, because it is introducing several poisons into the
conceptual processes of structuring information. Taking a look at what
people do with SGML and XML today has not shown _one_ case of anyone
waking up and smelling the coffee, and it has been _burning_ in the
coffee machine for a decade.

This is my view: You were told that you needed attributes in addition to
sub-element contents. Why did you ever _agree_ to that? The onus of
proof is normally on he who asserts the positive, and I challenge you to
explain to me why you _need_ attributes rather than accepting any
challenge to explain why you do _not_ need them when what I say is that
_you_ already know perfectly well how to deal with sub-elements. If you
have worked with SGML at all, you _know_ that people screw up attributes
and sub-elements, and you _have_ to had to deal with one that should have
been the other in your processing. It is _impossible_ to get them
"right" because the notion that there is a "right" solution depends on
information that is not available at the time the distinction is made.

Over the years, I have thought of _many_ different ways to deal with the
colossal braindamage that is attributes in SGML. One might think of them
as (keyword) arguments to functions, but which other information should
influence a "function" that deals with an element? Well, first and
foremost, its _parentage_. That means that I have already had to get rid
of the notion that <foo bar="x" zot="y"> is "really" a function call like
(foo :bar "x" :zot "y"). It has to know _so_ much more to do _anything_
right that it is completely useless to cast one's thinking in such terms.

SGML must be _questioned_, not accepted as gospel or natural science
reporting on some findings. Somebody made a decision to add attributes,
and I know for a fact that that was back in the days of typesetting and
document production when the idea was that you should be able to "remove"
the "tags" and end up with the readable text of the document as it would
be printed. That was the _real_ rationale for attributes. I happen to
think that was a briliant idea at the time -- competing markup languages
have a serious problem in using notations that destroy the ability to
figure out easily what it intended for human and what is intended for the
machine. (In particular, TeX is a monster.) I tended towards explaining
to people that they should not let stuff that should not be displayed be
in sub-elements. What a crock of shit that advice is! As soon as GML
became more general than producing print documents, for which it was well
suited and still is, the attribute concept had become a mill-stone around
its neck and it dragged it down fast. It was _wrong_ to keep attributes
around when their rationale had been completely eradicated from its set
of operating conditions. It made everything incredibly complex. I was
one of very few people on this planet to really _study_ the standard, and
my brain works in such a way that I still _know_ with immediate certainty
whether something is or is not supported by the standard language and how
to express it. (It works the exact same way with Common Lisp, Ada (1983,
unfortunately :), C (1991), and any number of things I have really sat
down to study and understand, and it is so efficient that I even get an
emotional response to violations before I see the logic of them.) I love
the way my brain works, but it also has serious drawbacks: Overriding and
updating old information is something I have to work really hard at. The
end result of the way I think and the way the standard is defined is that
I immediately saw these massively complex ways to do things that "nobody"
understood. Take HyTime and what it calls "architectual forms" -- I
vividly remember a long walk around a quiet Tallahassee one summer night
with the creator of this concept, when I questioned some of the designs
and how it would be implemented, and he was quiet for the longest time
before he said that I was probably the first person to have understood
what he was _really_ trying to accomplish. That would have been _such_ a
great thing if it had been, say, rocket science, but it was not. It was
a man-made complexity so great that it had required _months_ of brain-
wracking to really get my intuition working. That was the first time I
had really serious doubts about the wisdom of SGML's structuring process,
because the massive complexity of it all is _completely_ pointless and a
result of spreading the semantics so thin that you had to keep mental
track of an enormous number of relationships to end up with an idea of
what something should do or mean. It does not have to be that way. It
was _profoundly_ disappointing to discover that at the end of this long
process of grasping something that looked intellectually challenging lie
only a complexity that resulted from _rejecting_ simplicity of design at
a few crucial points. Hell, it still took me years to figure out what
alternatives they _should_ have picked up, and by then it was too late.

Now you are probably thinking "how F hard can it be?" and looking half
condescending on a retarded monkey who cannot figure out the purposes of
the mathematical relationships in calculus. But it is the same problem
we find in C++. The question to be asked of massive complexity like that
is not "what wonderful things did you find out that made this necessary",
but "whatever did you _miss_ that made this so horribly complex"? You
can sometimes see people who are really, really dumb go about some simple
tasks in a way that tells you that they have arrived at their ways of
performing it through an incredibly painful process that they are loathe
to reopen or examine at all no matter how hard it is to get it right for
them. Some people will construct ways of performing their job so that
they utilize all available brainpower, simply because that is indeed a
very satisfying feeling. However, when it comes to grasping someone
else's _wrong_ ideas, there is no upper bound on complexity. Some people
have the most bizarrely convoluted thinking processes and they completely
fail to monitor their thinking so they traipse off into oblivion and may
or may not come back, but if they do, it is with these spectacularly
irrational ideas that they _love_ before they discard them. This is the
kind of complexity that befell the SGML community. That I could figure
this mess out and think about it and have something dramatic to say about
it to the creators, frankly scares me.

In any case, I think the core problem is that a request for a rationale
for _removing_ a complexifying misfeature is completely bogus. We should
not look at what we wound up with, we should look at _how_ we wound up
where we are. I have explained how attributes got invented in the first
place and it _was_ a good idea at the time. However, as soon as elements
got more abstract and elements could contain _no_ information that would
wind up on the printed page, but instead other elements that would, and
those "abstract" elements would influence the way their sub-elements'
contents would wind up on the printed page, it should have been clear
that the attribute concept should be scheduled for extinction because
some of its roles had now been moved into a different realm where _all_
of its roles could be moved without sacrificing anything.

The core idea that went horribly wrong with SGML _because_ of the very
sad lack of re-examination of the rationale for attributes is almost so
fundamental that removing it will tear down everything that SGML has
built with it. This is likely why people resist thinking about it,
because it was so painful to learn SGML, it is better to keep out any
risk of having to re-experience that pain from another angle. I shall
probably have to repeat this core idea forever because so few people
really grasp it: SGML claims that some things you want to say about
something is meta-information and some things are "normal" information.
Like the historical baggage from the characters in the file that wound up
in print and the characters that vanished in processing, SGML's view on
meta-information and information is that they are inherently different
and thus not only distinguishable, but in need of being kept apart, so
much so that there are two wildly different languages to describe them.
This core mistake leads to an inability to move between views of your own
information and conceptualization of its structure, and that is just the
way to kill your information.

As a result of this dichotomy, SGML imposes an incredibly hard structure
on the information. If the information wants to break out of it, the
whole structure breaks. (XML is really _nothing_ better, but has all the
appeal of tooth decay the way it touts its caries as "extensibility".)
There are so many rules in the SGML standard that effectively prohibit a
rational way to "flex" its design that people do not refrain from it
because they do not consider it useful to be able to, but because any
change to a document type definition is associated with an unknowable
increase in complexity of processing, especially in the area of bringing
legacy documnts in line with the change. The extreme _brittleness_ of
the SGML structure is a direct result of the core mistake to strike a
dichotomy between meta-information and information, because in real life,
the two are in fact _exactly_ the same thing, it is just a matter of who
looks at it for which purpose. If you do not believe that, it is because
you still think that there _has_ to be a difference. Of course there is,
but it is not _inherent_ or _intrinsic_ to the information, it is highly
pragmatically determined which is which at any given time.

Structuring information is one of the _easiest_ tasks we humans do. All
the time, we add meta-information to information and we do not even mark
it up as we go. Human languages are chock full of meta-information: "I
did not know darkness could be so illuminating", he said, expectingly.
We _have_ no desire to mark meta-information as such and directly because
it is part and parcel of how we interpret what other people tell us. If
I say "yesterday" today, I probably mean "2001-08-26", so I could write
<date <formal 2001-08-26> yesterday>, but I could also talk about the
past in some general term like <date <formal past> yesterday>, and so on
and so forth. What we really grasp about the information we receive _is_
invariably meta-information. The problem is then entirely artificial,
since we do this almost automatically. What we really need are means to
make the meta-information explicit. I used to believe that this would be
a good idea, but until we find ways to "intuit" meta-information from a
human context, I believe it is a waste of effort and it could well be
counterproductive. What we need is a very limited and very practical
approach to obtain a minimal level of meta-information. The more we
specify the move we exclude, because as soon as we aim for a certain
"depth" of representation, the alternative representations at the same
level grow exponentially in number.

///

Erik Naggum

unread,

Aug 27, 2001, 1:23:23 AM8/27/01

to

* rp...@rigden.engr.sgi.com (Rob Warnock)

> While not repeatable, attributes *are* omissible if the DTD for those
> attribute contains either default values or the "#IMPLIED" status keyword,
> are they not?

That depends on whether you represent the parsed or pre-parsed structure.
In a Common Lisp setting, we are dealing with parsed structure. If the
attribute value is "implied" in the source, it still needs to be there in
the parsed structure.

> So if the DTD said:
>
> <!ELEMENT foo (bar | PCDATA)*>
> <!ATTLIST foo bar NUMBER #IMPLIED>
>
> that is, the "foo" element has an optional "bar" attribute *and* also
> allows an arbitrary number of "bar" sub-elements, then (foo (bar 1) (bar
> 2)) *would* be ambiguous.

If you choose to represent a pre-parsed SGML instance in Common Lisp, I
would argue strongly against that before I would even attempt to answer
anything else.

I _really_ mean it when I say that the attribute list has a fixed length.

I also indicated that for pragmatic reasons, I sometimes use a marker to
separate the attributes from the contents in the cdr of the element, such
as when the task at hand would be wastefully slow if I were to deal with
a fully parsed structure. Dirty hacks should be within reach because the
world is sometimes not clean. I am probably not going to get used to the
habit of some people who see a problem in one part of a proposal and
ignore the fact that there is a solution in another part of the same
proposal (like the next paragraph), and I am certainly not patient enough
with all the rampant idiocy in the SGML/XML world to explain this over
and over, but please go back and read the whole message. If you find a
need to use a marker in _some_ cases, I have in fact covered it. In the
fully parsed, fully general case, that need does _not_ arise, because the
attribute list is a fixed set of "slots" in the structure. This should
have no bearing on how to process them, however, but of course it matters
to and from SGML/XML representation.

///

Rob Warnock

unread,

Aug 27, 2001, 4:17:41 AM8/27/01

to

Kent M Pitman <pit...@world.std.com> wrote:
+---------------

+---------------

Oh, heavens! I certainly wasn't trying to open *that* can of worms again!
But yes, you're right, of course, if one were to try to use Lisp namespaces
directly for XML names. But...

I think Erik's parallel response gets it absolutely correct [which I
missed on first reading of his earlier article -- oops!], namely, once
parsed (and defaulted, if necessary) all the stuff about what's an
"attribute" and what's not should be a property of the Lisp representation
of the element [CLOS class, whatever], and not necessarily encoded
in any way in the Lisp data structure per se.

Likewise, I suspect the right answer for dealing with XML namespaces
will turn out to be to have the Lisp representation of each element
worry about that, and use directly-corresponding names for XML elements
and Lisp symbols only to the extent that it's convenient, and *NOT*
attempt to force any rigid or automatic 1-to-1 correspondence.

I was intending to use Lisp packages only to encode the one bit of
"attribute/non-attribute", not encode XML namespace, but Erik rightly
showed that approach was still trapped in the SGML/XML worldview. Hence,
I retract the suggestion (except in the case that the Lisp representation
of a particular element *chooses* to use that distinction, purely for
its own convenience).

-Rob

Rob Warnock

unread,

Aug 27, 2001, 6:39:11 AM8/27/01

to

Erik Naggum <er...@naggum.net> wrote:
+---------------

| rp...@rigden.engr.sgi.com (Rob Warnock)
| > While not repeatable, attributes *are* omissible if the DTD for those
| > attribute contains either default values or the "#IMPLIED" status keyword,
| > are they not?
|
| That depends on whether you represent the parsed or pre-parsed structure.
| In a Common Lisp setting, we are dealing with parsed structure. If the
| attribute value is "implied" in the source, it still needs to be there
| in the parsed structure.

+---------------

*Doh!* I think I finally get what you were trying to say, thanks!

+---------------

| If you choose to represent a pre-parsed SGML instance in Common Lisp...
+---------------

Or a half-parsed (i.e., half-assed)? ;-}

+---------------

| I would argue strongly against that before I would even attempt to
| answer anything else.
|
| I _really_ mean it when I say that the attribute list has a fixed length.

+---------------

Got it. Now let's see if I can explain it to others who may not have:

My understanding of what Erik is suggesting [very strongly!] is that one
should *NOT* try to invent any kind of direct "Lispified" or S-expr
restatement of XML/HTML/SGML *syntax* per se, but instead to *parse*
the XML document and choose convenient (potentially element-specific)
CL representations for the parsed elements. This parsing process will
involve filling in default values for omitted attributes, including those
whose default is "#IMPLIED". Once you have done this parsing, there is
nothing "optional" at all about any of the attributes -- you now have
*all* of their values. [Whether you choose to explicitly store defaulted
ones or not is a separate decision -- in any event you know their values.]

Now, having parsed the element and filled in the defaults, how you
choose to represent it in CL data is pretty much up to you. One way
might be as an instance of a CLOS class, with the attributes as slots
[plus a slot for the sub-elements, if it's not an empty element]. This
would allow you to use a generic function (print-element elem style)
that specialized on both the element type and the desired output style
to output completely different texts from the same parsed document.

Another way is a simple list of the element name[*] followed by the
values of the attributes (with or without attendant "keywords" to
make them readable to humans debugging the program) followed by the
rest of the contained elements (if any). Without any attribute markers
at all, this might have a form similar in appearance (only!) to a
function call with positional parameters, that is:

after parsing might internally represented as:

(foo 1 (bar 2))

Or if you choose to add some element-like structure to the attributes,
you can do that, too. [You might choose to do that if (*ugh!* *shudder!*)
some attributes contain further internal structure, and you'd like to
represent the *parsed* version of that structure in a pleasing way.]
That gets us to:

(foo (bar 1) (bar 2))

But again, since all of the application routines that have to deal with
a "foo" element *know* that "foo" has a "bar" attribute, all of the code
[that cares about attributes] knows that the CADADR is the attribute value
and the CDDR is the content.

Now suppose that the application-implied value for the attribute "bar"
is zero, and we are given this to parse:

What I (finally) heard Erik say is that the only reasonable internal
representation for that (depending on whether you chose the "positional"
or "element-like" representation for foo's attributes) would be one of
these forms:

(foo 0 (bar 2) (bar 17))
or:
(foo (bar 0) (bar 2) (bar 17))

That is, the structure of the CL representation *must* be invariant
w.r.t. inclusion or omission of attributes in the source text. So in
the second form, the CADADR is still the attribute value and the CDDR
is still the content, even though the attribute was omitted in the
source text.

+---------------

| I also indicated that for pragmatic reasons, I sometimes use a marker to
| separate the attributes from the contents in the cdr of the element, such
| as when the task at hand would be wastefully slow if I were to deal with
| a fully parsed structure. Dirty hacks should be within reach because the
| world is sometimes not clean.

+---------------

I now understand & agree.

+---------------

| I am probably not going to get used to the habit of some people who
| see a problem in one part of a proposal and ignore the fact that there
| is a solution in another part of the same proposal (like the next
| paragraph), and I am certainly not patient enough with all the rampant
| idiocy in the SGML/XML world to explain this over and over, but please
| go back and read the whole message.

+---------------

I did, and that's when the light finally dawned, but I have to say
that until one *does* finally understand it's not at all obvious.
No, I don't know how you could have said it any more clearly. I can
only say (from personal experience now!) that if one *ever* falls into
the trap of trying to "Lispify" the *syntax* of XML instead of represent
the *parsed* structure, it can be very hard to let go of that fixation.

Hmmm... Perhaps it's some sort of "figure/ground" thing, as in that
classic picture <URL:http://www.lcsc.edu/ss150/u5s1p6.htm> used in
gestalt psychology. If you see the young woman first, it's sometimes
hard to then see the old hag (or vice versa). And one's history or
prejudices may strongly affect which one you see first, e.g., young
men tend to see the young woman first.

[Of course, once you've seen *both*, then it's much, much easier
to flip your perception back and forth at will between them.]

-Rob

[*] That is, as I mentioned in my parallel reply to Kent, a CL symbol
chosen to *represent* the XML element name, not necessarily or even
desirably any automatic conversion of the XML element name to a CL
symbol.

-----
Rob Warnock, 30-3-510 <rp...@sgi.com>
SGI Network Engineering <http://reality.sgi.com/rpw3/>
1600 Amphitheatre Pkwy. Phone: 650-933-1673
Mountain View, CA 94043 PP-ASEL-IA

[Note: aaan...@sgi.com and zedw...@sgi.com aren't for humans ]

Boris Schaefer

unread,

Aug 27, 2001, 10:16:54 PM8/27/01

to

Erik Naggum <er...@naggum.net> writes:

| That depends on whether you represent the parsed or pre-parsed
| structure. In a Common Lisp setting, we are dealing with parsed
| structure. If the attribute value is "implied" in the source, it
| still needs to be there in the parsed structure.

Aahh, I think this clears things up for me. I think I understand now.
Thanks.

| I am probably not going to get used to the habit of some people who
| see a problem in one part of a proposal and ignore the fact that
| there is a solution in another part of the same proposal (like the
| next paragraph), and I am certainly not patient enough with all the
| rampant idiocy in the SGML/XML world to explain this over and over,
| but please go back and read the whole message.

I did. I actually read that part about the marker before already,
somehow it just didn't enter my brain. I also really didn't realize
that the attribute list _really_ is fixed length after parsing.
Thanks for stressing your patience and explaining it again.

Boris

--
bo...@uncommon-sense.net - <http://www.uncommon-sense.net/>

Facts, apart from their relationships, are like labels on empty bottles.
-- Sven Italla