The horror that is XML

Tim Bradshaw

unread,

Mar 4, 2002, 9:55:02 PM3/4/02

to

I have a system which currently reads an sexp based config file
syntax, for which I need to provide (and in fact have provided) an
alternative XML-based syntax for political reasons.

I'm wondering if anyone else has been through this and has run into
the same problems I have and maybe can offer any solutions. To
describe them I need to describe the current syntax slightly.

The config files are read and validated in a `safe' (as safe as I can
make it easily, which may not be really safe) reader environment.
After reading, they are validated by checking everything read is of a
good type (using an occurs check in case of circularity) a `good type'
means pretty much non-dotted lists, strings, reals, keywords and a
defined list of other symbols. At top-level a file consists of conses
whose cars are one of these good symbols.

Before anything else happens some metasyntax is expanded which allows
file inclusion, and conditionalisation. This results in an `effective
file' which may actually be the contents of several files. The
metasyntax is just things like ($include ...) or ($platform-case ...).

Finally, the resulting forms are passed to a handler function (this is
a function passed to the config file reader) which gets to dispatch
on the car of the forms, and do whatever it likes.

A top-level form is declared valid by declaring that its car is a `good
symbol' (via a macro) and usually by defining a handler for it. In
some cases the system wants all forms to be handled, but in many cases
all it cares is that the form is `good' (it must be good for the first
stage not to reject it) - this depends entirely on the handler.

The end result of this is that a module of the system can very easily
declare a new config-file form to be valid and establish a handler for
it, thus enabling it to get configured correctly at boot time or
whenever else config files are read. The overall system does not have
to care about anything other than making sure the files are read.

(On top of this there's a reasonably trivial hook mechanism which can
let modules run code before or after a config file is read or at other
useful points, so they can, for instance, check that the configuration
they needed actually happened.)

So I have to make something like this work with XML, and I have to do
it without doubling the size of either my brain or the system - as far
as I can see if I was to even read most of the vast encrustation of
specifications that have accumulated around XML I'd need to do the
former, both to make space for them and to invent a time machine so I
can do it in time. If I was to actually use code implementing these
specs then I'd definitely do the latter.

So what I'm doing instead is using the expat bindings done by Sunil
Mishra & Kaelin Colclasure (thanks), writing a tiny tree-builder based
on that, and then putting together a sort of medium-level syntax based
on XML.

Because I'm using expat I don't need to care about DTDs, just about
well-formedness. But it would be kind of nice (the client thinks) to
have DTDs, because it would be additional documentation.

But this seems to be really hard. Firstly, because of the metasyntax,
the grammar is kind of not like anything I can easily describe (as a
non-expert DTD writer). For instance almost any config file can have
metasyntax almost everywhere in it. I could give up and have XML
syntax which looks like:

or something, and write a DTD for that but this is obviously horrible.

Secondly, my system has modules. These modules want to be able to
declare handlers of their own. One day *other people* might write
these modules. It looks to me like any little module which currently,
say, declares some syntax like:

(load-patches
file ...)

now has to involve me in changing the DTD to allow (say)

<load-patches><file>...</file>...</load-patches>

This looks doomed.

When I skim the XML specs (doing more than this would require far
longer than I have: and they've also now fallen through my good strong
19th century floor and killed several innocent bystanders in the
floors below before finally coming to rest, smoking, embedded in the
bedrock a few hundred yards under my flat) it looks like there is
stuff do to with namespaces which looks like it might do what I want -
it looks like I can essentially have multiple concurrent DTDs and
declare which one is valid for a chunk by using namespaces. Then each
module could declare its own little namespace. This is kind of
complicated.

Or I could just give up and not care about DTDs: the system doesn't
actually care, so why should I? But then, is there any sense in which
XML is more than an incredibly complex and somehow less functional
version of sexprs? Surely it can't be this bad?

So really, I guess what I'm asking is: am I missing something really
obvious here, or is it all really just a very hard and over-complex
solution to a problem I've already solved?

--tim

Erik Naggum

unread,

Mar 5, 2002, 11:20:55 AM3/5/02

to

* Tim Bradshaw

| So really, I guess what I'm asking is: am I missing something really
| obvious here, or is it all really just a very hard and over-complex
| solution to a problem I've already solved?

XML, being the single suckiest syntactic invention in the history of
mankind, offers you several layers at which you can do exactly the same
thing very differently, in fact so differently that it takes effort to
see that they are even related.

<foo type="bar">zot</foo> actually defines three different views on the
same thing: Whather what you are really after is foo, bar, or zot,
depends on your application. XML is only a overly complex and otherwise
meaningless exercise in syntactic noise around the message you want to
send. Its notion of "structure" must be regarded as the same kind of
useless baggage that come with language that have been designed by people
who have completely failed to understand what syntax is all about. It is
therefore a mistake to try to shoe-horn things into the "structure" that
XML allows you to define.

In the abaove example, foo can be the application-level element, or it
can be the syntax-level element and bar the application-level element.
It is important to realize that SGML and XML offer a means to control
only the generic identifier (foo) and their nesting, but that it is often
important to use another attribute for the application. This was part of
the reason for #FIXED in the attribute default specification and the
purpose of omitting attributes from the actual tags. In my view, this is
probably the only actually useful role that attributes can play, but
there are other, much more elegant, ways to accomplish the same goal, but
not within the SGML framework. Now, whether you use one of the parts of
the markup, or use the contents of an element for your application is
another design choice. The markup may only be useful for validation
purposes, anyway.

Let me illustrate:

The XML now contains all the syntax information of the "host" language.
Many people think this is the _only_ granularity at which XML should be
used, and they try to enforce as much structure as possible, which
generally produces completely useless results and so brittle "documents"
that they break as soon as anyone gets any idea at all for improvement.

The XML now contains only a "surface level" syntax and the meaning of the
form elements is determined by the application, which discards or ignores
the "form" element completely and looks only at the attributes. This way
of doing things allows for some interesting extensibility that XML cannot
do on its own, and for which XML was designed because people used SGML
wrong, as in the first example.

<form>if<form>...condition...</form>
<form>...then...</form>
<form>...else...</form>
</form>

The XML is now only a suger-coating of syntax and the meaning of the
entire construct is determined by the contents of the form elements,
which are completely irrelevant after they have been parsed into a tree
structure, which is very close to what we do with the parentheses in
Common Lisp.

I hope this can resolve some of the problems of being forced to use XML,
but in all likelihood, lots of people will object to anything but the
finest granularity, even though it renders their use of XML so complex
that their applications will generally fail to be useful at all. Such is
the curse of a literally meaningless syntactic contraption whose
verbosity is so enormous that people are loath to use simple solutions.

My preferred syntax these days is one where I use angle brackets instead
of parentheses and let the symbol in the first position determines the
parsing rules for the rest of that "form". It could be mistaken for XML
if you are completely clueless, but then again, if you had any clue, you
would not be using XML.

///
--
In a fight against something, the fight has value, victory has none.
In a fight for something, the fight is a loss, victory merely relief.

Christopher Browne

unread,

Mar 5, 2002, 11:59:01 AM3/5/02

to

In the last exciting episode, Erik Naggum <er...@naggum.net> wrote::

> * Tim Bradshaw
> | So really, I guess what I'm asking is: am I missing something really
> | obvious here, or is it all really just a very hard and over-complex
> | solution to a problem I've already solved?

> XML, being the single suckiest syntactic invention in the history
> of mankind, offers you several layers at which you can do exactly
> the same thing very differently, in fact so differently that it
> takes effort to see that they are even related.

Wouldn't the embedding of quasi-XML-like functionality into HTML be
considered to suck even worse?
--
(reverse (concatenate 'string "gro.mca@" "enworbbc"))
http://www3.sympatico.ca/cbbrowne/finances.html
Giving up on assembly language was the apple in our Garden of Eden:
Languages whose use squanders machine cycles are sinful. The LISP
machine now permits LISP programmers to abandon bra and fig-leaf.
-- Epigrams in Programming, ACM SIGPLAN Sept. 1982

Kent M Pitman

unread,

Mar 5, 2002, 12:38:13 PM3/5/02

to

Erik Naggum <er...@naggum.net> writes:

> * Tim Bradshaw
> | So really, I guess what I'm asking is: am I missing something really
> | obvious here, or is it all really just a very hard and over-complex
> | solution to a problem I've already solved?
>
> XML, being the single suckiest syntactic invention in the history of
> mankind, offers you several layers at which you can do exactly the same
> thing very differently, in fact so differently that it takes effort to
> see that they are even related.

I don't think there's anything wrong with XML that a surgeon's knife,
removing 80% (or more) of the standard's text, wouldn't fix.

IMO, what makes XML bad is not how little it does but how much it
pretends to fix from what came before, yet without changing anything.
If it had either attempted less or been willing to make actual changes,
it might be respected more.

XML's lifeboat-like attempt to rescue all of SGML's functionality from
drowning, yet without applying "lifeboat ethics" and tossing deadweight
overboard (i.e., abandoning compatibility), seems to be the problem.

To quote Dr. Amar Bose (of Bose corporation fame): Better implies different.

Ray Blaak

unread,

Mar 5, 2002, 12:42:14 PM3/5/02

to

Tim Bradshaw <t...@cley.com> writes:
> Or I could just give up and not care about DTDs: the system doesn't
> actually care, so why should I?

Give up an don't care about DTDs. Your posting gives a clearer explanation
about your format than any DTD would. DTDs that are for humans to read have to
be understandable, and if the DTD will be torturous than there is no point.

DTDs other official purpose is for separate validation, a dubious idea in my
opinion. The application that finally processes an XML file will need to
validate it on its own anyway, so what is the point of validation in advance?

> But then, is there any sense in which XML is more than an incredibly complex
> and somehow less functional version of sexprs? Surely it can't be this bad?

It's really that bad. XML does have the nice notion of support for various
character encodings. There are tricks with namespaces you can do that seem
more powerful, but on the whole things are confusing and error prone as holy
hell.

> So really, I guess what I'm asking is: am I missing something really
> obvious here, or is it all really just a very hard and over-complex
> solution to a problem I've already solved?

You are not missing anything.

--
Cheers, The Rhythm is around me,
The Rhythm has control.
Ray Blaak The Rhythm is inside me,
bl...@telus.net The Rhythm has my soul.

Bob Bane

unread,

Mar 5, 2002, 12:50:32 PM3/5/02

to

Erik Naggum wrote:
>
> XML, being the single suckiest syntactic invention in the history of
> mankind, offers you several layers at which you can do exactly the same
> thing very differently, in fact so differently that it takes effort to
> see that they are even related.
>

Believe it or not, there are things in actual operational use that
syntactically suck worse than XML. Check out:

http://pds.jpl.nasa.gov/stdref/chap12.htm

which describes Object Definition Language (ODL), developed by NASA/JPL
in the early 90's to hold metadata for space data sets (primarily
planetary probe data).

XML is what you get when you assign the nested property list problem to
people who only know SGML. ODL is apparently what you get when you
assign the same problem to people who only know FORTRAN.

Lisp:
(foo (bar "baz"))
XML:
<foo> <bar>baz</bar> </foo>
or maybe:
<foo bar="baz"/>

ODL:
OBJECT = FOO
BAR = "baz"
END_OBJECT = FOO

ODL is the official standard metadata representation for data from the
Earth Science Data and Information System, NASA's next generation
observe-the-whole-earth data gathering project. I am currently working
on a task to take ODL from this system and display it intelligibly. The
current solution (chosen before I got here) is to take the ODL, convert
it to XML, then bounce the XML off an XSLT stylesheet to generate
HTML/Javascript.

So remember as you slog through yet another brain-damaged XML
application - it could be worse.

--
Bob Bane
ba...@removeme.gst.com

Erik Naggum

unread,

Mar 5, 2002, 1:25:39 PM3/5/02

to

* Christopher Browne <cbbr...@acm.org>

| Wouldn't the embedding of quasi-XML-like functionality into HTML be
| considered to suck even worse?

As I have become fond of saying lately, there is insufficient granularity
at that end of the scale to determine which is worse.

Erik Naggum

unread,

Mar 5, 2002, 1:51:30 PM3/5/02

to

* Ray Blaak <bl...@telus.net>

| DTDs other official purpose is for separate validation, a dubious idea in
| my opinion. The application that finally processes an XML file will need
| to validate it on its own anyway, so what is the point of validation in
| advance?

Remember when C was so young and machines so small that the compiler
could not be expected to do everything and we all studiously ran "lint"
on our programs? It was a fascinating time, I can tell you.

Thaddeus L Olczyk

unread,

Mar 5, 2002, 3:49:27 PM3/5/02

to

On Tue, 05 Mar 2002 16:20:55 GMT, Erik Naggum <er...@naggum.net> wrote:

> XML, being the single suckiest syntactic invention in the history of
> mankind,

APL.

David Golden

unread,

Mar 5, 2002, 3:50:27 PM3/5/02

to

Tim Bradshaw wrote:

> I have a system which currently reads an sexp based config file
> syntax, for which I need to provide (and in fact have provided) an
> alternative XML-based syntax for political reasons.
>

> But then, is there any sense in which

> XML is more than an incredibly complex and somehow less functional
> version of sexprs? Surely it can't be this bad?

XML is an incredibly complex and somehow less functional
vertsion of sexprs. It is that bad.

XML thoroughly sucks, but if you have to deal with it, there is an
excellent Scheme library for dealing with it, and a defined mapping of the
XML "infoset" to scheme, in the form of SXML. It'll go XML to sexprs and
vice versa.

I know it's not common lisp, but, in theory, it could be ported with
relatively little effort, and it should be food for thought.

See http://ssax.sourceforge.net/

About the only vaguely interesting features of XML to me are probably
certain aspects of XML-Schema (the replacement for DTDs), and
perhaps certain aspects of the extended hyperlinking (xlink/xpointer)

I've occasionally pondered the similarities of XML-Schema to syntax-rules
in Scheme, giving some sort of
datatyping-of-tree-structures-based-on-their-structure, or some similarly
wooly concept - i.e. checking whether a given sexpr
would match a given complicated macro definition is vaguely akin to
validating an XML document against an XML schema.

--

Don't eat yellow snow.

Eduardo Muñoz

unread,

Mar 5, 2002, 3:58:50 PM3/5/02

to

Erik Naggum <er...@naggum.net> writes:

> Remember when C was so young and machines so small that the compiler
> could not be expected to do everything and we all studiously ran "lint"
> on our programs?

Probably I wasn't born yet, so what is "lint"?

> It was a fascinating time, I can tell you.

I'm sure. I love when KMP (or someone else) talks
about anciente (for me :) software or hardware
(PDP's, VAX, TOPS, Lisp Machines, ITS and the
like).

--

Eduardo Muñoz

Dr. Edmund Weitz

unread,

Mar 5, 2002, 4:00:08 PM3/5/02

to

David Golden <qnivq....@bprnaserr.arg> writes:

> XML thoroughly sucks, but if you have to deal with it, there is an
> excellent Scheme library for dealing with it, and a defined mapping
> of the XML "infoset" to scheme, in the form of SXML. It'll go XML
> to sexprs and vice versa.
>
> I know it's not common lisp, but, in theory, it could be ported with
> relatively little effort, and it should be food for thought.

If it's just about getting the job done maybe this will help:

<http://www.ccs.neu.edu/home/dorai/scm2cl/scm2cl.html>

Edi.

--

Dr. Edmund Weitz
Hamburg
Germany

The Common Lisp Cookbook
<http://cl-cookbook.sourceforge.net/>

Christopher C. Stacy

unread,

Mar 5, 2002, 4:30:58 PM3/5/02

to

>>>>> On Tue, 05 Mar 2002 20:49:27 GMT, Thaddeus L Olczyk ("Thaddeus") writes:

Thaddeus> On Tue, 05 Mar 2002 16:20:55 GMT, Erik Naggum <er...@naggum.net> wrote:
>> XML, being the single suckiest syntactic invention in the history of
>> mankind,

Thaddeus> APL.

APL syntax is simpler than that of Lisp.
Do you program in APL?

Marco Antoniotti

unread,

Mar 5, 2002, 4:34:32 PM3/5/02

to

I beg to differ. APL *is* weird, but it's syntax is amazingly
simple and regular. It is the net effect that is unreadable. This
net effect is due to the special glyphs required and to the fact that
operators have different "semantics" if monadic or dyadic.

Cheers

--
Marco Antoniotti ========================================================
NYU Courant Bioinformatics Group tel. +1 - 212 - 998 3488
719 Broadway 12th Floor fax +1 - 212 - 995 4122
New York, NY 10003, USA http://bioinformatics.cat.nyu.edu
"Hello New York! We'll do what we can!"
Bill Murray in `Ghostbusters'.

David Golden

unread,

Mar 5, 2002, 4:42:32 PM3/5/02

to

Thaddeus L Olczyk wrote:

Strange that you'd say that. Most people I know who like
Lisp also like APL and Forth (if they know about them in the first place).

Both Forth and APL have simple, elegant, syntax. Kinda like...
oh... Lisp...

Note that I'm not talking about asciified APL abominations, which are a
royal pain in the backside to read... APL is unusual in that if
you DON'T use single-symbol identifiers for things, it gets less readable.

Also, APL programs can look as indecipherable as idiomatic
Perl - if you don't know the language. However, like Perl,
if you take a little time to learn the language, it all makes much
more sense (O.K. a little more sense...)

Marco Antoniotti

unread,

Mar 5, 2002, 4:43:55 PM3/5/02

to

David Golden <qnivq....@bprnaserr.arg> writes:

> Tim Bradshaw wrote:
>
> > I have a system which currently reads an sexp based config file
> > syntax, for which I need to provide (and in fact have provided) an
> > alternative XML-based syntax for political reasons.
> >
>
> > But then, is there any sense in which
> > XML is more than an incredibly complex and somehow less functional
> > version of sexprs? Surely it can't be this bad?
>
>
> XML is an incredibly complex and somehow less functional
> vertsion of sexprs. It is that bad.
>
> XML thoroughly sucks, but if you have to deal with it, there is an
> excellent Scheme library for dealing with it, and a defined mapping of the
> XML "infoset" to scheme, in the form of SXML. It'll go XML to sexprs and
> vice versa.
>
> I know it's not common lisp, but, in theory, it could be ported with
> relatively little effort, and it should be food for thought.
>
> See http://ssax.sourceforge.net/
>

Of course, people who do not know Common Lisp are bound to mess things
up.

How do you justify something written as

(*TOP*
(urn:loc.gov:books:book
(urn:loc.gov:books:title "Cheaper by the Dozen")
(urn:ISBN:0-395-36341-6:number "1568491379")
(urn:loc.gov:books:notes
(urn:w3-org-ns:HTML:p "This is a "
(urn:w3-org-ns:HTML:i "funny") " book!"))))
?

Erik Naggum

unread,

Mar 5, 2002, 4:49:18 PM3/5/02

to

* "Eduardo Muñoz"

| Probably I wasn't born yet, so what is "lint"?

No big loss. "lint" was a program that would compare actual calls and
definitions of pre-ANSI C functions because the languge lacked support
for prototypes, so header files was not enough to ensure consistency and
coherence between separately compiled files, probably not even within the
same file, if I recall correctly -- my 7th edition Unix documentation is
in natural cold storage somewhere on the loft, and it is too goddamn cold
tonight. "lint" also ensured that some of the more obvious problems in C
were detected prior to compilation. It was effectively distributing the
complexity of compilation among several programs because the compiler was
unable to remember anything between each file it had compiled. ANSI C
does not prescribe anything useful to be stored after compiling a file,
either, so manual header file management is still necessary, even though
this is probably the singularly most unnecessary thing programmers do in
today's world of programming. "lint" lingers on.

Kenny Tilton

unread,

Mar 5, 2002, 4:59:57 PM3/5/02

to

Marco Antoniotti wrote:
> operators have different "semantics" if monadic or dyadic.

ah yes, I saw that in the K language, I was wondering what possessed
them, thx for clearing that up. :)

--

kenny tilton
clinisys, inc
---------------------------------------------------------------
"Be the ball...be the ball...you're not being the ball, Danny."
- Ty, Caddy Shack

Tim Bradshaw

unread,

Mar 5, 2002, 5:08:02 PM3/5/02

to

* David Golden wrote:

> XML is an incredibly complex and somehow less functional
> vertsion of sexprs. It is that bad.

Thanks for this and the other followups. I now feel kind of better
about the whole thing.

The really disturbing thing is that huge investments in `web services'
are being predicated on using XML, something which (a) is crap and (b)
is so complicated that almost no-one will be able to use it correctly
(`CORBA was too complicated and hard to use? hey, have XML, it's *even
more* complicated and hard to use, it's bound to solve all your
problems!'). Papers like The Economist are busy writing
plausible-sounding articles about how all this stuff might be the next
big thing.

--tim

Christopher Browne

unread,

Mar 5, 2002, 5:27:13 PM3/5/02

to

> APL.

What's wrong with the syntax of APL?

If there's anything simpler and more regular than Lisp, it's APL.

It's fair to say that a lot of APL code depends on the "abuse" of
quasi-perverse interpretations of matrix operations, but that's not
syntax, that's "odd math."
--
(reverse (concatenate 'string "ac.notelrac.teneerf@" "454aa"))
http://www3.sympatico.ca/cbbrowne/linuxxian.html
Oh, no. Not again.
-- a bowl of petunias

Christopher Browne

unread,

Mar 5, 2002, 5:42:45 PM3/5/02

to

In an attempt to throw the authorities off his trail, David Golden <qnivq....@bprnaserr.arg> transmitted:

> XML thoroughly sucks, but if you have to deal with it, there is an
> excellent Scheme library for dealing with it, and a defined mapping
> of the XML "infoset" to scheme, in the form of SXML. It'll go XML
> to sexprs and vice versa.

> I know it's not common lisp, but, in theory, it could be ported with
> relatively little effort, and it should be food for thought.

When I have need to do so, I use Pierre Mai's C interface to expat.
It uses the expat XML parser, and generates sexp output that can be
read in using READ.

(defun xml-reader (filename)
(let ((xml-stream (common-lisp-user::run-shell-command
(concatenate 'string *xml-parser* " <" filename)
:output :stream)))
(prog1
(read xml-stream)
(close xml-stream))))

It would arguably be nicer to have something paralleling SAX which
would generate closures and permit lazy evaluation. But I haven't
found cases yet where the "brute force" of XML-READER was
unsatisfactory to me.

Note that this has the HIGHLY attractive feature of keeping all
management of "ugliness" in a library (/usr/lib/libexpat.so.1) that is
_widely_ used (including by such notables as Apache, Perl, Python, and
PHP) so that it is likely to be kept _quite_ stable.

I'd argue that expat significantly beats doing some automagical
conversion of Scheme code into CL...
--
(concatenate 'string "aa454" "@freenet.carleton.ca")
http://www3.sympatico.ca/cbbrowne/xml.html
Black holes are where God divided by zero.

Kenny Tilton

unread,

Mar 5, 2002, 5:52:51 PM3/5/02

to

Tim Bradshaw wrote:
> Papers like The Economist are busy writing
> plausible-sounding articles about how all this stuff might be the next
> big thing.

I haven't seen what the Economist has to say, but XML /will/ be the next
big thing if it works out as a lingua franca for data exchange. Not
saying XML does not suck from the syntax standpoint, just that syntax
can be fixed or (more likely) hidden.

Marco Antoniotti

unread,

Mar 5, 2002, 5:57:26 PM3/5/02

to

Kenny Tilton <kti...@nyc.rr.com> writes:

> Marco Antoniotti wrote:
> > operators have different "semantics" if monadic or dyadic.
>
> ah yes, I saw that in the K language, I was wondering what possessed
> them, thx for clearing that up. :)

Yep. Turns out that K is a language that heavily borrows from APL.

Christopher Browne

unread,

Mar 5, 2002, 6:47:30 PM3/5/02

to

The world rejoiced as Erik Naggum <er...@naggum.net> wrote:
> * "Eduardo Muñoz"
> | Probably I wasn't born yet, so what is "lint"?

> No big loss. "lint" was a program that would compare actual calls
> and definitions of pre-ANSI C functions because the languge lacked
> support for prototypes, so header files was not enough to ensure
> consistency and coherence between separately compiled files,
> probably not even within the same file, if I recall correctly --
> my 7th edition Unix documentation is in natural cold storage
> somewhere on the loft, and it is too goddamn cold tonight. "lint"
> also ensured that some of the more obvious problems in C were
> detected prior to compilation. It was effectively distributing
> the complexity of compilation among several programs because the
> compiler was unable to remember anything between each file it had
> compiled. ANSI C does not prescribe anything useful to be stored
> after compiling a file, either, so manual header file management
> is still necessary, even though this is probably the singularly
> most unnecessary thing programmers do in today's world of
> programming. "lint" lingers on.

There are new variations on lint, notably "LCLint" which has become
"Splint" which stands for "Secure Programming Lint." It does quite a
bit more than lint used to do.

Chances are that you'd be better off redeploying the code in OCAML
where type signatures would catch a whole lot more mistakes...
--
(concatenate 'string "cbbrowne" "@acm.org")
http://www.ntlug.org/~cbbrowne/lisp.html
``What this means is that when people say, "The X11 folks should have
done this, done that, or included this or that", they really should be
saying "Hey, the X11 people were smart enough to allow me to add this,
that and the other myself."'' -- David B. Lewis <d...@motifzone.com>

Christopher Browne

unread,

Mar 5, 2002, 7:02:35 PM3/5/02

to

The thing is, you don't actually _write_ any XML unless you're the guy
writing the library/module/package that _implements_ XML-RPC/SOAP.

Here's a bit of Python that provides the "toy" of allowing you to
submit simple arithmetic calculations to a SOAP server. (Of course,
that's a preposterously silly thing to do, but it's easy to
understand!)

def add(a, b):
return a + b

def add_array (e) :
total = 0
for el in e:
total = total + el
return total

A bit of Perl that calls that might be thus:
$a = 100;
$b = 15.5;
$c = $soap->add($a, $b)->result;
print $soap->add($a, $b), "\n";

@d = [1, 2, 3, 4, 7];
print $soap->add_array(@d), "\n";

I've omitted some bits of "client/server setup," but there's no
visible XML in any of that.

The problems with SOAP have to do with it being inefficient almost
beyond the wildest dreams of 3Com, Cisco, and Intel (the main
beneficiaries of the inefficiency in this case).

It should be unusual to need to look at the XML. Pretend it's like
CORBA's IIOP, which you generally don't look too closely at.

The place where you _DO_ look at or write some XML is with the "WSDL"
service description scheme, which is more or less similar to CORBA
IDL.

But I'd think CLOS/MOP would provide some absolutely _WONDERFUL_
opportunities there; it ought to be possible to write some CL that
would generate WSDL given references to classes and methods...

--
(concatenate 'string "aa454" "@freenet.carleton.ca")

http://www.ntlug.org/~cbbrowne/finances.html
I have this nagging fear that everyone is out to make me paranoid.

Erik Naggum

unread,

Mar 5, 2002, 7:37:01 PM3/5/02

to

* Kenny Tilton

| I haven't seen what the Economist has to say, but XML /will/ be the next
| big thing if it works out as a lingua franca for data exchange. Not
| saying XML does not suck from the syntax standpoint, just that syntax
| can be fixed or (more likely) hidden.

XML would not be so bad as it is if it were possible to pin down how to
represent it in the memory of a computer. At this time, the most common
suggestion is _vastly_ worse than anything a Common Lisp programmer would
whip up in a drunken stupor. DOM, SAX, XSLT, whatever the hell these
moreons are re-inventing, XML _could_ have been a pretty simple and
straight-forward syntax for a pretty simple and straight-forward external
representation of in-memory objects. This is manifestly _not_ the case,
since so much irrelevant crap has to be carried around in order to output
the same XML you read in.

There are certain mistakes people who have been exposed to Common Lisp
are not likely to make when it comes to mapping internal and external
representations of various object types. Every single one of those
mistakes has been made by the XML crowd, which is not very surprising,
considering the intense disdain for computer science that underlies the
SGML community -- they wanted to save their documents from the vagaries
of application programmers! Instead, they went into exactly the same
trap as every retarded application programmer has fallen into with their
eyes closed. And of _course_ Microsoft thinks it is so great -- XML
embodies the same kinds of mistakes that they are known for in their
proprietary unreadable "document" formats. All in all, a tragedy that
could have been avoided if they had only listened to people who knew how
computer scientists had been thinking about the same problem before them
-- but they would never listen, SGML was a political creation from before
it was born, and nobody should tell them how to do their stuff after it
had been standardized, lest it be deemed to have errors and mistakes...
Instead, we get anti-computer anti-scientists meddling with stuff they
have no hope of ever getting right, and certainly not be able to fix.

XML will go down with Microsoft, whose Steve Ballmer has now threatened
to withdraw Windows XP from the market and not do any more "innovation"
because of the demands made by the government lawsuits! Next, we will
see organized crime barons around the world threaten to stop trafficking
drugs if the police do not stop harassing them. That would certainly
stop the world economy! Steve Ballmer has once again demonstrated why
the evil that is Microsoft must be stopped _before_ it acquires enough
power to actually hurt anyone by making such threats.

Christopher C. Stacy

unread,

Mar 6, 2002, 12:00:35 AM3/6/02

to

>>>>> On 05 Mar 2002 16:34:32 -0500, Marco Antoniotti ("Marco") writes:

Marco> olc...@interaccess.com (Thaddeus L Olczyk) writes:

>> On Tue, 05 Mar 2002 16:20:55 GMT, Erik Naggum <er...@naggum.net> wrote:
>>
>> > XML, being the single suckiest syntactic invention in the history of
>> > mankind,
>> APL.

Marco> I beg to differ. APL *is* weird, but it's syntax is amazingly
Marco> simple and regular. It is the net effect that is unreadable. This
Marco> net effect is due to the special glyphs required and to the fact that
Marco> operators have different "semantics" if monadic or dyadic.

You're right about the regularity, but I don't think that the glyphs
make it _less_ readable - it could be argued that they make it _more_
readable. Of course, if you don't know APL, it will be unreadable,
just like if you don't know Lisp it will be unreadable (in any
very meaningful way).

One can write impenetrable programs in any language, but with dense
languages like APL you pack more trouble on a single line.
Professional APL programmers strive to make their code readable.

Kenny Tilton

unread,

Mar 6, 2002, 12:11:31 AM3/6/02

to

Here is my question to Ye Who Know XML: can you say what you want to say
in XML? I gather one huge objection is that there are N ways to say it.
As long as (plusp N), we're OK. No one other than a compiler author
should be thinking about RISC code, so all we need is.... XCL! Or XMCL:
better syntax compiled into some (doesn't matter what) legal XML.

Since Erik and Tim have bitched and moaned the most about XML, I think
they have to do this latest CL contrib. I mean, what self-respecting
c.l.l contributor cannot point to a pro bono CL contrib?

:)

The Xtroardinary thing about XML is that the world has pretty much
agreed we should all Xchange data in some (doesn't matter what)
universal /teXt/ format. That's the baby, syntaX is the bath water.

Jeff Greif

unread,

Mar 6, 2002, 12:20:23 AM3/6/02

to

Yes, there are things wrong with XML; however, if you are forced to use
it, one of its good uses is for configuration files. When used for such
an application, validating with a DTD (or XML Schema or some other
structure constraint language) helps, at least if people ever modify the
configuration files by hand, and if you use off-the-shelf XML software
to provide some sort of parsed and validated representation for your
application. Why?

The main reason is that you get several layers of error checking and
reporting free:
-- well-formedness checking
-- primitive validity checking of the structure and attribute values
against the DTD or other constraint language

I've found that this greatly reduces the defensive parts of the
application (the possible erroneous conditions your code has to handle).
It wouldn't always be this way, but often when there is an error in the
configuration file, the application is not going any farther until it's
fixed, and the error reporting from the off-the-shelf software is just
good enough to enable the user to fix the file and retry.

Another thing you get free is default values for attributes. This
reduces the amount of editing by users (and may make filling out the
config file more palatable), and still allows the application defaults
to be changed without changing running code. For most users of an app,
typing in XML is a nuisance without editor support, particularly, for
more naive users, that provided by an editor that knows the structure
desired from the DTD and basically only lets the user construct
something both well-formed and valid. If your users are guaranteed to
use one of these, you could skip the runtime validation against the DTD
and let the editor handle it. But it probably can't be guaranteed that
your users will use a structured editor.

Of course, you don't need any of this if your application flawlessly
generates the configuration file from some UI you present to the user.

Finally, this mode of operation allows a certain amount of version skew
between configuration file and your application. The parsing tools are
completely generic and deliver all the standard info from the XML file,
validated or not as you choose. What parts of it your application
chooses to grab is a separate decision, as is what it decides to do with
that information.

For your particular application, you could simply translate the XML file
to your sexpr format as early in the process as possible, leaving all
the validation stuff as it was before XML got into the picture. You
could do this before the inclusion processing was done (presumably the
included files would be XML also, and would have to be translated also).
You could probably do this simple translation with SAX event handlers
atop expat (or any other suitable parser). I'm pretty sure the Xerces
parser that comes with Apache can both validate according to a DTD or
XML Schema and also deliver SAX events. However, you'd have to have
separate DTDs for the outer file and the conditional inclusions.
Alternatively, you could use some kind of XML inclusion processor (I'm
not up on what's available) and validate after the entire structure was
assembled. Given how much you already have invested in handling the
sexpr-based format, this is probably the wrong choice.

I don't think it's doomed or hopeless, and shouldn't cause the size of
the system to double.

Jeff

Christopher Browne

unread,

Mar 6, 2002, 12:26:54 AM3/6/02

to

There is a counterargument to this...

APL programmers often strive HARD to avoid having explicit loops in
their code. (Not quite like Graham's "LOOP Considered Harmful;" more
like "Diamond Considered Harmful"...)

The legitimate reason for this is that if you keep the APL environment
humming on big matrix operations, it takes advantage of all the
Vectorized Power of APL.

Stopping for a while to interpret a loop is a substantial shift of
gears.

This has the result that code too often goes to near-pathological
extremes to do Matrix Stuff that replaces loops. The result of that
is that some "inscrutability" is introduced.
--
(reverse (concatenate 'string "gro.gultn@" "enworbbc"))
http://www3.sympatico.ca/cbbrowne/apl.html
God is dead. -Nietszche
Nietszche is dead. -God

Raymond Wiker

unread,

Mar 6, 2002, 2:56:20 AM3/6/02

to

"Jeff Greif" <jgr...@spam-me-not.alumni.princeton.edu> writes:

> Yes, there are things wrong with XML; however, if you are forced to use
> it, one of its good uses is for configuration files. When used for such
> an application, validating with a DTD (or XML Schema or some other
> structure constraint language) helps, at least if people ever modify the
> configuration files by hand, and if you use off-the-shelf XML software
> to provide some sort of parsed and validated representation for your
> application. Why?
>
> The main reason is that you get several layers of error checking and
> reporting free:
> -- well-formedness checking
> -- primitive validity checking of the structure and attribute values
> against the DTD or other constraint language

I *really* disagree with this. Editing XML files is a royal
pain, and the only way to get rid of this pain is if you don't
actually see the XML. The only way not to see the XML is if the editor
hides the XML, which means that you have to have some smarts in the
editor. The XML format may (or may not) make the editor easier to
write, but you still have to augment the checking that the XML
machinery gives you.

--
Raymond Wiker Mail: Raymon...@fast.no
Senior Software Engineer Web: http://www.fast.no/
Fast Search & Transfer ASA Phone: +47 23 01 11 60
P.O. Box 1677 Vika Fax: +47 35 54 87 99
NO-0120 Oslo, NORWAY Mob: +47 48 01 11 60

Try FAST Search: http://alltheweb.com/

Espen Vestre

unread,

Mar 6, 2002, 3:45:13 AM3/6/02

to

Erik Naggum <er...@naggum.net> writes:

> eyes closed. And of _course_ Microsoft thinks it is so great -- XML
> embodies the same kinds of mistakes that they are known for in their
> proprietary unreadable "document" formats.

The only thing I like about XML is that the fact that XML versions of
Word documents so brilliantly exposes how incredibly broken the software
producing it is.

Sigh. Even the NeXt people at Apple has started moving their property
list file format from a reasonable curly-bracket style to XML:

[macduck:/] root# cat /System/Library/StartupItems/SystemTuning/Resources/no.lproj/Localizable.strings
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist SYSTEM "file://localhost/System/Library/DTDs/PropertyList.dtd">
<plist version="0.9">
<dict>
<key>Tuning system</key>
<string>Stiller inn system</string>
</dict>
</plist>

--
(espen)

Tim Bradshaw

unread,

Mar 6, 2002, 7:24:03 AM3/6/02

to

* Kenny Tilton wrote:
> Since Erik and Tim have bitched and moaned the most about XML, I think
> they have to do this latest CL contrib. I mean, what self-respecting
> c.l.l contributor cannot point to a pro bono CL contrib?

Well, if I had *time* I have this thing called DTML which is kind of
pointy-bracket-compliant lisp:

It can emit XML or various other formats, and it has hacky but
functional tree-rewriting macros (very like the CL macro system) and
so on. We use it a lot, but only seriously for text, as I'd much
rather use sexps for data stuff (see second example above). I have to
write a manual and clean it up and make it work in more lisps, but one
day. Unfortunately not being an academic or independently wealthy
it's seriously non-trivial to find time, but one day.

(It was based on an idea Erik mentioned on cll, I suspect he may have
a better version of the same idea.)

--tim

Tim Bradshaw

unread,

Mar 6, 2002, 7:38:55 AM3/6/02

to

* Jeff Greif wrote:
> The main reason is that you get several layers of error checking and
> reporting free:
> -- well-formedness checking
> -- primitive validity checking of the structure and attribute values
> against the DTD or other constraint language

You seem to have not understood my article. *If* my application was a
great static thing where I could sit down once and for all and write
some DTD, then I could get DTD validation. But I *can not have* a
single DTD (at a useful level), since someone can load modules at
runtime which allow new syntax and declare new things as valid. Think
Lisp.

Even if I could have a DTD the metasyntax means that the DTD needs to
be incredibly lax because I have to allow metasyntax everywhere.

So what I end up with is well-formedness and some idiot trivia like
default values for attributes (which are basically useless because you
can't put structured data there).

Well, I had that already. Now, after 1500 lines of extra code
(fortunately I didn't write all of it) and expat I have a kind of
worse version of the same thing. The previous config file reader
(including all the validation, and preprocessing) was 500 lines.

--tim

Julian Stecklina

unread,

Mar 6, 2002, 8:51:23 AM3/6/02

to

"Eduardo Muñoz" <e...@jet.es> writes:

[...]

> > It was a fascinating time, I can tell you.
>
> I'm sure. I love when KMP (or someone else) talks
> about anciente (for me :) software or hardware
> (PDP's, VAX, TOPS, Lisp Machines, ITS and the
> like).

There is a german movie called "23" that deals with some "hackers" in
the mid 80ies. It was amazing when they bought a PDP as large as a
washing machine. :)

--
Meine Hompage: http://julian.re6.de

Um meinen oeffentlichen Schluessel zu erhalten:
To get my public key:
http://math-www.uni-paderborn.de/pgp/

Espen Vestre

unread,

Mar 6, 2002, 9:25:59 AM3/6/02

to

Julian Stecklina <der_j...@web.de> writes:

> There is a german movie called "23" that deals with some "hackers" in
> the mid 80ies. It was amazing when they bought a PDP as large as a
> washing machine. :)

Gee, I forgot that there were PDPs that small. I remember the DEC-10
system I used to use, which, including all its extra equipment - was a
large room full of refrigerator-looking cabinets and some of those
lovely small top-loading washing machines (which really were disk
cabinets with - what a novelty! - removable hard drives!).
--
(espen)

Tim Bradshaw

unread,

Mar 6, 2002, 10:22:15 AM3/6/02

to

* Kenny Tilton wrote:

> I haven't seen what the Economist has to say, but XML /will/ be the next
> big thing if it works out as a lingua franca for data exchange. Not
> saying XML does not suck from the syntax standpoint, just that syntax
> can be fixed or (more likely) hidden.

Sure, it will be the next big thing in the same sense that the web was
the last big thing: a lot of money will get spent on it, and there
will be a feeding frenzy and a few people will get rich, and then it
will all fall apart when people realise that it isn't actually
transforming the economy. Or perhaps people will still remember the
various web frenzies and it actually won't be the next big thing at
all.

What it won't do, I think, is *technically* transform things or make
life better. Here's a theory as to why:

A lot of commercial computing is all about reduction of friction in
various real-world processes. It's not a useful product in itself,
but it might make various other things less expensive to do. So the
whole e-commerce hype was predicated on the fact that e-commerce could
reduce the cost of transactions and give better access to information
thus enabling the market to work more efficiently.

This is a nice idea, and it ought to work. It's a bit of a let-down
for the web spin doctors that that's all it comes down to, but
actually frictional costs are often very high - a disturbingly large
amount of your phone bill goes to frictional costs of creating bills
for instance, so potentially it's a big win.

But this idea is only a win if the friction added by the computing
solution is lower than that it takes away. In particular these
systems should work. Currently a lot of the problem is that we just
can't write software that works reliably generally (some people can,
but you can't hire people and expect them to produce software that
works).

One cause of this is complexity. If you have a system that is complex
to use and understand, then most people will not use it correctly or
understand it. This will mean that software which uses it is
unreliable, or that it is very expensive to write and maintain.
Software complexity is friction.

CL people should be familiar with this - CL was a pretty complex
language for the 80s and this has historically meant that people find
it hard to use correctly and good CL programmers are expensive
(because they need to have read an internalised ~1000 pages of
specification).

Complexity - and its associated friction - is often necessary. I
think CL is an example: I don't think the language is much more
complex than it needs to be to do what it does. This is just an
opinion of course, others may differ. But in any case systems that do
complicated things need to be complicated.

But complexity that is *not* needed is pure friction, and is just a
cost.

I think that computing systems are becoming much more complex than
they need to be and thus much more frictional. In particular I think
this is true of XML in spades, and it looks like other people agree
with me. I'm not completely clear why this is happening, but my
hypothesis is that complexity is a kind of disease of people's minds.
The reason for this, I think is that people can only think about a
finite amount of stuff at once. If they get lured into a complicated
system, then they tend to have to spend all their time and energy
coping with the complexity, and they completely lose track of what the
system is actually for. So problems tend to get solved by adding more
complexity, because they can't step back and see the actual problem
any more. So once systems become sufficiently complex, people get lost
inside them and become unable to do anything but add yet more
complexity to the system. Occasionally people get so stressed by the
complexity that they revolt against it, and create systems whose sole
aim is to be *simple* - I think scheme is an example of such. These
people haven't escaped from the complexity disease: they are still
obsessed with complexity and have lost sight of the problem.

I think that this situation is fairly desperate. As more people
become lost in complexity and thus stop doing anything useful but
simply create more complexity, the complexity of systems increases
enormously. People who have managed to not fall into the trap but
still have some overview of the problems they are *actually* trying to
solve rather than the problems created by complexity, now have a
problem. In order to interact with these hugely overcomplex systems,
they need to understand how they work, so they have to devote
increasing resources to understanding the complexity until eventually
it overwhelms them and they too become trapped and stop doing anything
useful. This looks pretty toxic: complexity is a virus which is going
to get us all in the end unless we can find a way of simply not
interacting with the systems which contain the virus.

However it's not quite as bad as it looks because there are external
factors: these systems are meant to be used for people's financial
benefit. Overcomplex systems are more expensive to deploy and
maintain and less reliable than merely sufficiently complex systems.
So they make less money for people. This will put the brakes on
complexity: eventually you won't be able to get funding any more to
produce yet another 1000+-page spec for something to `patch' some
deficiency in XML but which actually simply makes it worse. It's not
clear to me whether the system will then equilibruate, in the way
that, say, Word probably has, at a level where it is merely expensive
but not crippling, or whether there will be a real backlash as people
decide they'd like to spend money on something other than yet more
software.

--tim

PS: you can see the kind of effect that complexity has on people as
they get lost in it in some of the followups to my original article.
I basically said that XML was an overcomplex nightmare, and at least
one response suggested that I could fix this by learning and using
some yet other encrustations on top of XML which would let me generate
s-expressions, at the cost of a few more thousand pages of
documentation to understand. But if you have an XML tokenizer, it's
basically *trivial* to generate s-expressions from XML.

Kent M Pitman

unread,

Mar 6, 2002, 10:55:48 AM3/6/02

to

Julian Stecklina <der_j...@web.de> writes:

> "Eduardo Munoz" <e...@jet.es> writes:
>
> [...]
>
> > > It was a fascinating time, I can tell you.
> >
> > I'm sure. I love when KMP (or someone else) talks
> > about anciente (for me :) software or hardware
> > (PDP's, VAX, TOPS, Lisp Machines, ITS and the
> > like).
>
>
> There is a german movie called "23" that deals with some "hackers" in
> the mid 80ies. It was amazing when they bought a PDP as large as a
> washing machine. :)

Ha. Sometimes as SMALL as that. Mostly that's the size (and even
look) of a disk drive of the era. Except for the KS-10, which is a
latter day model that was very small, I think PDP10's, fully
configured with memory, etc. took up a LOT more space than that.
Certainly the ones we had at MIT did. I've seen some small
laundramats that weren't as big as the PDP10's we had...

Kenny Tilton

unread,

Mar 6, 2002, 11:30:34 AM3/6/02

to

Tim Bradshaw wrote:
> But [XML] is only a win if the friction added by the computing

> solution is lower than that it takes away.

Agreed. From the beginning I have concentrated on making Cells friendly
as well as powerful, having seen many an innovation which seemed great
to its developers not catch on with a wider audience because it was such
a pain to use.

But that was my point about hiding the complexity of XML. I have a
compiler that let's me program RISC from "C" and WYSIWYG layout tools
that write HTML for me. I want the same from XCML. Or more likely a
product such as XIS from Excelon, a native XML database sitting atop
their C++ ODB.

The business world spends a lot on programs that do nothing but convert
the output of one program into a format comprehensible to a second
program, even where the both programs were developed by the same
company. These conversion programs are a pain to write where the two
programs see the world differently, and they require steady maintenance
to keep up with changes to either the input or output format.

XML seems to me like it can minimize the sensitivity to file formats,
and since reaping this benefit requires folks to sit down and agree on
domain specific data structures, folks may get drawn into making their
apps see the world more uniformly. Call it self-fulfilling hype: the
world responds to the hype of a data lingua franca by taking steps they
could have taken twenty years ago, and voila the hyped product gets the
credit (and deserves it?)

Kenny Tilton

unread,

Mar 6, 2002, 11:36:58 AM3/6/02

to

Tim Bradshaw wrote:
>
> * Kenny Tilton wrote:
> > Since Erik and Tim have bitched and moaned the most about XML, I think
> > they have to do this latest CL contrib. I mean, what self-respecting
> > c.l.l contributor cannot point to a pro bono CL contrib?
>
> Well, if I had *time* I have this thing called DTML which is kind of
> pointy-bracket-compliant lisp:

I wonder if that is what we need for CL-PDF. I want to look at marrying
the Cells project with CL-PDF, but unless I do a WYSIWYG document editor
(should I?) I will need a markup language parser.

> Unfortunately not being an academic or independently wealthy
> it's seriously non-trivial to find time

true, true... if I were not so keen on Cells I certainly would have
trouble finding the time.

Jeff Greif

unread,

Mar 6, 2002, 11:39:55 AM3/6/02

to

Perhaps my response was not clear. I suggested that you provide a DTD
for each includable section, or module, and that the include directives,
or conditional include directives be a part of the XML. You parse the
XML files recursively from includer to included in each case using the
DTD of the file you're parsing (which knows nothing about what is in the
included files), and convert to sexpr form. Your application logic
evaluates the conditions, etc., and then decides whether to carry out
the specified inclusion, which in turn will be validated against its own
DTD and then converted to sexprs for deeper inclusions if any.

If these superficial DTDs that only validate the outer structure (down
to the includes) of each file aren't helpful, leave them out, if your
client will let you. If the client requires them, they shouldn't be all
that difficult to produce. They should only reflect the inclusion
syntax, not its semantics. An inclusion should be a leaf in a DTD.

Jeff

"Tim Bradshaw" <t...@cley.com> wrote in message
news:ey36649...@cley.com...

Paolo Amoroso

unread,

Mar 6, 2002, 11:41:37 AM3/6/02

to

On 05 Mar 2002 21:58:50 +0100, "Eduardo Muñoz" <e...@jet.es> wrote:

> I'm sure. I love when KMP (or someone else) talks
> about anciente (for me :) software or hardware
> (PDP's, VAX, TOPS, Lisp Machines, ITS and the
> like).

You can get emulators for some of them (PDPs, VAX and more):

SIMH - Computer History Simulation Project
http://simh.trailing-edge.com

Erik and Kent recently commented that knowledge of early file systems is
useful for understanding Common Lisp pathnames. Those emulators may be an
occasion to learn more about those file systems. By the way, a Lisp
implementation for a PDP (PDP-6?) is also available at that site.

Paolo
--
EncyCMUCLopedia * Extensive collection of CMU Common Lisp documentation
http://www.paoloamoroso.it/ency/README
[http://cvs2.cons.org:8000/cmucl/doc/EncyCMUCLopedia/]

Frederic Brunel

unread,

Mar 6, 2002, 12:27:22 PM3/6/02

to

Christopher Browne <cbbr...@acm.org> writes:

> When I have need to do so, I use Pierre Mai's C interface to expat.
> It uses the expat XML parser, and generates sexp output that can be
> read in using READ.
>
> (defun xml-reader (filename)
> (let ((xml-stream (common-lisp-user::run-shell-command
> (concatenate 'string *xml-parser* " <" filename)
> :output :stream)))
> (prog1
> (read xml-stream)
> (close xml-stream))))
>

> Note that this has the HIGHLY attractive feature of keeping all
> management of "ugliness" in a library (/usr/lib/libexpat.so.1) that is
> _widely_ used (including by such notables as Apache, Perl, Python, and
> PHP) so that it is likely to be kept _quite_ stable.

I think it's an acceptable solutions for most systems and it could be
usefull for me. Where could I find this piece of code before I written
my own? :)

--
Frederic Brunel
Software Engineer
In-Fusio, The Mobile Fun Connection

Ray Blaak

unread,

Mar 6, 2002, 12:26:15 PM3/6/02

to

"Jeff Greif" <jgr...@spam-me-not.alumni.princeton.edu> writes:
> [DTD/Schema validation is good because]

> The main reason is that you get several layers of error checking and
> reporting free:
> -- well-formedness checking
> -- primitive validity checking of the structure and attribute values
> against the DTD or other constraint language
>
> I've found that this greatly reduces the defensive parts of the
> application (the possible erroneous conditions your code has to handle).

It doesn't. Since the application cannot assume that the input has already
been validated (there is nothing stopping a user from giving the application
complete garbage, after all), it needs to check anyway. The alternative is
uncontrolled crashes.

> It wouldn't always be this way, but often when there is an error in the
> configuration file, the application is not going any farther until it's
> fixed, and the error reporting from the off-the-shelf software is just
> good enough to enable the user to fix the file and retry.

The application can use the same off-the-shelf software to report errors as
well. Alternatively, the application, when looking for required elements, or
finding misformed elements, and easily report the offending locations using
the services provided by standard XML tools.

> Another thing you get free is default values for attributes.

These default values can also be assumed directly by the application, given
the same benefits to the user.

> [...] and still allows the application defaults to be changed without
> changing running code.

This is one advantage of a DTD. However, if this is what one is after, then
default values can avoid being hardcoded in the usual way by being read in
from a vastly simpler configuration file (which can also be in XML, by the way,
but that is not the point).

> Finally, this mode of operation allows a certain amount of version skew
> between configuration file and your application. The parsing tools are
> completely generic and deliver all the standard info from the XML file,
> validated or not as you choose. What parts of it your application
> chooses to grab is a separate decision, as is what it decides to do with
> that information.

This can be done anyway. Application processing needs to be fairly tolerant so
that future or obsolete file versions can be accommodated. E.g. unexpected
elements can be ignored (perhaps with warnings) and only missing required
elements or malformed required elements are reported as errors.

Note that every grammar rule in a DTD or Schema implies some corresponding
code in the application to process it semantically. That is, the application
necessarily has the knowledge of the grammar hard coded within it. I prefer to
avoid the maintenance problem of keeping the grammar and the application in
synch.

The point is that what really matters is not the results of a prevalidation
against a DTD/Schema. In the end what matters is how the application processes
the file. The prevalidation can give you reasonable confidence, sure, but the
final validity of a file is not known until given to the application.

It's like dealing with fortune tellers: it's nice to know the future, but one
can't actually be sure of what will happen until the future actually becomes
now.

Similary, its nice to have some measure of confidence about an XML file, but
it will be given to the application anyway, so why not just skip a step?

> I'm pretty sure the Xerces parser that comes with Apache can both validate
> according to a DTD or XML Schema and also deliver SAX events.

Or you can simply ignore DTD/Schema validation, and your parsing is that much
faster.

If one does not worry about DTDs and Schemas there is a vast simplification in
usage (the client does not have to prevalidate, application parsing is
simpler), and complexity (no DTD/Schema needs to be developed and maintained),
with no real disadvantages (the same crucial error checking will be done
anyway).

So, why bother? It is just artificial work.

--
Cheers, The Rhythm is around me,
The Rhythm has control.
Ray Blaak The Rhythm is inside me,
bl...@telus.net The Rhythm has my soul.

Tim Bradshaw

unread,

Mar 6, 2002, 12:14:14 PM3/6/02

to

* Jeff Greif wrote:
> Perhaps my response was not clear. I suggested that you provide a DTD
> for each includable section, or module, and that the include directives,
> or conditional include directives be a part of the XML. You parse the
> XML files recursively from includer to included in each case using the
> DTD of the file you're parsing (which knows nothing about what is in the
> included files), and convert to sexpr form. Your application logic
> evaluates the conditions, etc., and then decides whether to carry out
> the specified inclusion, which in turn will be validated against its own
> DTD and then converted to sexprs for deeper inclusions if any.

I suppose I could do this. It would mean that I'd have many times
more config files than I have currently because modules can no longer
declare syntax valid for a single file. And any module writer would
have the burden of writing a DTD as well, but I suppose that's
inevitable.

--tim

Jeff Greif

unread,

Mar 6, 2002, 1:05:43 PM3/6/02

to

"Ray Blaak" <bl...@telus.net> wrote in message
news:usn7dl...@telus.net...

>
> "Jeff Greif" <jgr...@spam-me-not.alumni.princeton.edu> writes:
> > [DTD/Schema validation is good because]
> > The main reason is that you get several layers of error checking and
> > reporting free:
> > -- well-formedness checking
> > -- primitive validity checking of the structure and attribute
values
> > against the DTD or other constraint language
> >
> > I've found that this greatly reduces the defensive parts of the
> > application (the possible erroneous conditions your code has to
handle).
>
> It doesn't. Since the application cannot assume that the input has
already
> been validated (there is nothing stopping a user from giving the
application
> complete garbage, after all), it needs to check anyway. The
alternative is
> uncontrolled crashes.

It sure does in my experience. I don't have to worry about the user
failing to provide the closing tag for an element, or leaving out
element B that should follow or be inside element A, or some required
attribute. All that is handled by the validation of the content model.
My application code can then deal with validation logic that's either
too painful or impossible to express in the constraint language of the
DTD or other content model (such as constraints depending on the values
of two or more attributes). There's a certain minimum level of
validation which I find greatly simplifies the code I have to write --
many fewer branches and tests. It seems easier to put the constraints
declaratively in the DTD than to think of all the ways the user could
violate them and defend against it using naive code. If I already had a
constraint language and validation package written in my application,
then it would be unnecessary.

I think from your response that I must have been unclear. I expect that
the application invokes the off-the-shelf XML tools to process the input
and only sees the DOM or SAX events or whatever produced by those tools.
The user can only hand you garbage that fits the content model, or it
will be rejected before your code has to do anything with it.

>
> > It wouldn't always be this way, but often when there is an error in
the
> > configuration file, the application is not going any farther until
it's
> > fixed, and the error reporting from the off-the-shelf software is
just
> > good enough to enable the user to fix the file and retry.
>
> The application can use the same off-the-shelf software to report
errors as
> well. Alternatively, the application, when looking for required
elements, or
> finding misformed elements, and easily report the offending locations
using
> the services provided by standard XML tools.

Again, I must have been unclear -- using the error-reporting of the
off-the-shelf stuff, under control of the application is exactly what I
meant.

I disagree slightly. "Every grammar rule *that this version of the
application cares about* implies some corresponding code in the
application ..." Also, you can frequently, but certainly not always,
leave out some of the semantic processing, such as type checks of some
values, when the content model validation has already dealt with it.

>
> The point is that what really matters is not the results of a
prevalidation
> against a DTD/Schema. In the end what matters is how the application
processes
> the file. The prevalidation can give you reasonable confidence, sure,
but the
> final validity of a file is not known until given to the application.

Exactly right. My applications determined final validity, but in my
usage, the prevalidation saved me a lot of work.

> Or you can simply ignore DTD/Schema validation, and your parsing is
that much
> faster.

The one-time parsing of a config file at application startup is seldom a
performance issue, but if it were, you might skip use of a content model
and hard code all the validation logic in the application. This, for
me, would be a considerable reduction in flexibility.

Jeff

Christopher Browne

unread,

Mar 6, 2002, 1:17:12 PM3/6/02

to

In the last exciting episode, Frederic Brunel <frederi...@in-fusio.com> wrote::

I've emailed it out a couple of times.

I'll suggest that people head to Google Groups, and search for:
"pierre mai xml trivial expat"

I had to make some minor change to it to get it to compile; don't be
alarmed if it proves necessary to change a variable name or something
of the sort.

One suggestion I'll throw out would be for Pierre to stick it on a web
site somewhere, and maybe put in some sort of licensing statement to
remove fear from any quaking hearts. (I rather like the "If it
breaks, you get to keep both pieces; this is not the GPL" license.)

A little blurb at the front that at least says "I wrote this; nobody
else should claim they were the author" would be a good thing...
--
(concatenate 'string "cbbrowne" "@ntlug.org")
http://www3.sympatico.ca/cbbrowne/languages.html
The difference between a child and a hacker is the amount he flames about
his toys.
-- Ed Schwalenberg

Erik Naggum

unread,

Mar 6, 2002, 3:17:03 PM3/6/02

to

* Kent M Pitman <pit...@world.std.com>

| Ha. Sometimes as SMALL as that. Mostly that's the size (and even look)
| of a disk drive of the era. Except for the KS-10, which is a latter day
| model that was very small, I think PDP10's, fully configured with memory,
| etc. took up a LOT more space than that. Certainly the ones we had at
| MIT did. I've seen some small laundramats that weren't as big as the
| PDP10's we had...

The room at the University of Oslo that once housed their DECSYTEM 1099
SMP (I believe that was the designation) with almost a gigabyte of disk
space (I think it actually had 800M at its peak) and 2 meagwords of
memory (one moby = 128KW per cabinet, as I recall), now serves as the
computer terminal room for more than 100 students. I think they and
their workstations produce about 1/10th the heat the PDP-10 did.

That PDP-10 with its dual KL-10 SMP was the first real computer I used.
It might well be the last -- I still wait for something that would give
me the same satisfaction as a user and programmer. Buying the biggest
goddamn cabinet I could find for my now old dual 600MHz Pentium III with
its 512M RAM, 128G of disk space and dual power supplies did not quite
cut it, but at least people who visit me refuse to believe it is a "PC".
It even has a lot of front-panel blinkenlights, and if I listen in on the
100MHz band with my portable radio, I can hear the bus, Almost there...

Christopher C. Stacy

unread,

Mar 6, 2002, 3:28:38 PM3/6/02

to

>>>>> On Wed, 06 Mar 2002 00:26:54 -0500, Christopher Browne ("cbrowne") writes:

>>>Chris Stacy (cs) wrote:
cs>> One can write impenetrable programs in any language, but with dense
cs>> languages like APL you pack more trouble on a single line.
cs>> Professional APL programmers strive to make their code readable.

cbrowne> There is a counterargument to this...
cbrowne> Stopping for a while to interpret a loop is a substantial shift
cbrowne> of gears. This has the result that code too often goes to
cbrowne> near-pathological extremes to do Matrix Stuff that replaces loops.
cbrowne> The result of that is that some "inscrutability" is introduced.

As I said, one can write pathological code in any language.
Matrix operations are the basis of expression in APL, and the algorithms
and way of thinking about the problems are not the same ones that would
occur in languages. APL programmers learn to recognize the idioms.

My experience was at Scientific Timesharing Corporation, which was one
of the three vendors of APL (Sharpe and IBM being the others) in they
heyday of APL in the 1970s. The programmers at STSC were very concerned
with highly readable code -- it was all the code we were all going to
have to read all the time! The education department also strived to
teach customers good style in their code. Regular publications also
concerned themselves in this area (along with performance, tricks,
and all the other same areas you'd find in any language.)

My experience with APL does not extend beyond that, so maybe there
are a lot of bad APL programmers running around. There are certainly
a lot of bad programmers in every other language, so...

Erik Naggum

unread,

Mar 6, 2002, 3:43:55 PM3/6/02

to

* Tim Bradshaw <t...@cley.com>

| This looks pretty toxic: complexity is a virus which is going to get us
| all in the end unless we can find a way of simply not interacting with
| the systems which contain the virus.

This (and the preceding discussion) so succinctly sum up why I quit
working with SGML and refuse to work with Microsoft's evil cruft. When
it dawned me that after a person had figured out the point behind SGML,
it would be more expensive to use SGML than any other tool, I could no
longer write the book I was working on about SGML and had to get out.

Erann Gat

unread,

Mar 6, 2002, 3:20:37 PM3/6/02

to

In article <ey33czd...@cley.com>, Tim Bradshaw <t...@cley.com> wrote:

> I think that this situation is fairly desperate. As more people
> become lost in complexity and thus stop doing anything useful but
> simply create more complexity, the complexity of systems increases
> enormously. People who have managed to not fall into the trap but
> still have some overview of the problems they are *actually* trying to
> solve rather than the problems created by complexity, now have a
> problem. In order to interact with these hugely overcomplex systems,
> they need to understand how they work, so they have to devote
> increasing resources to understanding the complexity until eventually
> it overwhelms them and they too become trapped and stop doing anything
> useful. This looks pretty toxic: complexity is a virus which is going
> to get us all in the end unless we can find a way of simply not
> interacting with the systems which contain the virus.

I think you're absolutely right about this.

> However it's not quite as bad as it looks because there are external
> factors: these systems are meant to be used for people's financial
> benefit. Overcomplex systems are more expensive to deploy and
> maintain and less reliable than merely sufficiently complex systems.
> So they make less money for people. This will put the brakes on
> complexity: eventually you won't be able to get funding any more to
> produce yet another 1000+-page spec for something to `patch' some
> deficiency in XML but which actually simply makes it worse. It's not
> clear to me whether the system will then equilibruate, in the way
> that, say, Word probably has, at a level where it is merely expensive
> but not crippling, or whether there will be a real backlash as people
> decide they'd like to spend money on something other than yet more
> software.

About this I'm not so sure you're right (though I desperately hope you
are). I have puzzled over this for a lot of years. It's been clear to me
in my personal experience that Lisp (and in general keeping things simple)
is an enormous productivity win. Nonetheless, neither Lisp nor simplicity
have won in the market. Why? I don't know the answer, but here are a few
hypotheses:

First, keeping things simple is not easy, and making things simple once
they are complicated is even harder. It's not clear that simplifying is
less work than just dealing with the complexity, especially if your
strategy for dealing with complexity is to hire someone to deal with it
for you.

Second, what matters is not actual productivity, but the *perception* of
productivity. Economies are confidence games. Dollars have value only
because enough people believe that they have value. (Perhaps a better
example is Enron stock. Until recently Enron stock had actual value
because enough people believed it had value. Enough people lost that
belief, and the stock lost its value. Art is another example.) Software
works the same way. Enough people start to believe something (like Linux
or XML) has value, and it does. Enough people believe that something else
(like BeOS) doesn't have value, and it doesn't.

(There are of course some things that have *inherent* value that is not
dependent on people's mindsets, like food and clean water. But even there
psychological aspects come into play, like when someone in the US pays $5
for a bottle of Evian.)

E.

Lieven Marchand

unread,

Mar 6, 2002, 12:27:17 PM3/6/02

to

Kent M Pitman <pit...@world.std.com> writes:

> Julian Stecklina <der_j...@web.de> writes:
> > There is a german movie called "23" that deals with some "hackers" in
> > the mid 80ies. It was amazing when they bought a PDP as large as a
> > washing machine. :)
>
> Ha. Sometimes as SMALL as that. Mostly that's the size (and even
> look) of a disk drive of the era. Except for the KS-10, which is a
> latter day model that was very small, I think PDP10's, fully
> configured with memory, etc. took up a LOT more space than that.
> Certainly the ones we had at MIT did. I've seen some small
> laundramats that weren't as big as the PDP10's we had...

Toys. IBM dinosaurs of these days were really BIG. Start with a
transfo to convert European 240V/60Hz to American 120V/50Hz bigger
than a washing machine, then add the water cooling system that
occupied about 6 square meters and you have already a lot of space
without any computing. The actual mainframe was only 2x4 m. Off
course, disk packs a la 3380 and tape drives were also stand alone and
big.

One of Comparex's ads in these days was to paint their clone, which
was a lot smaller for equivalent power, bright red and challenge
people to find it in the computer room amidst its IBM counterpart.

--
Lieven Marchand <m...@wyrd.be>
She says, "Honey, you're a Bastard of great proportion."
He says, "Darling, I plead guilty to that sin."
Cowboy Junkies -- A few simple words

MSCHAEF.COM

unread,

Mar 6, 2002, 5:09:39 PM3/6/02

to

In article <gat-060302...@eglaptop.jpl.nasa.gov>,

Erann Gat <g...@jpl.nasa.gov> wrote:
>In article <ey33czd...@cley.com>, Tim Bradshaw <t...@cley.com> wrote:
>
>> I think that this situation is fairly desperate. As more people
>> become lost in complexity and thus stop doing anything useful but
>> simply create more complexity, the complexity of systems increases
>> enormously. People who have managed to not fall into the trap but
>> still have some overview of the problems they are *actually* trying to
>> solve rather than the problems created by complexity, now have a
>> problem. In order to interact with these hugely overcomplex systems,
>> they need to understand how they work, so they have to devote
>> increasing resources to understanding the complexity until eventually
>> it overwhelms them and they too become trapped and stop doing anything
>> useful. This looks pretty toxic: complexity is a virus which is going
>> to get us all in the end unless we can find a way of simply not
>> interacting with the systems which contain the virus.
>
>I think you're absolutely right about this.

I don't know if this is common knowledge, but assuming it's not, there's
an online archive of Edsger Dijkstra's EWD* notes at this URL

http://www.cs.utexas.edu/users/EWD/

quoting from 1304:
(http://www.cs.utexas.edu/users/EWD/ewd13xx/EWD1304.PDF)

>I would therefore like to posit that computing's central challenge,
>viz. "How not to make a mess of it". has not been met. On the contrary,
>most of our systems are much more complicated than can be considered
>healthy, and are too messy and chaotic to be used in confidence.

Reading back, he's been talking about those kinds of issues for years.
I don't know whether or not to take solace in the fact that computing
has survived as long with this kind of excess complexity, or to sob that
the problem doesn't seem close to being addressed.

-Mike

--
http://www.mschaef.com

Eduardo Muñoz

unread,

Mar 6, 2002, 6:08:25 PM3/6/02

to

Paolo Amoroso <amo...@mclink.it> writes:

> SIMH - Computer History Simulation Project
> http://simh.trailing-edge.com

Thanks for the pointer :)
It doesn't build at the first shot, but I will
keep trying.

--

Eduardo Muñoz

Eric Moss

unread,

Mar 7, 2002, 2:17:19 AM3/7/02

to

Erann Gat wrote:

> Nonetheless, neither Lisp nor simplicity
> have won in the market. Why? I don't know the answer, but here are a few
> hypotheses:

Well, here's one of my biases for you (given that I agree with your
points (both of you):

Some people think that a collection of factoids equals intelligence, or
at least job security. By knowing a tremendous number of things that
others can't keep in their heads, the factoid expert becomes
"valuable". It's easy for them to snow others who spend their time
struggling with the blizzard of "knowledge". There is a macho aspect to
this, which I'm guessing you have all seen. It's also the dominant
theme in the American public educational system, which typically rewards
superficial understanding of many topics rather than deep understanding
of basic principles. That carries through, and so even when the
univeristy demands otherwise, the early conditioning often wins out.

I have other theories, but maybe it's as simple as Steve Jobs once said:
"You know, a lot of people simply have no taste." ;)

Eric

--
"The obvious mathematical breakthrough would be development of an easy
way to factor large prime numbers."

Bill Gates, The Road Ahead, Viking Penguin (1995), page 265

Espen Vestre

unread,

Mar 7, 2002, 3:37:55 AM3/7/02

to

Tim Bradshaw <t...@cley.com> writes:

> I suppose I could do this. It would mean that I'd have many times
> more config files than I have currently because modules can no longer
> declare syntax valid for a single file. And any module writer would
> have the burden of writing a DTD as well, but I suppose that's
> inevitable.

This calls for a lot of new positions, for DTD writers, config-file
configuration managers and so on. And of course project leaders! For
the configuration management tool purchase project, for instance! Lots
of specialists that will help reduce the power of those nasty hackers
that think they can do everything on their own. Maybe this even calls
for a department split - an excellent opportunity for the department
manager to make a career jump!

No wonder these things are so popular within big organizations :-)
--
(espen)

Espen Vestre

unread,

Mar 7, 2002, 4:00:24 AM3/7/02

to

Kenny Tilton <kti...@nyc.rr.com> writes:

> XML seems to me like it can minimize the sensitivity to file formats,

I thought this was true until I looked at the xml produced by MS Word
for the first time.

--
(espen)

Tim Bradshaw

unread,

Mar 7, 2002, 8:49:55 AM3/7/02

to

* Erann Gat wrote:
> About this I'm not so sure you're right (though I desperately hope you
> are). I have puzzled over this for a lot of years. It's been clear to me
> in my personal experience that Lisp (and in general keeping things simple)
> is an enormous productivity win. Nonetheless, neither Lisp nor simplicity
> have won in the market. Why? I don't know the answer, but here are a few
> hypotheses:

[cut reasonable theories].

Well one thing about Lisp, or Common Lisp, is I think the complexity
thing. When it was created, Common Lisp had a fairly large
specification. As I said I think that CL's specification is not over
large for what it does, but it's still big compared to 1st ed K&R C or
something. This means that you have to have read and understood a lot
of stuff to write good CL, which means that either you're expensive to
hire or you write bad CL. I've looked at a fair amount of CL code and
a lot of it is quite simply atrocious, obviously written by people who
really didn't understand CL. Things are made worse for CL because of
Lisp's background in AI, so there's lots of CL code written by AI
academics who suffer from permanent fear of not being `proper
scientists' and have resulting huge logs on their shoulders about
things like programming, which they regard as a craft skill (which it
is), and thus fundamentally lower-class and non-academic skill which
is beneath them as upper-class academics. So they deliberately never
learn to program well but churn out terrible code (we once pointed out
to one such person that his code was not commercialisable, in the most
tactful way possible. He's never spoken to us again).

So CL has historically been too complicated to succeed, among other
things.

Meantime, the `simple systems', which can be understood by people not
willing or able to read large documents are solving the same problems
by much larger and more complex programs than would be required in
CL. These programs are, of course, much harder to understand than the
effort required to read the CL spec and the shorter CL program, but
that's OK, because the effort is not required up-front, so it's still
cheap to hire people, and also because these huge complex systems have
no specifications anyway, so there's nothing to be frightened of.

And finally something nasty happens: some of these overcomplex-but
informal morasses get specifications, and then the whole process goes
exponential: people get lost in these specs and start writing more and
in a period of a year or so XML changes from a few hundred pages of
spec to 20,000 pages with a thousand new pages being written every
three months.

CL again loses out here because it can't keep up with the specs: it's
impossible to write code fast enough to keep up unless you're a huge
corporation. So CL can't be `compatible' and this is seen as a huge
defect.

However, I think that as all these overcomplex systems collapse under
the weight of paper, merely sufficiently complex systems, like CL,
might stand quite a good chance of doing rather well.

--tim

Jeff Greif

unread,

Mar 7, 2002, 9:48:21 AM3/7/02

to

I must be missing something. The intent of the approach I described is
precisely that of having the module writer prepare a DTD which describes
the structure of the configuration tree pertinent to that module only,
with some leaf nodes indicating where other content is to be or might be
included. This DTD would be no more complicated than any other
description of the module configuration content structure than would
have to be written in any case to tell someone how to use it, and
isomorphic with whatever structure description you used for the
s-expression version of the same configuration tree (down to the
metasyntax leaves). If you have a machine-readable description of the
module configuration structure, it could easily be transformed
mechanically into a DTD. After the XML parser output was converted to
sexprs, your existing application code would determine how to handle the
inclusions, just as it did before you had to mess with this XML stuff.

Validation of an included XML configuration file for a module would use
only the DTD for that module. XML Validation of an including file would
use a DTD that treated your metasyntax as leaf elements.

The idea is to minimize the extra work the XML requirements are
imposing, while meeting the client documentation requirements, and
avoiding the introduction of bugs owing to the XML requirement by
converting the syntax to the already working one as early as possible
and in the most mechanical fashion possible. At the file level, the XML
tree, if valid according to its DTD, must be isomorphic to a
corresponding sepxr tree.

Jeff

"Tim Bradshaw" <t...@cley.com> wrote in message

news:ey3lmd5...@cley.com...

Kenny Tilton

unread,

Mar 7, 2002, 10:21:46 AM3/7/02

to

heh-heh. Is that a fair test of XML?

Anyway, what I meant was that XML gets away from flat files in which the
mapping of data to fields requires hardcoded record definitions
specifying starting position, field type and length.

Tim Bradshaw

unread,

Mar 7, 2002, 10:25:40 AM3/7/02

to

* Jeff Greif wrote:
> This DTD would be no more complicated than any other
> description of the module configuration content structure than would
> have to be written in any case to tell someone how to use it, and
> isomorphic with whatever structure description you used for the
> s-expression version of the same configuration tree (down to the
> metasyntax leaves). If you have a machine-readable description of the
> module configuration structure, it could easily be transformed
> mechanically into a DTD.

Up to small issues like halting problems, since my machine readable
description is a Lisp program, and the inability of DTDs to actually
describe the legal syntax (unless I'm mistaken small matters like
ensuring that something is a number are too hard for DTDs).

I'm not really interested in this any more, so I think we should stop.
It's apparent to me that there is really so little common ground
between XML apologists and me that useful communication is essentially
impossible, unfortunately.

--tim

Espen Vestre

unread,

Mar 7, 2002, 10:34:58 AM3/7/02

to

Kenny Tilton <kti...@nyc.rr.com> writes:

> heh-heh. Is that a fair test of XML?

;-)

> Anyway, what I meant was that XML gets away from flat files in which the
> mapping of data to fields requires hardcoded record definitions
> specifying starting position, field type and length.

Sure. Unfortunately XML-freaks want to xmlize every file format, for
instance log file formats which are compact, reasonable and
greppable. The result? Your storage array vendor can buy himself a
golden bathtub and your sysadmins quit when you show them the
XML-parsing tool they now need to use instead of good old grep...
--
(espen)

Erik Naggum

unread,

Mar 7, 2002, 10:44:44 AM3/7/02

to

* "Jeff Greif" <jgr...@spam-me-not.alumni.princeton.edu>

| The intent of the approach I described is precisely that of having the
| module writer prepare a DTD which describes the structure of the
| configuration tree pertinent to that module only, with some leaf nodes
| indicating where other content is to be or might be included.

The approach you seem to want here, is an ex post facto DTD. In my view,
that is a very sensible approach. They allow you to validate whatever
you actually have. Constructing such a DTD is unfortunately not all that
trivial, but it is quite doable. It is a pity that people do not use
such a tool to construct DTDs that correspond to test documents instead
of trying to do it by hand.

That would make it possible to ensure that you "freeze" a configuration
file format at a particular point without actual human intervention of
the obnoxious silliness of DTD "design", but still allow your users to
waste their time "validating" their configuration files and editing them
with tools that have zero knowledge of their contents, only their syntax.

Please remember that SGML and XML rely on an _application_ in order to
imbue their syntactic verbosity with semantics of any kind. You have to
use a large number of notations and notation validators to even approach
correctness of a configuration file in a meaningful way.

Structure is _nothing_ if it is all you got. Skeletons _spook_ people if
thwy try to walk around on their own. I really wonder why XML does not.

Tim Bradshaw

unread,

Mar 7, 2002, 10:48:03 AM3/7/02

to

* Kenny Tilton wrote:
>>
>> I thought this was true until I looked at the xml produced by MS Word
>> for the first time.

> heh-heh. Is that a fair test of XML?

Yes, because 98% of the XML you have to deal with will be produced by
MS applications, and will have subtle incompatibilities and
obscurities both intentionally added for the purpose of reducing its
openness and accidentally added because only 3 people in the world
understand enough of the 11,000 layered standards on top of XML to
create legal documents.

Actually, I'd always assumed that MS Word's XML would look something
like:

<ms-word-document version='omitted in the interests of obscurity'
...>
<byte-stream>

<byte hex-value='00' decimal-value='0' binary-value='00000000'
octal-value='0'>
<hex>00</hex>
<decimal>0</decimal>
<binary>00000000</binary>
<octal>0</octal>
</byte>

...

</byte-stream>
</ms-word-document>

(except it will be MSWordDocumnt and BytStrM and so on).

That's a blowup factor of ~250 which is not bad. Saves having to
provide floppy support for Windows anyway.

--tim

Kenny Tilton

unread,

Mar 7, 2002, 12:30:14 PM3/7/02

to

Tim Bradshaw wrote:
>
> * Kenny Tilton wrote:
> >>
> >> I thought this was true until I looked at the xml produced by MS Word
> >> for the first time.
>
> > heh-heh. Is that a fair test of XML?
>
> Yes, because 98% of the XML you have to deal with will be produced by
> MS applications,

So it is XML's fault that Microsoft can screw it up? Is there anything
they cannot screw up? You are blaming the tool for what a lousy
practitioner does with it. Sounds like what happens to Lisp.

Dude, you are passing up a lovely opportunity to slam MS.

This reminds me of Graham's beef with OO, viz, that you can do
spaghetti-OO with it. I know what he means, my class hierarchies
sometimes migrate from coherent to pasta as they evolve. I don't blame
OO, I refactor (a phrase I know you are keen on).

Tim Bradshaw

unread,

Mar 7, 2002, 1:23:48 PM3/7/02

to

* Kenny Tilton wrote:

> So it is XML's fault that Microsoft can screw it up? Is there anything
> they cannot screw up? You are blaming the tool for what a lousy
> practitioner does with it. Sounds like what happens to Lisp.

> Dude, you are passing up a lovely opportunity to slam MS.

Good point.

--tim

Julian Stecklina

unread,

Mar 7, 2002, 1:24:01 PM3/7/02

to

Tim Bradshaw <t...@cley.com> writes:

[...]

> <ms-word-document version='omitted in the interests of obscurity'
> ...>
> <byte-stream>
> 
> <byte hex-value='00' decimal-value='0' binary-value='00000000'
> octal-value='0'>
> <hex>00</hex>
> <decimal>0</decimal>
> <binary>00000000</binary>
> <octal>0</octal>
> </byte>
> 
> ...
> 
> </byte-stream>
> </ms-word-document>
>
>
> (except it will be MSWordDocumnt and BytStrM and so on).
>
> That's a blowup factor of ~250 which is not bad. Saves having to
> provide floppy support for Windows anyway.

The XML data looks very compressible to me. Looks like MS thinks: "Let
others fix it" *g*

Regards,
Julian
--
Meine Hompage: http://julian.re6.de

Um meinen oeffentlichen Schluessel zu erhalten:
To get my public key:
http://math-www.uni-paderborn.de/pgp/

Joe Marshall

unread,

Mar 7, 2002, 5:03:10 PM3/7/02

to

"Tim Bradshaw" <t...@cley.com> wrote in message

news:ey3wuwo...@cley.com...

> I'm not really interested in this any more, so I think we should stop.
> It's apparent to me that there is really so little common ground
> between XML apologists and me that useful communication is essentially
> impossible, unfortunately.

Perhaps XML aficionados should frame their arguments in properly validated
XML.

Erik Naggum

unread,

Mar 7, 2002, 6:19:20 PM3/7/02

to

* Tim Bradshaw

| That's a blowup factor of ~250 which is not bad. Saves having to
| provide floppy support for Windows anyway.

Heh. One thing that continues to amaze me with those *ML people is that
both they and their software have this unnerving fixation on whitespace
and sprinkle it liberally to sort of indent their HTML code, as if anyone
could possibly have cared. Some web pages I have seen have more than 30%
whitespace that is discarded as "markup" by whatever passes for a parser.

Erik Naggum

unread,

Mar 7, 2002, 7:27:13 PM3/7/02

to

* Kenny Tilton

| So it is XML's fault that Microsoft can screw it up?

Microsoft is the major force pushing XML. So, yes.

| Is there anything they cannot screw up? You are blaming the tool for what
| a lousy practitioner does with it. Sounds like what happens to Lisp.

No, the lousy practitioners are not the people pushing Lisp the hardest.

| Dude, you are passing up a lovely opportunity to slam MS.

I think I can do both XML and MS at the same time. :)

| This reminds me of Graham's beef with OO, viz, that you can do
| spaghetti-OO with it. I know what he means, my class hierarchies
| sometimes migrate from coherent to pasta as they evolve. I don't blame
| OO, I refactor (a phrase I know you are keen on).

I know a behavioral psychologist. He does safety instructions for ships
and oil rigs for a living. He argues that if you give instructions that
people do not follow, the instructions are wrong. There is probably some
sense to this. In some particular domains it is probably even true.

Erik Naggum

unread,

Mar 7, 2002, 7:34:26 PM3/7/02

to

* Julian Stecklina <der_j...@web.de>

| The XML data looks very compressible to me.

Precisely, and the nutjobs who clamor about XML at the input in order to
get their fancy XSLT processing or whatever fail to understand that you
can get their DOM nonsense out of just about anything. It is really,
really annoying that these people have to see <foo bar="zot">quux</foo>
in the actual input source instead of just <quux> or whatever. SGML even
had some rudimentary support for this kind of "implicit" parsing, called
"data tags".

People with Common Lisp experience understand that you can parse just
about anything into any sort of internal memory representation you want,
and use a _different_ mapping from internal to external representation
than from external to internal if you need or want to. People without
such experience seem to believe that there is somehow only _one_ of those
representations that really matter, while the other matters not. That
XML has continued to make this mistake, despite strong arguments against
this view over many years from several people, who even influenced the
design of XML, is one of those tragedies that history books are for. I
am just sorry I live in a time and work with stuff that they will write
about, eventually. But at least I jumped ship before XML was born, so it
was not my fault. I have deniability, at the very least. Whew!

Daniel Pittman

unread,

Mar 7, 2002, 7:02:30 PM3/7/02

to

On 07 Mar 2002, Tim Bradshaw wrote:
> * Kenny Tilton wrote:
>>>
>>> I thought this was true until I looked at the xml produced by MS
>>> Word for the first time.
>
>> heh-heh. Is that a fair test of XML?
>
> Yes, because 98% of the XML you have to deal with will be produced by
> MS applications

[...]

> Actually, I'd always assumed that MS Word's XML would look something
> like:

[...]

> That's a blowup factor of ~250 which is not bad. Saves having to
> provide floppy support for Windows anyway.

Oh, well may you laugh. I once read an XML document[1] intended for use
in a shared project aimed at designing an interstellar starship.

This document touted the value of XML as a format because it made the
data simple, clear and accessible without reference to the creating
application.

The document included, among it's tags, the <bit></bit> tag, encoding a
single bit of information (as 1 or 0, as far as I could tell). A
sequence of these <bit> tags seemed to be the content of the outer leaf
of the tree.

Given there tended to be eight or ten of these, I calculate a factor of
88 times (!!) expansion for a document that featured no formatting
whitespace. Since it was nicely indented to the nesting the bit tag in
the file, the actual expansion was around 248 times.

That was the point at which I ceased to consider the project as having
any chance of success.[2]

Daniel

Footnotes:
[1] Well, a specification in English for one and part of some sample
documents.

[2] Not that I thought the odds of a self-declared "Open Source"[3]
project achieving the target they set were great to start with.

[3] This seemed to be mostly an attempt to capture mind-share because
the term was popular...

--
Romance is rape embellished with meaningful looks.
-- Andrea Dworkin, _Philadelphia Inquirer_, May 21, 1995

Patrick

unread,

Mar 7, 2002, 8:42:49 PM3/7/02

to

> In article <ey33czd...@cley.com>, Tim Bradshaw <t...@cley.com> wrote:
>
> > I think that this situation is fairly desperate. As more people
> > become lost in complexity and thus stop doing anything useful but
> > simply create more complexity, the complexity of systems increases
> > enormously. People who have managed to not fall into the trap but
> > still have some overview of the problems they are *actually* trying to
> > solve rather than the problems created by complexity, now have a
> > problem.

I used to hope that if the computer science community ever decides to pause,
take stock, and converge on a core set of principles with a view to creating
a lingua franca designed to survive centuries rather than decades, it will
be a form of Lisp. Nothing else is nearly as adaptable. Nothing lends itself
better to intelligent tools. The interchangeability of data and code makes
it ideal for depicting, storing, intelligently manipulating and transmitting
any kind of artifact. And the core simplicity of the syntax and semantics
would allow us to use the machines for genuinely interesting purposes,
instead of wasting effort on compexities that are not technical but
historical and commercial accidents.

IMO, there are good technical reasons for disliking Java, XML, and .NET. But
the likelihood that they will delay a real solution beyond our
three-score-and-ten is a better reason to actively hate them.

> > In order to interact with these hugely overcomplex systems,
> > they need to understand how they work, so they have to devote
> > increasing resources to understanding the complexity until eventually
> > it overwhelms them and they too become trapped and stop doing anything
> > useful. This looks pretty toxic: complexity is a virus which is going
> > to get us all in the end unless we can find a way of simply not
> > interacting with the systems which contain the virus.

Aye, there's the rub. Can't bring yourself to deal with it. Can't afford to
ignore it.

Actually, if you *can* afford to not interact with these toxic systems, then
why not follow your own path and let the rest of the world go to buggery?
Quality of life is well worth the money you won't make.

james anderson

unread,

Mar 8, 2002, 5:05:26 AM3/8/02

to

Erik Naggum wrote:
>
> ...

>
> People with Common Lisp experience understand that you can parse just
> about anything into any sort of internal memory representation you want,

... including xml.

> and use a _different_ mapping from internal to external representation
> than from external to internal if you need or want to. People without
> such experience seem to believe that there is somehow only _one_ of those
> representations that really matter, while the other matters not.

on the other hand, lisp programmers share with their counterparts the
trait that, how did that go, they "know the value for everything, but
the cost of nothing."

decoding configuration decriptions which are encoded as xml is not the
problem. the "problem" would be to determine how these expressions fit
into a process outside of the application-as-implemented-in-lisp and
what needs to be encoded in them. it is hard to tell from the
description whether the "political" reasons did, in fact, have technical ramifications.

...

Fernando Rodríguez

unread,

Mar 8, 2002, 5:15:05 AM3/8/02

to

On Thu, 07 Mar 2002 22:03:10 GMT, "Joe Marshall" <prunes...@attbi.com>
wrote:

Don't laugh, but the manager of a project I worked on 2 years ago, asked us to
write our comments in xml. We were even encouraged to use xml for the
handwritten notes taken during meetings and code-review sessions. (sic)

The obvious result was that everybody stopped commenting code and many of us
resigned to preserve our mental health.

PS This is not a joke. :-(

----
Fernando Rodríguez
frr at wanadoo dot es
-------

Frederic Brunel

unread,

Mar 8, 2002, 8:43:59 AM3/8/02

to

> I'll suggest that people head to Google Groups, and search for:
> "pierre mai xml trivial expat"

Thanx, I've got the code to work but I've face a strange
problem... Which Common Lisp implementation did you use?

I have modified Pierre's code to get a file as input but when I run it
in CMUCL with (ext:run-program ""), it gets freezed and I'm unable to
get a stream from it whereas the program runs perfectly from bash (and
pipes)! :(

--
Frederic Brunel
Software Engineer
In-Fusio, The Mobile Fun Connection

Nils Goesche

unread,

Mar 8, 2002, 9:28:25 AM3/8/02

to

In article <32245364...@naggum.net>, Erik Naggum wrote:
> That XML has continued to make this mistake, despite strong
> arguments against this view over many years from several people,
> who even influenced the design of XML, is one of those tragedies
> that history books are for. I am just sorry I live in a time
> and work with stuff that they will write about, eventually.
> But at least I jumped ship before XML was born, so it was not
> my fault. I have deniability, at the very least. Whew!

Germans used to call this ``Die Gnade der spaeten Geburt'',
something like ``The grace of late birth'' or some such...

Regards,
--
Nils Goesche
"The sooner all the animals are dead, the sooner we'll find
their money." -- Ed Bluestone
PGP key ID 0x42B32FC9

Erik Naggum

unread,

Mar 8, 2002, 9:36:15 AM3/8/02

to

* james anderson <james.a...@setf.de>

| on the other hand, lisp programmers share with their counterparts the
| trait that, how did that go, they "know the value for everything, but
| the cost of nothing."

You had a point?

Craig Brozefsky

unread,

Mar 8, 2002, 11:02:40 AM3/8/02

to

james anderson <james.a...@setf.de> writes:

> Erik Naggum wrote:
> >
> > ...
> >
> > People with Common Lisp experience understand that you can parse just
> > about anything into any sort of internal memory representation you want,
>
> ... including xml.
>
> > and use a _different_ mapping from internal to external representation
> > than from external to internal if you need or want to. People without
> > such experience seem to believe that there is somehow only _one_ of those
> > representations that really matter, while the other matters not.
>
> on the other hand, lisp programmers share with their counterparts the
> trait that, how did that go, they "know the value for everything, but
> the cost of nothing."

That's a troll around these parts, and I think there is a two beer
minimum before everyone has sufficient levity to appreciate the
comment as it was intended.

I had two beers late last night, and had just one cup of coffee this
morning, so I'm qualified to reply.

When you get down to it, the tools for evaluating the cost of
operations in a CL implementation are more than adequate for
determining the real cost of an operation. That said, not only do I
as a CL hacker understand the VALUES (remember this is CL so we can
have more than one) of everything, but I can determine the cost of an
operation or set of operations or a design decision quicker than C or
Java hackers. Not only can I try out multiple approaches faster, but
my profiling, call recording, memeory monitoring, code stepping and
call statistics tools are boffo!

> decoding configuration decriptions which are encoded as xml is not
> the problem. the "problem" would be to determine how these
> expressions fit into a process outside of the
> application-as-implemented-in-lisp and what needs to be encoded in
> them. it is hard to tell from the description whether the
> "political" reasons did, in fact, have technical ramifications.

Yah, that accursed lack of a brick wall properly seperating data from
code is such a burden for us CL haxors. That is not a cost, that is a
value, and just because other people don't have the vaue that you do
does not turn it into a cost. It's really all those XML/Java/C++
headz that don't understand the cost of their languages failures and
the cost of the band-aids they try to patch things up with.

XML is a great example, lame ass programmerdweebz getting weak in the
knees at the thought of writing a parser see XML as their savior.
Then once they get that done they graduate to the net circle of hell,
err I mean level of competence, and suddenly since everyone has a
parser that understands their sin tax they figure that everything
shoudl be encoded in that and it would like make it all inter-operable
and be a great big synchronicity win!

So, for the VALUE of not writing a parser they incur billions of
dollars worth of marketing, re-engineering, training, support, and
mental health maintenance COSTS.

Ok, I lied, Ihad one cup of coffee last night and two beers this
morning. Or was it three? Damn, time to check the empy CAR and CDR
slots in my SIXPACK. Look out world, I'm a drunk industrial CL
programmer here to improve your productivity apps!

--
Craig Brozefsky <cr...@red-bean.com>
http://www.red-bean.com/~craig
Ask me about Common Lisp Enterprise Eggplants at Red Bean!

Steve Long

unread,

Mar 8, 2002, 3:28:27 AM3/8/02

to

Another memorable quote.

> Structure is _nothing_ if it is all you got. Skeletons _spook_ people if

> they try to walk around on their own. I really wonder why XML does not.

Rahul Jain

unread,

Mar 8, 2002, 5:54:18 PM3/8/02

to

james anderson <james.a...@setf.de> writes:

> on the other hand, lisp programmers share with their counterparts the
> trait that, how did that go, they "know the value for everything, but
> the cost of nothing."

In the modern lisp community, I often see evidence to the
opposite. Macro- and sometimes micro-efficiency are very common topics
of discussion.

--
-> -/- - Rahul Jain - -\- <-
-> -\- http://linux.rice.edu/~rahul -=- mailto:rj...@techie.com -/- <-
-> -/- "I never could get the hang of Thursdays." - HHGTTG by DNA -\- <-
|--|--------|--------------|----|-------------|------|---------|-----|-|
Version 11.423.999.221020101.23.50110101.042
(c)1996-2002, All rights reserved. Disclaimer available upon request.

Kenny Tilton

unread,

Mar 9, 2002, 12:50:57 AM3/9/02

to

Is there anyone here who thinks they cannot /generate/ legal XML? (cake)

Is there anyone here with a requirement to /read/ legal but otherwise
unspecified XML? (ouch)

Is anyone aware of a standardized domain-specific XML representation
that is unreadable by an app to which the domain-specific XML
representation is known? (not)

These are /my/ sticking points in this discussion. I do not think I
could read arbitrary XML, nor do I think arbitrary XML is meant to be
readable. That's why folks in different domains are standardizing.
(hello?)

--

kenny tilton
clinisys, inc
---------------------------------------------------------------

"Let's pretend...we're real people."
- Ty, Caddy Shack

Daniel Barlow

unread,

Mar 8, 2002, 6:48:23 PM3/8/02

to

Craig Brozefsky <cr...@red-bean.com> writes:

> Then once they get that done they graduate to the net circle of hell,
> err I mean level of competence, and suddenly since everyone has a
> parser that understands their sin tax they figure that everything
> shoudl be encoded in that and it would like make it all inter-operable
> and be a great big synchronicity win!

You forgot to say "leverage"

-dan, going forward

--

http://ww.telent.net/cliki/ - Link farm for free CL-on-Unix resources

Erik Naggum

unread,

Mar 9, 2002, 11:38:15 AM3/9/02

to

* Kenny Tilton <kti...@nyc.rr.com>

| Is there anyone here who thinks they cannot /generate/ legal XML? (cake)
|
| Is there anyone here with a requirement to /read/ legal but otherwise
| unspecified XML? (ouch)
|
| Is anyone aware of a standardized domain-specific XML representation
| that is unreadable by an app to which the domain-specific XML
| representation is known? (not)
|
| These are /my/ sticking points in this discussion. I do not think I
| could read arbitrary XML, nor do I think arbitrary XML is meant to be
| readable. That's why folks in different domains are standardizing.
| (hello?)

I have some problem getting at your rhetorical style, but it seems that
you are not arguing, only positing and stipulating.

There are _many_ people who cannot generate valid (legal is something
else :) XML if their life depends on it. The problem is the moronic DTD
and/or schema cruft. Getting both of these right is a nightmare that the
people who could have been trained for it do not want to do and those who
do it are not trained to handle that kind of programmer-level design.

XML _is_ readable without having a DTD and having a clue what the data
"means" -- that was the primary design goal. The problem with SGML was
that you would not know whether some element had CDATA or RCDATA declared
contents, meaning that only an end-tag would be recognized (and it better
be for the only open element, in the spirit of redundant redundancy), not
a start-tag for some other element, which would throw a naive parser off,
as well as a lot of other minor stateful parsing problems.

The parsability of arbitrary XML is such an obvious design goal of XML
that I really wonder how you arrived at your last paragraph. It is
completely bogus as a rationale for standardizing elements and their
structural relationship, too. What people are standardizing in XML is
the same they standardize with EDIFACT or systematized nomenclature for
medicine (SNOMED) or even such "simple" things as the call letters of
ships, airplanes, airports, radio and TV stations, etc. People tend to
want to standardize sematnics, and leave syntax to randomness. XML is
standardized randomness and might therefore be chosen as the preferred
random way of expressing someone's semantic constructs.

David Boles

unread,

Mar 9, 2002, 11:45:26 AM3/9/02

to

Tim Bradshaw wrote:

> I have a system which currently reads an sexp based config
> file syntax, for which I need to provide (and in fact have
> provided) an alternative XML-based syntax for political
> reasons.
>
> ...

I must be nuts, but there's a significant point that hasn't
been brought out. I've been using XML since the days of a
draft specification and am not blind to it's warts. Most
(all?) of the criticisms heaped upon it by others here are
quite correct.

The thing that has been lacking in the discussion is a
proper placement of the technology among alternative
techniques or approaches. For many of those who are using
XML, the only alternative they have is OLE structured
storage. That there are actually alternatives that they
just don't know about because they are ignorant is beside
the point. If they don't know about them or understand
them, these things aren't alternatives.

An XML DTD, as ugly as it is, or a schema is a miracle
of clarity and consistency in a world in which document
formats change when some VB programmer makes changes to
"code", saves the "code", undoes the changes, and saves
again.

You can argue that people shouldn't be using tools that
are that lame, and that people who can't write a proper
parser shouldn't be programming, but as right as you may
be they won't stop.

As bad as it is, XML allows many projects to succeed that
would otherwise fail in a dustcloud of partially broken
or imcompatible parsers, ever-shifting formats resulting
in mis-understood hacks at the wrong layer, etc. It might
have the additional benefit that for a few souls, it is
the beginning of a more serious understanding of data
representation.

Sorry for the intrusion,

- db

Pierre R. Mai

unread,

Mar 9, 2002, 3:58:06 PM3/9/02

to

Frederic Brunel <frederi...@in-fusio.com> writes:

> > I'll suggest that people head to Google Groups, and search for:
> > "pierre mai xml trivial expat"
>
> Thanx, I've got the code to work but I've face a strange
> problem... Which Common Lisp implementation did you use?
>
> I have modified Pierre's code to get a file as input but when I run it
> in CMUCL with (ext:run-program ""), it gets freezed and I'm unable to
> get a stream from it whereas the program runs perfectly from bash (and
> pipes)! :(

Personally, I wouldn't modify the C code to take a file name, but
rather use the ability of either your favourite shell or CMU CL's
run-program to redirect standard input from a file, which is more
flexible.

In CMU CL, something like this should work:

(defun parse-xml-from-file (file)
(let* ((process (ext:run-program *xml-expat-parser-path* nil
:input file :output :stream :wait nil))
(output (ext:process-output process)))
(unwind-protect
(read output)
(ext:process-wait process)
(ext:process-close process)
(unless (zerop (ext:process-exit-code process))
(error "Error parsing XML file ~A." file)))))

If you modified elements.c to take a filename argument, you'd do

(let* ((process (ext:run-program *xml-expat-parser-path* (list file)
:input nil :output :stream :wait nil))

instead.

Regs, Pierre.

--
Pierre R. Mai <pm...@acm.org> http://www.pmsf.de/pmai/
The most likely way for the world to be destroyed, most experts agree,
is by accident. That's where we come in; we're computer professionals.
We cause accidents. -- Nathaniel Borenstein

Pierre R. Mai

unread,

Mar 9, 2002, 4:07:46 PM3/9/02

to

Christopher Browne <cbbr...@acm.org> writes:

> One suggestion I'll throw out would be for Pierre to stick it on a web
> site somewhere, and maybe put in some sort of licensing statement to
> remove fear from any quaking hearts. (I rather like the "If it
> breaks, you get to keep both pieces; this is not the GPL" license.)
>
> A little blurb at the front that at least says "I wrote this; nobody
> else should claim they were the author" would be a good thing...

As stated in the original posting of the code, I've explicitly placed
it into the public domain, so anyone is free to do anything they like
with it (including breaking it and recycling the bits)... I'll see
about putting it up somewhere...

Kenny Tilton

unread,

Mar 9, 2002, 6:20:11 PM3/9/02

to

Erik Naggum wrote:
> The parsability of arbitrary XML is such an obvious design goal of XML...

Well, you posted early on something I recall as three different ways to
say (or interpret?) the same thing. I am not an XMLer, but I had read
enough to come to the same conclusion. So how could anyone parse that
blind? Or did you mean the DTD would sort out the alterniative meanings,
at which point the wackiness of DTDs can kill you?

i think I am lucky, we just have to generate XML and provide XSL to
convert our own XML to other formats, not read arbitrary XML (for now).
But I still hark back to memories of code I have seen which was a pure
horror, written in a language which could have been used to write
elegant code. Actually all languages fit that description...including
XML.

Are y'all lookin for a language in which one cannot write bad code?

kenny tilton
clinisys, inc
---------------------------------------------------------------

Erik Naggum

unread,

Mar 9, 2002, 7:22:31 PM3/9/02

to

* Erik Naggum

> The parsability of arbitrary XML is such an obvious design goal of XML...

* Kenny Tilton

| Well, you posted early on something I recall as three different ways to
| say (or interpret?) the same thing. I am not an XMLer, but I had read
| enough to come to the same conclusion. So how could anyone parse that
| blind? Or did you mean the DTD would sort out the alterniative meanings,
| at which point the wackiness of DTDs can kill you?

I am still not sure what you are referring to, but the main difference
between SGML and XML is precisely that you do not need the DTD to parse
an XML "document".

| Are y'all lookin for a language in which one cannot write bad code?

No, just a language that has static style checking and does not stop at
proven correctness, but requires proven good taste. This would take care
of a lot more real-life problems than, e.g., static type checking.

Really, i just want every single professional programmer to be competent.
(The difference between a hobbyist and a professional programmer should
have been accountability. The difference today is whether he gets paid.)

Christian Lynbech

unread,

Mar 10, 2002, 5:52:49 AM3/10/02

to

>>>>> "Tim" == Tim Bradshaw <t...@cley.com> writes:

Tim> However, I think that as all these overcomplex systems collapse under
Tim> the weight of paper, merely sufficiently complex systems, like CL,
Tim> might stand quite a good chance of doing rather well.

Lets not forget that Lisp has so far survived for more than 4 decades,
allthough having been announced dead and nearly extinct numerous
times.

I am personally of the firm belief (probably to be taking in the
religious sense) that eventually Lisp programmers will be dancing on
the grave of Java. It won't happen anytime soon, but I am prepared to
wait :-)

------------------------+-----------------------------------------------------
Christian Lynbech |
------------------------+-----------------------------------------------------
Hit the philistines three times over the head with the Elisp reference manual.
- pet...@hal.com (Michael A. Petonic)

Frederic Brunel

unread,

Mar 11, 2002, 8:09:35 AM3/11/02

to

"Pierre R. Mai" <pm...@acm.org> writes:

> Personally, I wouldn't modify the C code to take a file name, but
> rather use the ability of either your favourite shell or CMU CL's
> run-program to redirect standard input from a file, which is more
> flexible.
>
> In CMU CL, something like this should work:
>
> (defun parse-xml-from-file (file)
> (let* ((process (ext:run-program *xml-expat-parser-path* nil
> :input file :output :stream :wait nil))
> (output (ext:process-output process)))
> (unwind-protect
> (read output)
> (ext:process-wait process)
> (ext:process-close process)
> (unless (zerop (ext:process-exit-code process))
> (error "Error parsing XML file ~A." file)))))
>
> If you modified elements.c to take a filename argument, you'd do
>
> (let* ((process (ext:run-program *xml-expat-parser-path* (list file)
> :input nil :output :stream :wait nil))

Thanx Pierre, I didn't get how to use :input, your example is
great. Maybe my mistake is to have ommitted the (:wait nil), but I
wonders why it has worked with the `ls' command, anyway I'll go using
the program without making any changes!

Ian Wild

unread,

Mar 11, 2002, 10:58:14 AM3/11/02

to

Kenny Tilton wrote:
>
> Erik Naggum wrote:
> > The parsability of arbitrary XML is such an obvious design goal of XML...
>
> Well, you posted early on something I recall as three different ways to
> say (or interpret?) the same thing. I am not an XMLer, but I had read
> enough to come to the same conclusion. So how could anyone parse that
> blind? Or did you mean the DTD would sort out the alterniative meanings,

parse => syntax
meaning => semantics

different kettles of fish entirely.

Even without a DTD you can see that "<m><g>Hello</g></m>"
is "Hello" inside a "g" inside an "m". That's parsing
and it's (relatively) easy.

With a DTD you can answer questions like "is a 'g' allowed
inside an 'm', and if so, can we have text inside the 'g'?"
That's still parsing, but not so easy.

You still need an application before you can discover
that "Hello" isn't a valid reading for the gas meter in
my house's meter-set.

Olivier Lefevre

unread,

Mar 11, 2002, 3:59:10 PM3/11/02

to

> Marco Antoniotti wrote:
>> operators have different "semantics" if monadic or dyadic.
>
> ah yes, I saw that in the K language, I was wondering what possessed
> them, thx for clearing that up. :)

No, they don't. What happens is that, due to the limitations of the
ASCII character set, some glyphs are overloaded to represent more than
one operator.

-- O.L.

Sverker Wiberg

unread,

Mar 13, 2002, 11:19:55 AM3/13/02

to

Erik Naggum wrote:

[...]

> <form>if<form>...condition...</form>
> <form>...then...</form>
> <form>...else...</form>
> </form>
>
> The XML is now only a suger-coating of syntax and the meaning of the
> entire construct is determined by the contents of the form elements,
> which are completely irrelevant after they have been parsed into a tree
> structure, which is very close to what we do with the parentheses in
> Common Lisp.

So you might as well say

<form>
(if (...condition...)
(...then...)
(...else...))
</form>

which would help cut down on the sugar intake.

/Sverker Wiber

Erik Naggum

unread,

Mar 13, 2002, 11:41:06 AM3/13/02

to

* Sverker Wiberg <sver...@erix.ericsson.se>

:D Good one! Yes, it is certainly possible to defer the whole problem
to what SGML and XML call "notations", but the intention with notations
was actually that they should have parsers that return structured objects.

I doubt that anyone uses notations this way with the current crop of
parsers, however. If someone knows, I would be happy to hear otherwise.