Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

The horror that is XML

282 views
Skip to first unread message

Tim Bradshaw

unread,
Mar 4, 2002, 9:55:02 PM3/4/02
to
I have a system which currently reads an sexp based config file
syntax, for which I need to provide (and in fact have provided) an
alternative XML-based syntax for political reasons.

I'm wondering if anyone else has been through this and has run into
the same problems I have and maybe can offer any solutions. To
describe them I need to describe the current syntax slightly.

The config files are read and validated in a `safe' (as safe as I can
make it easily, which may not be really safe) reader environment.
After reading, they are validated by checking everything read is of a
good type (using an occurs check in case of circularity) a `good type'
means pretty much non-dotted lists, strings, reals, keywords and a
defined list of other symbols. At top-level a file consists of conses
whose cars are one of these good symbols.

Before anything else happens some metasyntax is expanded which allows
file inclusion, and conditionalisation. This results in an `effective
file' which may actually be the contents of several files. The
metasyntax is just things like ($include ...) or ($platform-case ...).

Finally, the resulting forms are passed to a handler function (this is
a function passed to the config file reader) which gets to dispatch
on the car of the forms, and do whatever it likes.

A top-level form is declared valid by declaring that its car is a `good
symbol' (via a macro) and usually by defining a handler for it. In
some cases the system wants all forms to be handled, but in many cases
all it cares is that the form is `good' (it must be good for the first
stage not to reject it) - this depends entirely on the handler.

The end result of this is that a module of the system can very easily
declare a new config-file form to be valid and establish a handler for
it, thus enabling it to get configured correctly at boot time or
whenever else config files are read. The overall system does not have
to care about anything other than making sure the files are read.

(On top of this there's a reasonably trivial hook mechanism which can
let modules run code before or after a config file is read or at other
useful points, so they can, for instance, check that the configuration
they needed actually happened.)

So I have to make something like this work with XML, and I have to do
it without doubling the size of either my brain or the system - as far
as I can see if I was to even read most of the vast encrustation of
specifications that have accumulated around XML I'd need to do the
former, both to make space for them and to invent a time machine so I
can do it in time. If I was to actually use code implementing these
specs then I'd definitely do the latter.

So what I'm doing instead is using the expat bindings done by Sunil
Mishra & Kaelin Colclasure (thanks), writing a tiny tree-builder based
on that, and then putting together a sort of medium-level syntax based
on XML.

Because I'm using expat I don't need to care about DTDs, just about
well-formedness. But it would be kind of nice (the client thinks) to
have DTDs, because it would be additional documentation.

But this seems to be really hard. Firstly, because of the metasyntax,
the grammar is kind of not like anything I can easily describe (as a
non-expert DTD writer). For instance almost any config file can have
metasyntax almost everywhere in it. I could give up and have XML
syntax which looks like:

<cons><car><string>...</string></car>...</cons>

or something, and write a DTD for that but this is obviously horrible.

Secondly, my system has modules. These modules want to be able to
declare handlers of their own. One day *other people* might write
these modules. It looks to me like any little module which currently,
say, declares some syntax like:

(load-patches
file ...)

now has to involve me in changing the DTD to allow (say)

<load-patches><file>...</file>...</load-patches>

This looks doomed.

When I skim the XML specs (doing more than this would require far
longer than I have: and they've also now fallen through my good strong
19th century floor and killed several innocent bystanders in the
floors below before finally coming to rest, smoking, embedded in the
bedrock a few hundred yards under my flat) it looks like there is
stuff do to with namespaces which looks like it might do what I want -
it looks like I can essentially have multiple concurrent DTDs and
declare which one is valid for a chunk by using namespaces. Then each
module could declare its own little namespace. This is kind of
complicated.

Or I could just give up and not care about DTDs: the system doesn't
actually care, so why should I? But then, is there any sense in which
XML is more than an incredibly complex and somehow less functional
version of sexprs? Surely it can't be this bad?

So really, I guess what I'm asking is: am I missing something really
obvious here, or is it all really just a very hard and over-complex
solution to a problem I've already solved?

--tim


Erik Naggum

unread,
Mar 5, 2002, 11:20:55 AM3/5/02
to
* Tim Bradshaw

| So really, I guess what I'm asking is: am I missing something really
| obvious here, or is it all really just a very hard and over-complex
| solution to a problem I've already solved?

XML, being the single suckiest syntactic invention in the history of
mankind, offers you several layers at which you can do exactly the same
thing very differently, in fact so differently that it takes effort to
see that they are even related.

<foo type="bar">zot</foo> actually defines three different views on the
same thing: Whather what you are really after is foo, bar, or zot,
depends on your application. XML is only a overly complex and otherwise
meaningless exercise in syntactic noise around the message you want to
send. Its notion of "structure" must be regarded as the same kind of
useless baggage that come with language that have been designed by people
who have completely failed to understand what syntax is all about. It is
therefore a mistake to try to shoe-horn things into the "structure" that
XML allows you to define.

In the abaove example, foo can be the application-level element, or it
can be the syntax-level element and bar the application-level element.
It is important to realize that SGML and XML offer a means to control
only the generic identifier (foo) and their nesting, but that it is often
important to use another attribute for the application. This was part of
the reason for #FIXED in the attribute default specification and the
purpose of omitting attributes from the actual tags. In my view, this is
probably the only actually useful role that attributes can play, but
there are other, much more elegant, ways to accomplish the same goal, but
not within the SGML framework. Now, whether you use one of the parts of
the markup, or use the contents of an element for your application is
another design choice. The markup may only be useful for validation
purposes, anyway.

Let me illustrate:

<if><condition>...</condition>
<then>...</then>
<else>...</else>
</if>

The XML now contains all the syntax information of the "host" language.
Many people think this is the _only_ granularity at which XML should be
used, and they try to enforce as much structure as possible, which
generally produces completely useless results and so brittle "documents"
that they break as soon as anyone gets any idea at all for improvement.

<form operator="if"><form operator=... role="condition">...</form>
<form operator=... role="then">...</form>
<form operator=... role="else">...</form>
</form>

The XML now contains only a "surface level" syntax and the meaning of the
form elements is determined by the application, which discards or ignores
the "form" element completely and looks only at the attributes. This way
of doing things allows for some interesting extensibility that XML cannot
do on its own, and for which XML was designed because people used SGML
wrong, as in the first example.

<form>if<form>...condition...</form>
<form>...then...</form>
<form>...else...</form>
</form>

The XML is now only a suger-coating of syntax and the meaning of the
entire construct is determined by the contents of the form elements,
which are completely irrelevant after they have been parsed into a tree
structure, which is very close to what we do with the parentheses in
Common Lisp.

I hope this can resolve some of the problems of being forced to use XML,
but in all likelihood, lots of people will object to anything but the
finest granularity, even though it renders their use of XML so complex
that their applications will generally fail to be useful at all. Such is
the curse of a literally meaningless syntactic contraption whose
verbosity is so enormous that people are loath to use simple solutions.

My preferred syntax these days is one where I use angle brackets instead
of parentheses and let the symbol in the first position determines the
parsing rules for the rest of that "form". It could be mistaken for XML
if you are completely clueless, but then again, if you had any clue, you
would not be using XML.

///
--
In a fight against something, the fight has value, victory has none.
In a fight for something, the fight is a loss, victory merely relief.

Christopher Browne

unread,
Mar 5, 2002, 11:59:01 AM3/5/02
to
In the last exciting episode, Erik Naggum <er...@naggum.net> wrote::

> * Tim Bradshaw
> | So really, I guess what I'm asking is: am I missing something really
> | obvious here, or is it all really just a very hard and over-complex
> | solution to a problem I've already solved?

> XML, being the single suckiest syntactic invention in the history
> of mankind, offers you several layers at which you can do exactly
> the same thing very differently, in fact so differently that it
> takes effort to see that they are even related.

Wouldn't the embedding of quasi-XML-like functionality into HTML be
considered to suck even worse?
--
(reverse (concatenate 'string "gro.mca@" "enworbbc"))
http://www3.sympatico.ca/cbbrowne/finances.html
Giving up on assembly language was the apple in our Garden of Eden:
Languages whose use squanders machine cycles are sinful. The LISP
machine now permits LISP programmers to abandon bra and fig-leaf.
-- Epigrams in Programming, ACM SIGPLAN Sept. 1982

Kent M Pitman

unread,
Mar 5, 2002, 12:38:13 PM3/5/02
to
Erik Naggum <er...@naggum.net> writes:

> * Tim Bradshaw
> | So really, I guess what I'm asking is: am I missing something really
> | obvious here, or is it all really just a very hard and over-complex
> | solution to a problem I've already solved?
>
> XML, being the single suckiest syntactic invention in the history of
> mankind, offers you several layers at which you can do exactly the same
> thing very differently, in fact so differently that it takes effort to
> see that they are even related.

I don't think there's anything wrong with XML that a surgeon's knife,
removing 80% (or more) of the standard's text, wouldn't fix.

IMO, what makes XML bad is not how little it does but how much it
pretends to fix from what came before, yet without changing anything.
If it had either attempted less or been willing to make actual changes,
it might be respected more.

XML's lifeboat-like attempt to rescue all of SGML's functionality from
drowning, yet without applying "lifeboat ethics" and tossing deadweight
overboard (i.e., abandoning compatibility), seems to be the problem.

To quote Dr. Amar Bose (of Bose corporation fame): Better implies different.

Ray Blaak

unread,
Mar 5, 2002, 12:42:14 PM3/5/02
to
Tim Bradshaw <t...@cley.com> writes:
> Or I could just give up and not care about DTDs: the system doesn't
> actually care, so why should I?

Give up an don't care about DTDs. Your posting gives a clearer explanation
about your format than any DTD would. DTDs that are for humans to read have to
be understandable, and if the DTD will be torturous than there is no point.

DTDs other official purpose is for separate validation, a dubious idea in my
opinion. The application that finally processes an XML file will need to
validate it on its own anyway, so what is the point of validation in advance?

> But then, is there any sense in which XML is more than an incredibly complex
> and somehow less functional version of sexprs? Surely it can't be this bad?

It's really that bad. XML does have the nice notion of support for various
character encodings. There are tricks with namespaces you can do that seem
more powerful, but on the whole things are confusing and error prone as holy
hell.

> So really, I guess what I'm asking is: am I missing something really
> obvious here, or is it all really just a very hard and over-complex
> solution to a problem I've already solved?

You are not missing anything.

--
Cheers, The Rhythm is around me,
The Rhythm has control.
Ray Blaak The Rhythm is inside me,
bl...@telus.net The Rhythm has my soul.

Bob Bane

unread,
Mar 5, 2002, 12:50:32 PM3/5/02
to
Erik Naggum wrote:
>
> XML, being the single suckiest syntactic invention in the history of
> mankind, offers you several layers at which you can do exactly the same
> thing very differently, in fact so differently that it takes effort to
> see that they are even related.
>
Believe it or not, there are things in actual operational use that
syntactically suck worse than XML. Check out:

http://pds.jpl.nasa.gov/stdref/chap12.htm

which describes Object Definition Language (ODL), developed by NASA/JPL
in the early 90's to hold metadata for space data sets (primarily
planetary probe data).

XML is what you get when you assign the nested property list problem to
people who only know SGML. ODL is apparently what you get when you
assign the same problem to people who only know FORTRAN.

Lisp:
(foo (bar "baz"))
XML:
<foo> <bar>baz</bar> </foo>
or maybe:
<foo bar="baz"/>

ODL:
OBJECT = FOO
BAR = "baz"
END_OBJECT = FOO

ODL is the official standard metadata representation for data from the
Earth Science Data and Information System, NASA's next generation
observe-the-whole-earth data gathering project. I am currently working
on a task to take ODL from this system and display it intelligibly. The
current solution (chosen before I got here) is to take the ODL, convert
it to XML, then bounce the XML off an XSLT stylesheet to generate
HTML/Javascript.

So remember as you slog through yet another brain-damaged XML
application - it could be worse.

--
Bob Bane
ba...@removeme.gst.com

Erik Naggum

unread,
Mar 5, 2002, 1:25:39 PM3/5/02
to
* Christopher Browne <cbbr...@acm.org>

| Wouldn't the embedding of quasi-XML-like functionality into HTML be
| considered to suck even worse?

As I have become fond of saying lately, there is insufficient granularity
at that end of the scale to determine which is worse.

Erik Naggum

unread,
Mar 5, 2002, 1:51:30 PM3/5/02
to
* Ray Blaak <bl...@telus.net>

| DTDs other official purpose is for separate validation, a dubious idea in
| my opinion. The application that finally processes an XML file will need
| to validate it on its own anyway, so what is the point of validation in
| advance?

Remember when C was so young and machines so small that the compiler
could not be expected to do everything and we all studiously ran "lint"
on our programs? It was a fascinating time, I can tell you.

Thaddeus L Olczyk

unread,
Mar 5, 2002, 3:49:27 PM3/5/02
to
On Tue, 05 Mar 2002 16:20:55 GMT, Erik Naggum <er...@naggum.net> wrote:

> XML, being the single suckiest syntactic invention in the history of
> mankind,

APL.

David Golden

unread,
Mar 5, 2002, 3:50:27 PM3/5/02
to
Tim Bradshaw wrote:

> I have a system which currently reads an sexp based config file
> syntax, for which I need to provide (and in fact have provided) an
> alternative XML-based syntax for political reasons.
>

> But then, is there any sense in which


> XML is more than an incredibly complex and somehow less functional
> version of sexprs? Surely it can't be this bad?


XML is an incredibly complex and somehow less functional
vertsion of sexprs. It is that bad.

XML thoroughly sucks, but if you have to deal with it, there is an
excellent Scheme library for dealing with it, and a defined mapping of the
XML "infoset" to scheme, in the form of SXML. It'll go XML to sexprs and
vice versa.

I know it's not common lisp, but, in theory, it could be ported with
relatively little effort, and it should be food for thought.

See http://ssax.sourceforge.net/

About the only vaguely interesting features of XML to me are probably
certain aspects of XML-Schema (the replacement for DTDs), and
perhaps certain aspects of the extended hyperlinking (xlink/xpointer)

I've occasionally pondered the similarities of XML-Schema to syntax-rules
in Scheme, giving some sort of
datatyping-of-tree-structures-based-on-their-structure, or some similarly
wooly concept - i.e. checking whether a given sexpr
would match a given complicated macro definition is vaguely akin to
validating an XML document against an XML schema.

--

Don't eat yellow snow.

Eduardo Muñoz

unread,
Mar 5, 2002, 3:58:50 PM3/5/02
to
Erik Naggum <er...@naggum.net> writes:

> Remember when C was so young and machines so small that the compiler
> could not be expected to do everything and we all studiously ran "lint"
> on our programs?

Probably I wasn't born yet, so what is "lint"?

> It was a fascinating time, I can tell you.

I'm sure. I love when KMP (or someone else) talks
about anciente (for me :) software or hardware
(PDP's, VAX, TOPS, Lisp Machines, ITS and the
like).

--

Eduardo Muñoz

Dr. Edmund Weitz

unread,
Mar 5, 2002, 4:00:08 PM3/5/02
to
David Golden <qnivq....@bprnaserr.arg> writes:

> XML thoroughly sucks, but if you have to deal with it, there is an
> excellent Scheme library for dealing with it, and a defined mapping
> of the XML "infoset" to scheme, in the form of SXML. It'll go XML
> to sexprs and vice versa.
>
> I know it's not common lisp, but, in theory, it could be ported with
> relatively little effort, and it should be food for thought.

If it's just about getting the job done maybe this will help:

<http://www.ccs.neu.edu/home/dorai/scm2cl/scm2cl.html>

Edi.

--

Dr. Edmund Weitz
Hamburg
Germany

The Common Lisp Cookbook
<http://cl-cookbook.sourceforge.net/>

Christopher C. Stacy

unread,
Mar 5, 2002, 4:30:58 PM3/5/02
to
>>>>> On Tue, 05 Mar 2002 20:49:27 GMT, Thaddeus L Olczyk ("Thaddeus") writes:

Thaddeus> On Tue, 05 Mar 2002 16:20:55 GMT, Erik Naggum <er...@naggum.net> wrote:
>> XML, being the single suckiest syntactic invention in the history of
>> mankind,

Thaddeus> APL.

APL syntax is simpler than that of Lisp.
Do you program in APL?

Marco Antoniotti

unread,
Mar 5, 2002, 4:34:32 PM3/5/02
to

I beg to differ. APL *is* weird, but it's syntax is amazingly
simple and regular. It is the net effect that is unreadable. This
net effect is due to the special glyphs required and to the fact that
operators have different "semantics" if monadic or dyadic.

Cheers

--
Marco Antoniotti ========================================================
NYU Courant Bioinformatics Group tel. +1 - 212 - 998 3488
719 Broadway 12th Floor fax +1 - 212 - 995 4122
New York, NY 10003, USA http://bioinformatics.cat.nyu.edu
"Hello New York! We'll do what we can!"
Bill Murray in `Ghostbusters'.

David Golden

unread,
Mar 5, 2002, 4:42:32 PM3/5/02
to
Thaddeus L Olczyk wrote:

Strange that you'd say that. Most people I know who like
Lisp also like APL and Forth (if they know about them in the first place).

Both Forth and APL have simple, elegant, syntax. Kinda like...
oh... Lisp...

Note that I'm not talking about asciified APL abominations, which are a
royal pain in the backside to read... APL is unusual in that if
you DON'T use single-symbol identifiers for things, it gets less readable.

Also, APL programs can look as indecipherable as idiomatic
Perl - if you don't know the language. However, like Perl,
if you take a little time to learn the language, it all makes much
more sense (O.K. a little more sense...)

Marco Antoniotti

unread,
Mar 5, 2002, 4:43:55 PM3/5/02
to

David Golden <qnivq....@bprnaserr.arg> writes:

> Tim Bradshaw wrote:
>
> > I have a system which currently reads an sexp based config file
> > syntax, for which I need to provide (and in fact have provided) an
> > alternative XML-based syntax for political reasons.
> >
>
> > But then, is there any sense in which
> > XML is more than an incredibly complex and somehow less functional
> > version of sexprs? Surely it can't be this bad?
>
>
> XML is an incredibly complex and somehow less functional
> vertsion of sexprs. It is that bad.
>
> XML thoroughly sucks, but if you have to deal with it, there is an
> excellent Scheme library for dealing with it, and a defined mapping of the
> XML "infoset" to scheme, in the form of SXML. It'll go XML to sexprs and
> vice versa.
>
> I know it's not common lisp, but, in theory, it could be ported with
> relatively little effort, and it should be food for thought.
>
> See http://ssax.sourceforge.net/
>

Of course, people who do not know Common Lisp are bound to mess things
up.

How do you justify something written as

(*TOP*
(urn:loc.gov:books:book
(urn:loc.gov:books:title "Cheaper by the Dozen")
(urn:ISBN:0-395-36341-6:number "1568491379")
(urn:loc.gov:books:notes
(urn:w3-org-ns:HTML:p "This is a "
(urn:w3-org-ns:HTML:i "funny") " book!"))))
?

Erik Naggum

unread,
Mar 5, 2002, 4:49:18 PM3/5/02
to
* "Eduardo Muñoz"

| Probably I wasn't born yet, so what is "lint"?

No big loss. "lint" was a program that would compare actual calls and
definitions of pre-ANSI C functions because the languge lacked support
for prototypes, so header files was not enough to ensure consistency and
coherence between separately compiled files, probably not even within the
same file, if I recall correctly -- my 7th edition Unix documentation is
in natural cold storage somewhere on the loft, and it is too goddamn cold
tonight. "lint" also ensured that some of the more obvious problems in C
were detected prior to compilation. It was effectively distributing the
complexity of compilation among several programs because the compiler was
unable to remember anything between each file it had compiled. ANSI C
does not prescribe anything useful to be stored after compiling a file,
either, so manual header file management is still necessary, even though
this is probably the singularly most unnecessary thing programmers do in
today's world of programming. "lint" lingers on.

Kenny Tilton

unread,
Mar 5, 2002, 4:59:57 PM3/5/02
to

Marco Antoniotti wrote:
> operators have different "semantics" if monadic or dyadic.

ah yes, I saw that in the K language, I was wondering what possessed
them, thx for clearing that up. :)

--

kenny tilton
clinisys, inc
---------------------------------------------------------------
"Be the ball...be the ball...you're not being the ball, Danny."
- Ty, Caddy Shack

Tim Bradshaw

unread,
Mar 5, 2002, 5:08:02 PM3/5/02
to
* David Golden wrote:

> XML is an incredibly complex and somehow less functional
> vertsion of sexprs. It is that bad.

Thanks for this and the other followups. I now feel kind of better
about the whole thing.

The really disturbing thing is that huge investments in `web services'
are being predicated on using XML, something which (a) is crap and (b)
is so complicated that almost no-one will be able to use it correctly
(`CORBA was too complicated and hard to use? hey, have XML, it's *even
more* complicated and hard to use, it's bound to solve all your
problems!'). Papers like The Economist are busy writing
plausible-sounding articles about how all this stuff might be the next
big thing.

--tim

Christopher Browne

unread,
Mar 5, 2002, 5:27:13 PM3/5/02
to

> APL.

What's wrong with the syntax of APL?

If there's anything simpler and more regular than Lisp, it's APL.

It's fair to say that a lot of APL code depends on the "abuse" of
quasi-perverse interpretations of matrix operations, but that's not
syntax, that's "odd math."
--
(reverse (concatenate 'string "ac.notelrac.teneerf@" "454aa"))
http://www3.sympatico.ca/cbbrowne/linuxxian.html
Oh, no. Not again.
-- a bowl of petunias

Christopher Browne

unread,
Mar 5, 2002, 5:42:45 PM3/5/02
to
In an attempt to throw the authorities off his trail, David Golden <qnivq....@bprnaserr.arg> transmitted:

> XML thoroughly sucks, but if you have to deal with it, there is an
> excellent Scheme library for dealing with it, and a defined mapping
> of the XML "infoset" to scheme, in the form of SXML. It'll go XML
> to sexprs and vice versa.

> I know it's not common lisp, but, in theory, it could be ported with
> relatively little effort, and it should be food for thought.

When I have need to do so, I use Pierre Mai's C interface to expat.
It uses the expat XML parser, and generates sexp output that can be
read in using READ.

(defun xml-reader (filename)
(let ((xml-stream (common-lisp-user::run-shell-command
(concatenate 'string *xml-parser* " <" filename)
:output :stream)))
(prog1
(read xml-stream)
(close xml-stream))))

It would arguably be nicer to have something paralleling SAX which
would generate closures and permit lazy evaluation. But I haven't
found cases yet where the "brute force" of XML-READER was
unsatisfactory to me.

Note that this has the HIGHLY attractive feature of keeping all
management of "ugliness" in a library (/usr/lib/libexpat.so.1) that is
_widely_ used (including by such notables as Apache, Perl, Python, and
PHP) so that it is likely to be kept _quite_ stable.

I'd argue that expat significantly beats doing some automagical
conversion of Scheme code into CL...
--
(concatenate 'string "aa454" "@freenet.carleton.ca")
http://www3.sympatico.ca/cbbrowne/xml.html
Black holes are where God divided by zero.

Kenny Tilton

unread,
Mar 5, 2002, 5:52:51 PM3/5/02
to

Tim Bradshaw wrote:
> Papers like The Economist are busy writing
> plausible-sounding articles about how all this stuff might be the next
> big thing.

I haven't seen what the Economist has to say, but XML /will/ be the next
big thing if it works out as a lingua franca for data exchange. Not
saying XML does not suck from the syntax standpoint, just that syntax
can be fixed or (more likely) hidden.

Marco Antoniotti

unread,
Mar 5, 2002, 5:57:26 PM3/5/02
to

Kenny Tilton <kti...@nyc.rr.com> writes:

> Marco Antoniotti wrote:
> > operators have different "semantics" if monadic or dyadic.
>
> ah yes, I saw that in the K language, I was wondering what possessed
> them, thx for clearing that up. :)

Yep. Turns out that K is a language that heavily borrows from APL.

Christopher Browne

unread,
Mar 5, 2002, 6:47:30 PM3/5/02
to
The world rejoiced as Erik Naggum <er...@naggum.net> wrote:
> * "Eduardo Muñoz"
> | Probably I wasn't born yet, so what is "lint"?

> No big loss. "lint" was a program that would compare actual calls
> and definitions of pre-ANSI C functions because the languge lacked
> support for prototypes, so header files was not enough to ensure
> consistency and coherence between separately compiled files,
> probably not even within the same file, if I recall correctly --
> my 7th edition Unix documentation is in natural cold storage
> somewhere on the loft, and it is too goddamn cold tonight. "lint"
> also ensured that some of the more obvious problems in C were
> detected prior to compilation. It was effectively distributing
> the complexity of compilation among several programs because the
> compiler was unable to remember anything between each file it had
> compiled. ANSI C does not prescribe anything useful to be stored
> after compiling a file, either, so manual header file management
> is still necessary, even though this is probably the singularly
> most unnecessary thing programmers do in today's world of
> programming. "lint" lingers on.

There are new variations on lint, notably "LCLint" which has become
"Splint" which stands for "Secure Programming Lint." It does quite a
bit more than lint used to do.

Chances are that you'd be better off redeploying the code in OCAML
where type signatures would catch a whole lot more mistakes...
--
(concatenate 'string "cbbrowne" "@acm.org")
http://www.ntlug.org/~cbbrowne/lisp.html
``What this means is that when people say, "The X11 folks should have
done this, done that, or included this or that", they really should be
saying "Hey, the X11 people were smart enough to allow me to add this,
that and the other myself."'' -- David B. Lewis <d...@motifzone.com>

Christopher Browne

unread,
Mar 5, 2002, 7:02:35 PM3/5/02
to

The thing is, you don't actually _write_ any XML unless you're the guy
writing the library/module/package that _implements_ XML-RPC/SOAP.

Here's a bit of Python that provides the "toy" of allowing you to
submit simple arithmetic calculations to a SOAP server. (Of course,
that's a preposterously silly thing to do, but it's easy to
understand!)

def add(a, b):
return a + b

def add_array (e) :
total = 0
for el in e:
total = total + el
return total

A bit of Perl that calls that might be thus:
$a = 100;
$b = 15.5;
$c = $soap->add($a, $b)->result;
print $soap->add($a, $b), "\n";

@d = [1, 2, 3, 4, 7];
print $soap->add_array(@d), "\n";

I've omitted some bits of "client/server setup," but there's no
visible XML in any of that.

The problems with SOAP have to do with it being inefficient almost
beyond the wildest dreams of 3Com, Cisco, and Intel (the main
beneficiaries of the inefficiency in this case).

It should be unusual to need to look at the XML. Pretend it's like
CORBA's IIOP, which you generally don't look too closely at.

The place where you _DO_ look at or write some XML is with the "WSDL"
service description scheme, which is more or less similar to CORBA
IDL.

But I'd think CLOS/MOP would provide some absolutely _WONDERFUL_
opportunities there; it ought to be possible to write some CL that
would generate WSDL given references to classes and methods...


--
(concatenate 'string "aa454" "@freenet.carleton.ca")

http://www.ntlug.org/~cbbrowne/finances.html
I have this nagging fear that everyone is out to make me paranoid.

Erik Naggum

unread,
Mar 5, 2002, 7:37:01 PM3/5/02
to
* Kenny Tilton

| I haven't seen what the Economist has to say, but XML /will/ be the next
| big thing if it works out as a lingua franca for data exchange. Not
| saying XML does not suck from the syntax standpoint, just that syntax
| can be fixed or (more likely) hidden.

XML would not be so bad as it is if it were possible to pin down how to
represent it in the memory of a computer. At this time, the most common
suggestion is _vastly_ worse than anything a Common Lisp programmer would
whip up in a drunken stupor. DOM, SAX, XSLT, whatever the hell these
moreons are re-inventing, XML _could_ have been a pretty simple and
straight-forward syntax for a pretty simple and straight-forward external
representation of in-memory objects. This is manifestly _not_ the case,
since so much irrelevant crap has to be carried around in order to output
the same XML you read in.

There are certain mistakes people who have been exposed to Common Lisp
are not likely to make when it comes to mapping internal and external
representations of various object types. Every single one of those
mistakes has been made by the XML crowd, which is not very surprising,
considering the intense disdain for computer science that underlies the
SGML community -- they wanted to save their documents from the vagaries
of application programmers! Instead, they went into exactly the same
trap as every retarded application programmer has fallen into with their
eyes closed. And of _course_ Microsoft thinks it is so great -- XML
embodies the same kinds of mistakes that they are known for in their
proprietary unreadable "document" formats. All in all, a tragedy that
could have been avoided if they had only listened to people who knew how
computer scientists had been thinking about the same problem before them
-- but they would never listen, SGML was a political creation from before
it was born, and nobody should tell them how to do their stuff after it
had been standardized, lest it be deemed to have errors and mistakes...
Instead, we get anti-computer anti-scientists meddling with stuff they
have no hope of ever getting right, and certainly not be able to fix.

XML will go down with Microsoft, whose Steve Ballmer has now threatened
to withdraw Windows XP from the market and not do any more "innovation"
because of the demands made by the government lawsuits! Next, we will
see organized crime barons around the world threaten to stop trafficking
drugs if the police do not stop harassing them. That would certainly
stop the world economy! Steve Ballmer has once again demonstrated why
the evil that is Microsoft must be stopped _before_ it acquires enough
power to actually hurt anyone by making such threats.

Christopher C. Stacy

unread,
Mar 6, 2002, 12:00:35 AM3/6/02
to
>>>>> On 05 Mar 2002 16:34:32 -0500, Marco Antoniotti ("Marco") writes:

Marco> olc...@interaccess.com (Thaddeus L Olczyk) writes:

>> On Tue, 05 Mar 2002 16:20:55 GMT, Erik Naggum <er...@naggum.net> wrote:
>>
>> > XML, being the single suckiest syntactic invention in the history of
>> > mankind,
>> APL.

Marco> I beg to differ. APL *is* weird, but it's syntax is amazingly
Marco> simple and regular. It is the net effect that is unreadable. This
Marco> net effect is due to the special glyphs required and to the fact that
Marco> operators have different "semantics" if monadic or dyadic.

You're right about the regularity, but I don't think that the glyphs
make it _less_ readable - it could be argued that they make it _more_
readable. Of course, if you don't know APL, it will be unreadable,
just like if you don't know Lisp it will be unreadable (in any
very meaningful way).

One can write impenetrable programs in any language, but with dense
languages like APL you pack more trouble on a single line.
Professional APL programmers strive to make their code readable.

Kenny Tilton

unread,
Mar 6, 2002, 12:11:31 AM3/6/02
to
Here is my question to Ye Who Know XML: can you say what you want to say
in XML? I gather one huge objection is that there are N ways to say it.
As long as (plusp N), we're OK. No one other than a compiler author
should be thinking about RISC code, so all we need is.... XCL! Or XMCL:
better syntax compiled into some (doesn't matter what) legal XML.

Since Erik and Tim have bitched and moaned the most about XML, I think
they have to do this latest CL contrib. I mean, what self-respecting
c.l.l contributor cannot point to a pro bono CL contrib?

:)

The Xtroardinary thing about XML is that the world has pretty much
agreed we should all Xchange data in some (doesn't matter what)
universal /teXt/ format. That's the baby, syntaX is the bath water.

Jeff Greif

unread,
Mar 6, 2002, 12:20:23 AM3/6/02
to
Yes, there are things wrong with XML; however, if you are forced to use
it, one of its good uses is for configuration files. When used for such
an application, validating with a DTD (or XML Schema or some other
structure constraint language) helps, at least if people ever modify the
configuration files by hand, and if you use off-the-shelf XML software
to provide some sort of parsed and validated representation for your
application. Why?

The main reason is that you get several layers of error checking and
reporting free:
-- well-formedness checking
-- primitive validity checking of the structure and attribute values
against the DTD or other constraint language

I've found that this greatly reduces the defensive parts of the
application (the possible erroneous conditions your code has to handle).
It wouldn't always be this way, but often when there is an error in the
configuration file, the application is not going any farther until it's
fixed, and the error reporting from the off-the-shelf software is just
good enough to enable the user to fix the file and retry.

Another thing you get free is default values for attributes. This
reduces the amount of editing by users (and may make filling out the
config file more palatable), and still allows the application defaults
to be changed without changing running code. For most users of an app,
typing in XML is a nuisance without editor support, particularly, for
more naive users, that provided by an editor that knows the structure
desired from the DTD and basically only lets the user construct
something both well-formed and valid. If your users are guaranteed to
use one of these, you could skip the runtime validation against the DTD
and let the editor handle it. But it probably can't be guaranteed that
your users will use a structured editor.

Of course, you don't need any of this if your application flawlessly
generates the configuration file from some UI you present to the user.

Finally, this mode of operation allows a certain amount of version skew
between configuration file and your application. The parsing tools are
completely generic and deliver all the standard info from the XML file,
validated or not as you choose. What parts of it your application
chooses to grab is a separate decision, as is what it decides to do with
that information.

For your particular application, you could simply translate the XML file
to your sexpr format as early in the process as possible, leaving all
the validation stuff as it was before XML got into the picture. You
could do this before the inclusion processing was done (presumably the
included files would be XML also, and would have to be translated also).
You could probably do this simple translation with SAX event handlers
atop expat (or any other suitable parser). I'm pretty sure the Xerces
parser that comes with Apache can both validate according to a DTD or
XML Schema and also deliver SAX events. However, you'd have to have
separate DTDs for the outer file and the conditional inclusions.
Alternatively, you could use some kind of XML inclusion processor (I'm
not up on what's available) and validate after the entire structure was
assembled. Given how much you already have invested in handling the
sexpr-based format, this is probably the wrong choice.

I don't think it's doomed or hopeless, and shouldn't cause the size of
the system to double.

Jeff


Christopher Browne

unread,
Mar 6, 2002, 12:26:54 AM3/6/02
to

There is a counterargument to this...

APL programmers often strive HARD to avoid having explicit loops in
their code. (Not quite like Graham's "LOOP Considered Harmful;" more
like "Diamond Considered Harmful"...)

The legitimate reason for this is that if you keep the APL environment
humming on big matrix operations, it takes advantage of all the
Vectorized Power of APL.

Stopping for a while to interpret a loop is a substantial shift of
gears.

This has the result that code too often goes to near-pathological
extremes to do Matrix Stuff that replaces loops. The result of that
is that some "inscrutability" is introduced.
--
(reverse (concatenate 'string "gro.gultn@" "enworbbc"))
http://www3.sympatico.ca/cbbrowne/apl.html
God is dead. -Nietszche
Nietszche is dead. -God

Raymond Wiker

unread,
Mar 6, 2002, 2:56:20 AM3/6/02
to
"Jeff Greif" <jgr...@spam-me-not.alumni.princeton.edu> writes:

> Yes, there are things wrong with XML; however, if you are forced to use
> it, one of its good uses is for configuration files. When used for such
> an application, validating with a DTD (or XML Schema or some other
> structure constraint language) helps, at least if people ever modify the
> configuration files by hand, and if you use off-the-shelf XML software
> to provide some sort of parsed and validated representation for your
> application. Why?
>
> The main reason is that you get several layers of error checking and
> reporting free:
> -- well-formedness checking
> -- primitive validity checking of the structure and attribute values
> against the DTD or other constraint language

I *really* disagree with this. Editing XML files is a royal
pain, and the only way to get rid of this pain is if you don't
actually see the XML. The only way not to see the XML is if the editor
hides the XML, which means that you have to have some smarts in the
editor. The XML format may (or may not) make the editor easier to
write, but you still have to augment the checking that the XML
machinery gives you.

--
Raymond Wiker Mail: Raymon...@fast.no
Senior Software Engineer Web: http://www.fast.no/
Fast Search & Transfer ASA Phone: +47 23 01 11 60
P.O. Box 1677 Vika Fax: +47 35 54 87 99
NO-0120 Oslo, NORWAY Mob: +47 48 01 11 60

Try FAST Search: http://alltheweb.com/

Espen Vestre

unread,
Mar 6, 2002, 3:45:13 AM3/6/02
to
Erik Naggum <er...@naggum.net> writes:

> eyes closed. And of _course_ Microsoft thinks it is so great -- XML
> embodies the same kinds of mistakes that they are known for in their
> proprietary unreadable "document" formats.

The only thing I like about XML is that the fact that XML versions of
Word documents so brilliantly exposes how incredibly broken the software
producing it is.

Sigh. Even the NeXt people at Apple has started moving their property
list file format from a reasonable curly-bracket style to XML:

[macduck:/] root# cat /System/Library/StartupItems/SystemTuning/Resources/no.lproj/Localizable.strings
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist SYSTEM "file://localhost/System/Library/DTDs/PropertyList.dtd">
<plist version="0.9">
<dict>
<key>Tuning system</key>
<string>Stiller inn system</string>
</dict>
</plist>

--
(espen)

Tim Bradshaw

unread,
Mar 6, 2002, 7:24:03 AM3/6/02
to
* Kenny Tilton wrote:
> Since Erik and Tim have bitched and moaned the most about XML, I think
> they have to do this latest CL contrib. I mean, what self-respecting
> c.l.l contributor cannot point to a pro bono CL contrib?

Well, if I had *time* I have this thing called DTML which is kind of
pointy-bracket-compliant lisp:

<doc :title "foo"
<tocifying :levels 3
<indexifying
<h1|This is a header, stuff in here is just text>
<p|text text>>>>

<module :type c++
<parameters
<build-parameters
<param :for (gcc c++-source-processor) :name :include-path
:value ("/local/include/" "/usr/local/fub/include")>>>>

It can emit XML or various other formats, and it has hacky but
functional tree-rewriting macros (very like the CL macro system) and
so on. We use it a lot, but only seriously for text, as I'd much
rather use sexps for data stuff (see second example above). I have to
write a manual and clean it up and make it work in more lisps, but one
day. Unfortunately not being an academic or independently wealthy
it's seriously non-trivial to find time, but one day.

(It was based on an idea Erik mentioned on cll, I suspect he may have
a better version of the same idea.)

--tim

Tim Bradshaw

unread,
Mar 6, 2002, 7:38:55 AM3/6/02
to
* Jeff Greif wrote:
> The main reason is that you get several layers of error checking and
> reporting free:
> -- well-formedness checking
> -- primitive validity checking of the structure and attribute values
> against the DTD or other constraint language

You seem to have not understood my article. *If* my application was a
great static thing where I could sit down once and for all and write
some DTD, then I could get DTD validation. But I *can not have* a
single DTD (at a useful level), since someone can load modules at
runtime which allow new syntax and declare new things as valid. Think
Lisp.

Even if I could have a DTD the metasyntax means that the DTD needs to
be incredibly lax because I have to allow metasyntax everywhere.

So what I end up with is well-formedness and some idiot trivia like
default values for attributes (which are basically useless because you
can't put structured data there).

Well, I had that already. Now, after 1500 lines of extra code
(fortunately I didn't write all of it) and expat I have a kind of
worse version of the same thing. The previous config file reader
(including all the validation, and preprocessing) was 500 lines.

--tim

Julian Stecklina

unread,
Mar 6, 2002, 8:51:23 AM3/6/02
to
"Eduardo Muñoz" <e...@jet.es> writes:

[...]

> > It was a fascinating time, I can tell you.
>
> I'm sure. I love when KMP (or someone else) talks
> about anciente (for me :) software or hardware
> (PDP's, VAX, TOPS, Lisp Machines, ITS and the
> like).


There is a german movie called "23" that deals with some "hackers" in
the mid 80ies. It was amazing when they bought a PDP as large as a
washing machine. :)

--
Meine Hompage: http://julian.re6.de

Um meinen oeffentlichen Schluessel zu erhalten:
To get my public key:
http://math-www.uni-paderborn.de/pgp/

Espen Vestre

unread,
Mar 6, 2002, 9:25:59 AM3/6/02
to
Julian Stecklina <der_j...@web.de> writes:

> There is a german movie called "23" that deals with some "hackers" in
> the mid 80ies. It was amazing when they bought a PDP as large as a
> washing machine. :)

Gee, I forgot that there were PDPs that small. I remember the DEC-10
system I used to use, which, including all its extra equipment - was a
large room full of refrigerator-looking cabinets and some of those
lovely small top-loading washing machines (which really were disk
cabinets with - what a novelty! - removable hard drives!).
--
(espen)

Tim Bradshaw

unread,
Mar 6, 2002, 10:22:15 AM3/6/02
to
* Kenny Tilton wrote:

> I haven't seen what the Economist has to say, but XML /will/ be the next
> big thing if it works out as a lingua franca for data exchange. Not
> saying XML does not suck from the syntax standpoint, just that syntax
> can be fixed or (more likely) hidden.

Sure, it will be the next big thing in the same sense that the web was
the last big thing: a lot of money will get spent on it, and there
will be a feeding frenzy and a few people will get rich, and then it
will all fall apart when people realise that it isn't actually
transforming the economy. Or perhaps people will still remember the
various web frenzies and it actually won't be the next big thing at
all.

What it won't do, I think, is *technically* transform things or make
life better. Here's a theory as to why:

A lot of commercial computing is all about reduction of friction in
various real-world processes. It's not a useful product in itself,
but it might make various other things less expensive to do. So the
whole e-commerce hype was predicated on the fact that e-commerce could
reduce the cost of transactions and give better access to information
thus enabling the market to work more efficiently.

This is a nice idea, and it ought to work. It's a bit of a let-down
for the web spin doctors that that's all it comes down to, but
actually frictional costs are often very high - a disturbingly large
amount of your phone bill goes to frictional costs of creating bills
for instance, so potentially it's a big win.

But this idea is only a win if the friction added by the computing
solution is lower than that it takes away. In particular these
systems should work. Currently a lot of the problem is that we just
can't write software that works reliably generally (some people can,
but you can't hire people and expect them to produce software that
works).

One cause of this is complexity. If you have a system that is complex
to use and understand, then most people will not use it correctly or
understand it. This will mean that software which uses it is
unreliable, or that it is very expensive to write and maintain.
Software complexity is friction.

CL people should be familiar with this - CL was a pretty complex
language for the 80s and this has historically meant that people find
it hard to use correctly and good CL programmers are expensive
(because they need to have read an internalised ~1000 pages of
specification).

Complexity - and its associated friction - is often necessary. I
think CL is an example: I don't think the language is much more
complex than it needs to be to do what it does. This is just an
opinion of course, others may differ. But in any case systems that do
complicated things need to be complicated.

But complexity that is *not* needed is pure friction, and is just a
cost.

I think that computing systems are becoming much more complex than
they need to be and thus much more frictional. In particular I think
this is true of XML in spades, and it looks like other people agree
with me. I'm not completely clear why this is happening, but my
hypothesis is that complexity is a kind of disease of people's minds.
The reason for this, I think is that people can only think about a
finite amount of stuff at once. If they get lured into a complicated
system, then they tend to have to spend all their time and energy
coping with the complexity, and they completely lose track of what the
system is actually for. So problems tend to get solved by adding more
complexity, because they can't step back and see the actual problem
any more. So once systems become sufficiently complex, people get lost
inside them and become unable to do anything but add yet more
complexity to the system. Occasionally people get so stressed by the
complexity that they revolt against it, and create systems whose sole
aim is to be *simple* - I think scheme is an example of such. These
people haven't escaped from the complexity disease: they are still
obsessed with complexity and have lost sight of the problem.

I think that this situation is fairly desperate. As more people
become lost in complexity and thus stop doing anything useful but
simply create more complexity, the complexity of systems increases
enormously. People who have managed to not fall into the trap but
still have some overview of the problems they are *actually* trying to
solve rather than the problems created by complexity, now have a
problem. In order to interact with these hugely overcomplex systems,
they need to understand how they work, so they have to devote
increasing resources to understanding the complexity until eventually
it overwhelms them and they too become trapped and stop doing anything
useful. This looks pretty toxic: complexity is a virus which is going
to get us all in the end unless we can find a way of simply not
interacting with the systems which contain the virus.

However it's not quite as bad as it looks because there are external
factors: these systems are meant to be used for people's financial
benefit. Overcomplex systems are more expensive to deploy and
maintain and less reliable than merely sufficiently complex systems.
So they make less money for people. This will put the brakes on
complexity: eventually you won't be able to get funding any more to
produce yet another 1000+-page spec for something to `patch' some
deficiency in XML but which actually simply makes it worse. It's not
clear to me whether the system will then equilibruate, in the way
that, say, Word probably has, at a level where it is merely expensive
but not crippling, or whether there will be a real backlash as people
decide they'd like to spend money on something other than yet more
software.

--tim

PS: you can see the kind of effect that complexity has on people as
they get lost in it in some of the followups to my original article.
I basically said that XML was an overcomplex nightmare, and at least
one response suggested that I could fix this by learning and using
some yet other encrustations on top of XML which would let me generate
s-expressions, at the cost of a few more thousand pages of
documentation to understand. But if you have an XML tokenizer, it's
basically *trivial* to generate s-expressions from XML.

Kent M Pitman

unread,
Mar 6, 2002, 10:55:48 AM3/6/02
to
Julian Stecklina <der_j...@web.de> writes:

> "Eduardo Munoz" <e...@jet.es> writes:
>
> [...]
>
> > > It was a fascinating time, I can tell you.
> >
> > I'm sure. I love when KMP (or someone else) talks
> > about anciente (for me :) software or hardware
> > (PDP's, VAX, TOPS, Lisp Machines, ITS and the
> > like).
>
>
> There is a german movie called "23" that deals with some "hackers" in
> the mid 80ies. It was amazing when they bought a PDP as large as a
> washing machine. :)

Ha. Sometimes as SMALL as that. Mostly that's the size (and even
look) of a disk drive of the era. Except for the KS-10, which is a
latter day model that was very small, I think PDP10's, fully
configured with memory, etc. took up a LOT more space than that.
Certainly the ones we had at MIT did. I've seen some small
laundramats that weren't as big as the PDP10's we had...

Kenny Tilton

unread,
Mar 6, 2002, 11:30:34 AM3/6/02
to

Tim Bradshaw wrote:
> But [XML] is only a win if the friction added by the computing


> solution is lower than that it takes away.

Agreed. From the beginning I have concentrated on making Cells friendly
as well as powerful, having seen many an innovation which seemed great
to its developers not catch on with a wider audience because it was such
a pain to use.

But that was my point about hiding the complexity of XML. I have a
compiler that let's me program RISC from "C" and WYSIWYG layout tools
that write HTML for me. I want the same from XCML. Or more likely a
product such as XIS from Excelon, a native XML database sitting atop
their C++ ODB.

The business world spends a lot on programs that do nothing but convert
the output of one program into a format comprehensible to a second
program, even where the both programs were developed by the same
company. These conversion programs are a pain to write where the two
programs see the world differently, and they require steady maintenance
to keep up with changes to either the input or output format.

XML seems to me like it can minimize the sensitivity to file formats,
and since reaping this benefit requires folks to sit down and agree on
domain specific data structures, folks may get drawn into making their
apps see the world more uniformly. Call it self-fulfilling hype: the
world responds to the hype of a data lingua franca by taking steps they
could have taken twenty years ago, and voila the hyped product gets the
credit (and deserves it?)

Kenny Tilton

unread,
Mar 6, 2002, 11:36:58 AM3/6/02
to

Tim Bradshaw wrote:
>
> * Kenny Tilton wrote:
> > Since Erik and Tim have bitched and moaned the most about XML, I think
> > they have to do this latest CL contrib. I mean, what self-respecting
> > c.l.l contributor cannot point to a pro bono CL contrib?
>
> Well, if I had *time* I have this thing called DTML which is kind of
> pointy-bracket-compliant lisp:

I wonder if that is what we need for CL-PDF. I want to look at marrying
the Cells project with CL-PDF, but unless I do a WYSIWYG document editor
(should I?) I will need a markup language parser.

> Unfortunately not being an academic or independently wealthy
> it's seriously non-trivial to find time

true, true... if I were not so keen on Cells I certainly would have
trouble finding the time.

Jeff Greif

unread,
Mar 6, 2002, 11:39:55 AM3/6/02
to
Perhaps my response was not clear. I suggested that you provide a DTD
for each includable section, or module, and that the include directives,
or conditional include directives be a part of the XML. You parse the
XML files recursively from includer to included in each case using the
DTD of the file you're parsing (which knows nothing about what is in the
included files), and convert to sexpr form. Your application logic
evaluates the conditions, etc., and then decides whether to carry out
the specified inclusion, which in turn will be validated against its own
DTD and then converted to sexprs for deeper inclusions if any.

If these superficial DTDs that only validate the outer structure (down
to the includes) of each file aren't helpful, leave them out, if your
client will let you. If the client requires them, they shouldn't be all
that difficult to produce. They should only reflect the inclusion
syntax, not its semantics. An inclusion should be a leaf in a DTD.

Jeff

"Tim Bradshaw" <t...@cley.com> wrote in message
news:ey36649...@cley.com...

Paolo Amoroso

unread,
Mar 6, 2002, 11:41:37 AM3/6/02
to
On 05 Mar 2002 21:58:50 +0100, "Eduardo Muñoz" <e...@jet.es> wrote:

> I'm sure. I love when KMP (or someone else) talks
> about anciente (for me :) software or hardware
> (PDP's, VAX, TOPS, Lisp Machines, ITS and the
> like).

You can get emulators for some of them (PDPs, VAX and more):

SIMH - Computer History Simulation Project
http://simh.trailing-edge.com

Erik and Kent recently commented that knowledge of early file systems is
useful for understanding Common Lisp pathnames. Those emulators may be an
occasion to learn more about those file systems. By the way, a Lisp
implementation for a PDP (PDP-6?) is also available at that site.


Paolo
--
EncyCMUCLopedia * Extensive collection of CMU Common Lisp documentation
http://www.paoloamoroso.it/ency/README
[http://cvs2.cons.org:8000/cmucl/doc/EncyCMUCLopedia/]

Frederic Brunel

unread,
Mar 6, 2002, 12:27:22 PM3/6/02
to
Christopher Browne <cbbr...@acm.org> writes:

> When I have need to do so, I use Pierre Mai's C interface to expat.
> It uses the expat XML parser, and generates sexp output that can be
> read in using READ.
>
> (defun xml-reader (filename)
> (let ((xml-stream (common-lisp-user::run-shell-command
> (concatenate 'string *xml-parser* " <" filename)
> :output :stream)))
> (prog1
> (read xml-stream)
> (close xml-stream))))
>

> Note that this has the HIGHLY attractive feature of keeping all
> management of "ugliness" in a library (/usr/lib/libexpat.so.1) that is
> _widely_ used (including by such notables as Apache, Perl, Python, and
> PHP) so that it is likely to be kept _quite_ stable.

I think it's an acceptable solutions for most systems and it could be
usefull for me. Where could I find this piece of code before I written
my own? :)

--
Frederic Brunel
Software Engineer
In-Fusio, The Mobile Fun Connection

Ray Blaak

unread,
Mar 6, 2002, 12:26:15 PM3/6/02
to

"Jeff Greif" <jgr...@spam-me-not.alumni.princeton.edu> writes:
> [DTD/Schema validation is good because]

> The main reason is that you get several layers of error checking and
> reporting free:
> -- well-formedness checking
> -- primitive validity checking of the structure and attribute values
> against the DTD or other constraint language
>
> I've found that this greatly reduces the defensive parts of the
> application (the possible erroneous conditions your code has to handle).

It doesn't. Since the application cannot assume that the input has already
been validated (there is nothing stopping a user from giving the application
complete garbage, after all), it needs to check anyway. The alternative is
uncontrolled crashes.

> It wouldn't always be this way, but often when there is an error in the
> configuration file, the application is not going any farther until it's
> fixed, and the error reporting from the off-the-shelf software is just
> good enough to enable the user to fix the file and retry.

The application can use the same off-the-shelf software to report errors as
well. Alternatively, the application, when looking for required elements, or
finding misformed elements, and easily report the offending locations using
the services provided by standard XML tools.

> Another thing you get free is default values for attributes.

These default values can also be assumed directly by the application, given
the same benefits to the user.

> [...] and still allows the application defaults to be changed without
> changing running code.

This is one advantage of a DTD. However, if this is what one is after, then
default values can avoid being hardcoded in the usual way by being read in
from a vastly simpler configuration file (which can also be in XML, by the way,
but that is not the point).

> Finally, this mode of operation allows a certain amount of version skew
> between configuration file and your application. The parsing tools are
> completely generic and deliver all the standard info from the XML file,
> validated or not as you choose. What parts of it your application
> chooses to grab is a separate decision, as is what it decides to do with
> that information.

This can be done anyway. Application processing needs to be fairly tolerant so
that future or obsolete file versions can be accommodated. E.g. unexpected
elements can be ignored (perhaps with warnings) and only missing required
elements or malformed required elements are reported as errors.

Note that every grammar rule in a DTD or Schema implies some corresponding
code in the application to process it semantically. That is, the application
necessarily has the knowledge of the grammar hard coded within it. I prefer to
avoid the maintenance problem of keeping the grammar and the application in
synch.

The point is that what really matters is not the results of a prevalidation
against a DTD/Schema. In the end what matters is how the application processes
the file. The prevalidation can give you reasonable confidence, sure, but the
final validity of a file is not known until given to the application.

It's like dealing with fortune tellers: it's nice to know the future, but one
can't actually be sure of what will happen until the future actually becomes
now.

Similary, its nice to have some measure of confidence about an XML file, but
it will be given to the application anyway, so why not just skip a step?

> I'm pretty sure the Xerces parser that comes with Apache can both validate
> according to a DTD or XML Schema and also deliver SAX events.

Or you can simply ignore DTD/Schema validation, and your parsing is that much
faster.

If one does not worry about DTDs and Schemas there is a vast simplification in
usage (the client does not have to prevalidate, application parsing is
simpler), and complexity (no DTD/Schema needs to be developed and maintained),
with no real disadvantages (the same crucial error checking will be done
anyway).

So, why bother? It is just artificial work.

--
Cheers, The Rhythm is around me,
The Rhythm has control.
Ray Blaak The Rhythm is inside me,
bl...@telus.net The Rhythm has my soul.

Tim Bradshaw

unread,
Mar 6, 2002, 12:14:14 PM3/6/02
to
* Jeff Greif wrote:
> Perhaps my response was not clear. I suggested that you provide a DTD
> for each includable section, or module, and that the include directives,
> or conditional include directives be a part of the XML. You parse the
> XML files recursively from includer to included in each case using the
> DTD of the file you're parsing (which knows nothing about what is in the
> included files), and convert to sexpr form. Your application logic
> evaluates the conditions, etc., and then decides whether to carry out
> the specified inclusion, which in turn will be validated against its own
> DTD and then converted to sexprs for deeper inclusions if any.

I suppose I could do this. It would mean that I'd have many times
more config files than I have currently because modules can no longer
declare syntax valid for a single file. And any module writer would
have the burden of writing a DTD as well, but I suppose that's
inevitable.

--tim

Jeff Greif

unread,
Mar 6, 2002, 1:05:43 PM3/6/02