* Fabrice Popineau <Fabrice.Popin...@supelec.fr> | This is not the problem. You stated that 'there is no consensus on | what an XML document means'.
I'm sorry, but could you please pay attention to what I'm saying so I don't have to reestablish the entire context _every_ time I say something you apparently are not going to accept and keep bickering about?
To be blunt: *ML documents derive meaning from sources external to the documents. Even if you use XSL to obtain meaning as far as _presentation_ is concerned, you still don't have a clue what you're dealing with unless you're actually the _same_ application as the writer of the XML document. *ML is no better than random chunks of binary data, but it also is no worse -- it could easily have been.
| The DOM is a recommendation of the W3C, so it is a consensus, even if | you do not like it.
That's the worst non sequitur this newsgroup has suffered in a while. If you can't argue better than this, go back to school and shut up.
| From the 'parser problem' point of view, it is the recommended way | to access the document and any parser should ideally follow it.
I'm glad you're providing evidence of your understanding that DOM is essentially no more than an access mechanism, which I called merely an alternate representation, not actually representing a _meaning_. Can you please make the effort to grasp the difference?
| From a practical point of view, I have found several DOM modules for | Perl, C/C++ that quickly allowed me to hack XML documents but I have | not been able to find the same thing for Lisp (any hint there ?). | And even if DOM does not follow an ideally good design, it is | already useful.
I was not talking about your ability to find useful tools to access *ML documents via DOM "API"'s, OK? Now, _get_ the idea, damnit!
| If you have better proposals, just submit them to the W3C.
Oh, Christ, another one of those. Just go away. If you don't like that response, please submit your suggestions for improvements to the Norwegian government, or better yet: NATO. Wait, try EU! No, make that the United Nations.
#:Erik -- If this is not what you expected, please alter your expectations.
Fabrice Popineau <Fabrice.Popin...@supelec.fr> wrote: > If you have better proposals, just submit them to the W3C.
Irrespective of programming language I find it pretty tiresome to deal with XML on a low level, be it SAX or DOM. This level may be appropriate for applications targetting working _on_ XML. For a applications that only _use_ XML for externally representing objects I'd much prefer a direct mapping between internal and external representation. If I'm not mistaken, this is very familiar to Lisp people. Incidentally, Sun is working on something like this for Java (keyword: XML data binding).
* Fabrice Popineau <Fabrice.Popin...@supelec.fr> Fabrice> This is not the problem. You stated that 'there is no Fabrice> consensus on what an XML document means'.
Erik> I'm sorry, but could you please pay attention to what I'm Erik> saying so I don't have to reestablish the entire context Erik> _every_ time I say something you apparently are not going to Erik> accept and keep bickering about?
I apologize for not having taken your first assertion to its basic meaning. From my point of view, it has always been obvious that an XML document does not convey any meaning by itself (except if it is a standardized application of XML like MathML) and each of the writer and reader applications should be aware of the document's semantics. So I guess we agree on this point.
Erik> To be blunt: *ML documents derive meaning from sources external Erik> to the documents. Even if you use XSL to obtain meaning as far Erik> as _presentation_ is concerned, you still don't have a clue Erik> what you're dealing with unless you're actually the _same_ Erik> application as the writer of the XML document. *ML is no Erik> better than random chunks of binary data, but it also is no Erik> worse -- it could easily have been.
I agree. You might expect to describe more semantics using metadata : RDF and schemas descriptions of your document. But you will still be far from describing how to generate data structures (say, in Lisp) from an unknown XML document even if it has associated metadata. So that's why the DOM is lacking from semantics. By the way, do you know of any clear ways to specify semantics of generic documents ? What would you like to find there ?
Erik> I'm glad you're providing evidence of your understanding that Erik> DOM is essentially no more than an access mechanism, which I Erik> called merely an alternate representation, not actually Erik> representing a _meaning_. Can you please make the effort to Erik> grasp the difference?
I perfectly grasp the difference. Nobody ever tolds that an XML document should convey meaning, and that's why your first assertion was misleading.
> <program> > <function-definition>foo<arglist></arglist> > <application>display "I am paren-challenged"</application> > <application>newline</application> > </function-definition> > <application>foo</application> > <application>exit 0</application> > </program>
Hot Damn! Without those parenthesis, it suddenly becomes orders of magnitude more readable! Why didn't we think of this before? Do you have a DTD for this?
Oh, just noticed a typo (no doubt because it is so much easier to read):
<application>display "I am paren-challenged"</application>
And of course, we have to consider the crucial question of indentation. So allow me to be the first to point out that unless the </function-definition> tag is lined up with the body of the function, you will be excommunicated.
> * Tim Bradshaw <t...@cley.com> > | Can entities also expand to syntactically/lexically-nonsensical > | things?
> Yes. There are some feeble attempts to restrict the nonsense in > SGML and some less feeble, but not particularly strong, attempts at > same in XML.
The kind of splicing enmacrofurbulation made famous by the infamous string-munching preprocessor of C (and PL/I in 1966) is not allowed in XML. I don't know how strictly various parsers enforce this requirement. Any that don't enforce it shouldn't be used, since they encourage creative misuse of the language.
The XML specification 4.3.2 specifically says:
A consequence of well-formedness in entities is that the logical and physical structures in an XML document are properly nested; no start-tag, end-tag, empty-element tag, element, comment, processing instruction, character reference, or entity reference can begin in one entity and end in another.
I agree with Erique that the way this requirement is expressed is indirect, feeble and to me seems the result of an committee compromise or afterthought. To read the XML specification is to realize that nothing at all has been learned in the past 30 years of computer science.
"One learns from one's failures, not one's successes."
Centuries ago, Nostradamus foresaw a time when Erik Naggum would say:
>* Fabrice Popineau <Fabrice.Popin...@supelec.fr> >| This is not the problem. You stated that 'there is no consensus on >| what an XML document means'.
> I'm sorry, but could you please pay attention to what I'm saying so > I don't have to reestablish the entire context _every_ time I say > something you apparently are not going to accept and keep bickering > about?
> To be blunt: *ML documents derive meaning from sources external to > the documents. Even if you use XSL to obtain meaning as far as > _presentation_ is concerned, you still don't have a clue what you're > dealing with unless you're actually the _same_ application as the > writer of the XML document. *ML is no better than random chunks of > binary data, but it also is no worse -- it could easily have been.
Don't Lisp programs suffer from the same problem?
(CAR WHATEVER) derives meaning from whatever external meaning you've attached to whatever is in the sequence WHATEVER.
To be sure, DTDs are not as useful in determining semantics as one might _want_ them to be, but they _do_ provide _some_ indication of meaning. -- cbbro...@ntlug.org - <http://www.ntlug.org/~cbbrowne/lsf.html> Where do you *not* want to go today? "Confutatis maledictis, flammis acribus addictis" (<http://www.hex.net/~cbbrowne/msprobs.html>
* Christopher Browne | Don't Lisp programs suffer from the same problem?
No. Lisp programs do not exist outside of the language definition.
| (CAR WHATEVER) derives meaning from whatever external meaning you've | attached to whatever is in the sequence WHATEVER.
Nonsense. car has defined meaning regardless of what whatever is, and the whole form has defined meaning regardless of which operator is in the first position.
| To be sure, DTDs are not as useful in determining semantics as one | might _want_ them to be, but they _do_ provide _some_ indication of | meaning.
Like what?
#:Erik -- If this is not what you expected, please alter your expectations.
> Centuries ago, Nostradamus foresaw a time when Erik Naggum would say: > >* Christopher Browne > >| Don't Lisp programs suffer from the same problem?
> > No. Lisp programs do not exist outside of the language definition.
> >| (CAR WHATEVER) derives meaning from whatever external meaning you've > >| attached to whatever is in the sequence WHATEVER.
> > Nonsense. car has defined meaning regardless of what whatever is, > > and the whole form has defined meaning regardless of which operator > > is in the first position.
> Sure, there's _a_ meaning.
> But the _intended_ meaning can vary considerably, depending on the > context of what data I stuck into WHATEVER, and what Lisp form this > reference is embedded into.
> Based on looking at a bit of code that says (car a1), I can't tell > much about what it means.
> In contrast, if I look at an SGML document fragment:
> <sect1> <title> Introduction </title>
> it is reasonably likely that, even without knowing anything about the > DTD, we can readily guess something about the intent of <sect1> and > <title>.
Yes, the tags convey some information, because they were deliberately created that way. That can be done with Lisp, too.
> >| To be sure, DTDs are not as useful in determining semantics as one > >| might _want_ them to be, but they _do_ provide _some_ indication of > >| meaning.
> > Like what?
> Whether it's you writing the code that processes the FOS or sosofo, or > me, we're likely to have _some_ common realization of the structure of > the results that should come out of something like:
Centuries ago, Nostradamus foresaw a time when Erik Naggum would say:
>* Christopher Browne >| Don't Lisp programs suffer from the same problem?
> No. Lisp programs do not exist outside of the language definition.
>| (CAR WHATEVER) derives meaning from whatever external meaning you've >| attached to whatever is in the sequence WHATEVER.
> Nonsense. car has defined meaning regardless of what whatever is, > and the whole form has defined meaning regardless of which operator > is in the first position.
Sure, there's _a_ meaning.
But the _intended_ meaning can vary considerably, depending on the context of what data I stuck into WHATEVER, and what Lisp form this reference is embedded into.
Based on looking at a bit of code that says (car a1), I can't tell much about what it means.
In contrast, if I look at an SGML document fragment:
<sect1> <title> Introduction </title>
it is reasonably likely that, even without knowing anything about the DTD, we can readily guess something about the intent of <sect1> and <title>.
>| To be sure, DTDs are not as useful in determining semantics as one >| might _want_ them to be, but they _do_ provide _some_ indication of >| meaning.
> Like what?
Whether it's you writing the code that processes the FOS or sosofo, or me, we're likely to have _some_ common realization of the structure of the results that should come out of something like:
<sect1> <title> Introduction </title> <para> ... stuff ... </para> </sect1> -- cbbro...@hex.net - <http://www.hex.net/~cbbrowne/linux.html> Roses are red Violets are blue Some poems rhyme But this one doesn't.
* Christopher Browne wrote: > Based on looking at a bit of code that says (car a1), I can't tell > much about what it means. > In contrast, if I look at an SGML document fragment: > <sect1> <title> Introduction </title> > it is reasonably likely that, even without knowing anything about the > DTD, we can readily guess something about the intent of <sect1> and > <title>.
Yes, but this is an entirely different thing. You can *guess* something about the intent is entirely different than saying that the DTD tells you what the intent is.
If you start worrying about just what exactly it means to `have an intent' or `have a meaning' you will rapidly fall into a quagmire of philosophy and probably be doomed to spend the rest of your life as an embittered cognitive scientist or something. But you can stay away from that by asking much more specific questions.
Take the string.
"(lambda () (let ((x '(1 2))) (car x)))"
Then there are several things you can do:
READ (well, READ-FROM-STRING) will accept this and return an object. So you know that it's well-formed as a lisp form.
COMPILE (something like (compile nil ...)) will accept what READ gave you and return another object. So you know that it's well-formed as a lisp program.
FUNCALL will accept that object and return 1. So you know that that lisp program actually does something.
Cognitive scientists will disagree with all this, because CL probably isn't formally enough specified, but I don't care about them. And language lawyers will point out that you have to be in the right package and the readtable has to be sane, and I've carefully chosen the string not to have anything that might be a macro in it which makes it possibly-indeterminate whether it's a well-formed program, but I don't care about them either.
Now the point is that SGML and XML only give you the first two stages, at best. In fact I think they give only partial bits of them:
I'm not sure (someone will know) if either assign a structure to the string rather than just saying that it's well-formed. I presume they do.
SGML only gives you the second stage in general: without the grammar, you can't even tell if a string is readable the way you can in Lisp. XML, I think, aimed to give both first and second stages, so you should be able to check an XML document for first-stage well-formedness even without a grammar. I don't know if it succeeds.
All this will not satisfy people who care about formal semantics and so on. All I'm trying to get at is that it's clear that Lisp programs do have a whole bunch more `meaning' than *ML documents, in a sense that can be made formal.
* Christopher Browne | But the _intended_ meaning can vary considerably, depending on the | context of what data I stuck into WHATEVER, and what Lisp form this | reference is embedded into.
Nonsense. Failure to include information about an enclosing form does not constitute a change of semantics for the form so enclosed. It is simply not useful to communicate with other people with a fear that everything they say might have been enclosed in a `not' form, and it is not useful to blame the recipient for not having taking such into account when interpreting the meaning of what they say.
| Based on looking at a bit of code that says (car a1), I can't tell | much about what it means.
No, obviously _you_ can't, since you have made up your mind that you can enclose a form in any form at all to _rob_ it of meaning, a pretty silly move, but necessary in order to argue that SGML _has_ meaning, since SGML has meaning _only_ relative to external sources and that hypothetical-mythical enclosing form has the same status for the Lisp forms: The Great Unknown Semantic Modifier.
Once again, an SGML fan is displaying his lack of clue. Boring!
#:Erik -- If this is not what you expected, please alter your expectations.
Clint Hyde wrote: > I'm sure this is a FAQ by now...I'm out of touch...
> I want an XML parser written in lisp...is there such a thing? available > free? where?
Dan Barlow pointed me at two parsers. I wasn't able to get at the first URL at the time, so I went with:
Scott ?*, who pointed me at his XML parser. it's ok--won't read/use a DTD (i.e., it's not a validating parser), does require well-formed XML (close-tags are required, empty tags will break it), has a couple of quirks, at least one of which I eliminated (text-strings (like comments) weren't allowed to have commas in them ?!, just required a quickie fix to the custom read-table the parser uses, to remove the reader-macro on the comma).
I was able to fire it up without too much trouble (well, a lot, actually, because it is written to use things I don't/won't use, like the mkant defsystem--portability is NOT my concern, but build efficiency is).
it may be that the other xml parser is/will-be better, apparently it will read a DTD...
> Erik Naggum must have written this by now if no on else has :)
and to my surprise, he hadn't. he did point out that he wanted a tree of objects coming out of one, so do I, and that's what I have. he still might not find it satisfactory...
I used it to build a quickie CLIM app, which you can download if you're interested:
all that's there is a zip with the application (Win-NT only, since that's the PC ACL I have), and a zip with the source needed to compile this and build it. I modified the classes in the xml-parser to support being drawn in my clim app. that might be unsatisfactory to others, you'd have to build a shadow tree for use in the window where each node would point to a corresponding node in the xml. the source should be easy enough to build in some other clim. I could make a Solaris app if someone wanted it.
the app: uses a real tree view of the xml-structure (using clim's format-graph-from-root). you can drag-n-drop to re-arrange the structure, you can add/remove nodes. you can load/save, and if you save and reload, you do get back what you wrote out.
if you try to open a DTD, it will break. fortunately, for any errors, thanks to Jeff Morrill, this is handled cleanly: you get a popup menu-choose, and you get to pick standard lisp proceed-choices...i.e., you can recover back to where the app is waiting for you to click on something.
I did this in about 12 hours total. very nearly all MY code in it was cut-n-paste from other projects over the past few years. the app includes a sample XML file...if you have another one that is bigger/different/breaks-the-program, I'd like to know what I've missed/left-out.
of course: GPL applies here. give credit where it is due...feel free to take the code and modify as you like. send me any improvements :) or bug-reports :(
-- please reply direct to <a href="mailto:ch...@bbn.com">Clint Hyde</a> I don't have enough time to scan everything I'd like to, and don't want to miss your answers...
Note that this list includes one parser not mentioned in this thread so far: James Anderson's CL-XML. This is the most complete of the four I have listed and even contains a DOM implementation.