Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Processing XML-formatted documents in CL
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  10 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Ron Parker  
View profile  
 More options Jan 13 2009, 10:57 pm
Newsgroups: comp.lang.lisp
From: Ron Parker <rdpar...@gmail.com>
Date: Tue, 13 Jan 2009 19:57:47 -0800 (PST)
Local: Tues, Jan 13 2009 10:57 pm
Subject: Processing XML-formatted documents in CL
I need to analyze and manipulate several multi-megabyte documents that
are stored in various XML formats. There appears to be a fairly long
list of XML parsers and tools on the CLiki but I have no frame of
reference to judge them.

Some of the documents I need to access will probably contain Unicode,
although I am not positive of this.

Half of me wants to hack it from scratch, but this is my first CL
project so I don't know if this would be wise even though I've hacked
Elisp for a couple decades off and on.

Any recommendations in this area would be appreciated.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Xah Lee  
View profile  
 More options Jan 14 2009, 1:43 am
Newsgroups: comp.lang.lisp
From: Xah Lee <xah...@gmail.com>
Date: Tue, 13 Jan 2009 22:43:51 -0800 (PST)
Local: Wed, Jan 14 2009 1:43 am
Subject: Re: Processing XML-formatted documents in CL
On Jan 13, 7:57 pm, Ron Parker <rdpar...@gmail.com> wrote:

> I need to analyze and manipulate several multi-megabyte documents that
> are stored in various XML formats. There appears to be a fairly long
> list of XML parsers and tools on the CLiki but I have no frame of
> reference to judge them.

> Some of the documents I need to access will probably contain Unicode,
> although I am not positive of this.

> Half of me wants to hack it from scratch, but this is my first CL
> project so I don't know if this would be wise even though I've hacked
> Elisp for a couple decades off and on.

> Any recommendations in this area would be appreciated.

thought i'd mention that there's nxml mode written by the xml expert
James Clark. It features a xml parser and xml validation as you type.

since it contains a complet parser, i think it can be used for your
project, but am not sure how the code is structured to be used like
that. (the code is over 10k lines)

(James is also the one who wrote the widely used xml parser expat in
c. (which happened to be used as part of our app server software
written in perl back in 1999, before realizing him in about 2007.))

  Xah
http://xahlee.org/



 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Volkan YAZICI  
View profile  
 More options Jan 14 2009, 3:26 am
Newsgroups: comp.lang.lisp
From: Volkan YAZICI <volkan.yaz...@gmail.com>
Date: Wed, 14 Jan 2009 00:26:41 -0800 (PST)
Local: Wed, Jan 14 2009 3:26 am
Subject: Re: Processing XML-formatted documents in CL
On Jan 14, 5:57 am, Ron Parker <rdpar...@gmail.com> wrote:

> I need to analyze and manipulate several multi-megabyte documents that
> are stored in various XML formats. There appears to be a fairly long
> list of XML parsers and tools on the CLiki but I have no frame of
> reference to judge them.

> Some of the documents I need to access will probably contain Unicode,
> although I am not positive of this.

I think Closure XML[1] is pretty good; have a good support and
community.

You didn't specify much about what you mean with "multi-megabyte", but
this issue could be a problem. Anyway, just let's try and see. If it
fails for some memory related reasons, you can first convert your XML
files into s-expression forms using some sort of XSLT and then easily
parse these s-expressions from lisp.

Regards.

[1] http://common-lisp.net/project/cxml/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Andy Chambers  
View profile  
 More options Jan 14 2009, 5:26 am
Newsgroups: comp.lang.lisp
From: Andy Chambers <achambers.h...@googlemail.com>
Date: Wed, 14 Jan 2009 02:26:14 -0800 (PST)
Local: Wed, Jan 14 2009 5:26 am
Subject: Re: Processing XML-formatted documents in CL
On Jan 14, 8:26 am, Volkan YAZICI <volkan.yaz...@gmail.com> wrote:

> On Jan 14, 5:57 am, Ron Parker <rdpar...@gmail.com> wrote:

> > I need to analyze and manipulate several multi-megabyte documents that
> > are stored in various XML formats. There appears to be a fairly long
> > list of XML parsers and tools on the CLiki but I have no frame of
> > reference to judge them.

> > Some of the documents I need to access will probably contain Unicode,
> > although I am not positive of this.

> I think Closure XML[1] is pretty good; have a good support and
> community.

> You didn't specify much about what you mean with "multi-megabyte", but
> this issue could be a problem.

Not really.  Closure can process documents as streams or trees and
it's
stream processor has a really nice interface (klacks).

--
Andy


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
GP lisper  
View profile  
 More options Jan 14 2009, 5:52 am
Newsgroups: comp.lang.lisp
From: GP lisper <spamb...@CloudDancer.com>
Date: Wed, 14 Jan 2009 02:52:38 -0800
Local: Wed, Jan 14 2009 5:52 am
Subject: Re: Processing XML-formatted documents in CL

On Tue, 13 Jan 2009 19:57:47 -0800 (PST), <rdpar...@gmail.com> wrote:
> I need to analyze and manipulate several multi-megabyte documents that
> are stored in various XML formats. There appears to be a fairly long
> list of XML parsers and tools on the CLiki but I have no frame of
> reference to judge them.

If it is standards compliant XML, any of the fancy code will work.

When I faced this problem about 3 years ago, "s-xml" solved my
non-standard XML problems nicely.  I still use it for everything,
since I didn't need to learn any buzzwords and specs to apply it.

You'll probably try a few parsers anyway, sounds like speed will be an
issue.

--
"Most programmers use this on-line documentation nearly all of the
time, and thereby avoid the need to handle bulky manuals and perform
the translation from barbarous tongues."  CMU CL User Manual


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Zach Beane  
View profile  
 More options Jan 14 2009, 8:26 am
Newsgroups: comp.lang.lisp
From: Zach Beane <x...@xach.com>
Date: Wed, 14 Jan 2009 08:26:22 -0500
Local: Wed, Jan 14 2009 8:26 am
Subject: Re: Processing XML-formatted documents in CL

Ron Parker <rdpar...@gmail.com> writes:
> I need to analyze and manipulate several multi-megabyte documents that
> are stored in various XML formats. There appears to be a fairly long
> list of XML parsers and tools on the CLiki but I have no frame of
> reference to judge them.

> Some of the documents I need to access will probably contain Unicode,
> although I am not positive of this.

> Half of me wants to hack it from scratch, but this is my first CL
> project so I don't know if this would be wise even though I've hacked
> Elisp for a couple decades off and on.

> Any recommendations in this area would be appreciated.

I've been a happy user of Closure XML for some time now. It's very
capable and the documentation is good.

Zach


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
game_designer  
View profile  
 More options Jan 15 2009, 10:03 am
Newsgroups: comp.lang.lisp
From: game_designer <alex.repenn...@gmail.com>
Date: Thu, 15 Jan 2009 07:03:39 -0800 (PST)
Local: Thurs, Jan 15 2009 10:03 am
Subject: Re: Processing XML-formatted documents in CL
On Jan 13, 8:57 pm, Ron Parker <rdpar...@gmail.com> wrote:
.

> Any recommendations in this area would be appreciated.

If the goal is to read the file and to map XML elements into similar
structured CLOS objects you may want to explore XMLisp. A new version
released recently:

http://www.agentsheets.com/lisp/XMLisp/

If it works, great! If not, let me know why not.

Alex


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Chaitanya Gupta  
View profile  
 More options Jan 20 2009, 3:20 am
Newsgroups: comp.lang.lisp
From: Chaitanya Gupta <m...@chaitanyagupta.com>
Date: Tue, 20 Jan 2009 13:50:12 +0530
Local: Tues, Jan 20 2009 3:20 am
Subject: Re: Processing XML-formatted documents in CL

Ron Parker wrote:
> I need to analyze and manipulate several multi-megabyte documents that
> are stored in various XML formats. There appears to be a fairly long
> list of XML parsers and tools on the CLiki but I have no frame of
> reference to judge them.

Others have mentioned CXML. It is quite good for most of your XML needs,
but since you mention multi-megabyte XML documents, its performance
might be an issue[1].

If that is the case, consider using S-XML:
http://common-lisp.net/project/s-xml/

I don't know if S-XML will satisfy your Unicode needs or not, but its
DOM parser is faster than CXML's.

Chaitanya

1. http://common-lisp.net/pipermail/cxml-devel/2008-September/000444.html


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
David Lichteblau  
View profile  
 More options Jan 20 2009, 3:57 am
Newsgroups: comp.lang.lisp
From: David Lichteblau <usenet-2...@lichteblau.com>
Date: 20 Jan 2009 08:57:28 GMT
Local: Tues, Jan 20 2009 3:57 am
Subject: Re: Processing XML-formatted documents in CL
On 2009-01-20, Chaitanya Gupta <m...@chaitanyagupta.com> wrote:

> Others have mentioned CXML. It is quite good for most of your XML needs,
> but since you mention multi-megabyte XML documents, its performance
> might be an issue[1].
[...]
> 1. http://common-lisp.net/pipermail/cxml-devel/2008-September/000444.html

Hey, I don't claim that cxml is the fastest XML implementation around.

But in that mailing list post above, you had issues with Allegro CL's
default scheduler configuration, not cxml speed.

So MP:*DEFAULT-PROCESS-QUANTUM* is very high.
That's not cxml's fault.

d.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Chaitanya Gupta  
View profile  
 More options Jan 20 2009, 6:55 am
Newsgroups: comp.lang.lisp
From: Chaitanya Gupta <m...@chaitanyagupta.com>
Date: Tue, 20 Jan 2009 17:25:04 +0530
Local: Tues, Jan 20 2009 6:55 am
Subject: Re: Processing XML-formatted documents in CL

David Lichteblau wrote:
> Hey, I don't claim that cxml is the fastest XML implementation around.

Right, you don't.

> But in that mailing list post above, you had issues with Allegro CL's
> default scheduler configuration, not cxml speed.

> So MP:*DEFAULT-PROCESS-QUANTUM* is very high.
> That's not cxml's fault.

We did play around with mp:*default-process-quantum*, but it didn't help
much.

And we discovered that S-XML's DOM parser was faster and, IIRC, had
lower memory needs. It also didn't cause any image hanging issues that
we faced with CXML. In the end, we decided to switch to S-XML for that
particular service, and (so I've heard, since I left by then), there
haven't been any sleepless nights since. ;)

Mind you, CXML is still my tool of choice for XML parsing needs, and I
am particularly grateful for the extensions built on top of it
(cxml-rng, plexippus-xpath, etc.), but its performance did bite us once.
So I am just letting the OP know of a good alternative in case he feels
the performance pinch too.

Chaitanya


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »