Adding XmlWriter API

110 views
Skip to first unread message

Marco Rogers

unread,
Jan 13, 2012, 7:45:00 PM1/13/12
to libx...@googlegroups.com
There's an open pull request to support the XmlWriter api in libxmljs.


I'm not super familiar with this type of api. So I wanted to open it up discussion. Why would you say this is a good idea? I'm not against it. But I also don't want to support every feature of libxml2. I want to be selective about what people really need from xml when they're writing javascript.

What are some examples where people have used this api?

:Marco

--
Marco Rogers
marco....@gmail.com | https://twitter.com/polotek

Life is ten percent what happens to you and ninety percent how you respond to it.
- Lou Holtz

Roman Shtylman

unread,
Jan 13, 2012, 8:33:40 PM1/13/12
to libx...@googlegroups.com
The biggest use case I can think of is to avoid string concatenation when creating an xml document (to avoid errors with mismatched brackets, formatting, etc).

Some additional comments on the matter:

Another library that does xml writing:

Marco Rogers

unread,
Jan 13, 2012, 11:16:06 PM1/13/12
to libx...@googlegroups.com
Ah okay. So it's just for building up a document piece by piece and knowing that it will come out valid. I'm gonna take a closer look at the code. I would love to make the api nicer. It's pretty verbose right now. There are other libraries with a writer/builder type api. See https://github.com/racker/node-elementtree.

:Marco

Lorenz Schori

unread,
Jan 15, 2012, 4:14:17 AM1/15/12
to libx...@googlegroups.com
I'm using the SAXParser and XmlWriter interfaces from libxmljs in https://github.com/znerol/node-xmlshim. I thought this project was very much in the spirit of your statement you expressed in this post: https://groups.google.com/d/msg/libxmljs/SBY46kDx-sk/Coy6PJfvBnQJ

I'm okay with API changes. It would just be easier for people familiar with this or that or even libxml itself if the methods would work like expected. Sticking to what people expect would also help in avoiding this situations: https://groups.google.com/d/msg/libxmljs/awbDtHawHLQ/hqvCofllIeoJ

Marco Rogers

unread,
Feb 8, 2012, 12:49:12 AM2/8/12
to libx...@googlegroups.com
Where did we land on this? I see that @znerol has updated his pull request. I still think we can come up with a better api than what is here. But since I don't have the time to work on that, I don't want to hold this up if others think it's a good thing. Roman?

The issue here is that we will be committing to an API for a while. Or if we ever do decide to change it, we are in the position of breaking backwards compatibility. I don't like doing that, which is why I'm a fan of thinking hard about API up front. Maybe I should spend some time thinking about an API I'd be more comfortable with.

:Marco

Roman Shtylman

unread,
Feb 9, 2012, 12:27:26 AM2/9/12
to libx...@googlegroups.com
The more I look at the api, the more unclear it becomes to me why the existing api can't be used to build a document? Furthermore, I am not sure that xml writing is one of those things that is worth doing in c/c++. Parsing was worth it because of the complexities in xml edge cases and html brokenness, but writing it out seems like something that a js library could do.

Those hesitations aside, I think the api presented matches pretty closely by name with the libxml2 c api (not sure if that matters).

Again, I did not have a use case personally for it (and think there are probly pure js solutions that are adequate). I guess my general sentiment is that I am not sold on it. @znerol, what are your thoughts on pure js solutions out there? There must have been some reason you felt that extending the c api was easier than using an existing solution? Or was it just a proof of concept?

marco....@gmail.com

unread,
Feb 9, 2012, 12:40:30 AM2/9/12
to libx...@googlegroups.com, libx...@googlegroups.com
I can imagine that if you already need the robust parsing of libxmljs then you don't want to use a whole other lib for building.

To me this is a question of intent. If our intent is to provide nice js interfaces to some of the useful bits of libxml2, then we should do these bindings. Perhaps behind a nicer API. If our intent is to make an awesome and comprehensive solution for XML in mode, then maybe we should just provide our own pure js implementation for building. Or even merge in a nice existing implementation.

One other thing to consider is that libxmljs is still one of the biggest addon modules in the node community. People use it for reference. So that's another small reason to stick to our binding roots.

Sent from my iPhone

Nicholas Campbell

unread,
Feb 9, 2012, 7:57:34 AM2/9/12
to libx...@googlegroups.com
Simply because the addon is a "reference" seems _to me_ to be a poor
reason to not change things. Of course it'll still be binding to C++
land, I think the question is really, as you noted, just how much of
the API is bound to C++ land vs done in JS land.

I suspect the best thing to do would be to see if it what the API is
trying to accomplish is just as fast (or close enough) to C++ land and
if so, then keep it in JS. It makes it much easier for others who are
not familiar with the native side of things to reason about and
understand should they need to.

Also, if you can make the API more intuitive in the process, then that
might also be worthwhile. Deprecation is easy enough...just add some
printout to the old APIs and then have them call the new APIs. People
will quickly roll to the newer API.

Also, if any of this doesn't make sense or I missed the point of the
thread then just ignore me. ;D

- Nick Campbell

http://digitaltumbleweed.com

Marco Rogers

unread,
Feb 9, 2012, 2:05:34 PM2/9/12
to libx...@googlegroups.com
I'm not suggesting it's a large reason. I'm just laying everything on the table. My point is that what we do with this depends heavily on what we think is important about the project. If we decided that being a good reference for node addons was a core goal, then that would factor into the decision. It's not a big consideration for me, but I don't consider it unimportant. I think node addons are an important, but underserved part of the node community. I want people to write more of them. There is something to the idea of having a project that promotes those ideas.

But you guys are full maintainers now, so my opinion isn't the only one that counts. Based on some of Nick's comments on the patch, it sounds like we should scrutinize the API more closely though. I'll do that soon.

:Marco

Lorenz Schori

unread,
Feb 9, 2012, 5:31:23 PM2/9/12
to libxmljs
On Feb 9, 6:27 am, Roman Shtylman <shtyl...@gmail.com> wrote:
> The more I look at the api, the more unclear it becomes to me why the
> existing api can't be used to build a document? Furthermore, I am not sure
> that xml writing is one of those things that is worth doing in c/c++.
> Parsing was worth it because of the complexities in xml edge cases and html
> brokenness, but writing it out seems like something that a js library could
> do.

A pure JavaScript implementation is not necessarily easier to code/
maintain than the c++ bridge. Special characters are escaped
automatically (entities) by libxml and the document is guaranteed to
be well formed after a call to endDocument. Also the API takes care of
namespace declarations when writing attributes. It also would be easy
to expose methods which ensure proper indentation or base64 encoding
etc.

I was in need of a robust XML serialization method and for me it was a
nobrainer to just reuse what has been proven to work well in c-land. I
had to work with jsdom on that project and did not want to write an
adapter translating the DOM to the native libxmljs document model,
effectively duplicating the trees in memory.

> Those hesitations aside, I think the api presented matches pretty closely
> by name with the libxml2 c api (not sure if that matters).

I think the API was originally introduced by Microsoft and then spread
to some other projects like libxml and PHP. I've only implemented the
bare minimum. The original API also has some conveniance methods like
writeElement(name, content) or writeComment(string) etc. They can be
implemented easily on top of the exposed native methods.

Cheers
Lorenz

Marco Rogers

unread,
Feb 9, 2012, 7:22:57 PM2/9/12
to libx...@googlegroups.com
One thing I know I don't want is a huge surface area for the API. I want the mimimum to allow people to do what they  need to do with XML. If you want something fancier, you should build on top of this. That has been my feeling, but I want to hear what others think. I really wanna lock down what we think the focus of this project is.

:Marco

Roman Shtylman

unread,
Feb 9, 2012, 7:30:11 PM2/9/12
to libx...@googlegroups.com
For me personally the use case was processing xml/html since no other library could do it as fast and correctly. Creating it is not something I do often (or ever). I mostly deal with json these days so I am not sure I am best to speak intelligently about the proper api for creation. I thought the existing dom builder api was pretty good for creating documents (all the test cases do it). But that might not be true for larger documents and certainly not true if you are output only and don't need the dom kept around.

As an aside, I think a streaming xml generator api would be more interesting. It would basically write out as soon as it could and has determined that a particular string will no longer be changed. Maybe this api can do that; again my inexperience with the issue shines through here :)

Lorenz Schori

unread,
Feb 11, 2012, 7:21:52 AM2/11/12
to libx...@googlegroups.com
On Friday, February 10, 2012 1:30:11 AM UTC+1, Roman Shtylman wrote:
As an aside, I think a streaming xml generator api would be more interesting. It would basically write out as soon as it could and has determined that a particular string will no longer be changed. Maybe this api can do that; again my inexperience with the issue shines through here :)

XMLWriter is exactly what you describe. You can retrieve and clear the content of the buffer at any time. After calling closeDocument and retrieving the remaining content, you get a well formed XML document. Imho the XMLWriter API can be regarded as the counterpart of the SAX parsing API.
Reply all
Reply to author
Forward
0 new messages