Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Q/VB.NET: Append data to XML file without loading complete file?

24 views
Skip to first unread message

Jonathan Buckland

unread,
Feb 20, 2004, 8:04:30 AM2/20/04
to
Can someone give me an example how to append data without having to
load the complete XML file.

Is this possible?

Jonathan

Derek Harmon

unread,
Feb 20, 2004, 10:42:25 PM2/20/04
to
"Jonathan Buckland" <jona...@theinformationpeople.com> wrote in message news:94e0c304.04022...@posting.google.com...

> Can someone give me an example how to append data without having to
> load the complete XML file.

The ability to append data to an XML file without processing the entire
document greatly depends on the structure of the XML, and the place
in the file where you want to append.

As a simple example to get started with, consider the following relatively
flat XML instance document that contains the records of several fictitious
healthcare providers:

- - - Providers.xml (before)
<?xml version="1.0" encoding="iso-8859-1" ?>
<Providers>
<Provider specialty="Endocrinologist" id="NYC00001">
<Name>Dr. No</Name>
<Phone kindOf="pager">212-555-9876</Phone>
</Provider>
<Provider specialty="GeneralPractitioner" id="PHL00001">
<Name>Dr. Who</Name>
<Phone kindOf="mobile">215-555-4567</Phone>
</Provider>
</Providers>
- - -

What do we know about this file? It's ISO-8859-1 ANSI text,
one byte to a character. If what we need to do is append another
provider to this file, then these additional records can be inserted
immediate before the </Providers> closing tag. Counting the new-
line (CR and LF control codes) after </Providers>, and one byte
per character of the text representation, "</Providers>" we know
that this insert position is 12 + 2 (vbCrLf) = 14 bytes before the
end of the file.

Hmmm... now this is a text file in the file system, so in addition to
writing to it with an XmlWriter object (which is helpful when emitting
well-formed XML, that is, XML nodes with matching start and end
tags) we can also write to it using a conventional StreamWriter from
the System.IO namespace.

This analysis lends itself to the following example VB.NET code,

- - - Append.vb
Imports Microsoft.VisualBasic
Imports System
Imports System.IO
Imports System.Xml

Public Class AppendXmlSample

Public Shared Sub Main()

' Open the XML document as an ordinary text file.
Dim fileOut As New FileStream( "Providers.xml", _
FileMode.Open, FileAccess.Write)

' Move back 14 bytes from the end of the stream and write away.
fileOut.Seek( -14, SeekOrigin.End)

' Wrap the file output stream in a StreamWriter, I name it closer
' because you will see I'll need it to finish off the XML document.
Dim closer As New StreamWriter( fileOut)

' Wrap the stream writer in an XmlTextWriter, which I use to emit
' the well-formed XML records being appended into the file, before
' the document element's closing tag.
Dim writer As New XmlTextWriter( closer)

' Pretty-print the XML output.
writer.Formatting = Formatting.Indented
writer.IndentChar = vbTab
writer.Indentation = 1

' This next block of code just emits a well-formed XML fragment
' representing a fictitious physician.
writer.WriteStartElement( String.Empty, "Provider", String.Empty)
writer.WriteStartAttribute( String.Empty, "specialty", String.Empty)
writer.WriteString( "Cardiologist")
writer.WriteEndAttribute()
writer.WriteStartAttribute( String.Empty, "id", String.Empty)
writer.WriteString( "LAX00001")
writer.WriteEndAttribute()
writer.WriteStartElement( String.Empty, "Name", String.Empty)
writer.WriteString( "Dr. Love")
writer.WriteEndElement()
writer.WriteStartElement( String.Empty, "Phone", String.Empty)
writer.WriteStartAttribute( String.Empty, "kindOf", String.Empty)
writer.WriteString( "home")
writer.WriteEndAttribute()
writer.WriteString( "310-555-1234")
writer.WriteEndElement()
writer.WriteEndElement()

' Flush the XML content to the file, because at this point I am
' done with the XmlTextWriter. I don't Close because I don't
' want to close the file output stream quite yet.
writer.Flush()

' I've already overwritten the document end tag, and I can't emit
' this end tag with XmlTextWriter because its unbalanced (I did
' not write the document start tag with the XmlTextWriter).
'
' What I do is just emit the text representation of the end tag,
' angle brackets and all.
'
closer.Write( vbCrLf + "</Providers>" + vbCrLf)
closer.Flush()

' Now I am done, I can close.
closer.Close()

writer = Nothing
closer = Nothing
fileOut = Nothing

End Sub

End Class
- - -

If you build the AppendXmlSample like this,

vbc Append.vb /r:System.Xml.dll

and then run the resulting Append executable from the command-line
in the directory containing a copy of Providers.xml, the resulting XML
file afterwards will look like the following with the additional record for
Dr. Love:

- - - Providers.xml (after)
<?xml version="1.0" encoding="iso-8859-1" ?>
<Providers>
<Provider specialty="Endocrinologist" id="NYC00001">
<Name>Dr. No</Name>
<Phone kindOf="pager">212-555-9876</Phone>
</Provider>
<Provider specialty="GeneralPractitioner" id="PHL00001">
<Name>Dr. Who</Name>
<Phone kindOf="mobile">215-555-4567</Phone>
</Provider>
<Provider specialty="Cardiologist" id="LAX00001">
<Name>Dr. Love</Name>
<Phone kindOf="home">310-555-1234</Phone>
</Provider>
</Providers>
- - -

The flat XML instance document described above illustrates one way
of appending content to the end of the file, in a manner that preserves
the well-formedness of the document. More complex schemas tend
to prohibit taking this approach where the requirement is to append
child nodes that are deeply nested.

A more sophisticated technique (perhaps too low-level for VB.NET)
would involve analyzing the NTFS file system and the arrangement of
physical sectors within the file system. It's conceivable you could edit
the sector containing the piece of serialized XML at which you want to
append a deeply nested child node. The sector could be quickly located
if you maintained an index mapping sectors to locations within the XML
document. Next you would append the data within the file, and relocate
the disturbed data following it to another sector (or sector(s)), fixing-up
the allocation table to maintain the appropriate linkages between sectors.
XML content relocated to other sectors could have insignificant white-
space injected within it, to re-establish it as being an even multiple of
the sector size.

For extremely large and complex XML files with the requirement to
append low-level nodes, this second technique is usually necessary
to achieve expeditious updates. It can be seen in some XML DB
(or "accelerator") implementations, but it is not a solution for the
faint-of-heart.


Derek Harmon


"Jeffrey Tan[MSFT]"

unread,
Feb 21, 2004, 3:17:05 AM2/21/04
to

Hi Jonathan,

We have reviewed your post, and will do some research on this issue.

Thanks for your understanding.

Best regards,
Jeffrey Tan
Microsoft Online Partner Support
Get Secure! - www.microsoft.com/security
This posting is provided "as is" with no warranties and confers no rights.

Kevin Yu [MSFT]

unread,
Feb 23, 2004, 1:44:07 AM2/23/04
to
Hi Jonathan,

Thank you for posting in the community!

First of all, I would like to confirm my understanding of your issue. From
your description, I understand that you need to process the XML file
without reading the whole document to the memory. If there is any
misunderstanding, please feel free to let me know.

As far as I know, this cannot be done with DOM, due to the DOM's tree-based
model, most implementations demand that the entire XML document be
contained in memory while processing. So we have to achieve this using SAX.

SAX stands for the Simple API for XML. SAX models the Infoset through a
linear sequence of well-known method calls. Because SAX doesn't demand
resources for an in-memory representation of the document, it's a
lightweight alternative to the DOM. It is implemented in MSXML.

You can find many resources if you search in MSDN with the keyword SAX.
Here I have listed some of them. Hope this helps:

http://msdn.microsoft.com/msdnmag/issues/1100/xml/default.aspx
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/xmlsdk/htm/
sax_starter_2szc.asp
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/xmlsdk/htm/
sax_devgd_hdi_domtosax_56bb.asp

Does this answer your question? If anything is unclear, please feel free to
reply to the post.

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."

0 new messages