I am an XML novice trying out how to manipulate WordML files.
I open and create the WordML files using the FSO (FileSystemObject) API .
Under VB6 that API offers full Unicode (UTF-16) or ASCII only. How can I
handle the WordML files as UTF-8, i.e. as usual?
Studi1
'U may use the save-Method of the XML-Document instead.
..
Set xmlOutput = CreateObject("MSXML2.FreeThreadedDOMDocument.5.0")
xmlOutput.async=false
xmlOutput.preserveWhiteSpace = True
xmlOutput.validateOnParse = False
xmlOutput.resolveExternals = False
xmlOutput.loadXML( objXSLTProc.output )
xmlOutput.save(newFileName)
willib
"Studi1" <Stu...@discussions.microsoft.com> schrieb im Newsbeitrag
news:C9E73F39-7E60-4889...@microsoft.com...
Don't use the FileSystemObject. Use MSXML to load and save your XML.
MSXML 3 is everwhere where IE 6 or later is installed and on most
systems by now even MSXML 6 is part of the OS or service pack (e.g.
Vista, Windows 7, Windows XP SP3).
The MSXML APIs are documented here:
http://msdn.microsoft.com/en-us/library/ms763742(VS.85).aspx
That's my suggestion based on the XML APIs Microsoft offers.
If VB6 has better UTF-8 support for file system access than the
FileSystemObject I don't know, you should ask in a VB6 group about that.
--
Martin Honnen --- MVP XML
http://msmvps.com/blogs/martin_honnen/
Now: I would like to manipulate the XML data of Word 2003 documents of type
'xml'. What does usage of MSXML practically mean in this case?
- Does it mean that Word will have to build up the XML structure every time
the user opens a Word document?
-- I consider this as a time consuming step, in particular with large Word
documents (let's say of 1000 A4 pages).
-- Would it not be much faster do manipulate the XML file directly, i.e.
like a normal text file and to open the modified Word document afterwards?
- On the other hand, is it possible, without MSXML, to manipulate the Word
file of the active word document?
And a final administrative question: Why do I not get an EMAIL announcement
of the replies or new questions even if I ask for?
- My profile contains the EMAIL address.
- Do I have to do anything else?
Best regards - Studi1
"Martin Honnen" wrote:
> .
>
> Now: I would like to manipulate the XML data of Word 2003 documents of type
> 'xml'. What does usage of MSXML practically mean in this case?
> - Does it mean that Word will have to build up the XML structure every time
> the user opens a Word document?
> -- I consider this as a time consuming step, in particular with large Word
> documents (let's say of 1000 A4 pages).
> -- Would it not be much faster do manipulate the XML file directly, i.e.
> like a normal text file and to open the modified Word document afterwards?
> - On the other hand, is it possible, without MSXML, to manipulate the Word
> file of the active word document?
This newsgroup is about XSLT and XPath. Your question seems about Word
and XML, you could try the newsgroup microsoft.public.office.xml for that.
I have asked my technical question in newsgroup
'microsoft.public.office.xml' now. Thank you.
Still I do not understand why the EMAIL reply function of your newsgroup
does not function. Could you answer to this Question (3rd trial :-))?
Best regards - Studi1
"Martin Honnen" wrote:
> .
>
>
>OK: I asked in a VB6 forum and discover now that UTF-8 support seems to be a
>delicate point in VB6 and XML API offers ...
>
>Now: I would like to manipulate the XML data of Word 2003 documents of type
>'xml'. What does usage of MSXML practically mean in this case?
>- Does it mean that Word will have to build up the XML structure every time
>the user opens a Word document?
>-- I consider this as a time consuming step, in particular with large Word
>documents (let's say of 1000 A4 pages).
>-- Would it not be much faster do manipulate the XML file directly, i.e.
>like a normal text file and to open the modified Word document afterwards?
>- On the other hand, is it possible, without MSXML, to manipulate the Word
>file of the active word document?
There are a few ways to open an XML document, and not all of them
require you to read the entire document rtee into memory and process
it first.
"SAX" and XMLTextReader are two possibilities - they are forward-only
document readers which read each node (or XML tag) and raise events
(like "reached XML node, reached XML attribute, reached closing XML
node, reached XML text node)
Programatically, you can write code which responds to each 'event' and
does some work (chooses to output the text node, or ignores the
attribute).
These method doesn't require building a large, memory intensive data
structure, but the reader won't know (until it's finished reading
every line) whether the document is "valid XML", so you can't easily
use that in conjunction with XML validation using XSD.
http://msdn.microsoft.com/en-us/library/system.xml.xmltextreader.aspx
http://msdn.microsoft.com/en-us/library/9khb6435.aspx
http://php.net/manual/en/book.xmlreader.php
A corresponding XML writer is usually available in most programming
languages.
HTH
Cheers - Neil
------------------------------------------------
Digital Media MVP : 2004-2010
http://mvp.support.microsoft.com/mvpfaqs
Thank you for answering. I will ask in newsgroup
'microsoft.public.office.xml' ...
Still, could you please answer my administrative question: Why do I not get
an EMAIL announcement? of the replies or new questions even if I ask for?
- My profile contains the EMAIL address.
- Do I have to do anything else?
Best regards - Studi1
"Martin Honnen" wrote:
> .
>
Thank you for answering. This gives me an entry point to my problem.
I am interested in a good understanding of the workloads for (1) a
MSXML-based method with XML structure buildup for each (large) Word document
plus individual XML data access afterwards as compared to (2) a direct access
to the individual XML data each time I (or the user) need it.
Do you know of a comparison paper for this question of an XML beginner?
Best regards - Studi1
> .
>
>
>Hello Neil
>
>Thank you for answering. This gives me an entry point to my problem.
>
>I am interested in a good understanding of the workloads for (1) a
>MSXML-based method with XML structure buildup for each (large) Word document
>plus individual XML data access afterwards as compared to (2) a direct access
>to the individual XML data each time I (or the user) need it.
http://www.hanselman.com/blog/CommentView.aspx?guid=276
http://www.nearinfinity.com/blogs/joe_ferner/performance_linq_to_sql_vs.html
HTH
Cheers - Neil
> Still I do not understand why the EMAIL reply function of your newsgroup
> does not function. Could you answer to this Question (3rd trial :-))?
I am sorry, for me this is a newsgroup on news.microsoft.com that I read
with a NNTP news agent. If you use the web interface to the Microsoft
newsgroups then it might have some functionality I am not aware of and I
am not familiar with so I can't help you using that interface.