At the core of EWD is a lightweight HTML/XML DOM parser that can be
used as a tool in its own right. XML DOM processing is a very
powerful technique that can be used for all sorts of tasks. Of course
it's the main task involved in EWD custom tag definition, but there's
many more uses for EWD's DOM parser. You could even use it to create
a Mumps-based Native XML Database.
The parser is very lenient - it was designed to cater for the "lazy"
format of HTML, so it won't refuse to parse a document that isn't
correctly structured XML. It will do its best to tidy it up.
However, once parsed, the document is handled as correctly structured
XML, and if you output it again, it will be properly structured XML.
The parser doesn't worry about namespace declarations - it just
handles them as attributes. Prefixed names are treated as just a
standard tag name. So don't expect EWD's parser to validate your XML
documents! It's up to you to get it right. But by not having to
worry about all that XML "bureacracy", you'll find that EWD's parser
is a lot easier and quicker to deal with than most.
The DOM parser is all about its APIs. All the main W3C XML DOM APIs
are available in EWD and the entire suite of available APIs is
described in detail in the ewdMgr (EWD's portal) application. Click
on the Documentation tab and you'll find all the information you need,
complete with examples.
However I thought it would be helpful to provide the key starting
point, which is "how do you instantiate or create a DOM in the first
place?".....and of course, "having created a DOM, how do I convert it
back into an XML file?"
There are several ways of instantiating/building a DOM:
1) Building a DOM from scratch.
This is pretty cool - you can actually generate a complete XML
completely programmatically. You kick it off with a single API call:
s docOID=$$newXMLDocument^%zewdDOM
(docName,outerTagName,addProcessingInstruction)
where:
docName = the document name (DOM name) you want to assign to the DOM
you're going to create
outerTagName = the tag name of the outermost tag in the XML document
addProcessingInstruction = 1 if you want to add an initial default <?
xml version='1.0' encoding='UTF-8'?> to the XML document. (If you want
a different encoding, specify 0 and add your own)
docOID = the document (DOM) OID that will be assigned by EWD to the
DOM that is created.
Before we go any further, a bit of explanation about DOM docNames and
docOIDs. Each DOM has 2 unique identifiers - the OID which is an
opaque, automatically generated identifier, and a meaningful name that
you assign. Each must be unique - no other DOM must already exist
with these values. Somewhat confusingly you'll find that some APIs
require the docOID, some require the docName. This really goes back
to the original W3C API definitions.
The $$newXMLDocument function will delete any existing DOM with the
docName you specify.
So if you call the following:
s docOID=$$newXMLDocument^%zewdDOM("demo","xxx",1)
you'll create a DOM named "demo" which should look like this:
<?xml version='1.0' encoding='UTF-8'?>
<xxx />
So having run this API, how can we see what the DOM looks like? The
answer is the $$outputDOM function. Try this:
s ok=$$outputDOM^%zewdDOM("demo",1,2)
and you should see:
<?xml version='1.0' encoding='UTF-8'?>
<xxx />
The outputDOM function is used for viewing the current state of your
DOM, and also for spitting out the DOM into a file. Just specify the
outputLocation as "file" and add the location path, eg:
s ok=$$outputDOM^%zewdDOM("demo",1,2,"file",,"/tmp/demo.xml")
The 2 is controlling the layout of the XML document: 2= "prettified"
indented output. Set it to 0 and it will spit out the DOM as a
stream:
s ok=$$outputDOM^%zewdDOM("demo",1,0)
<?xml version='1.0' encoding='UTF-8'?><xxx />
So we have a simple DOM instantiated. Now what?
Well you'll probably want to add new tags into the DOM. That's really
easy. Just use the "macro" API addElementToDOM. However, we need to
find out something first.
In a DOM, all the tags and attributes etc are represented as "nodes",
each with their own OID known as the nodeOID. In our document we've
just created, that outer tag (<xxx />) is known as the
"documentElement" and before we do anything we need to discover its
nodeOID:
s docName="demo"
s deOID=$$getDocumentElement^%zewdDOM(docName)
We can check that it's what we expect:
w $$getTagName^%zewdDOM(deOID)
xxx
Now we can add a new child tag into the DOM, using the documentElement
as the parentNode:
s attr("hello")="world"
s newOID=$$addElementToDOM^%zewdDOM("yyy",deOID,,.attr,"Bingo!")
Let's check what it's done:
s ok=$$outputDOM^%zewdDOM("demo",1,2)
<?xml version='1.0' encoding='UTF-8'?>
<xxx>
<yyy hello="world">
Bingo!
</yyy>
</xxx>
OK so that's one way to get a DOM started. How about if we want to
process an XML (or HTML) file? Just use the API call $$parseXMLFile^
%zewdAPI (Note that this is in ^%zewdAPI, not ^%zewdDOM), eg:
s ok=$$parseXMLFile^%zewdAPI("/tmp/demo.xml","secondDOM")
If it worked OK, ok="". If not it will tell you what went wrong, eg:
w ok
The file path /tmp/demox.xml does not exist
If it parsed OK, you can now list the document with $$outputDOM^
%zewdDOM("secondDOM",1,2)
And finally, what about if we want to grab some HTML from a web site
and turn it into a DOM so we can process it? Just use $$parseURL^
%zewdAPI, eg:
s ok=$$parseURL^%zewdAPI("
www.mgateway.com","/","third",,1)
then list the DOM it created:
s ok=$$outputDOM^%zewdDOM("third",1,2)
Note the 1 at the end of the parameter list for parseURL. That tells
the parser that the content needs to be processed as XHTML, not XML.
When you set it to 1, all tag names and attribute names are converted
to lower case. If you want to retain the exact names in their
original case, specify 0.
You've probably realised by now that you can also use EWD's DOM in
conjunction with REST services that you set up using m_apache. Just
write out the HTTP header records including
Content-type:text/xml
then add a call to outputDOM and the contents of the DOM you've
created will be transmitted to the awaiting client system.
So that's it really! You now have a DOM and you can use any of those
API methods to manipulate it.
A few more tricks:
How do I find the nodeOID of a particular tag?
The easiest way is if the tag has an id attribute:
s nodeOID=$$getElementById^%zewdDOM("myId",docOID)
Otherwise you can return a local array of all tags matching the name:
$$getElementsArrayByTagName^%zewdDOM("yyy",docName,,.nodes)
zwr nodes
nodes("12-4")=""
If you know that there's just one tag with that name:
s nodeOID=$$getTagOID^%zewdDOM("yyy",docName)
zwr nodeOID
12-4
There are many other APIs for navigating around in a DOM - consult the
ewdMgr documentation.
How do you get rid of a DOM once you're done?
s ok=$$removeDocument^%zewdDOM(docName)
Can I clear down all my DOMs in one go?
d clearDOMs^%zewdDOM
Can I clear down DOMs that start with a particular prefix?
d clearDOMsByPrefix^%zewdDOM("myPrefix")
Can I get a list of my DOMs?
d listDOMs^%zewdDOM(.listOfDOMs)
GTM>zwr
listOfDOMs("demo")=""
listOfDOMs("third")=""
Where are DOMs physically held?
In the global ^zewdDOM
Please DON'T manipulate this global yourself. ALWAYS use the API
methods.
That's enough to get you going! Have fun with EWD's DOM parser!