Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Java Sun XML parser weird behaviour

0 views
Skip to first unread message

Dennis

unread,
Feb 2, 2006, 12:38:21 AM2/2/06
to
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE database [
<!ELEMENT database (item)*>
<!ELEMENT item (name,sequence)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT sequence (#PCDATA)>
]>
<database>
<item>
<name>YBL014c</name>
<sequence>ATGAGTG</sequence>
</item>
<item>
<name>YBL025w</name>
<sequence>ATGCTGA</sequence>
</item>
</database>

Here's the example I'm working with. When I call
getElementsByName("item"), it returns 2 items as expected. Then when I
call getChildNodes, it returns a NodeList of length 5 for some reason,
even though there are only two child nodes per item? Two of them are
the "name" and "sequence" nodes, and 3 of them are "#text". Plus,
getting to the node values in the "name" and "sequence" nodes is not
completely intuitive. I can get to them, but the method calls seem
weird. (i.e. code below for tempName and tempSeq). If there were more
than two nodes in each "item", how would I get to it, since in this
example I'm coming at the values from opposite sides?

Oh, I'm using the DocumentBuilderFactory -> DocumentBuilder -> Document
- XML Parser from the javax.xml.parsers package.

In my code: nl is a NodeList from getElementsByTagName("item");

...
Node temp = nl.item(i);

NodeList tl = temp.getChildNodes();
Node tn = nl.item(i);

tempName =
tn.getFirstChild().getNextSibling().getFirstChild().getNodeValue();

tempSeq =
tn.getLastChild().getPreviousSibling().getFirstChild().getNodeValue();
...


Dennis

Martin Honnen

unread,
Feb 2, 2006, 7:46:42 AM2/2/06
to

Dennis wrote:


> <database>
> <item>
> <name>YBL014c</name>
> <sequence>ATGAGTG</sequence>
> </item>
> <item>
> <name>YBL025w</name>
> <sequence>ATGCTGA</sequence>
> </item>
> </database>
>
> Here's the example I'm working with. When I call
> getElementsByName("item"), it returns 2 items as expected. Then when I
> call getChildNodes, it returns a NodeList of length 5 for some reason,
> even though there are only two child nodes per item? Two of them are
> the "name" and "sequence" nodes, and 3 of them are "#text".

There are other nodes than element nodes possible in the DOM, in your
case there are text nodes with white space between the element nodes.

Access element nodes using getElementsByTagName e.g.
NodeList items = xmlDocument.getElementsByTagName("item");
for (int i = 0; i < items.getLength(); i++) {
Element item = (Element)items.item(i);
Element name = (Element)item.getElementsByTagName("name").item(0);
if (name != null) {
// access contents of name element
// e.g. if that is Java Sun 1.5
// name.getTextContent()
// if that is 1.4
// name.getFirstChild().getNodeValue()
}
Element sequence =
(Element)item.getElementsByTagName("sequence").item(0);
if (sequence != null) {
...
}
}
If you have several child elements you can of course loop over e.g.
item.getElementsByTagName("name")
or
item.getElementsByTagName("sequence")


--

Martin Honnen
http://JavaScript.FAQTs.com/

0 new messages