<item>
............
</item>
<item>
............
</item>
</info>
I want to extract each< item> in its entirity; thus, in above, I want
to create 3 files
each containing just
<item>
............
</item>
.I tried using xpath didnt help, not sure how to readof the actual
tags.
Thanks.
String xpathExpr = "/info/item";
String inputFilename = "yourFile.xml";
XPath xpath = XPathFactory.newInstance().newXPath();
InputSource inputSource = new InputSource(inputFilename);
NodeList nodes = (NodeList)xpath.evaluate(xpathExpr, inputSource,
XPathConstants.NODESET);
Regards
DM
- Import the library in your project / favorite IDE
Given the following an XML file called items.xml with the following
contents:
<?xml version="1.0"?>
<info>
<item>hello</item>
<item>world</item>
<item>!</item>
</info>
We will be producing 3 files, each named item1.xml, item2.xml,
item3.xml with the following piece of code using the JDOM library:
import org.jdom.*;
import org.jdom.input.*;
import org.jdom.output.*;
import java.io.*;
import java.util.*;
public class XMLItemManipulator {
private List<Element> items;
public XMLItemManipulator() {
items = null;
}
public void readItems(File xmlFile) throws FileNotFoundException,
IOException {
// make sure the file exists and can be read
if(!xmlFile.exists())
throw new FileNotFoundException("cannot find the xml file");
if(!xmlFile.canRead())
throw new IOException("file exists but does not have *read*
permission");
// now that we have made sure we got the file, just get the objects
// necessary to read it and create and XML doc outta if
SAXBuilder builder = new SAXBuilder();
Document doc = null;
try {
doc = builder.build(xmlFile);
} catch(JDOMException e) {
System.out.println("An error occured while build the XML Doc!");
e.printStackTrace();
}
// get the root element, in you case this would be <info>
Element root = doc.getRootElement();
// get the list of children of the root element
// which have the "item" tag.
// meaning that even if you had other tags that
// were children of the root, we really wouldn't care
// perfect for an heterogenous xml file containing more
// than the "item" elements
items = root.getChildren("item");
}
// now that you got the items you might want to manipulate them
// it depends on what you wanna do with them while they're in
// memory. I recommend you have a look at the JDOM doc for more info.
public void manipulateItems() {
// put some code here
}
// once you have manipulated them or since you got the items,
// you can now decide to write them separately to files.
// To do this, it's very simple.
public void writeItems() throws IOException, Exception {
Element root = null;
Document doc = null;
FileWriter writer = null;
XMLOutputter out = new XMLOutputter();
int size = items.size();
try {
for(int counter = 0; counter < size; counter++) {
root = new Element("item");
root.addContent(items.get(counter).cloneContent());
doc = new Document(root);
writer = new FileWriter(new File("item" + counter + ".xml"));
out.output(doc, writer);
out.output(doc, System.out);
}
} catch(IOException e) {
throw e; // put better handling of exception here
} catch(Exception e) {
throw e; // put better handling of exception here
} finally {
try {
writer.close();
} catch(Exception e) {
e.printStackTrace(); // imagine better handling here
}
}
}
// testing all of this with a main method (normally you'd write)
// a full test case to do this but that's your decision
public static void main(String[] args) {
XMLItemManipulator manip = new XMLItemManipulator();
File file = new File("items.xml");
try {
manip.readItems(file);
manip.manipulateItems(); // this is optional
manip.writeItems();
} catch(Exception e) {
e.printStackTrace();
}
}
}
There you go. Let us know how it goes.
Regards,
Jean-Paul H.
thanks
it didnt work.
Item is not the root tag but they are scatter of the doc...
<Info>
<Item>
..............
</Item>
etc
</Info>
suggest
Thanks.
What are you trying to do (I missed the thread?)
The JDK has reference implementations of DOM and SAX, all in JAXP which
shares ancestry with Xerces. I prefer DOM4J but I can't give you an
intelligent rationale other than, "it's always worked well when I've
used it".
Since 1.5, it seems like it should be unnecessary to use anything
additional for xml processing, unless you need a particular
implementation for performance or compatability reasons. But I must
admit, I didn't see the original question and I might be being naive.
org.xml.sax.SAXParseException: Invalid byte 2 of 3-byte UTF-8 sequence.
at
com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(Unknown
Source)
at
com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
Source)
at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown
Source)
at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown
Source)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown
Source)
at
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown
Source)
at javax.xml.parsers.SAXParser.parse(Unknown Source)
at javax.xml.parsers.SAXParser.parse(Unknown Source)
at
com.touchgraph.amazoncache.io.AmazonParser.parse(AmazonParser.java:33)
at
com.touchgraph.amazoncache.io.AmazonCacheReader.readCache(AmazonCacheReader.java:35)
at
com.touchgraph.amazoncache.io.AmazonCacheStore.getBooksFromCache(AmazonCacheStore.java:185)
at
com.touchgraph.amazoncache.io.AmazonCacheStore.loadSimilarFromCache(AmazonCacheStore.java:131)
at
com.touchgraph.amazoncache.io.AmazonCacheStore.getSimilarBooks(AmazonCacheStore.java:44)
at
com.touchgraph.amazoncache.io.AmazonDataModel.addSimilarBooks(AmazonDataModel.java:70)
at
com.touchgraph.amazoncache.io.AmazonCacheFrame$1.actionPerformed(AmazonCacheFrame.java:85)
at javax.swing.AbstractButton.fireActionPerformed(Unknown Source)
at javax.swing.AbstractButton$Handler.actionPerformed(Unknown Source)
at javax.swing.DefaultButtonModel.fireActionPerformed(Unknown Source)
at javax.swing.DefaultButtonModel.setPressed(Unknown Source)
at javax.swing.plaf.basic.BasicButtonListener.mouseReleased(Unknown
Source)
at java.awt.Component.processMouseEvent(Unknown Source)
at javax.swing.JComponent.processMouseEvent(Unknown Source)
at java.awt.Component.processEvent(Unknown Source)
at java.awt.Container.processEvent(Unknown Source)
at java.awt.Component.dispatchEventImpl(Unknown Source)
at java.awt.Container.dispatchEventImpl(Unknown Source)
at java.awt.Component.dispatchEvent(Unknown Source)
at java.awt.LightweightDispatcher.retargetMouseEvent(Unknown Source)
at java.awt.LightweightDispatcher.processMouseEvent(Unknown Source)
at java.awt.LightweightDispatcher.dispatchEvent(Unknown Source)
at java.awt.Container.dispatchEventImpl(Unknown Source)
at java.awt.Window.dispatchEventImpl(Unknown Source)
at java.awt.Component.dispatchEvent(Unknown Source)
at java.awt.EventQueue.dispatchEvent(Unknown Source)
at java.awt.EventDispatchThread.pumpOneEventForHierarchy(Unknown
Source)
at java.awt.EventDispatchThread.pumpEventsForHierarchy(Unknown Source)
at java.awt.EventDispatchThread.pumpEvents(Unknown Source)
at java.awt.EventDispatchThread.pumpEvents(Unknown Source)
at java.awt.EventDispatchThread.run(Unknown Source)
null
thanks