Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

how to detect two or more roots in xml

30 views
Skip to first unread message

mike

unread,
Feb 23, 2021, 12:23:01 PM2/23/21
to
Hi,

I will have files like:

<get>
<filter>
<configure xmlns="urn:nokia.com:sros:ns:yang:sr:conf">
<system>
<security>
<user-params>
<local-user>
<user>
<user-name /> <!-- selection node in context A -->
</user>
</local-user>
</user-params>
</security>
</system>
</configure>
<state xmlns="urn:nokia.com:sros:ns:yang:sr:state">
<system>
<security>
<user-params>
<local-user>
<user>
<attempted-logins /> <!-- selection node in context B -->
</user>
</local-user>
</user-params>
</security>
</system>
</state>
</filter>
</get>

I will need to parse to xml:

<configure xmlns="urn:nokia.com:sros:ns:yang:sr:conf">
<system>
<security>
<user-params>
<local-user>
<user>
<user-name /> <!-- selection node in context A -->
</user>
</local-user>
</user-params>
</security>
</system>
</configure>
<state xmlns="urn:nokia.com:sros:ns:yang:sr:state">
<system>
<security>
<user-params>
<local-user>
<user>
<attempted-logins /> <!-- selection node in context B -->
</user>
</local-user>
</user-params>
</security>
</system>
</state>

However it is not valid xml since it is two roots. How can I detect that I have two roots. Currently I am only seeing this when parse fails ( see below) And message is too general to know there are multiple ( in this case 2) roots.
If I can detect multi roots I can split string and parse each and add them together later ( I think).

Any ideas?
br,

//mike


<subtreeA></subtreeA><subtreeB ></subtreeB>
at com.ericsson.commonlibrary.netconf.AbstractNetconfBase.parseXmlString(AbstractNetconfBase.java:409)
at com.ericsson.commonlibrary.netconf.AbstractNetconfBase.get(AbstractNetconfBase.java:728)
at com.ericsson.commonlibrary.netconf.NetconfTest.testGet(NetconfTest.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:124)
at org.testng.internal.Invoker.invokeMethod(Invoker.java:596)
at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:732)
at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1006)
at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:125)
at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:109)
at org.testng.TestRunner.privateRun(TestRunner.java:651)
at org.testng.TestRunner.run(TestRunner.java:508)
at org.testng.SuiteRunner.runTest(SuiteRunner.java:458)
at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:453)
at org.testng.SuiteRunner.privateRun(SuiteRunner.java:418)
at org.testng.SuiteRunner.run(SuiteRunner.java:367)
at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52)
at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:84)
at org.testng.TestNG.runSuitesSequentially(TestNG.java:1213)
at org.testng.TestNG.runSuitesLocally(TestNG.java:1142)
at org.testng.TestNG.runSuites(TestNG.java:1050)
at org.testng.TestNG.run(TestNG.java:1018)
at org.testng.remote.AbstractRemoteTestNG.run(AbstractRemoteTestNG.java:115)
at org.testng.remote.RemoteTestNG.initAndRun(RemoteTestNG.java:251)
at org.testng.remote.RemoteTestNG.main(RemoteTestNG.java:77)
Caused by: org.apache.xmlbeans.XmlException: error: The markup in the document following the root element must be well-formed.
at org.apache.xmlbeans.impl.store.Locale$SaxLoader.load(Locale.java:3448)
at org.apache.xmlbeans.impl.store.Locale.parse(Locale.java:708)
at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:692)
at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:679)
at org.apache.xmlbeans.impl.schema.SchemaTypeLoaderBase.parse(SchemaTypeLoaderBase.java:208)
at org.apache.xmlbeans.XmlObject$Factory.parse(XmlObject.java:639)
at com.ericsson.commonlibrary.netconf.AbstractNetconfBase.parseXmlString(AbstractNetconfBase.java:407)
... 27 more
Caused by: org.xml.sax.SAXParseException; systemId: file://; lineNumber: 1; columnNumber: 23; The markup in the document following the root element must be well-formed.
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:203)
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:177)
at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:400)
at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:327)
at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1472)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$TrailingMiscDriver.next(XMLDocumentScannerImpl.java:1395)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:602)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:112)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:505)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:842)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:771)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:643)
at org.apache.xmlbeans.impl.store.Locale$SaxLoader.load(Locale.java:3422)
... 33 more




Arne Vajhøj

unread,
Feb 23, 2021, 9:27:30 PM2/23/21
to
On 2/23/2021 12:22 PM, mike wrote:
> I will have files like:

> I will need to parse to xml:

> However it is not valid xml since it is two roots. How can I detect
> that I have two roots. Currently I am only seeing this when parse
> fails ( see below) And message is too general to know there are
> multiple ( in this case 2) roots. If I can detect multi roots I can
> split string and parse each and add them together later ( I think).
Parsing and catching is probably the most safe approach.

If you want a string hack then regex may be the easiest approach.

For inspiration:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class XmlSniffer {
private static Pattern either =
Pattern.compile("(\\<[\\w-]+.*?\\>)|(\\</[\\w-]+\\>)");
private static Pattern start = Pattern.compile("\\<[\\w-]+.*?\\>");
private static Pattern end = Pattern.compile("\\</[\\w-]+\\>");
private static Pattern both = Pattern.compile("\\<[\\w-]+.*?/\\>");
public static boolean check(String xmlstr) {
Matcher e = either.matcher(xmlstr);
int depth = 0;
boolean first = true;
while(e.find()) {
String piece = e.group();
if(start.matcher(piece).matches()) {
if(depth == 0) {
if(first) {
first = false;
} else {
return false;
}
}
if(!both.matcher(piece).matches()) {
depth++;
}
} else if(end.matcher(piece).matches()) {
depth--;
}
}
return true;
}
private static void test(String xmlstr) {
System.out.println(xmlstr);
System.out.println(check(xmlstr));
}
public static void main(String[] args) {
test("<a/>");
test("<a>xxxx</a>");
test("<a>\r\n <b>xxxx</b>\r\n</a>");
test("<a>\r\n <b/>\r\n</a>");
test("<a>xxxx</a>\r\n<c>yyyy</c>");
test("<a>\r\n <b>xxxx</b>\r\n</a>\r\n<c>yyyy</c>");
test("<a>\r\n <b/>\r\n</a>\r\n<c>yyyy</c>");
test("<a>xxxx</a>\r\n<c/>");
test("<a>\r\n <b>xxxx</b>\r\n</a>\r\n<c/>");
test("<a>\r\n <b/>\r\n</a>\r\n<c/>");
test("<a/>\r\n<c/>");
}
}


Arne

mike

unread,
Feb 24, 2021, 3:33:48 AM2/24/21
to
Thanks Arne! This gave me a concreate heads-on start. I will continue working on this.

br,

//mike

Daniele Futtorovic

unread,
Feb 25, 2021, 9:29:41 AM2/25/21
to
This looks like a self-inflicted problem. Don't do string manipulations
with XML.

The first document you posted (starting with a "get" element) is
correct. Parse _that_ document. Then extract the children you need --
for instance using XPath. Or build it as a Document and find the
children you need.

(Apologies, early reply was sent directly to OP instead of newsgroup)

--
DF.
0 new messages