Requirement:
return results of openxml() to an Edge Table.
Function body:
openxml(@handle, '//Charge', 2)
[where 'Charge' is a parent element in the .xml doc)
Question:
My openxml() fails to parse an .xml doc with the root element attributes
shown below (function returns -0- rows).
If I add a character to the 2nd "xmlns" in the .xml doc root attributes (or
if I delete a character, or replace that 2nd "xmlns" with "xmlns:AnyWord"),
then the function works fine.
Anybody know why the .xml doc root element attributes are causing openxml()
to fail? Also, is it 'best practice' to alter the attributes so the function
works?
Thanks in advance ...
Bill
Root element of .xml doc:
<UPS_EBR_BILL
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="http://www.ups.com/XMLSchema/EBR/Billing/v1"
xsi:schemaLocation="http://www.ups.com/XMLSchema/EBR/Billing/v1
Billing_XML_Schema.xsd">
[body of .xml doc goes here]
<UPS_EBR_BILL>
You need to take the namespace into account, here is an example:
DECLARE @hdoc int
DECLARE @doc nvarchar(1000)
SET @doc = N'<UPS_EBR_BILL
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="http://www.ups.com/XMLSchema/EBR/Billing/v1"
xsi:schemaLocation="http://www.ups.com/XMLSchema/EBR/Billing/v1
Billing_XML_Schema.xsd">
[body of .xml doc goes here]
</UPS_EBR_BILL>'
--Create an internal representation of the XML document.
EXEC sp_xml_preparedocument @hdoc OUTPUT, @doc, N'<root
xmlns:v1="http://www.ups.com/XMLSchema/EBR/Billing/v1"/>'
SELECT * FROM openxml(@hdoc, '/v1:UPS_EBR_BILL', 2)
-- Remove the internal representation.
exec sp_xml_removedocument @hdoc
--
Martin Honnen --- MVP XML
http://msmvps.com/blogs/martin_honnen/
That worked! Thank you ...!
Your answer contains a lot of new info for me, so a few more questions:
1)How did you know to use
'xmlns:v1="http://www.ups.com/XMLSchema/EBR/Billing/v1"/>' as the last
parameter in sp_xml_preparedocument (instead of using one of the other root
element attributes)? Was it the 'schemalocation' key word?
2) For the sp_xml_prepareddocument, how did you know to rephrase the
attribute as 'xmlns:v1=' ( instead of using "xsi:schemalocation ...")?
3) Because I now need to add the namespace parameter to my
sp_xml_preparedocument, is it best practice to simply hard code it? Or
should I be searching for the applicable string in the original .xml doc for
each new doc? We gert these docs regularly.
4) I am getting the original .xml doc into an xml variable instead of
varchar() and I am not using N'. I assume that is okay?
5) Is sql server actually going out to that schemalocation url to get schema
info, or is the openxml() parse strictly internal?
Thank you again for all your help. Greatly appreciated ...
Will
> Your answer contains a lot of new info for me, so a few more questions:
>
> 1)How did you know to use
> 'xmlns:v1="http://www.ups.com/XMLSchema/EBR/Billing/v1"/>' as the last
> parameter in sp_xml_preparedocument (instead of using one of the other root
> element attributes)? Was it the 'schemalocation' key word?
No, http://www.ups.com/XMLSchema/EBR/Billing/v1 is the namespace in the
default namespace declaration (e.g. xmlns="...") of the root element of
your posted XML sample. With that default namespace declaration the root
element and its descendants elements are in the namespace and for XPath
1.0 expressions to select elements in a namespace you need to bind a
prefix to the namespace and use that prefix. That is why
> 2) For the sp_xml_prepareddocument, how did you know to rephrase the
> attribute as 'xmlns:v1=' ( instead of using "xsi:schemalocation ...")?
there is a xmlns:v1="http://www.ups.com/XMLSchema/EBR/Billing/v1" in
that fourth argument to preparedocument, it simply defines any prefixes
you want to use in the XPath expressions with openxml.
> 3) Because I now need to add the namespace parameter to my
> sp_xml_preparedocument, is it best practice to simply hard code it? Or
> should I be searching for the applicable string in the original .xml doc for
> each new doc? We gert these docs regularly.
Do all documents use the same namespace? In that case you can hard code
it. If you need to process documents with different namespaces then you
will need to extract the namespace.
> 4) I am getting the original .xml doc into an xml variable instead of
> varchar() and I am not using N'. I assume that is okay?
I think so. I just prefer nvarchar to varchar for XML stuff as XML is
meant to build on and support Unicode, that is why the sample I posted
used nvarchar and N'...'.
> 5) Is sql server actually going out to that schemalocation url to get schema
> info, or is the openxml() parse strictly internal?
I don't think openxml attempts any validation against a schema. It
simply checks that the markup is well-formed. Thus as far as I am aware
openxml does not attempt to read in any schema in a schemaLocation
attribute.
Your answers are great. Thank you. Based on them I did some reading on xml
name spaces and figured I could basically replace whatever xmlns was in the
original .xml doc with my own. So I tried replacing the root element with
the following <UPS_EBR_BILL xmlns="http://sam/v1">, and it worked fine.
Your answer below: "With that default namespace declaration the root
element and its descendants elements are in the namespace and for XPath
1.0 expressions to select elements in a namespace you need to bind a
prefix to the namespace and use that prefix. " was perfect.
For XPath expressions to work, there needs to be a declared namespace
attribute in the root element, but looks like any name that fits the required
xmlns syntax will work. I need to do a little more experimenting with that
syntax to see how forgiving it is.
Thank you again ... very helpful ...
Will