In the example below, I am trying to get the MOREXML out into a string
so that I can then load it an manipulate it, write it back to the
original xml document, and then continue processing.
<test>
<entry1>
<data><![CDATA[<?xml version="1.0"
encoding="UTF-8"?><MOREXML>....</MOREXML>]]</data>
</entry1>
</test>
Thanks
Chris
I just tried this and I see the same effect, i.e. the resulting SDO
field is superficially empty. I'll take a look and see if I can work
out what's going on.
Regards
Simon
void sdo_cdataBlock(void *ctx, const xmlChar *value, int len)
{
}
We base the PHP SDO implmentation on the Tuscany C++ SDO implementation
so I've just posted to the Tuscany list (will appear here
http://www.mail-archive.com/tuscany-dev%40ws.apache.org/index.html at
some point) to work out why this is. If it's just missing by mistake
I'll propose a patch.
Regards
Simon
1) Duplicate the CDATA string as is, including the "<![CDATA[" and
"]]>" markers, to the appropriate property in the data object hiearchy
2) Duplicate the CDATA string excluding the "<![CDATA[" and "]]>"
markers and instigate a special flag to indicate that CDATA is present.
CDATA is the specific concern of XML, i.e. the chracter entities that
CDATA protects an XML parser from are of noconcern to SDO because SDO
is not intended to be tied directly to XML. So given the example
options above we either expose the specifics of XML to the SDO core 2)
or to the SDO user 1).
Neither are particularly attractive.
1) appears to be the simplest approach to implement because it provides
a mechanism for the user to read, and
create CDATA without having to provide much special support in SDO. 2)
is more involved particularly because
CDATA can appear mixed in with other text strings and so a sequence may
need to be used to represent properties
that have a mixture of text and CDATA marking those sequences entries
that are CDATA.
1) does require changes (at least in C++ SDO) because XML parsers tend
to be too helpful in this case for processing CDATA. XML parsers,
libxml2 in particular, recognize the "<![CDATA[" and "]]>" sequence as
a special indicator and throw it away returning just the text it
includes. We would have to reintroduce it and store it in the parameter
value in question. The C++ SDO implementation uses a lot of XML string
handling before the parameter value is actually stored which URL
encodes parts of the CDATA markers so this would have to be fixed. When
writing out the CDATA strings any string typed properties would have to
be scanned for the markers so that the appropriate libxml2 functions
can be called to get the CDATA sections in the right place.
I have a test implementation of 1) which needs nore more before I could
check it in but let me know if this is going in the right direction an
I'll do the work. In particular, when a CDATA section appears in an XML
file are you happy to have this appear verbatim in a data object
property. The result is that, as a user you would have to read the
property, parse out the CDATA markers and perform whatever processing
you require. When putting th string back you would have to make sure
that the CDATA makers appear correctly if you want this to write back
to XML without error.
Thoughts?
Simon
I am fine with #1, especially since its less work for you guys and you
have it mostly done.
Ok Chris, thanks, I'll push on and complete the changes
Simon
Simon,
It's surprising that neither the Java nor the C++ SDO specs mention
this issue, and I noticed that there is an issue open against the C++
SDO spec about it. I mostly agree with you about how it should be
handled - a CDATA section is simply text which has been marked not for
formatting, so depending on its position in the document, it should be
presented as either the value of an SDO Property or as a text element
in an SDO Sequence. However we should go for your option (1). It would
be out of keeping for SDO to present the tags to the user, they must be
stripped out first.
You say you are working on a solution - presumably this is in the
Tuscany C++ library? I don't think the PHP SDO Core should need to
change at all. You need to ensure that the C++ code does not process
the data between the CDATA start and end tags.
As you mention, the C++ code may need to mark the property internally
so that it knows to write out the data as a CDATA section if necessary.
Firstly, the really difficult case is when a CDATA section appears as
part of an element of simple type string. In this case I have nothing
to hang a flag off to indicate that only part of the string is not
parsed. This is less of a problem for mixed types as the sequence can
be used to separate parseable text from CDATA and some suitable flag
instigated
Secondly, this issue has been raised in the SDO spec group but no
discussion has taken place to date. On this basis I would favour the
simplest and least invasive solution, in terms of changes to SDO
infrastructure, pending any decision, which we can feed into of course,
on what the spec group is going to do. This has the benefit of being
better that throwing away the CDATA, as we do at the moment, being
quick to implement and it allows us to experiment a little with CDATA
before commiting to more radical SDO core changes.
Simon
The solution I have gone with for now is my solution #1 as previously
suggested, i.e. I make the CDATA strings available in SDO properties
makred with the XML CDATA markers. I have submitted a patch to the
Tuscany C++ SDO project but time will tell whether they find this
acceptable.
Simon
/home/cdouglas/phpsdo/pecl/sdo/commonj/sdo/SDOString.cpp:37: error:
explicit specialization of 'SDOString std::basic_string<char,
std::char_traits<char>, std::allocator<char> >::toLower(unsigned int,
unsigned int)' must be introduced by 'template <>'
/home/cdouglas/phpsdo/pecl/sdo/commonj/sdo/SDOString.cpp:37: error: no
member function 'toLower' declared in 'std::basic_string<char,
std::char_traits<char>, std::allocator<char> >'
/home/cdouglas/phpsdo/pecl/sdo/commonj/sdo/SDOString.cpp:37: error:
invalid function declaration
make: *** [commonj/sdo/SDOString.lo] Error 1
Sorry About that
Simon
make: *** No rule to make target
`/downloads/SDO-1.0.4/commonj/sdo/SDOString.cpp', needed by
`commonj/sdo/SDOString.lo'. Stop.
I've just been through the same process and encountered this on
Windows. The problem is SDOString.cpp is no longer required, but is
still referred to by the config.w32 and config.m4 files. Try removing
the line "commonj/sdo/SDOString.cpp \" from config.m4 and then running
through your build from the beginning again.
Hope this fixes it.
config.m4 - linux
config.w32 - windows
I will just rest my change to the linux file and check it in. I'll
repost here when I've done it.
Regards
Simon
I then cd pecl/sdo and ran the following:
phpize
./configure
make
make install
The new sdo.so is being loaded, but the version when i do a phpinfo
still says 1.0.4.
Save the following to a file called test.xsd
======================================================
<?xml version="1.0" encoding="utf-8" ?>
<xs:schema targetNamespace="testNS" xmlns="testNS"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="doc">
<xs:complexType>
<xs:element name="v1" type="xs:string" />
<xs:element name="v2" type="xs:string" />
<xs:element name="v3" type="xs:string" />
</xs:complexType>
</xs:element>
</xs:schema>
======================================================
Save the following to a file called test.in.xml
======================================================
<?xml version="1.0" encoding="utf-8"?>
<doc xmlns="testNS">
<v1><![CDATA[
<p>
A bit of escaped content
</p>
]]></v1>
<v2><![CDATA[<?xml version="1.0"
encoding="UTF-8"?><MOREXML>....</MOREXML>
]]></v2>
<v3>Value3</v3>
</doc>
======================================================
Create a php script with the following:
======================================================
<?php
$xmldas = SDO_DAS_XML::create('test.xsd');
$doc = $xmldas->loadFile('test.in.xml');
$xmldas->saveFile($doc, 'test.out.xml', 4);
?>
======================================================
When you run the script, you should see a file created called
test.out.xml which has the preserved CDATA sections. If the CDATA
sections are lost then it's probably your build and we can go from
there. If you do, then it may be your XML schema, and we can start
discussing that.
I hope this helps.
void sdo_cdataBlock(void *ctx, const xmlChar *value, int len)
{
if (!((SAX2Parser*)ctx)->parserError)
{
SDOXMLString valueAsString(value, 0, len);
SDOXMLString cdata(PropertySetting::CDataStartMarker);
cdata = cdata + valueAsString;
cdata = cdata + PropertySetting::CDataEndMarker;
((SAX2Parser*)ctx)->characters(cdata);
}
}
If this method exists but is empty then you don't have the right code.
The banch you need is BRANCH_1_0_5. The Root_BRANCH_1_0_5 marks the
point at where we branched to add this code (and some other stuff) so
Root_BRANCH_1_0_5 won't have the change.
Graham has been playing with this also and has spotted a situation
where it doesn't apparently work. Something to do with schema
correctness. He is going to post also.
If you find that you have the code and that whatever Graham says checks
out then you may have straight away found a corner case that I didn't
take account of.
Here is a very simple sample that I just ran:
XSD
===
<?xml version="1.0" encoding="UTF-8"?>
<schema xmlns="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.example.org/test"
xmlns:tns="http://www.example.org/test">
<complexType name="TestType">
<sequence>
<element name="entry" type="string"/>
</sequence>
</complexType>
<element name="test" type="tns:TestType"/>
</schema>
XML
===
<?xml version="1.0" encoding="UTF-8"?>
<tns:test xmlns:tns="http://www.example.org/test"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.example.org/test cdata1.xsd ">
<entry>xxx<![CDATA[some data and some <MoreXML></MoreXML>]]></entry>
</tns:test>
These are pretty similar to your original example so I'm hoping that we
can get you up and running. If you suspect that your examples are
different in some important way that we might not have considered can
you distil out the different and we'll make an example and try it here.
Regards
Simon
S