Problem reading simple xml string with XmlMapper

637 views
Skip to first unread message

Hugo Vandeputte

unread,
May 24, 2016, 12:16:41 PM5/24/16
to jackson-user
Hello,
I'm trying to read this simple string:
"<windSpeed units=\"kt\">27<radius>20</radius></windSpeed>"
using XmlMapper.

I use this "CxmlWindSpeed" class :
public class WindSpeed {

    public static class Radius {
        @JacksonXmlProperty(isAttribute = true)
        private String sector;
        @JacksonXmlProperty(isAttribute = true)
        private String units;
        @JacksonXmlText
        private int value;
        ..../ Getters and Setters code/....
    }
    @JacksonXmlProperty(isAttribute = true)
    private String units;
    @JacksonXmlProperty(isAttribute = true)
    private String source;
    @JacksonXmlText
    private int value;
    @JacksonXmlElementWrapper(useWrapping = false)
    private List<Radius> radius;
    ..../ Getters and Setters code/....
}
And this reading code:
 WindSpeed speed=xmlMapper.readValue(cxmlWindSpeed, WindSpeed.class);
 System.out.println(speed.getValue());
I get the value 0 instead of 27.
When I use this other string "<windSpeed units=\"kt\">27</windSpeed>"
the correct value 27 is read.

If I try to use a custom dezerializer, I find that with the first string jp.getCodec().readTree(jp) return a node with only 2 childs.
I'm using jackson 2.7.4 jars and stax2-api-3.1.1.jar + woodstox-core-asl-4.4.1.jar

Did I miss something ?
Thanks for your help.

Hugo Vandeputte

unread,
May 25, 2016, 9:36:37 AM5/25/16
to jackson-user
Problem solved:
After digging through the jackson-dataformat-xml, I found that the XmlTokenStream explicitly discards text if not just before the end element:

           
            // otherwise need to find START/END_ELEMENT or text
           
String text = _collectUntilTag();
           
// If it's START_ELEMENT, ignore any text
           
if (_xmlReader.getEventType() == XMLStreamReader.START_ELEMENT) {
               
return _initStartElement();
           
}
           
// For END_ELEMENT we will return text, if any
           
if (text != null) {
                _textValue
= text;
               
return (_currentState = XML_TEXT);
           
}
           
return _handleEndElement();

Tatu Saloranta

unread,
May 25, 2016, 12:43:58 PM5/25/16
to jackson-user
Sorry, was about to reply that "mixed content" is explicitly not supported: that is, mixing of non-empty textual character data segment(s) with child elements. @XmlText is only supported if there are no child elements.

-+ Tatu +-


--
You received this message because you are subscribed to the Google Groups "jackson-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jackson-user...@googlegroups.com.
To post to this group, send email to jackso...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Hugo Vandeputte

unread,
May 26, 2016, 4:38:09 AM5/26/16
to jackson-user
Hello Tatu,
thank you for this information. As a short term solution I will use a custom deserializer.
Changing the xml structure is a matter of years, if possible, because it's part of internationnal standard.

Jackson FasterXml is doing a great job, do you think a patch providing improved "mixed content" support would be accepted by the developpers?
It could be based on a modification of the XmlTokenStream class, or the possibility to provide a custom XmlTokenStream.
I could spend some "home work" time on this.

Hugo.

Hugo Vandeputte

unread,
May 26, 2016, 5:36:23 AM5/26/16
to jackson-user
** xml structure of the "windspeed element" **
sorry for the ambiguity.

Tatu Saloranta

unread,
May 26, 2016, 1:34:56 PM5/26/16
to jackson-user
On Thu, May 26, 2016 at 1:38 AM, Hugo Vandeputte <hugo.va...@bleu-pastel.org> wrote:
Hello Tatu,
thank you for this information. As a short term solution I will use a custom deserializer.
Changing the xml structure is a matter of years, if possible, because it's part of internationnal standard.

Ah. Yes, I can understand that being difficult.
It is bit unusual as mixed content is most commonly found from textual markup use cases and not with data-oriented usage, which is why such content is often not supported by data-binding: as far as I know, JAXB also has trouble with such content models.
 

Jackson FasterXml is doing a great job, do you think a patch providing improved "mixed content" support would be accepted by the developpers?
It could be based on a modification of the XmlTokenStream class, or the possibility to provide a custom XmlTokenStream.
I could spend some "home work" time on this.

I would very much like additional support, if suitable solution could be found.
Having some support has been discussed in context of some existing issues, but I can not see any existing open issue specifically dealing with mixed content. So perhaps filing a new one would make sense.

Code that handles intermediate conversion from xml stream tokens into JsonTokens is quite involved, and has to make a few assumptions. So I guess one big question is how to express "unwrapped" text, and whether this can be done without look-ahead to see what comes in stream. This may or may not be problematic, since checks are already made for all-space/not-all-space content, and cases where text is followed by start-element vs end-element.

-+ Tatu +-
Reply all
Reply to author
Forward
0 new messages