Hello,
In Mesh4j, I got an exception when trying to read a well formed XML
feed (using UTF-8 encoding) that contains french accented caracters in
a item's payload:
Caused by: com.mesh4j.sync.validations.MeshException:
org.xml.sax.SAXParseExcept
ion: Invalid byte 2 of 3-byte UTF-8 sequence.
at
com.mesh4j.sync.utils.XMLHelper.canonicalizeXML(XMLHelper.java:130)
at com.mesh4j.sync.model.Content.refreshVersion(Content.java:
49)
at com.mesh4j.sync.model.Content.<init>(Content.java:22)
at
com.mesh4j.sync.adapters.feed.XMLContent.<init>(XMLContent.java:17)
at
com.mesh4j.sync.adapters.feed.FeedReader.readItem(FeedReader.java:150
)
at
com.mesh4j.sync.adapters.feed.FeedReader.read(FeedReader.java:115)
at
com.mesh4j.sync.adapters.feed.FeedReader.read(FeedReader.java:71)
at
com.mesh4j.sync.adapters.feed.FeedReader.read(FeedReader.java:103)
at
com.mesh4j.sync.adapters.feed.FeedAdapter.<init>(FeedAdapter.java:74)
at FeedSyncController
$_closure7.doCall(FeedSyncController.groovy:83)
at FeedSyncController
$_closure7.doCall(FeedSyncController.groovy)
In effect this exception was raised by one of the
XMLHelper.canonicalizeXML methods when reading the item that contained
accented caracters. When it's not the case, all works well.
So to resolve this problem, I changed:
byte[] result = c.engineCanonicalize(xml.getBytes());
by:
byte[] result = c.engineCanonicalize(xml.getBytes("UTF-8"));
Hope this help !
Bertrand.
http://www.odelia-technologies.com/