XML export force character encoding

33 views
Skip to first unread message

Nick Batt

unread,
Sep 18, 2015, 10:44:33 AM9/18/15
to Lucee
I'm trying to export the content of a CMS I have which has mySQL Latin1 encoding, but it looks like the server has been forcing UTF-8 encoding of the data submitted by the CMS via the Server settings.

So when I export the XML, some records break the XML - with this (from the broswer where I check the data)
error on line 24 at column 19: Input is not proper UTF-8, indicate encoding ! Bytes: 0x03 0x6F 0x66 0x20

There are 26k records so manually discovering and editing the culprits is not really an option.
My question is, is there a way to process the data before hand or force character encoding for specific fields as they put into the xml structure?

Thanks


Nick Batt

unread,
Sep 18, 2015, 11:14:22 AM9/18/15
to Lucee
I did just find that wrapping the particular output in htmlparse(myfield) worked on not breaking the XML, though it did wrap it in HTML page tags - I can remove these in a mysql query I should imagine. Still interestedi n a cleaner solution though,

Adam Chapman

unread,
Sep 21, 2015, 2:37:21 AM9/21/15
to Lucee
On your cffile (or equivalent) use the charset="" argument?

Nando Breiter

unread,
Sep 21, 2015, 10:05:41 AM9/21/15
to lu...@googlegroups.com
You can convert the charset on the mySql database itself using the following:

<cfquery name="convert" datasource="yourDatasource">
ALTER DATABASE yourDatabase CHARACTER SET utf8;
</cfquery>

<cfquery name="convert" datasource="yourDatasource">
ALTER TABLE yourTable CONVERT TO CHARACTER SET utf8;
</cfquery>

Run the alter table statement on each table.

I'd suggest making a backup of the database, just in case, but I've used this on production systems to solve character set issues, and it's worked out well for me.





Aria Media Sagl
Via Rompada 40
6987 Caslano
Switzerland

+41 (0)91 600 9601
+41 (0)76 303 4477 cell
skype: ariamedia

On Mon, Sep 21, 2015 at 8:37 AM, Adam Chapman <adam.p....@gmail.com> wrote:
On your cffile (or equivalent) use the charset="" argument?

--
See Lucee at CFCamp Oct 22 & 23 2015 @ Munich Airport, Germany - Get your ticket NOW - http://www.cfcamp.org/
---
You received this message because you are subscribed to the Google Groups "Lucee" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lucee+un...@googlegroups.com.
To post to this group, send email to lu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lucee/a3a8c22f-e04c-4513-ad0e-3fdbe9ed71fe%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages