Encodings and Source issue

47 views
Skip to first unread message

Ulf Dunkel

unread,
Oct 23, 2013, 11:20:56 AM10/23/13
to iloc...@googlegroups.com
Hi Jean et al.,

how are file encodings treated in the current version of iLocalize?

I had an issue in an Xcode project where some .strings files where
included in UTF-16LE format, while all our Xcode .strings files should
be encoded and exported on build phases in UTF-8 format. Thus, some of
these .strings files went into the relevant iLocalize project in
UTF-16LE encoding first, and after I updated the encoding in my sources
to UTF-8, each project update in iLocalize still exports in UTF-16LE
encoding. When I check one of these files in the File inspector of
iLocalize, it is correctly shown as UTF-8.

Do I have to drop the project and restart, because iLocalize still
exports files from an old app version in its Source folder? I also saw
that the last app version in the Source folder still had files and
folders which no longer were in the last app version the iLocalize
project had been updated from.

Example: The current version of the app has only five own frameworks
(one was dropped), but the app in the Source folder still has six
frameworks. The relevant .strings files in the app of the Source folder
really are UTF-16LE while the same files as shown in the iLocalize
editor are UTF-8.

I guess it would be best to move the last app from the Source folder
into the /Source/History/ folder but somehow this didn't happen in my case.

Best regards,
---Ulf Dunkel

Jean Bovet

unread,
Nov 4, 2013, 10:33:03 PM11/4/13
to iloc...@googlegroups.com
Hi Ulf,

iLocalize will try to preserve the encoding of the original file as much as possible. You can force the encoding to be of a certain type by choosing Edit > Encoding. Hope this helps,

Jean
> --
> You received this message because you are subscribed to the Google Groups "iLocalize" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to ilocalize+...@googlegroups.com.
> To post to this group, send email to iloc...@googlegroups.com.
> Visit this group at http://groups.google.com/group/ilocalize.
> For more options, visit https://groups.google.com/groups/opt_out.

Ulf Dunkel

unread,
Nov 5, 2013, 4:53:44 AM11/5/13
to iloc...@googlegroups.com
Hi Jean.

I guess there is something else going wrong. As I tried to describe, it
seems as if the encoding of the originally and firstly imported file
will be used for export anytime, even if the encoding has changed by
updating the project and thus importing new file versions with a changed
encoding.

Does iLocalize pick the encoding to use from the file in the current
project or from a file in the history section?

- - - - -

Jean-Pierre Kuypers

unread,
Nov 6, 2013, 9:55:59 AM11/6/13
to iloc...@googlegroups.com
Problem too…

I have an English .strings file "Unicode (UTF-16BE)" (iLocalize says).
I translate it into French and take care to have a French .strings file "Unicode (UTF-16BE)" too.

Using Xcode to open the file in my French.lproj folder, I see garbage :
¿"¿u¿r¿l¿-¿f¿r¿a¿n¿c¿a¿i¿s¿e¿"¿ ¿=¿ ¿"¿h¿t¿t¿p¿:¿/¿/¿p¿e¿r¿s¿o¿.¿u¿c¿l¿o¿u¿v¿a¿i¿n¿.¿b¿e¿/¿j¿e¿a¿n¿-¿p¿i¿e¿r¿r¿e¿.¿k¿u¿y¿p¿e¿r¿s¿/¿l¿o¿g¿i¿c¿i¿e¿l¿T¿r¿a¿d¿u¿i¿t¿/¿W¿h¿a¿t¿R¿o¿u¿t¿e¿/¿"¿;¿

With TextWrangler, I see :
" u r l - f r a n c a i s e "   =   " h t t p : / / p e r s o . u c l o u v a i n . b e / j e a n - p i e r r e . k u y p e r s / l o g i c i e l T r a d u i t / W h a t R o u t e / " ; 

After the TextWrangler "File -> Reopen Using Encoding -> Unicode (UTF-16)" command, I see :
Ok!

Using the Terminal "file -I" command, I get :
for the original English file : text/plain; charset=utf-16be
for the French file from iLocalize : application/octet-stream; charset=binary
for the French file saved as "Unicode (UTF-16)" using TextWrangler : text/plain; charset=utf-16be

I put this file in my French.lproj folder, I open my projetc and iLocalize says that the file as the "Unicode (UTF-16BE) - avec BOM" encoding.

When the .strings file is with the "charset=binary", it can't be handled with Xcode nor used by the application.

So my questions would be : what code(s) to choice in iLocalize to have the good code for the other apps (Xcode, TextWrangler, the app itself…)?

And sorry to be using a native language with all those accents and others hieroglyphics...

Ulf Dunkel

unread,
Nov 6, 2013, 12:27:43 PM11/6/13
to iloc...@googlegroups.com
Salut Jean-Pierre, et al.

> So my questions would be : what code(s) to choice in iLocalize to have
> the good code for the other apps (Xcode, TextWrangler, the app itself…)?

We keep using all .strings files in UTF-8 (no BOM) encoding format and
force each Xcode project not to re-convert to the (default) UTF-16
format when Xcode builds the app.

Apple recommends UTF-16 (but doesn't talk about which exact UTF-16
format, Big Endian, Little Endian, BOM, no BOM) for speed reasons.

But because .strings files aren't that time-throttle at all and because
Apple also recommends to save "weight" in your apps, UTF-8 is the much
more weight-saving format at all. And it can be edited with much more
editors. As you have described, even Xcode wasn't able to handle the
UTF-16LE (BOM) format properly.

For must languages, UTF-8 is almost 1-byte per character. And some guys
still do not know that even UTF-8 is able to encode ALL AVAILABLE
Unicode characters.

Just my 2 cents.
---Ulf Dunkel

Jean-Pierre Kuypers

unread,
Nov 7, 2013, 10:29:46 AM11/7/13
to iloc...@googlegroups.com
Thanks for your note.

On my side I do some tests with several encodings.
Here my little summary:

 saved by TextWrangler:      seen by iLocalize:                   seen by "file -I":


 UTF-8.strings:              Unicode (UTF-8)                      text/plain; charset=utf-8

 UTF-8 BOM.strings:          Unicode (UTF-8) - avec BOM           text/plain; charset=utf-8

 UTF-16.strings:             Unicode (UTF-16BE) - avec BOM        text/plain; charset=utf-16be

 UTF-16 LE.strings:          Unicode (UTF-16LE) - avec BOM        text/plain; charset=utf-16le

 UTF-16 no BOM.strings:      Unicode (UTF-16BE)                   application/octet-stream; charset=binary

 UTF-16 LE no BOM.strings:   Unicode (UTF-16LE)                   application/octet-stream; charset=binary


It seems that UTF-16 without BOM are to be avoided.


Ulf Dunkel

unread,
Nov 7, 2013, 10:52:36 AM11/7/13
to iloc...@googlegroups.com
Hi Jean-Pierre.

> It seems that UTF-16 without BOM are to be avoided.

It's no big secret that all UTF-16 formats should use BOM.

---Ulf Dunkel

Jean-Pierre Kuypers

unread,
Nov 8, 2013, 10:54:41 AM11/8/13
to iloc...@googlegroups.com
Le jeudi 7 novembre 2013 16:52:36 UTC+1, UlfDunkel a écrit :
It's no big secret that all UTF-16 formats should use BOM.

Indeed, and the (English) file I received to translate was in UTF-16 without BOM.

I'll try to explain the problem to the (English speaking) developer...

Ulf Dunkel

unread,
Nov 8, 2013, 11:15:03 AM11/8/13
to iloc...@googlegroups.com
>> It's no big secret that all UTF-16 formats should use BOM.
>
> Indeed, and the (English) file I received to translate was in UTF-16
> without BOM.
>
> I'll try to explain the problem to the (English speaking) developer...

Maybe that's the Xcode default. :-(

Because we always switch all projects to output and handle all .strings
files in UTF-8 format, I do not know and do not want to know what UTF-16
format is used by Xcode.

---Ulf Dunkel
Reply all
Reply to author
Forward
0 new messages