Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Converting from utf-16le to ascii or utf-8

315 views
Skip to first unread message

Simon Geard

unread,
Aug 30, 2023, 2:38:04 PM8/30/23
to
So I have a data file containing ascii characters which is UTF-16LE
encoded (output from Powershell). I'd like to do two things with this
file from tcl on both Windows and Linux:

1) detect that it is utf-16le
2) convert it to ascii (or utf-8)

Looking at the output from [encoding names] on Linux there is no
utf-16le, does it have a different name? It looks to me as if I could
just read the file two bytes at a time and drop the second byte but I
was hoping I could use fconfigure and encoding.

Thanks for any ideas.

Simon

Rich

unread,
Aug 30, 2023, 3:33:24 PM8/30/23
to
Simon Geard <si...@whiteowl.co.uk> wrote:
> So I have a data file containing ascii characters which is UTF-16LE
> encoded (output from Powershell). I'd like to do two things with this
> file from tcl on both Windows and Linux:
>
> 1) detect that it is utf-16le

For detection, a UTF-16 encoded file is /supposed/ to begin with a Byte
Order Mark (https://en.wikipedia.org/wiki/Byte_order_mark) -- so
assuming it has one, this is how to detect it is UTF-16LE.

> 2) convert it to ascii (or utf-8)

It looks like there is no utf-16 'encoding' support yet. There is a
Tip: https://core.tcl-lang.org/tips/doc/main/tip/547.md but it is
marked as Tcl 8.7.

> Looking at the output from [encoding names] on Linux there is no
> utf-16le, does it have a different name? It looks to me as if I could
> just read the file two bytes at a time and drop the second byte but I
> was hoping I could use fconfigure and encoding.

Since you have Linux, it appears that the iconv command handles
converting from UTF-16 LE and BE -- so you might be able to convert on
Linux and then use the converted file afterward.

Peter Dean

unread,
Aug 31, 2023, 1:03:50 AM8/31/23
to
https://wiki.tcl-lang.org/page/Unicode+file+reader

but I just use notepad++ and save as utf-8

Peter
0 new messages