how to `read-char` from port using `latin-1` encoding?

47 views
Skip to first unread message

Matthew Butterick

unread,
Mar 31, 2017, 3:55:59 PM3/31/17
to Racket Users
IIUC when `read-char` reads from a port it assumes the port uses UTF-8 encoding. [1] 

(My Racketuition suggests there might be a parameter called `current-port-encoding` that controls what encoding is applied to the port, but apparently not.)

So instead, one must convert the port explicitly. OK, this seems to work:

(open-input-string (bytes->string/latin-1 (port->bytes port)))

The problem is that I'm reading all the bytes first, which defeats the port-ishness of the operation.

But if I try the promising-sounding `reencode-input-port`:

(reencode-input-port port "latin-1")

This doesn't work, because it relies on `bytes-open-converter`, which apparently doesn't know about `latin-1` encoding (a teeny bit surprising since other racket/base functions deal with this encoding)

Hence the question: is there a smarter way to read characters from a `latin-1` encoded port?


Matthew Flatt

unread,
Mar 31, 2017, 4:05:04 PM3/31/17
to Matthew Butterick, Racket Users
Try "latin1" (no hyphen).

I'm not sure of the standard for these names, but probably Racket's `latin-1` functions are misnamed.
--
You received this message because you are subscribed to the Google Groups "Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jon Zeppieri

unread,
Mar 31, 2017, 4:08:38 PM3/31/17
to Matthew Butterick, Racket Users
Try using "iso-8859-1" as the name of the encoding, instead of "latin-1." -J

Matthew Butterick

unread,
Mar 31, 2017, 4:15:11 PM3/31/17
to Racket Users
OK thanks. Both "latin1" and "iso-8859-1" do work with `reencode-input-port`.
Reply all
Reply to author
Forward
0 new messages