> (read-line (reencode-input-port (open-input-bytes #"\xA3")
"windows-1252"))
"£"
For handling e-mail, see also `generalize-encoding` from `net/unihead`.
> ____________________
> Racket Users list:
> http://lists.racket-lang.org/users
____________________
Racket Users list:
http://lists.racket-lang.org/users
On Mar 3, 2015, at 4:31 PM, Matthew Flatt <mfl...@cs.utah.edu> wrote:
> You can use "windows-1252" as an encoding name with, for example,
> `reencode-input-port`:
>
>> (read-line (reencode-input-port (open-input-bytes #"\xA3")
> "windows-1252"))
> “£"
Perfect!
I went looking for a place where I might add a “windows-1252” search term, but it looks like it might be hard, since the list of supported encodings is apparently platform dependent. Would it make sense simply to attach a free-floating search tag of “windows-1252” to this part of the documentation?
>
> For handling e-mail, see also `generalize-encoding` from `net/unihead`.
That probably saved me another half-hour of searching and head-scratching.
Thanks!
John
(p.s.: no one whose mailer checks DMARC records will get this e-mail, sadly. Can’t wait to change to google groups.)
I see that the documentation suggests that (entity-charset) is supposed to return a symbol. However, it nearly always returns a string. In particular, it appears to me that it returns a symbol only when it returns its default, 'us-ascii.I feel compelled to repair this, but there are two ways to fix it:1) make it match the docs and always return a symbol, or2) change the docs and the default to return a string.It looks to me like #2 will break (less) code, though it's certainly possible that people depend on the default value's being a string.