On Fri, 7 Jun 2024 at 17:26, Rett Berg <
goog...@gmail.com> wrote:
> On lua 5.3 strings don't format invalid utf8
> > ('%q'):format'\xF3'
> ""
> > ('%q'):format'\x80'
> ""
Maybe there is a problem with your stdout visualizing (terminal).
Using "terminator" on debian linux:
$ lua5.3
Lua 5.3.6 Copyright (C) 1994-2020 Lua.org, PUC-Rio
> string.byte("abc",1,-1)
97 98 99
> string.byte(('%q'):format'\x80',1,-1)
34 128 34
> ('%q'):format'\x80'
"�"
I do not know how it will transmit across the line, my side has
mojibake, the generic replacement character ( question mark on a pi/4
tilted square ) between the quotes, >>"?"<<
> It looks like there IS a byte inside the quotes, it just doesn't display.
> > s = ('%q'):format'\x80'
> > #s
> 3
When in doubt, use string.byte to dump the real contents.
> Shouldn't the result of ('%q'):format'\x80' be "\x80" instead?
IIRC lua is "8 bit clean" somehow, it can read that back ( although my
terminal cannot display it properly, it probably can if I set it to
latin 1 or some other full 8 bit code.
( Manual states "Lua is 8-bit clean: strings can contain any 8-bit
value, including embedded zeros ('\0'). Lua is also encoding-agnostic;
it makes no assumptions about the contents of a string.", since
compilation (load) is done via intermediate strings it should work:
> s = ('%q'):format'\x80'
> f = load("return "..s)
> t = f()
> string.byte(t,1,-1)
128
Works for me at least.
Francisco Olarte.