apparently duplicate integer keys into a table

150 views
Skip to first unread message

jay veedeeh

unread,
Dec 31, 2024, 5:50:46 AM12/31/24
to lua-l
very sporadic (== inexperienced) lua user here, so apologies in advance if the issue is considered trivial.

when porting my tcl-solution to this fun problem

https://adventofcode.com/2024/day/11

to lua I stumbled over this: while building a lookup table of translations from "old integer value" (the key) to "new single or pair of integer values" (the value) the same key (always representing an integer) can occur via different routes. E.g. the number/integer "1" might be the result of the transition 0 → 1 (according to the given task specified in that problem) or come about by splitting, say, the number/string "91" into "9" and "1" (where the splitting is performed with string.sub()).

while I understand (hopefully correctly) that lua handles the interpretation of number vs. string depending on context, it still caught me by surprise that I ended up with a results table (counting occurrences of the different possible keys) that, when printed, had entries like

...
x[1] = 123
...
x[1] = 456

i.e. apparently duplicate keys pointing to separate values (exactly the thing which cannot happen with hash arrays ;)). I guess the issue at hand is that one "1" is interpreted as number/integer and the other "1" is interpreted as a string holding the character "1".
indeed the problem goes away if I enforce numerical interpretation after the string.sub() based splitting by adding +0 to the prospective key.

my real question is this: should not lua categorically interpret keys that are valid integer (or float, for that matter) representations as numbers and thus prevent such "pseudo duplicates"?

a secondary question: how to discriminate between a key that is interpreted as a number and a typographically identical key that seems to be interpreted as a string?

thank you
joerg


Pierre Chapuis

unread,
Dec 31, 2024, 6:17:53 AM12/31/24
to lu...@googlegroups.com
Hello joerg.

Lua values are typed, you can check the type using the type function (it will be "number" or "string").

What did you use to print the table? In general libraries should print in valid Lua format, which means that for a number key it would be x[1] = 123 but for a string key it would be x["1"] = 123.

-- 
Pierre Chapuis
--
You received this message because you are subscribed to the Google Groups "lua-l" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lua-l+un...@googlegroups.com.

Francisco Olarte

unread,
Dec 31, 2024, 6:39:39 AM12/31/24
to lu...@googlegroups.com
On Tue, 31 Dec 2024 at 11:50, jay veedeeh <veede...@gmail.com> wrote:
> to lua I stumbled over this: while building a lookup table of translations from "old integer value" (the key) to "new single or pair of integer values" (the value) the same key (always representing an integer) can occur via different routes. E.g. the number/integer "1" might be the result of the transition 0 → 1 (according to the given task specified in that problem) or come about by splitting, say, the number/string "91" into "9" and "1" (where the splitting is performed with string.sub()).

You can force numeric strings to integer using tonumber, this may have
the added bonus of getting rid of leading zeroes:

Lua 5.3.6 Copyright (C) 1994-2020 Lua.org, PUC-Rio
> tonumber("0123")
123

> while I understand (hopefully correctly) that lua handles the interpretation of number vs. string depending on context, it still caught me by surprise that I ended up with a results table (counting occurrences of the different possible keys) that, when printed, had entries like
>
> ...
> x[1] = 123
> ...
> x[1] = 456
> i.e. apparently duplicate keys pointing to separate values (exactly the thing which cannot happen with hash arrays ;)). I guess the issue at hand is that one "1" is interpreted as number/integer and the other "1" is interpreted as a string holding the character "1".

there are no two "1", there is one <1> and one <"1">, different
values. They both convert to the same string and to the same number.

> indeed the problem goes away if I enforce numerical interpretation after the string.sub() based splitting by adding +0 to the prospective key.

Relying on implicit string->number or viceversa coercions is asking
for trouble. And, in your sample, you are probably printing them with
the wrong conversion. A recent lua should have %q, which is enough if
you only have string and numbers:

> x={[1]="num",["1"]=42}
> for k,v in pairs(x) do print(string.format("x[%q]=%q",k,v)) end
x["1"]=42
x[1]="num"

> my real question is this: should not lua categorically interpret keys that are valid integer (or float, for that matter) representations as numbers and thus prevent such "pseudo duplicates"?

IMO? Never. I work a lot with phone numbers and I have enough trouble
with excel and similar stuff ( note, numeric strings preserve leading
zeroes and sort differently than numbers, i.e., "34" > "123" but 34 <
123.

> a secondary question: how to discriminate between a key that is interpreted as a number and a typographically identical key that seems to be interpreted as a string?

They are typographically identical only if you print them wrong, if
you see my previous sample I put both of them without problems. I
haven't done TCL since the 7.6 days, but if everything is still a
string ( conceptually ) you may be getting bitten by assuming lua does
something similar. It did in part, but I think it does not do it
anymore...
> 1=="1"
false

For that problem you should probably do the split step explicitly, and
insure you only have numbers outside it:
> function split(n)
local s=tostring(n)
if #s % 2 == 0 then
return { tonumber(string.sub(s,1,#s/2)), tonumber(string.sub(s,#s/2+1)) }
else
return { n*2024 }
end
end
> print(table.concat(split(1000),"/"))
10/0
> print(table.concat(split(100),"/"))
202400

Note I cheated and relied on implicit string conversion on the concat
for example brevity.

Francisco Olarte.

jay veedeeh

unread,
Dec 31, 2024, 7:37:47 AM12/31/24
to lua-l
hi pierre,

thank you for your reply. indeed type() reports the difference between the two different `1' keys. should have thought of that myself. regarding printing: I naively just issued print(k, v) to print the k/v pairs in the for loop. ...

br/joerg

jay veedeeh

unread,
Dec 31, 2024, 7:59:00 AM12/31/24
to lua-l
hi francisco,

thanks for the comprehensive explanation. all understood now. will keep the advice in mind not to rely on the implicit conversions  (and to not extrapolate from tcl behaviour...) and to use tostring()/ tonumber() where appropriate. regarding the printing and "%q": for some reason this does not work in my table as in your example: both, strings and numbers are reported in double quotes:

   for k,v in pairs(buf.stones) do
         print(type(k),k, type(v), v)
         print(string.format("%q,%q", k,v))
   end

leads to

...

number  1       number  11
"1","11"
string  80      number  4
"80","4"

...

will probably have to read the docs :).

br/joerg

jay veedeeh

unread,
Dec 31, 2024, 8:13:40 AM12/31/24
to lua-l
correction regarding %q: that happened because I was inadvertently running the script with luajit rather than lua proper. using the latter, "%q" behaves as you said :)

Philippe Verdy

unread,
Jan 2, 2025, 10:59:09 PM1/2/25
to lu...@googlegroups.com
So this is a bug of Luajit's internal library, not correctly handling "%q" for number-type arguments like vanilla Lua, but unconditionally converting the argument to a string and then quoting it in the output of Luajit's "string.format()". Check also what happens between Lua and Luajit if you `string.format("%q", nil)`,  `string.format("%q", 1.23)`   or `string.format("%q", {})` (for types other than integers and strings, note that integers may be internally special-cased compared to other number values, and what happens with short internal strings, or with usertypes and object references). Then report it to Luajit maintainers.

--
You received this message because you are subscribed to the Google Groups "lua-l" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lua-l+un...@googlegroups.com.

Lars Müller

unread,
Mar 12, 2025, 7:03:45 PM3/12/25
to lu...@googlegroups.com

This is not a bug. LuaJIT does not implement Lua 5.4, it's based on 5.1 / 5.2 primarily and cherry-picks features otherwise. And the Lua 5.1 reference manual says:

"The q option formats a string in a form suitable to be safely read back by the Lua interpreter: the string is written between double quotes, and all double quotes, newlines, embedded zeros, and backslashes in the string are correctly escaped when written. [...] whereas q and s expect a string."

So LuaJIT is correctly implementing 5.1 behavior here (and in fact it is necessary that LuaJIT implements it like this for existing 5.1 code that runs on LuaJIT to continue working as-is).
%q also serializing floats precisely as hex is a newer thing.

- Lars

Reply all
Reply to author
Forward
0 new messages