Proposal: C API function to check whether a table is a sequence

195 views
Skip to first unread message

Lars Müller

unread,
Sep 15, 2025, 7:43:10 AM (12 days ago) Sep 15
to lu...@googlegroups.com
Most languages, especially the system programming languages where Lua is
typically embedded, do not have a single "table" concept, but rather two
distinct concepts for "arrays" and (hash) "maps".

This means we often want to decide whether a table is a sequence.

It is currently both cumbersome and inefficient to do this: As far as I
know, the only way to do this reliably is to iterate the table using
lua_next,

which I don't think is really satisfactory for the very common case of
checking that a table with t[1] ~= nil is indeed an array.

This means that both Lua and, say, C code embedding Lua typically
refrains from doing proper runtime type checking, weakening Lua's strong
typing,

and resulting in incorrect behavior, e.g. when tables with holes are
treated as sequences and cut off. [1]

I would propose the addition of a new C API function: int
lua_issequence(lua_State *L, int index),
which tells you whether the value at the given index is (1) a table and
(2) more specifically, (only) a sequence, meaning it has keys 1, 2, ...,
#t, all with associated non-nil values.

I believe that on the Lua side of things, this could be computed very
efficiently [2], especially for the common case:

Compute #t, then check that t[1], ..., t[#t] are non-nil. Check that
t[#t+2], ..., t[arrlen] are nil. Iterate the hash part to verify that
all keys are valid integers in the range from [1, #t].

It could even be considered to expose this in the table library, say, as
"table.issequence", which would be useful - e.g. serializers often want
to make decisions on whether a table is a sequence - but that seems less
important and as if it might be too "oddly specific" a feature.

Kind regards,
Lars



[1]: e.g. https://github.com/mlua-rs/mlua/issues/640
[2]: Now that type tags live in a separate slice, you just have to check
that all the type tags in that range are non-nil. Especially if the
compiler can vectorize that, you can probably achieve quite some
impressive throughput. But even if it can't, you can check at least 8
type tags at a time on a 64-bit machine. The most common case is
probably that there is no hash part; and if there is a hash part, we
will likely hit a non-sequence key very soon.

Tomás Guisasola

unread,
Sep 15, 2025, 8:12:44 AM (12 days ago) Sep 15
to lu...@googlegroups.com
Hi Lars

> It is currently both cumbersome and inefficient to do this: As far as I
> know, the only way to do this reliably is to iterate the table using
> lua_next,

Yes, that is the only reliable way to make sure a table is also a sequence.

> I would propose the addition of a new C API function: int
> lua_issequence(lua_State *L, int index),
> which tells you whether the value at the given index is (1) a table and
> (2) more specifically, (only) a sequence, meaning it has keys 1, 2, ...,
> #t, all with associated non-nil values.
>
> I believe that on the Lua side of things, this could be computed very
> efficiently [2], especially for the common case:

I don't think this could be "computed very efficiently" even in C,
unless Lua implements it using some internal magic...

> Compute #t, then check that t[1], ..., t[#t] are non-nil. Check that
> t[#t+2], ..., t[arrlen] are nil. Iterate the hash part to verify that
> all keys are valid integers in the range from [1, #t].

All that checks will have to iterate the whole table; no way to avoid that!

Regards,
Tomás

Patrick Kurth

unread,
Sep 15, 2025, 8:30:40 AM (12 days ago) Sep 15
to lu...@googlegroups.com


On Sep 15, 2025 13:43, 'Lars Müller' via lua-l <lu...@googlegroups.com> wrote:

This means we often want to decide whether a table is a sequence.

Is this true? Why? I use lua for a lot and I never wanted to do this. Furthermore, there are many discussions about the length operator and sequences vs. non-sequences, arrays with holes etc. Why is this such a problem? I follow the gigo-principle for this (garbage in, garbage out). If a function expects an array-like table and you hand it something else then that's wrong usage of the function. How do all these arrays with holes come into existence anyway? I always wondered. If you use table.insert to build arrays they stay arrays. Could you provide me with real-world code where this is a problem?

Kind regards, 
Patrick 

Martin Eden

unread,
Sep 15, 2025, 10:10:48 AM (12 days ago) Sep 15
to lu...@googlegroups.com

On 2025-09-15 13:43, 'Lars Müller' via lua-l wrote:
> I would propose the addition of a new C API function: int
> lua_issequence(lua_State *L, int index),
> which tells you whether the value at the given index is (1) a table
> and (2) more specifically, (only) a sequence, meaning it has keys 1,
> 2, ..., #t, all with associated non-nil values.

Hello Lars,

Instead I would propose "sequence" datatype.

* Sequence is base datatype
* Sequence is list of objects of base datatypes
* Index of sequence element is natural number (integer >= 1)

Tables envelop sequences by adding indexing by any base datatype
(aka "associative arrays" aka "hash part") instead of natural number
indexing.

This way we can solve "why # returns wrong length of my table?!"
("#" is applied to sequence part of table) and get rid of weird
"select('#', ...)": "..." will become sequence and "#..." be
length of it.

Strings are sequences already. Most of "table" module functions
are designed for sequences.

Note that this structure allows "{ 1, nil, { 3 }}" but not
"{ 128, 0, 128, Type = 'color' }".

-- Martin

bil til

unread,
Sep 15, 2025, 10:30:11 AM (12 days ago) Sep 15
to lu...@googlegroups.com
Adding a new datatype is a BIG thing.

Adding a new function, or possibly also adding a flag into the table
properties would be a relatively easy "add-on" task I think. Such flag
could be set to 1 for an empty table (by default on table creation),
and if a hash index / a named index is added, it is set to zero. (so I
assume, it would do this on any such "named" addition, but such "named
addition" I assume costs anyway quite a bit of CPU cycles, so killing
a flag to 0 in such cases will not add any remarkable overhead).

I also think, it would be nice to have such a function as the proposed
"lua-issequence", as table iteration "array-like" via index of course
is faster than by the next functionality I think. (and also just due
to usage, many users might just personally prefer to use the
"array-like" number indexing, if this is possible... ).

Am Mo., 15. Sept. 2025 um 16:10 Uhr schrieb 'Martin Eden' via lua-l
<lu...@googlegroups.com>:
> --
> You received this message because you are subscribed to the Google Groups "lua-l" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to lua-l+un...@googlegroups.com.
> To view this discussion visit https://groups.google.com/d/msgid/lua-l/5ea21a79-f780-425a-bdeb-294a82e5e6ec%40disroot.org.

Sean Conner

unread,
Sep 15, 2025, 3:29:03 PM (12 days ago) Sep 15
to lu...@googlegroups.com
It was thus said that the Great Patrick Kurth once stated:
One such case is serializing a Lua table into JSON or CBOR. Both formats
make a distinction between an array (a list of values) and a map (name/value
pairs) and there are existing Lua modules available to do this. But using a
general purpose module to encode JSON/CBOR can run into issues with picking
the right format for a Lua table.

For my own CBOR module [1] I rely upon some heuristics to encode a Lua
table as an array or map but it's not always 100% reliable. Yes, there is a
way to enforce a particular encoding for a table (by setting a field in a
metatable) but another CBOR module might do it differently (or not at all).

It might be temping to say, "Just pick the array or map function for a
given table," but what is a JSON/CBOR encoder going to do with a table like:

data =
{
path = "/path/to/some/file",
sizes = { 100 , 200 , 400 , 800 },
}

-spc

[1] https://github.com/spc476/CBOR

Sainan

unread,
Sep 15, 2025, 3:45:48 PM (12 days ago) Sep 15
to lu...@googlegroups.com
I wrote a json.encode function myself and I think iterating through the whole table is simply the correct thing to do. If you think that's too expensive, I can imagine `t[1] ~= nil` a valid compromise.

I've ran an intensive task involving json.decode -> processing -> json.encode with a profiler attached and the 'is index based table' part of it is so insignificant that the profiler says it used no CPU time at all.

So, I think this a solution in search of a problem.

-- Sainan

André Naef

unread,
Sep 15, 2025, 3:47:41 PM (12 days ago) Sep 15
to lua-l
I use Lua a lot with JSON, and while #t > 0 usually does the trick, an issue emerges in the distinction of an empty object vs. empty array. Empty objects are rare, but are sometimes required, such as with certain LLM function calling when the function takes no arguments.

Lars Müller

unread,
Sep 16, 2025, 7:06:58 AM (12 days ago) Sep 16
to lu...@googlegroups.com
Happy to expand on the rationale. Indeed one big reason are various forms of serialization and conversion, for example the linked mlua issue [1].
I've written various pieces of such code in Lua and C++, among them also a serializer and deserializer for JSON (as well as custom serialization formats).
I would expect that usually, especially for Lua code, iterating through the table is probably not that big of a deal;
especially not if you do heavier work (such as serializing) later anyways, which is why I would usually bite the bullet.
But even in Lua you can get a ~2x difference on some simple examples [2].
t[1] ~= nil is generally not a sufficient compromise because it may result in unacceptable bugs such as a serializer losing data for valid Lua tables.
If you want to check the precondition of something being a sequence, t[1] ~= nil simply does not do that.
It doesn't tell you anything about holes or remaining keys.
Only in the case that t[1] is nil do you get certain decision that something is *not* a sequence.
(Indeed this is probably a common case; but the proposed C API function can, and my proposed implementation would, leverage that:
If t[1] is nil, #t is 0. Therefore, if the table is nonempty, it will not be considered a valid sequence.)

Which is why I am proposing a C API function, focusing specifically on C API users, not Lua programmers.
I think this would fit in very well with the existing API functions:
The C API has always been geared towards offering more flexible, more performant (if used properly) APIs.
This is why you could preallocate tables, for example - long before there was a corresponding Lua library function.
At the same time, it also strives to provide (few) convenience helpers in the auxiliary library.
The proposed function ticks all the boxes:
I believe it can be provided at relatively small development cost [3], it is useful, it is future-proof,
it can exploit implementation details to achieve superior performance.
Lua wants to allow C API users to do things efficiently, and it provides convenience helpers in the auxiliary library [4].
The proposed feature is both.

From another angle, I can tell you what is likely to happen if such an API function is not provided.
If systems programmers (or even just Lua programmers) are asked to write code that deals with sequences,
the most likely outcome given the current API is that they will produce lax and/or buggy code.

If they need to determine whether a table is a sequence, they will probably check for table length (or emptiness), and then do something with t[1], ..., t[#t].
At best this results in a crash for tables with holes (as they hit a nil), at worst you get a quiet bug, with part of the table being quietly ignored / "cut off".
(This also means that you can pass a table completely unsuitable as a sequence, say {foo = 1, bar = 2}, usually a clear bug, and it will be treated as if it was just an empty sequence.)
The mlua issue was about mlua quietly throwing away parts of tables in serialization.
Systems programmers are not fond of having to do a full table iteration, through the C API,
with limited potential for optimization, as implementation details are hidden,
just to decide whether a table is a sequence (array), the most common container type in systems programming.

So what happens is: They don't. And the resulting APIs have avoidable flaws because of it.

"Garbage in, garbage out" ("gigo") is not the paradigm of Lua [5].

To sum it up: I disagree strongly about this being "a solution in search of a problem".
We can talk about how big the problem is and how it should be addressed best, but it undeniably exists and is relevant.

- Lars



Footnotes:

[1]: https://github.com/mlua-rs/mlua/issues/640

[2]: Benchmarking the following code against the equivalent code with the is_sequence call removed, I can measure a ~2x speedup on Lua 5.4.7; and even on LuaJIT, the version without is_sequence is ~1.3x faster. Clearly no blanket statement along the lines of "checking whether a table is a sequence has negligible cost" can be made, even in Lua, even if you do linear time operations on the table. And mind you, my proposal is primarily about processing Lua-provided tables in a systems programming language, though there would also be potential merit for a Lua library function as an aside. Unsurprisingly, an entire second iteration of a table - with non-negligible work per iteration - comes at a cost. How much cost remains with a Lua-provided API function would remain to be seen, but I'm fairly sure there could be significant speedups in some cases.

local function is_sequence(t)
    local len = #t
    local got = 0
    for k in pairs(t) do
        if type(k) ~= "number" or k < 1 or k > len or math.floor(k) ~= k then
            return false -- strict check: don't want any extra keys
        end
        got = got + 1
    end
    return got == len
end
local function tmin(t)
    assert(is_sequence(t)) -- remove this
    local min = 0
    for i = 1, #t do
        min = math.min(min, t[i])
    end
    return min
end
local t = {}
local j = 1
for i = 1e6, 1, -1 do
    t[j] = i
    j = j + 1
end
print(tmin(t))

[3]: I would have provided a patch myself, but I've read that the Lua team does not accept patches. It's just a simple utility function that is barely coupled with the rest of the codebase, except of course slightly with the table representation (but that is unlikely to change substantially and if so, the function is easy to update).

[4]: These helpers have the added benefit of nudging C API users in the right direction.

[5]: In general there is a tradeoff to make here - runtime type checks come at a cost, both in terms of effort and performance -
and in Lua land we do like strong typing (runtime type checking) to a reasonable extent, especially when we're writing Lua APIs in a systems programming language.
If you take "gigo" to its logical extreme, you end up with a weakly typed language like JS where you can do whatever and in the end some "undefined[object Object]"
falls out and you're left wondering how in tarnation that happened.
Lua did this much better.

Sainan

unread,
Sep 16, 2025, 7:16:52 AM (12 days ago) Sep 16
to lu...@googlegroups.com
To be clear, in my case, the json.{encode,decode} functions are implemented in C/C++. I disabled inlining for the respective function, attached a debugger, and after ~10s of intensive work, I checked the profiler stats.

-- Sainan

涵曦

unread,
Sep 16, 2025, 8:16:27 AM (11 days ago) Sep 16
to lu...@googlegroups.com
You can take a look at the implementation of this BSON library.

If it has the __len metamethod, it’s treated as an array.

If it has the __pairs metamethod, it’s treated as a hash table.

In other cases, the is_rawarray function is used to determine the type.



'Sainan' via lua-l <lu...@googlegroups.com> 于 2025年9月16日周二 19:16写道:
To be clear, in my case, the json.{encode,decode} functions are implemented in C/C++. I disabled inlining for the respective function, attached a debugger, and after ~10s of intensive work, I checked the profiler stats.

-- Sainan

--
You received this message because you are subscribed to the Google Groups "lua-l" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lua-l+un...@googlegroups.com.

Lars Müller

unread,
Sep 16, 2025, 9:06:31 AM (11 days ago) Sep 16
to lua-l
That does not look like correct logic to test whether a table is a sequence I'm afraid.
I see multiple potential bugs at a single glance, e.g. no proper check for holes?
It also relies on lua_next giving you the "first" key - assuming there is an array part - which it need not do.

The metamethod checks also don't make a lot of sense, in my opinion.
It's completely reasonable for an "array-like" data structure to implement __pairs, for example.

- Lars


(p.s. if anyone is wondering why i'm sending from a gmail address now, it's because the interface at https://groups.google.com/g/lua-l is more convenient)

Andrey Dobrovolsky

unread,
Sep 16, 2025, 10:32:46 AM (11 days ago) Sep 16
to lu...@googlegroups.com
Hi Lars,

'#' operator should be relied on only if You are already sure that it
is applied to the sequence. In case Your table is the sequence with
holes it may return the position of one of these holes. So in some
cases Your implementation of is_sequence() will fail - rare enough but
surprising a lot. I mean the case when the actual number of alive
elements is equal to the hole position returned by '#'.

I'd like to propose to find the maximum index and compare it with an
actual number of elements present.

local is_sequence = function(t)
local max = 0
local got = 0
for k in pairs(t) do
if math.type(k) ~= "integer" then
return false -- strict check: integers only
end
if k > max then max = k end
got = got + 1
end
return got == max
end

Regards,
Andrew

вт, 16 вер. 2025 р. о 16:06 Lars Müller <appgu...@gmail.com> пише:
> To view this discussion visit https://groups.google.com/d/msgid/lua-l/f14f5728-124d-49de-bb9e-ca2e9c885c7en%40googlegroups.com.
Message has been deleted

Andrey Dobrovolsky

unread,
Sep 16, 2025, 11:21:47 AM (11 days ago) Sep 16
to lu...@googlegroups.com
Ops, update: index must be positive integer

if math.type(k) ~= "integer" or k < 1 then
return false -- strict check: integers only
end

-- Andrew

вт, 16 вер. 2025 р. о 17:32 Andrey Dobrovolsky
<andrey.dobro...@gmail.com> пише:

Sainan

unread,
Sep 16, 2025, 11:33:14 AM (11 days ago) Sep 16
to lu...@googlegroups.com
To detect holes, you really ought to have a 'k' initialised alongside the iteration (k = 1) and for each key, check for equality to k then increment k.

-- Sainan

Thijs Schreijer

unread,
Sep 16, 2025, 11:34:18 AM (11 days ago) Sep 16
to lu...@googlegroups.com

On Tue, 16 Sep 2025, at 15:06, Lars Müller wrote:
> That does not look like correct logic to test whether a table is a sequence I'm afraid.
> I see multiple potential bugs at a single glance, e.g. no proper check for holes?
> It also relies on lua_next giving you the "first" key - assuming there is an array part - which it need not do.
>
> The metamethod checks also don't make a lot of sense, in my opinion.
> It's completely reasonable for an "array-like" data structure to implement __pairs, for example.
>
> - Lars

I think this is the exact problem; any quick check is not fail-safe. And being fail-safe is expensive.
Then again; when do you need to be fail-safe? If your application is passing a table with both hash and array keys, to a JSON encoder, then the problem is primarily with your application.

So in my experience the simple checks suffice.

Thijs

PS. The one thing that has always been a pain point is how to encode an empty table, and that's a problem none of the solutions offered solves (except for a strict new array type).

Andrey Dobrovolsky

unread,
Sep 16, 2025, 11:56:45 AM (11 days ago) Sep 16
to lu...@googlegroups.com
Hi Sainan,

> To detect holes, you really ought to have a 'k' initialised alongside the iteration (k = 1) and for each key, check for equality to k then increment k.

But according to Reference Manual 5.4 section 6.1 about next():

The order in which the indices are enumerated is not specified, even
for numeric indices. (To traverse a table in numerical order, use a
numerical for.)

-- Andrew

вт, 16 вер. 2025 р. о 18:34 'Thijs Schreijer' via lua-l
<lu...@googlegroups.com> пише:
> --
> You received this message because you are subscribed to the Google Groups "lua-l" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to lua-l+un...@googlegroups.com.
> To view this discussion visit https://groups.google.com/d/msgid/lua-l/b1218ebb-f7e8-46bf-9e8f-552063c51ff8%40app.fastmail.com.

Andrey Dobrovolsky

unread,
Sep 16, 2025, 12:25:02 PM (11 days ago) Sep 16
to lu...@googlegroups.com
> _VERSION
Lua 5.4
> t={}
> for i=1,100 do t[math.random(1000000)]=1 end
> for k in pairs(t) do io.write(k, " ") end
615951 600078 427613 204475 796169 500388 886469 421397 266713 91835
940450 449596 819422 484277 320702 146078 543716 97059 27466 744636
249592 373042 399968 572689 997505 347901 353236 121970 770941 529261
335714 405692 577525 54161 620201 616900 250254 993530 49088 492845
730934 529659 345637 516057 640914 832535 917377 538682 95834 681305
23446 301577 242396 788849 649052 104373 459153 111210 571332 447751
910227 527947 125257 90896 931256 659731 986376 743866 228569 152624
975578 721934 239241 120145 837174 682323 59667 407647 155173 163528
653531 412224 156223 552879 338694 14193 96951 208135 98541 422011
488306 77145 584193 461828 398267 614295 824950 812916 174878 166625 >

-- Andrew

вт, 16 вер. 2025 р. о 18:56 Andrey Dobrovolsky
<andrey.dobro...@gmail.com> пише:

Sainan

unread,
Sep 16, 2025, 12:44:31 PM (11 days ago) Sep 16
to lu...@googlegroups.com
Well, there's definitely a few holes there, so I see no problem. :^)

-- Sainan

Sainan

unread,
Sep 16, 2025, 12:49:02 PM (11 days ago) Sep 16
to lu...@googlegroups.com
This holds, that's good enough for me:

local t = {}
for i = 1000000, 1, -1 do
t[i] = i
end
local i = 1
for k in pairs(t) do
assert(i == k)
i = i + 1
end

-- Sainan

Andrey Dobrovolsky

unread,
Sep 16, 2025, 1:51:22 PM (11 days ago) Sep 16
to lu...@googlegroups.com
> t={}
> for i=1,3000 do t[math.random(1000)+500]=1 end
> for i=1,5000 do t[math.random(998) + 500] = nil end
> n = 0 for k in pairs(t) do n = n + 1 io.write(k, " ") end print("n = ", n)
1106 1299 1499 1500 575 605 699 758 939 n = 9
> for i=1,500 do t[i]=1 end
> n = 0 for k in pairs(t) do n = n + 1 io.write(k, " ") end print("n = ", n)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49
50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72
73 74 75 76 77 78 79 80 81 82 1106 84 85 86 87 88 89 90 91 92 93 94 95
96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113
114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130
131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147
148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164
165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181
182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198
199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215
216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232
233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249
250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266
267 268 269 270 271 272 273 274 275 1299 277 278 279 280 281 282 283
284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300
301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317
318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334
335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351
352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368
369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385
386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402
403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419
420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436
437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453
454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470
471 472 473 474 475 1499 1500 478 479 480 481 482 483 484 485 486 487
488 489 490 491 492 493 494 495 496 497 498 499 500 575 605 699 758
477 476 276 939 83 n = 509
>
Here we can see a contiguous 1-500 segment broken. Lua team is not joking.

-- Andrew

вт, 16 вер. 2025 р. о 19:49 'Sainan' via lua-l <lu...@googlegroups.com> пише:
> --
> You received this message because you are subscribed to the Google Groups "lua-l" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to lua-l+un...@googlegroups.com.
> To view this discussion visit https://groups.google.com/d/msgid/lua-l/I38LdQrz2cNDhGCYp6Scm_z66M4fDILzLoM5LOJ3OZ9nBftHDoV7cndJKF-XeSeXQ_0wt7JDFvnnniyvqqUW5MXtpqZ3MDmva6-4K8q80VQ%3D%40calamity.inc.

Sainan

unread,
Sep 16, 2025, 2:11:17 PM (11 days ago) Sep 16
to lu...@googlegroups.com
Okay, but that's not a contiguous table in whole, so it's irrelevant if the index assumption breaks, we need to return false anyway.

-- Sainan

Xmilia Hermit

unread,
Sep 16, 2025, 2:17:52 PM (11 days ago) Sep 16
to lu...@googlegroups.com

> This holds, that's good enough for me:

Until it doesn't:

local t = {}
for i = 19, 1, -1 do
t[i] = i
end
local i = 1
for k in pairs(t) do
assert(i == k)
i = i + 1
end

Regards,
Xmilia


Sainan

unread,
Sep 16, 2025, 2:20:30 PM (11 days ago) Sep 16
to lu...@googlegroups.com
> Until it doesn't

Which version of Lua? Because as far as I can tell, it holds with any size table across various versions.

-- Sainan

Xmilia Hermit

unread,
Sep 16, 2025, 2:29:27 PM (11 days ago) Sep 16
to lu...@googlegroups.com
HEAD(9ea06e61f20ae34974226074fc6123dbb54a07c2) of the GitHub mirror.

Regards,
Xmilia

Sainan

unread,
Sep 16, 2025, 2:32:02 PM (11 days ago) Sep 16
to lu...@googlegroups.com
Damn, a breaking change in Lua 5.5. Well, maybe something to consider before upgrading.

-- Sainan

Lars Müller

unread,
Sep 16, 2025, 2:38:19 PM (11 days ago) Sep 16
to lua-l
Hi Andrey,

I'm a little confused here. I'm fairly sure my implementation is correct.
I know how the length operator works: It gives you any valid border. This is not some kind of "undefined behavior".
It is well defined for all tables, though there may be multiple valid borders for tables with holes.
But that is completely fine for my purposes, because I simply count the (distinct) integer keys in the range 1, 2, ..., len,
where len is the table length "claimed" by #t. If there is a hole, there will be a missing key, hence got < max and the function will return false.
If conversely there is an entry at a higher index, the check "k > len" will succeed, and thus it will also return false.
In either case it works correctly. Am I missing something?
Do you have a concrete example where you believe it would not be guaranteed to be correct as per the reference manual?

There are also alternative implementations. If you don't mind iterating the table twice,
you can simply count the iterations of an ipairs loop (which stops at the first nil) and compare with the count produced using pairs.

By the way, there is a bug in your implementation: is_sequence({[-1] = 1, [1] = 2, [3] = 3}) is true.
You need to reject keys < 1, at least under my strict sequence definition where I want keys 1, 2, ..., n.
Otherwise it looks like a valid alternative.

Best regards,
Lars

Sewbacca

unread,
Sep 16, 2025, 2:50:01 PM (11 days ago) Sep 16
to lu...@googlegroups.com
On 9/16/2025 8:11 PM, 'Sainan' via lua-l wrote:
Okay, but that's not a contiguous table in whole, so it's irrelevant if the index assumption breaks, we need to return false anyway.

-- Sainan

For randomly ordered valid sequences this assumption also fails, idk if that's a usecase to be considered.

-- test.lua
local M = 5000

local function test(n)
	local indicies = {}

	for i = 1, n do
		indicies[i] = i
	end

	local res = { }
	for i = 1, n do
		local index = table.remove(indicies, math.random(1, #indicies))
		res[index] = index
	end

	assert(#res == n)
	for i = 1, #res do assert(res[i] == i) end

	local i = 0
	for k in pairs(res) do
		i = i + 1
		assert(k == i, string.format("#res = %d inconsistent key i=%d k=%d", #res, i, k))
	end
end

for i = 1, M do
	test(i)
end

-- bash
$ lua test.lua
test.lua:23: #res = 10 inconsistent key i=9 k=10
stack traceback:
        [C]: in function 'assert'
        test.lua:23: in local 'test'
        test.lua:28: in main chunk
        [C]: in ?

-- Sewbacca

Sainan

unread,
Sep 16, 2025, 2:59:54 PM (11 days ago) Sep 16
to lu...@googlegroups.com
Thank you. Now we have a solid test case for any 'is array' implementation:

local t = {}
t[4] = 4
t[5] = 5
t[10] = 10
t[1] = 1
t[7] = 7
t[2] = 2
t[9] = 9
t[6] = 6
t[3] = 3
t[8] = 8

-- Sainan

Xmilia Hermit

unread,
Sep 16, 2025, 3:47:36 PM (11 days ago) Sep 16
to lu...@googlegroups.com
How about iterating over the whole array. If one key is not an integer
or smaller than 1, it is not a contiguous array.
We also record the maximum value. Since all keys are unique, one can
compare the number of elements with the maximum, and if they are the
same, it is a contiguous array.
It could be implemented as follows:

local pairs, type = pairs, type
local is_int
if math.tointeger then
local tointeger = math.tointeger
is_int = function(x) return tointeger(x) == x end
else
is_int = function(x) return x%1 == 0 end
end

local function is_contiguous_array(t)
local num, max = 0, 0
for k in pairs(t) do
if type(k) ~= 'number' or k < 1 or not is_int(k) then return
false end
num = num + 1
if k > max then max = k end
end
return max == num
end

In case empty tables should not be accepted the max value can be
initialized to 1.

Regards,
Xmilia

Andrey Dobrovolsky

unread,
Sep 16, 2025, 3:49:51 PM (11 days ago) Sep 16
to lu...@googlegroups.com
Hi Lars,

> I'm a little confused here. I'm fairly sure my implementation is correct.

Agreed, sorry for the noise. Comparison k > len really triggers on
indices out of 1..len interval in case len is the lower border of one
of the holes. Moreover, Your function will detect irregularity in the
supposed sequence the faster the lower hole will be caught by '#',
while mine function will be able to make a conclusion only after full
traverse cycle.

> You need to reject keys < 1,

Agreed once again, there was the following message with update.

Best regards,
Andrew

вт, 16 вер. 2025 р. о 21:59 'Sainan' via lua-l <lu...@googlegroups.com> пише:
> --
> You received this message because you are subscribed to the Google Groups "lua-l" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to lua-l+un...@googlegroups.com.
> To view this discussion visit https://groups.google.com/d/msgid/lua-l/pIe6F9xnp6u1pPK6lQ_-2KO-zxP9c73DjMZ4_8Np0V1BFUdIAqsmPReX-pLLvR0CkAsA3WGDZJLOKicUHWuYi2uU3GF6_KG1Czd2WbT-oTU%3D%40calamity.inc.

Sewbacca

unread,
Sep 16, 2025, 4:35:52 PM (11 days ago) Sep 16
to lu...@googlegroups.com
On 9/16/2025 9:47 PM, Xmilia Hermit wrote:
    local num, max = 0, 0
    for k in pairs(t) do
        if type(k) ~= 'number' or k < 1 or not is_int(k) then return false end
        num = num + 1
        if k > max then max = k end
    end
    return max == num

For a faster rejection, you might wanna add len = #t and in the body check if k > len then break end

-- Sewbacca

Sainan

unread,
Sep 16, 2025, 5:02:47 PM (11 days ago) Sep 16
to lu...@googlegroups.com
This should work:

inline lua_Integer isIndexBasedTable (lua_State *L, int i) {
lua_pushvalue(L, i);
lua_pushnil(L);
lua_Integer k = 1;
for (; lua_next(L, -2); ++k) {
if (lua_type(L, -2) != LUA_TNUMBER) {
lua_pop(L, 3);
return 0;
}
if (!lua_isinteger(L, -2)) {
lua_pop(L, 3);
return 0;
}
if (lua_tointeger(L, -2) <= 0) {
lua_pop(L, 3);
return 0;
}
lua_pop(L, 1);
}
lua_pop(L, 1);
return k;
}

static int tisarray (lua_State *L) {
luaL_checktype(L, 1, LUA_TTABLE);
if (const auto n = isIndexBasedTable(L, 1)) {
for (lua_Integer k = 1; k != n; ++k) {
if (l_unlikely(lua_geti(L, 1, k) == LUA_TNIL)) {
lua_pop(L, 1);
lua_pushboolean(L, false);
return 1;
}
lua_pop(L, 1);
}
lua_pushboolean(L, true);
}
else lua_pushboolean(L, false);
return 1;
}

-- Sainan

Andrey Dobrovolsky

unread,
Sep 17, 2025, 9:47:50 AM (10 days ago) Sep 17
to lu...@googlegroups.com
Hi Lars and discussion participants,

A few words about possible optimization of the algorithm discussed
above. Taking into consideration implementation of the luaH_next() -
searching in array part first - we can guess that starting scanning at
#t index should significantly decrease the number of steps necessary
for finding out-of-sequence indices. In case the table includes the
correct sequence plus some amount of non-sequence elements it will
take the single step to conclude that the table is not a pure
sequence. The same for tables missing array part.For other cases like:
- a sequence with holes with or without non-sequence elements
such a kind of first stage should not take many steps too.
If the table is a canonical sequence then next(t,#t) will return nil
and the second search stage can be started from nil to #t according to
the conditions discussed earlier in this thread.
Hope this may help.

Regards,
Andrew

ср, 17 вер. 2025 р. о 00:02 'Sainan' via lua-l <lu...@googlegroups.com> пише:
> --
> You received this message because you are subscribed to the Google Groups "lua-l" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to lua-l+un...@googlegroups.com.
> To view this discussion visit https://groups.google.com/d/msgid/lua-l/eM37IJTdV7ZmRE9Lht1koOq5KRcNQu_7ja-23iL0W-5ouXjMs-1P1PVzkXpojj3Ia_8x_4WBFBOH7ZoxkLE_-gOny0WL8Xr_3fFxX9_cYho%3D%40calamity.inc.

Roberto Ierusalimschy

unread,
Sep 17, 2025, 1:36:13 PM (10 days ago) Sep 17
to 'Sainan' via lua-l
> Damn, a breaking change in Lua 5.5. Well, maybe something to consider before upgrading.

What exactly is being broken?

-- Roberto

Sainan

unread,
Sep 17, 2025, 1:51:40 PM (10 days ago) Sep 17
to lu...@googlegroups.com
> What exactly is being broken?

I was being facetious about this apparently failing on Lua 5.5 as reported by Xmilia:

local t = {}
for i = 19, 1, -1 do
t[i] = i
end
local i = 1
for k in pairs(t) do
assert(i == k)
i = i + 1
end

But obviously as per the Lua manual there is no guarantee about this even in current versions, just a bit of 'UB' being made more obvious now.

-- Sainan

Gé Weijers

unread,
Sep 18, 2025, 10:44:19 AM (9 days ago) Sep 18
to lu...@googlegroups.com
On Tue, Sep 16, 2025 at 8:32 PM 'Sainan' via lua-l <lu...@googlegroups.com> wrote:
Damn, a breaking change in Lua 5.5. Well, maybe something to consider before upgrading.

-- Sainan

 
That's not a breaking change, that's trying to rely on an undocumented implementation behavior that is not guaranteed. Lua 5.4.8 also has access patterns that break your assumption.

This one fails on Lua 5.4.8 (this is the build from 'homebrew' on MacOS). After the first loop ends the table contains the values 1..68 such that t[i] == i, as 59 and 68 are relative prime.

local t = {}
for i = 68, 1, -1 do
        local j = i*59 % 68 + 1
        t[j] = j

end
local i = 1
for k in pairs(t) do
     assert(i == k)
     i = i + 1
end

The keys 65, 66, 67, 68 show up in order 66, 67, 68, 65. They are likely part of the hash part of the table. It's still a sequence.

Lars Müller

unread,
Sep 19, 2025, 10:35:26 AM (8 days ago) Sep 19
to lua-l
I have now written a proof of concept, and as expected it is much faster than a user iterating the table via lua_next for a table living in the array part (the expected common case where you do not get to bail out early).
I'm measuring two orders of magnitude speedup for a large sequence on my machine. [1]

I have put my proof-of-concept branch on GitHub [2]. Attached is also a full diff against current Lua master (9ea06e6).
Simply running ./etc/quick_test.sh should run the small test + benchmark.

- Lars



[1]: Of course it is still linear time as is necessary, but in comparison to whatever linear time processing you do with the table during conversion / serialization that pales. (Additionally I suppose some details relating to very small tables would need to be fleshed out.)
[2]: https://github.com/appgurueu/lua/compare/master...appgurueu:lua:is-seq
lua_issequence.diff
Reply all
Reply to author
Forward
0 new messages