tables constructor nesting limit

218 views
Skip to first unread message

Andrey Dobrovolsky

unread,
Jun 23, 2024, 4:18:43 PM (9 days ago) Jun 23
to lu...@googlegroups.com
Greetings!

Maybe it will be correct to add the note about the tables constructor
nesting depth limit in the Lua reference, for example at the bottom of
"3.4.9 – Table Constructors" section.

Regards!

Sainan

unread,
Jun 23, 2024, 4:41:02 PM (9 days ago) Jun 23
to lu...@googlegroups.com
I don't think there is any such limit. You may run into register limitations, tho.

Andrey Dobrovolsky

unread,
Jun 23, 2024, 6:22:16 PM (9 days ago) Jun 23
to lua-l
Unfortunately such limit exists and You're absolutely right, it is derived from register number limitations. I don't mean data structures nesting level in memory but table constructor {} nesting depth limit for the Lua sourse code.
Documenting such warning may prevent unpleasant surprises for those who intend to use Lua sources as data format.

неділя, 23 червня 2024 р. о 23:41:02 UTC+3 Sainan пише:

Sainan

unread,
Jun 23, 2024, 7:09:12 PM (9 days ago) Jun 23
to lu...@googlegroups.com
In some of my applications, heavy data is stored in Lua source files, and I have yet to run into any issues.

The only issue I could even imagine is if your data were stored in a Lua source file and such data would require more than 255 registers; and if this is actually an issue you are facing,  you have a lot of easy solutions:

  1. Work around it by just using intermediate locals, e.g.: local inner_t = {{{{{{}}}}}}; local outer_t = {{{{{{inner_t}}}}}};
  2. Use a data exchange format that is not turing complete.
  3. Format your data more pragmatically.

Andrey Dobrovolsky

unread,
Jun 23, 2024, 8:20:09 PM (9 days ago) Jun 23
to lua-l
I appreciate all proposed solutions. Of course it is not unsolvable problem and the straight way is the second one - use another data format, meaning usage of some additional stuff. In my case it was the matter of my wish to use Lua as wide as possible :-) I've satisfied my needs with the patch extending guarnteed nesting depth up to 200 (C stack limit). But I think some warning would be fine in the reference because original guaranteed nesting level is quite humble - 5 - and is data order (!) dependent, which can cause very tough debugging. Or maybe adding some note concerning discouraging of Lua usage as the data parser would be nice I guess.

Regards!

понеділок, 24 червня 2024 р. о 02:09:12 UTC+3 Sainan пише:

Rett Berg

unread,
Jun 24, 2024, 5:53:12 PM (8 days ago) Jun 24
to lua-l
this feels to me like a solution looking for a problem. Most folks wouldn't want such a large nested literal -- create variables or functions and re-use them as needed -- then the problem goes away.

Andrey Dobrovolsky

unread,
Jun 24, 2024, 11:19:25 PM (8 days ago) Jun 24
to lua-l
Thanks for the proposal but it plays well for hand-made sources, while I mean using Lua as the data exchange format. Of course for hand-written code deep nested literals can only dim the readability and inhibit understandability and undisputably must not be recommended. I meant software-generated data formatted as Lua tables which can be used without any preprocessing. Probably such usage is attractive for small utilities only, because for large software it doesn't matters whether it has N or N+1 dependencies.
I understand that making Lua parser bullet-proof for external code, which may carry malicious intentions, will make it (parser) more compicated and enlarge the code. Still "load()" and friends are essential for Lua that's why I think the ways to interrupt the normal Lua functioning should be observed and discussed And at least the warning in the Lua reference concerning the parser limitations may be helpful.

Regards!
вівторок, 25 червня 2024 р. о 00:53:12 UTC+3 Rett Berg пише:

Sainan

unread,
Jun 24, 2024, 11:51:05 PM (8 days ago) Jun 24
to lu...@googlegroups.com
JSON, YAML, XML, and TOML aren't the only data exchange formats on the block. Make your own if you think existing solutions are so complex that you can't implement a parser for them in like 50 LOC.

Sean Conner

unread,
Jun 25, 2024, 12:54:55 AM (8 days ago) Jun 25
to 'Sainan' via lua-l
It was thus said that the Great 'Sainan' via lua-l once stated:
> JSON, YAML, XML, and TOML aren't the only data exchange formats on the
> block. Make your own if you think existing solutions are so complex that
> you can't implement a parser for them in like 50 LOC.

There's also CBOR [1], which is a binary encoding system that makes
distinctions between integers and floating point, text strings and binary
data, and you can semantically tag data as well.

-spc

[1] http://cbor.io/
also RFC-8949

Sainan

unread,
Jun 25, 2024, 1:03:36 AM (8 days ago) Jun 25
to lu...@googlegroups.com
Yes, data exchange formats exist. Just saying you don't need to use one of the ones that aren't so simple you couldn't parse them with about 50 lines of code.

Also, I don't understand the rationale behind CBOR. It's always good to have a human-viewable stream of data for sanity's sake. And it's not like a JSON string can't contain binary data (there is an escaping mechanism after all), it's just that sanity usually wins.

Sean Conner

unread,
Jun 25, 2024, 1:19:58 AM (8 days ago) Jun 25
to 'Sainan' via lua-l
It was thus said that the Great 'Sainan' via lua-l once stated:
> Also, I don't understand the rationale behind CBOR. It's always good to
> have a human-viewable stream of data for sanity's sake. And it's not like
> a JSON string can't contain binary data (there is an escaping mechanism
> after all), it's just that sanity usually wins.

There are other data-exchange formats that are binary (DNS comes to mind).
Also, CBOR has defined how to encode integers, floating point (including NaN
and +-inf, something that JSON lacks for instance), binary data that doesn't
inflate the size by 33%, and don't discount the semantic tagging of data.
The RFC also includes a text-readable version of CBOR for examples, so it's
not like it *has* to be binary.

It's also interesting that HTTP/2 and HTTP/3 have evolved beyond text
encoding. And it's obviously better, because Google says so [1]. All Hail
Google!

-spc

[1] Sarcasm. Heavy sarcasm here.

Sainan

unread,
Jun 25, 2024, 2:11:43 AM (8 days ago) Jun 25
to lu...@googlegroups.com
And yet, we still use HTTP/1.1. Not saying binary formats are bad, but it's not good if you want humans to "learn" it and then teach their computers to speak it.

And especially for data exchange, these formats are often used for config files, so it helps to allow humans to create and modify such data (and also it helps to not have stupid rules about trailing commas, so ideally don't use JSON).

Sean Conner

unread,
Jun 25, 2024, 4:15:15 AM (8 days ago) Jun 25
to 'Sainan' via lua-l
It was thus said that the Great 'Sainan' via lua-l once stated:
> And especially for data exchange, these formats are often used for config
> files, so it helps to allow humans to create and modify such data (and
> also it helps to not have stupid rules about trailing commas, so ideally
> don't use JSON).

To keep this on topic for this list, I, in fact, use Lua for
configurtation files. Comments, and sane rules about trailing commas.

-spc

Andrey Dobrovolsky

unread,
Jun 25, 2024, 6:35:35 AM (8 days ago) Jun 25
to lua-l
Sean Conner wrote:
> Comments, and sane rules about trailing commas.

Plus painless interleaving of sequenced and indexed fields.

Lua is simply the best.

Regards!

вівторок, 25 червня 2024 р. о 11:15:15 UTC+3 Sean Conner пише:

Martin Eden

unread,
Jun 25, 2024, 11:23:31 AM (8 days ago) Jun 25
to lu...@googlegroups.com
Table constructor's nesting limit is implementation detail.
It does not belong to language specification.

But it would be not bad to have "Implementation limits" document
for PUC-Lua implementation. Useful for setting-up code generator.

What if I want several yobibytes variable name? Google-length
integer literal?

(Iirc BASIC had limitation of variable name of two characters.
With such thing in spec you can implement name -> address mapping
just using table with 64 KiB keys. That was age before hashes spread.)

-- Martin

Martin Eden

unread,
Jun 25, 2024, 11:34:26 AM (8 days ago) Jun 25
to lu...@googlegroups.com
Regarding representation tree data formats.

Had you guys considered bencoding (1)? Sane, practical and easy to
implement. Open any .torrent file to see it.

[1]: https://wiki.theory.org/BitTorrentSpecification#Bencoding

-- Martin

Rett Berg

unread,
Jun 25, 2024, 1:57:32 PM (7 days ago) Jun 25
to lu...@googlegroups.com
Had you guys considered bencoding

A string is encoded like this:

4: spam

This is NOT human writable. Requiring humans to write down their string length is a recipe for pain and disaster.


--
You received this message because you are subscribed to a topic in the Google Groups "lua-l" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lua-l/UYrcWiHKU4I/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lua-l+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lua-l/5e620b31-b23b-4c10-adaf-3ae7dc2ae519%40disroot.org.

Roberto Ierusalimschy

unread,
Jun 25, 2024, 2:06:49 PM (7 days ago) Jun 25
to lu...@googlegroups.com
Please, let us keep this list for discussions about Lua.

Many thanks,

-- Roberto
> You received this message because you are subscribed to the Google Groups "lua-l" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to lua-l+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/lua-l/CALW7ahzdVkLLa%3DF8H5sC%3DikgELZfkDy76qQgUTzLUM56yG9SEQ%40mail.gmail.com.

Roberto Ierusalimschy

unread,
Jun 25, 2024, 4:03:42 PM (7 days ago) Jun 25
to lu...@googlegroups.com
> [...] But I think
> some warning would be fine in the reference because original guaranteed
> nesting level is quite humble - 5 - [...]

It should be emphasised that this limit hits in very specific
circumstances. Not only you need 5 constructors nested inside each
other, but each one should have dozens of elements. (Just for curiosity,
did you reach that limit?)


> [...] which
> can cause very tough debugging.

I don't understand that point. Why would debugging be tough?

-- Roberto

Andrey Dobrovolsky

unread,
Jun 25, 2024, 4:47:19 PM (7 days ago) Jun 25
to lua-l
> It should be emphasised that this limit hits in very specific
> circumstances. Not only you need 5 constructors nested inside each
> other, but each one should have dozens of elements. (Just for curiosity,
> did you reach that limit?)

After I've noticed that maximum depth depends on the number of sequence members I've turned to sources and saw LFIELDS_PER_FLUSH defined in lopcodes.h which is 50, so opening the new nesting level after LFIELDS_PER_FLUSH * X - 1 sequential fields (X is any positive integer) leaves them in the registers unflushed and decrease substantially the rameining nesting pool.

> I don't understand that point. Why would debugging be tough?

Because the nesting limit is data order dependent.
The case
{0, ... , 0, {0 .. , 0, {0, ... , 0, {0, ... , 0, {0, ... , 0, {0, ... , 0, {0, ... , 0}}}}}}}
will cause an error, while the case
{{{{{{{0, ... , 0}, 0, ... , 0}, 0, ... , 0}, 0, ... , 0}, 0, ... , 0}, 0, ... , 0}, 0, ... , 0}
will not.
If the table is software-generated in certain applications data order is not reproducible and will lead to the headache :-)

I realize that Lua is not used as the data exchange format, as the most it is used for configs, and the described deviation remains hidden. Maybe the warning about the guaranteed nesting depth limit will cool down the hot heads haunting for something not very common :-)

вівторок, 25 червня 2024 р. о 23:03:42 UTC+3 Roberto Ierusalimschy пише:

Luiz Henrique de Figueiredo

unread,
Jun 25, 2024, 5:39:05 PM (7 days ago) Jun 25
to lu...@googlegroups.com
> I realize that Lua is not used as the data exchange format, as the most it is used for configs

When Lua was created, 30 years ago, writing data files was already a
design goal. It supported large files, for that time.
During its evolution, Lua has supported increasingly larger data files.
But that does not mean huge arbitrarily nested data.

Roberto Ierusalimschy

unread,
Jun 25, 2024, 5:39:17 PM (7 days ago) Jun 25
to lu...@googlegroups.com
Again, have you ever hit this problem "in the wild", with any real
application?

-- Roberto

Andre Leiradella

unread,
Jun 25, 2024, 5:47:35 PM (7 days ago) Jun 25
to lua-l
FWIW, in my Pascal to Lua transpiler, I generate nested table constructors when generating the Pascal classes, structures and arrays. For some programs, I would hit the limit.

I switched to initializing them separately, and then setting the required fields accordingly i.e. obj1.field = obj2

Andre

Andrey Dobrovolsky

unread,
Jun 25, 2024, 6:33:07 PM (7 days ago) Jun 25
to lua-l
> Again, have you ever hit this problem "in the wild", with any real
> application?

Of course, otherwise how could I knew about it. In my fork of redo-c the common build log is written by different processes and the nested structure reflects the parent-child relations while sequence fields represents dependencies. Named fields tells about timings.
Using Lua for such structure is natural. I'm not sure which was nesting level to cause unexpected fails, but if I am not mistaken it was in the range  20 - 30. And it was changing for unknown reasons. Then I started to dig for what's going wrong.
 
середа, 26 червня 2024 р. о 00:39:17 UTC+3 Roberto Ierusalimschy пише:

Andrey Dobrovolsky

unread,
Jun 25, 2024, 6:48:47 PM (7 days ago) Jun 25
to lua-l
> But that does not mean huge arbitrarily nested data.

But the described hit of the limit doesn't require huge amount of data and even huge nesting depth. It may happen suddenly and disappear suddenly too. 5 levels for sure, above is the luck, looks something of that kind.

середа, 26 червня 2024 р. о 00:39:05 UTC+3 Luiz Henrique de Figueiredo пише:

Roberto Ierusalimschy

unread,
Jun 26, 2024, 4:25:23 PM (6 days ago) Jun 26
to lu...@googlegroups.com
> > Again, have you ever hit this problem "in the wild", with any real
> > application?
>
> Of course, otherwise how could I knew about it. [...]

For instance looking at the Lua code generator. (I imagine that is how
you arrived at the "5" limit.)


> I'm not sure which was nesting
> level to cause unexpected fails, but if I am not mistaken it was in the
> range 20 - 30.

Well, 20-30 is a little larger than 5. :-)

Many thanks for the reply.

-- Roberto
Reply all
Reply to author
Forward
0 new messages