Observations from a bytecode validation feasibility experiment

92 views
Skip to first unread message

nobody

unread,
Feb 21, 2026, 10:47:13 AM (10 days ago) Feb 21
to lu...@googlegroups.com
Hi,

I recently tried to build a bytecode validator for Lua 5.4, mainly to
see how the internals changed compared to earlier versions and where the
main sources of difficulty are now. My main conclusion is that it's
gotten a lot harder, and a complete verifier would likely be as big as
(if not larger than) the luaV_execute core loop. In particular:


1) luaP_opmodes no longer works as "ground truth"

Many 5.1/5.2 validators relied on luaP_opmodes to handle most opcodes
generically, with a switch/case only covering the more complex cases. In
5.4, there are "inaccuracies" (from the perspective of post-hoc
consumers) that make it unsuitable as ground truth.

As an example, VARARGPREP is tagged as A ("instruction sets register A")
even though it doesn't. This means that e.g. a `function(...) return
end` (which has maxstacksize == 0) would fail a generic
luaP_opmodes-based test for supposedly writing an out-of-bounds register.

I assume the table is exactly in the form the parser / code generator
needs; it just means it's no longer easily reusable for other purposes.


2) Type assumptions are the main source of complexity

Instructions like SETLIST assuming a type (try e.g. `return
{debug.setlocal(1,1,0)}`) mean that validation requires tracing register
types, which is much more involved than sanity-checking instructions in
isolation. For this particular one, 5.1 had a runtime check macro that
silently skipped the instruction on type mismatch, 5.2 had that macro do
nothing unless explicitly defined, and 5.3+ just dropped the test entirely.

Having optional runtime checks (via luaconf.h / compile time flag) as
catchable errors could be useful for hardening in general and would also
make vastly simpler bytecode validation "good enough". (A practical way
to identify those would be combining a minimal bounds check validator
and a fuzzer, though this can probably be defined more rigorously if
desired.)


3) Block / scope debug information

One of the hardest parts in practice is reconstructing block and scope
structure. Lua must have access to this information in some form during
code generation, but only records instruction → line mappings.

Explicit block / scope information would be useful not just for
validation, but likely also for profiling, debugging, fuzzing (of the
Lua code, not Lua itself), etc. (Even if code is written "more
vertically" to make line numbers more informative, there will still be
cases where multiple basic blocks sit on the same line.)


It's clear that bytecode validation isn't a priority for the community,
but it appears the features that would likely enable it (optional
runtime type checks in critical places, block/scope debug info) both
seem small enough and potentially useful for other purposes. So I'm
curious if others have actually felt a need for these in other contexts,
or whether these are just things that look like they might be useful but
actually aren't all that relevant.

Cheers,
nobody

Luiz Henrique de Figueiredo

unread,
Feb 21, 2026, 2:38:09 PM (10 days ago) Feb 21
to lu...@googlegroups.com
> My main conclusion is that it's
> gotten a lot harder, and a complete verifier would likely be as big as
> (if not larger than) the luaV_execute core loop.

A complete verifier is a hopeless task (halting problem etc.).
We dropped even the simple integrity test in Lua 4.0, way back in 2000.

> Many 5.1/5.2 validators relied on luaP_opmodes to handle most opcodes
> generically, with a switch/case only covering the more complex cases.

We did that for bytecode listing in luac.c, but it was not clear.
Since Lua 5.4 luac.c uses a complete switch; it's much easier to
maintain even if there is much repetition.

> It's clear that bytecode validation isn't a priority for the community,

It's an impossible task.
Our take is that if you don't trust bytecode, don't allow Lua to load it.
--lhf
Reply all
Reply to author
Forward
0 new messages