Implementing non-blocking IO in a C application that embeds Lua using coroutines while supporting user corountines

251 views
Skip to first unread message

LordMZTE

unread,
Jul 14, 2025, 5:25:35 AM7/14/25
to lu...@googlegroups.com
Hello everyone!

I have a daemon-like program in C (actually, Zig, but we can pretend it's C for the sake of
argument) which executes user-provided Lua code. I want to provide an API to this Lua code that
enables potentially slow IO operations in a non-blocking fashion. Suppose this is our user-provided
Lua code:

print "Started"
local result = do_io_operation()
print "Done"

I would like the C code which invokes this Lua code to return back to some event loop when
`do_io_operation` is called, allowing the program (which is single-threaded in nature due to Lua
also being single-threaded) to do other work while we wait for an IO response. The event loop should
then continue executing this Lua code once we have the result, ending up at a C callback once it
finishes. My current idea is using coroutines, where the C code would look something like this
(though this should be treated as pseudocode):

int lua_do_io_operation(lua_State *L) {
start_io_operation_on_event_loop(/* event loop will lua_resume this with IO result: */ L);
return lua_yield(L, 0);
}

int main() {
lua_State *L = luaL_newstate();
lua_register(L, "do_io_operation", lua_do_io_operation);

lua_State *thread = lua_newthread(L);
luaL_loadfile(thread, "usercode.lua");
lua_resume(thread, 0); // This should return once the user code invokes do_io_operation.

run_my_event_loop();

lua_close(L);
return 0;
}

This seems to me like it should work, but I believe this will break if the user chooses to also use
coroutines. The user may expect something like this to work:

local coro = coroutine.create(function()
local result = do_io_operation()
doStuff(result)
end)

-- The user expects this to make the coroutine run to completion, wheras do_io_operation will
-- suspend The coroutine too early, making this cause undesired behavior and making our event
-- loop resume a coroutine that probably doesn't even exist anymore.
coroutine.resume(coro)

I would like this to work, but for that to work, I would probably need the ability to resume to the
"outermost" coroutine, as they're being nested here. Thus, my final question: How can I implement
this sort of non-blocking IO in such a fashion, that the above code works as expected?

Please note that I'm on LuaJIT 2.1 (Lua 5.1), so no `lua_yieldk` and the likes.
signature.asc

Thijs Schreijer

unread,
Jul 14, 2025, 8:33:49 AM7/14/25
to lu...@googlegroups.com


On Mon, 14 Jul 2025, at 11:01, LordMZTE wrote:
> Hello everyone!
>
> I have a daemon-like program in C (actually, Zig, but we can pretend
> it's C for the sake of
> argument) which executes user-provided Lua code. I want to provide an
> API to this Lua code that
> enables potentially slow IO operations in a non-blocking fashion.

<snip>

>
> This seems to me like it should work, but I believe this will break if
> the user chooses to also use
> coroutines. The user may expect something like this to work:
>
> local coro = coroutine.create(function()
> local result = do_io_operation()
> doStuff(result)
> end)
>
> -- The user expects this to make the coroutine run to completion,
> wheras do_io_operation will
> -- suspend The coroutine too early, making this cause undesired
> behavior and making our event
> -- loop resume a coroutine that probably doesn't even exist anymore.
> coroutine.resume(coro)
>
> I would like this to work, but for that to work, I would probably need
> the ability to resume to the
> "outermost" coroutine, as they're being nested here. Thus, my final
> question: How can I implement
> this sort of non-blocking IO in such a fashion, that the above code
> works as expected?
>
> Please note that I'm on LuaJIT 2.1 (Lua 5.1), so no `lua_yieldk` and the likes.

Looks like the "nested coroutine problem" to me, see http://lua-users.org/lists/lua-l/2023-10/msg00139.html

bil til

unread,
Jul 15, 2025, 12:15:57 AM7/15/25
to lu...@googlegroups.com
Am Mo., 14. Juli 2025 um 11:25 Uhr schrieb LordMZTE <lo...@mzte.de>:
>
> This seems to me like it should work, but I believe this will break if the user chooses to also use
> coroutines...

To my experience with resume/yield (but it is 1 year ago I programmed
this last time), you have to be very strict and should not mix
different approaches.

In my lib I finally solved it like this, that I do NOT allow
coroutines to the user, but I programmed my own threads.

I think coroutines might run nicely if you do not interfere from your
C source side... . But if you use resume-yield-processing in your
software anyway, then better also present your own thread system to
the user.

Yan

unread,
Jul 15, 2025, 1:14:10 AM7/15/25
to lua-l
I agree.

Think about why you wanna use `coroutine.create` ?

With `coroutine.create`:

```
... -- code1

local coro = coroutine.create(function()
local result = do_io_operation()
doStuff(result)
end)
... -- code2
```

In the case code2 don't wait `doStuff`.

msg handler: code1 -> |  -> code2
                                       do_io_operation(doStuff)

Without `coroutine.create`:

```
--- code1

local result = do_io_operation()
doStuff(result)
--- code2
```
In the case code2 must wait `doStuff`.

msg handler: code1 -> do_io_operation(doStuff) -> code2
 
So, you can handle coroutine with yourself. For example you can implement `fork` function.

With `coroutine.create`:

```
... -- code1
fork(function()

local result = do_io_operation()
doStuff(result)
end)
... -- code2
```

In `fork` function:

1. Use `coroutine.create` and then the framework holds the coroutine returned from `coroutine.create` to a queue.
2. When the msg finished, return control to the framework, pop the fork queue and resume the coroutine.

Renato Maia

unread,
Jul 20, 2025, 5:00:36 PM7/20/25
to lu...@googlegroups.com
> I would like this to work, but for that to work, I would probably need the ability to resume to the
"outermost" coroutine, as they're being nested here.

You want to resume to the "outermost" coroutine to avoid breaking
other coroutines running in the stack, right? Consider the following
code as an illustration of the issue. The `system.listdir` plays the
role of your yielding I/O function `do_io_operation`, and the
`system.run` is the Lua equivalent of your `run_my_event_loop`.

```lua
local system = require "coutil.system"

local function combine_files_from(dir1, dir2)
return coroutine.wrap( -- coroutine C2
function () -- function F2
for path1 in system.listdir(dir1) do
for path2 in system.listdir(dir2) do
coroutine.yield(path1, path2)
end
end
end
)
end

coroutine.resume(
coroutine.create( -- coroutine C1
function() -- function F1
for path1, path2 in combine_files_from("/tmp/dir1", "/tmp/dir2") do
print(path1, path2)
end
end)
)

system.run()
```

If we execute this code, when the first `system.listdir` is called we
would have a stack with something more or less like:

```
[main thread]
- call function 'coroutine.resume' (or 'lua_resume' underneath)
[coroutine C1]
- call function F1
- call function by 'coroutine.wrap' (or 'lua_resume' underneath)
[coroutine C2]
- call function F2
- call function 'system.listdir' (or 'lua_yield' underneath)
```

Since `system.listdir` simply yields the current coroutine without any
values, this will break the "yield signature" of the iterator
coroutine C2 created by `combine_files_from`, which should yield two
path values until the end of its iterations. Thus you would like to
skip C2 and instead resume C1 that does not yield values to the main
thread, so it would be compatible with the yield from
`system.listdir`.

Unlike your original post that guarantees that `usercode.lua` always
runs on a coroutine with a yield signature that you can honor from
`do_io_operation`, it is easy to see in the example above that C1
could be modified to yield values, or even not be created altogether
if we move the `for` to the script's main chunk. So there might not
always be a coroutine in the slack that you can yield to without
breaking its "yield signature".

Anyway, even if you can make sure in your particular code that all
coroutines are rooted in the main thread by a `lua_resume` that you
control and therefore can safely yield to that point, I don't believe
it is possible in Lua right now to skip some coroutines when yielding.
But there might be a solution in some future versions of Lua as hinted
[here](https://groups.google.com/g/lua-l/c/TmsDXBslnxk/m/ZMRAeUFZBgAJ).

> Please note that I'm on LuaJIT 2.1 (Lua 5.1), so no `lua_yieldk` and the likes.

What I would do to deal with this, which you might be able to
reproduce with old Lua, is to avoid using "yield signatures"
altogether. Or more precisely, design your coroutines and functions
implementing multithreading to rely on the pattern described
[here](https://github.com/renatomaia/coutil/blob/master/doc/manual.md#await-function).

It is fine for coroutines to rely on yielding values to its immediate
resumer (like the ones from the chapter about coroutines in the PiL
book), as long as they don't call functions like `system.listdir` or
your `do_io_operation`, which should be documented that might yield
and thus break their "yield signature". For instance, the script above
will work as expected if `system.listdir` executes in blocking-mode
and does not yield (which can be done by passing `"~"` as second
argument, as described
[here](https://github.com/renatomaia/coutil/blob/master/doc/manual.md#systemlistdir-path--mode)).

But for coroutines that we want to be able to yield at "more
arbitrary" points (for this non-blocking I/O support for instance),
stick to the "yields no values" signature and avoid using the standard
coroutine resume/yield. Instead, use other mechanisms to switch
execution and exchange data between coroutines. For instance, the same
functionality of the example above could be rewritten as follows:

```lua
local event = require "coutil.event"
local system = require "coutil.system"

local function combine_files_from(dir1, dir2) -- factory of coroutines cB
e = {} -- unique value as the event (could be coroutine C2 below as well)
coroutine.resume(
coroutine.create( -- coroutine C2
function () -- function F2
for path1 in system.listdir(dir1) do
for path2 in system.listdir(dir2) do
event.emitone(e, path1, path2)
end
end
event.emitone(e, nil) -- to signal iterator's end
end
)
)
return function () return select(2, event.await(e)) end
end

coroutine.resume(
coroutine.create( -- coroutine C1
function() -- function F1
for path1, path2 in combine_files_from("/tmp/dir1", "/tmp/dir2") do
print(path1, path2)
end
end
)
)

system.run()
```

Here the iterator coroutine using resume/yield is replaced by a
"coroutine thread" that sometimes suspends on calls to the I/O
function and other times produces values as an event for the iterator
to consume. The iterator by `combine_files_from` now just continuously
suspends waiting for the events.

The implementation of `event.*` functions is a Lua script
[here](https://github.com/renatomaia/coutil/blob/master/lua/coutil/event.lua),
and I believe it should work in Lua 5.1 with very little changes. Its
documentation is
[here](https://github.com/renatomaia/coutil/blob/master/doc/manual.md#events).

Also, if you're interested, see
[here](https://groups.google.com/g/lua-l/c/TmsDXBslnxk/m/a-eCIIdsAQAJ)
for another example of this solution approach for this issue.

--
Renato Maia

LordMZTE

unread,
Jul 21, 2025, 9:10:54 AM7/21/25
to lu...@googlegroups.com
Thank you for that detailed answer!

My main concern currently isn't really how the `usercode.lua` script could be modified in order to
support this API, but more that this code isn't written by me, but by an unsuspecting user - it
should ideally "just work" without the user even having to understand what's going on with
coroutines here.

I do have a somewhat cursed idea of how to possibly achieve this - my actual code is written in Zig,
and [Zig should get async again soon](https://github.com/ziglang/zig/issues/23446). This works by
saving stack frames, jumping out of a function and restoring the stack frame and jumping back in
later. Perhaps I could simply create a Lua thread (to have a separate stack) and then use this
method of suspending my `do_io_operation` function without even using Lua's own coroutines. I have
thought about this a little and I'm let's say semi-convinced that it should work considering that it
should look identical to the Lua library as some strangely-ordered API calls using multiple threads.

Something similar should already be possible to pull off with C++20 coroutines. Do we have any
record of someone having tried that before?

Cheers
Moritz
signature.asc

bil til

unread,
Jul 22, 2025, 2:35:24 AM7/22/25
to lu...@googlegroups.com
Am Mo., 21. Juli 2025 um 15:10 Uhr schrieb LordMZTE <lo...@mzte.de>:
> My main concern currently isn't really how the `usercode.lua` script could be modified in order to
> support this API, but more that this code isn't written by me, but by an unsuspecting user - it
> should ideally "just work" without the user even having to understand what's going on with
> coroutines here.

... but you know the book "Programming in Lua" of Roberto, especially
chapter 24 "Coroutines", but also chapter 33 "Threads and states" if
you consider your own yield/resume handling?

... there such things are discussed in length (... if I understand
your application problem correctly... I did not have time to read
through all of your text, sorry...).

Rett Berg

unread,
Jul 23, 2025, 5:02:10 PM7/23/25
to lua-l
My fd library solves this by running the C code in a separate thread. The separate thread stores any output values in a struct and updates a boolean to mark completion. All yielding logic happens in pure Lua, the C code never blocks when the file is in "async" mode.


Basically the "FDT" (file-descriptor-thread) type handles running file operations on a separate thread and communicating state.

Note: There's also the "pollList" which allows access to the unix poll method for filedescriptors, but that might be beside the point for your use-case.

- Rett

Thijs Schreijer

unread,
Jul 24, 2025, 1:35:25 PM7/24/25
to lu...@googlegroups.com
On Wed, 23 Jul 2025, at 23:02, Rett Berg wrote:
My fd library solves this by running the C code in a separate thread. The separate thread stores any output values in a struct and updates a boolean to mark completion. All yielding logic happens in pure Lua, the C code never blocks when the file is in "async" mode.


Basically the "FDT" (file-descriptor-thread) type handles running file operations on a separate thread and communicating state.

Note: There's also the "pollList" which allows access to the unix poll method for filedescriptors, but that might be beside the point for your use-case.

- Rett

If that is an option then also Copas-Async [1] might be an option. Copas is a coroutine scheduler, and Copas-Async is an add-on that uses LuaLanes under the hood to do sync stuff like this without blocking the coroutine loop. So this would be a more generic version of what Rett mentioned.

Thijs

Reply all
Reply to author
Forward
0 new messages