On Friday, April 26, 2024 at 6:05:18 AM UTC-3 Thijs Schreijer wrote:
> This works in the first test. In the second test the sleep is called
> from the fibonacci-producer. And this is where the coroutine stack
> comes into play. The sleep function now yields and passes the delay.
> But this doesn't end up in the main scheduler loop (2 levels up the stack),
> but instead ends up in the fibonacci-counter (1 level up the stack).
> ...
I'm pretty late for this discussion, and I'm probably missing the general issue here, but for the particular problem in this example my considerations are:
- When someone resumes a coroutine passing or expecting some values they are defining a contract with the coroutine, much like when we call a function. Just like with functions, we should expect that changing the coroutine's contract will likely break the code it interacts with. When we add a call to a yielding function like `sleep` in the `fibonacciProducer` coroutine, we basically changed its "coroutine contract" since it now is resumed from clients expecting incompatible contracts (ex. the code printing numbers and the scheduler waking up coroutines).
- The best approach I found to provide such "flexibility" is to adopt the most minimal contract possibile. For instance, avoid passing or expecting values altogether when we want the coroutine be resumed and yield to multiple clients/threads. To illustrate this using your example, you can change the `sleep` function to avoid yielding values to the `run_scheduler` like:
```diff
local function sleep(delay) -- delay in ticks
if type(delay) ~= "number" then
error("bad argument #1 for 'sleep' expected number, got: " .. type(delay), 2)
end
- delay = math.max(0, math.floor(delay))
- coroutine.yield(delay)
+ local coro = coroutine.running()
+ tasks[coro] = math.max(0, math.floor(delay))
+ coroutine.yield()
end
```And also change the scheduler to not expect an yielded value:
```diff
else
-- we're up!
- success, new_delay = coroutine.resume(coro)
+ success = coroutine.resume(coro)
if not success then
print("error resuming coroutine: ", new_delay)
end
- if coroutine.status(coro) ~= "dead" then
- tasks[coro] = new_delay
- end
end
end
end
```Now we only need to find a way for the code that prints the numbers to communicate with the `fibonacciProducer` coroutine without passing values through `resume`&`yield` as well. For instance, we can use the `queue` object below:
```lua
local queue = { thread = nil, value = nil }
do
local function resume(self, ...)
local thread = self.thread
self.thread = nil
coroutine.resume(thread, ...)
end
local function suspend(self)
self.thread = coroutine.running()
return coroutine.yield()
end
function queue:push(value)
if self.thread then
resume(self, value)
else
self.value = value
suspend(self)
end
end
function queue:pop()
if self.thread then
local value = self.value
resume(self)
return value
end
return suspend(self)
end
end
```And then we change the code to use it:
```diff
@@ -71,7 +101,7 @@
local function fibCoroutine()
local a, b = 0, 1
for i = 1, n do
- coroutine.yield(a) -- Yield the current Fibonacci number
+ queue:push(a) -- Yield the current Fibonacci number
a, b = b, a + b -- Update to the next Fibonacci number
sleep(number_ticks) -- yield to allow other threads to run <-- SLEEP WAS MOVED HERE
end
@@ -83,9 +113,10 @@
add_task(function()
local i = 0
local fibo = fibonacciProducer(100)
+ coroutine.resume(fibo)
while i < 10 do
i = i + 1
- print("fibonacci:", coroutine.resume(fibo)) -- do work
+ print("fibonacci:", queue:pop()) -- do work
end
end, number_ticks)
```Note that now we resume the `fibo` coroutine ignoring any values yielded. `fibo` could be on a `sleep`, `queue:push` or any other yielding operation that registers it to be resumed later.
I wrote the multi-threading library Coutil (
https://github.com/renatomaia/coutil) following this basic approach. With Coutil your example could be rewritten as:
```lua
local channel = require "coutil.channel"
local system = require "coutil.system"
print("\nStarting second test using nested coroutines\n")
coroutine.resume(coroutine.create(function ()
-- Fibonacci counter thread
coroutine.resume(coroutine.create(function ()
system.suspend(.1) -- emulate the `add_task` initial "sleep"
-- Fibonacci number generator thread
coroutine.resume(coroutine.create(function ()
local ch<close> = channel.create("fibonacci")
local a, b = 0, 1
for i = 1, 10 do
system.awaitch(ch, "in", a) -- Yield the current Fibonacci number
a, b = b, a + b -- Update to the next Fibonacci number
system.suspend(.1) -- yield to allow other threads to run <-- SLEEP WAS MOVED HERE
end
end))
local ch<close> = channel.create("fibonacci")
for i = 1, 10 do
print("fibonacci:", system.awaitch(ch, "out")) -- do work
end
end))
-- character counter thread
coroutine.resume(coroutine.create(function ()
system.suspend(.2) -- emulate the `add_task` initial "sleep" for i = 1, 3 do
print("character:", string.char(64+i)) -- do work
system.suspend(.2) -- wait till we're up again
end
end))
end))
system.run()
```You can move all coroutines to the main chunk (thus avoiding nesting coroutines) and the script should behave the same.
I used the `coroutine.resume(coroutine.create(fun...end))` pattern to be clear on what's going on. But the ideal is to use something like `coutil.spawn.catch(print, fun...end)` that wraps the coroutine's function around an error handler that prints any errors regardless of how the coroutine is resumed.
Finally, the use of `coutil.channel` here is not ideal because it is designed for communication among system threads. In particular, if the `fibo` coroutine produces more values than are consumed, the coroutine will wait on the channel forever expecting that maybe someone from outside the Lua state to consume the values. To let `fibo` be collected when suspended, you can use the `queue` object written above instead of the values created by `coutil.channel`.
In summary, I fail to understand the actual issue a `coroutine.{goto,yieldto}` is trying to solve from this example.
> But if you write an application and compose it of multiple libraries, then
> the requirement is that all those libraries use the same mechanism, which
> is something users cannot rely on nor check. Hence the only way to properly
> solve this, is by providing a solution in Lua itself.
Coutil adopts a simple and probably sensible approach that could be shared by others to allow interoperability: if your coroutine wants to suspend from anywhere and be resumed from anywhere, make as minimal requirements from the caller/resumer as possible (ex. no values, no error handling, can be resumed from anywhere, anytime). We adapted the `sleep` function from the example to follow part of this approach and we could use it with the `queue` object defined above, and likely all Coutil functions.
One tricky part might be how to mix together multiple event loops like `run_scheduler` and `coutil.system.run` for instance. One possibility with Coutil is to run them in different threads and propagate events from the other to the `coutil.system.run` event loop using one of the Coutil's thread communication mechanisms, like in this example using "state coroutines":
https://github.com/renatomaia/coutil/blob/master/demo/stateco.lua
Coutil exposes most, if not all, the features from
https://libuv.org/, therefore you should be able to write code similar to these examples that not only wait for some time like with `sleep`, but can also wait on files, sockets, processes, threads, and more. And with some effort, you might even integrate other libraries with it when necessary.
--
Renato Maia