Control.
=====
Felix programs begin with a main thread for which an scheduler is constructed.
The thread registers with the thread control object.
The main thread may complete operation before other threads however
it does not return control until all other threads have terminated.
The scheduler consists of two parts. The async and sync schedulers.
The operation begins by the construction of a scheduler data object which
is initially an empty set of active fibres and a single running fibre which
executes a continuation of the initial procedure representing the user program.
New fibres can be spawned by the initial fibre. In principle the spawner and spawnee
are added to the active set and on is picked by the scheduler to as the currently
running fibre, however Felix always selects a newly constructed fibre so that
a spawn has the same behaviour as a subroutine call.
A fibre may also launch a coroutine as a process. In this case a new pthread
is created with a scheduler, but it spawnee and spawner share the same active
set. However now, two pthreads are pulling fibres out of the active set to
run instead of one, so the running fibres are now executing concurrently.
The initial scheduler cannot return until the active set is empty and
all processes, including itself, are hung waiting, and, there are no
asynchronous events pending for the (now shared) scheduler data
object. Spawned process scheduler also cannot terminate until
these conditions are met. When there is no work to do the spawned
processes terminate and the main (initial) scheduler then returns.
A fibre can also run a new scheduler. In this case a new scheduler and
scheduler data object is created and runs the initial coroutine as a fibre by a
subroutine call which only returns control when there is no work to do,
as above. Note that during the call the launcher fibre remains the currently
running fibre which suspends the scheduler which is running it.
However another process thread for that schedulers data object
sharing it may continue processing fibres in the scheduler data object.
In other words schedulers can be stacked, with a single thread running
the top of the stack, and the rest of the schedulers in the stack suspended
until it returns, popping the stack. Each of these schedulers is associated
with a distinct scheduler data object (however any one may still be active
if there are other threads sharing access).
Memory
======
Felix manages most objects using a world stop mark sweep garbage collector.
The key to understanding the memory management system is to understand
reachability of objects. Unreachable objects are occasionally reaped by the
collector.
The set of active fibres of a scheduler data object are reachable if one or more
schedulers sharing that data object are reachable. The running fibre of
a scheduler is reachable if the scheduler is reachable. A channel is reachable
if one or more reachable fibres can reach it, and a fibre is also reachable
if a channel on which it is suspended is reachable.
When all the fibres of a scheduler data object are neither running,
active, nor rooted by a pending asynchronous event, there is no work
for that data object or any scheduler sharing the data object, then if no
channels are known other than by fibres of that object, all the fibres
and channels can be deleted. Note that this condition is is best achieved
if channels are only known to fibres of the scheduler object group.
If a channel is saved in a global variable, the channel and fibres
suspended on it cannot be reaped. in the usual case, after construction
of a network, only the fibres which read or write channels should know them.
NOTE: Go-lang gets the correct operation of channels completely wrong.
Felix gets it right. It is not possible to test a channel to see if there is data
available, nor to test if there is a reader for data being written. If there
is no data a reader will suspend permanently: the need to read is
called hunger, and the eternal lack of supply called starvation.
Similarly the need to write when no reader will come is
permanent blockage (constipation .. :-)
Starvation and blockage are the CORRECT way to terminate a fibre
because provided channels are anonymous, it makes the channels
and fibres unreachable.
Asynchronous Events
=================
Felix handles asynchronous events by removing the calling fibre
as the running fibre, making it a root, and handing the fibre
over to an asynchronous event service thread. These are not Felix threads,
Felix supports two async operations at present.
A clock object provides an interface to a thread that manages timed
sleeps for fibres. The thread organises for the OS to signal it when
the earliest timeout will occur, and then unroots the fibre and places
it back on its original scheduler data object. Felix provides a single
system clock, but the user can create any number of clocks.
Each clock runs a separate thread.
CAVEAT: during garbage collection, the final action or returning
a fibre to its original scheduler data object must be inhibited
because the state of the schedulers and data objects must remain
unchanged or the GC may fail. [I don’t think I am checking this!
CHECK AND FIXME IF NECESSARY. Locking the root stl map object
may suffice]
Felix also provides a thread to manage socket events using the OS
native socket notification service.
Synchronisation
============
Channel I/O operations are atomic, and use system spinlocks to ensure atomicity.
Channel I/O can be across threads and schedulers. One machine word is transfered
each time a reader and writer match up. No allocations occur. Channels are the primary
synchronisation device which enable high speed user space context switching
between fibres.
Non-preemptable spinlocks ensure correct operation if available. A pre-emption
in the middle of a spinlock mayl leave on CPU spinning for an unbounded time.
System mutex can be used with caution. If there is a contention, it should
be rapidly resolved by the lock holder. A pre-emption is possible and will
delay lock release. Allocations are NOT permitted inside a system spinlock
because they may trigger a GC which in turn calls for a world stop which
the contending thread cannot respond to because it is waiting on the
lock release. Felix provides a GC aware mutex which is actually a slow
spinlock. After a period, if the lock has not been release, the attempt
at acquisition is abandoned, the world stop flag checked, and if not
set, the a timed wait is again performed on the lock. In addition,
the Felix lock registers the timed wait for notification by a signal if
a world stop is called for, to wake it up more quickly to check
the world stop flag. A thread detecting a world stop suspends until
notified the GC cycle is complete, saving the bounds of it machine
stack in the thread control object so that the GC may conservatively
scan it. This allows allocations in functions, which would not otherwise
allow them. The suspension also save the machine registers on
the stack first, using setjump().
Felix provides condition variables using the same machinery.
Note that locks and condition variables lock the WHOLE PTHREAD not just
a fibre.
Real time threads
==============
A thread can use OS facilities to put it in real time mode.
This usually consists of requesting a certain amount of CPU
with a given time period. The OS should provide that amout
of CPU withing ANY such time period.
A HARD realtime thread also disables preemptions.
Real time threads MUST NOT respond to world stop requests
and so cannot use Felix mutex or condition variables.
RT threads can synchronise with ordinary threads using condition variables:
(a) if the waiter is RT, a standard condition variable is used
(b) if the waiter is Felix, a Felix condition variable is used
This is because signalling is a non-blocking operation and does not involve
a delay or world stop, the signaller does not need to hold the a lock to signal,
although it does need to hold a lock to set the condition being signalled.
[INVESTIGATE!] The same lock must be used. The crucial thing is NOT the holding
of the lock, but (a) atomicity of setting the condition and (b) a barrier to ensure the
condition propagates to the other thread when it comes out of its wait.
NOTES
======
The spinning of condition variables to check world stop should NOT be necessary because
they should be registering for a signal on world stop.
Locks on the other hand are a pain.In principle a mutex is a device which is used to serialise
selected operations so that these operations do not interleave with certain other operations
of other threads. In principle, the period for which a thread holds a lock must be bounded.
In this sense, world stop checks should not be needed for locks.
What this means is that allocations done whilst holding a lock should not trigger a GC.
Unfortunately whilst a no-collecion can be enforced by an explicit memory request,
some memory requests, such as closure formation, are implicit.
More generally the REAL pain with Felix world stop is a fault in OS design:
if a collection is required the collector should be able to forcibly freeze all
threads. Posix does not support this. Llinux, however, does provide thread
stopping signals.
Purpose
======
The main purpose of this article is to highlight the current Felix architecture.
At present, some data like the kind of thread is stored in the scheduler
instead of being associated with a thread. Therefore introducting a “realitime”
thread type is not going to work properly, because the data is in the wrong place.
In addition, code in a coroutine cannot access a lot of data because the pointers
go in the wrong direction: a fibre owns a continuation, but a continuation
cannot find its fibre. Similarly it cannot directly find its scheduler or scheduler
data object. Every continuation and fibre is owned by a particular scheduler
data object however it can be shared by scheduler and thread so it doesn’t
make sense for a coroutine to discover its thread, since the property
is transient not persistent.
—
John Skaller
ska...@internode.on.net