Multithreaded filesystem using cooperative scheduling

Marven Lee

unread,

Sep 16, 2011, 6:06:01 AM9/16/11

to

I started thinking about send/receive message passing
microkernels again instead of the migraring thraad/protection
ring ideas that I've often considered. This lead me to think
about how to add concurrency to a single threaded filesystem.
From reading old posts I gather there was a multithreaded
filesystem for Minix but couldn't find out anything about it.

So a few ways I thought:

1) State machine.. FS queues messages and handle them
like a state machine, but that gets omplicated espeically
with lots of places to block, nesting and looping in each
request.

2) Multithreaded FS. Each thread handles a different request,
needs kernel-mode threads and a number of sync primitives.

3) Migrating Threads. Allow a thread to cross from one
protection domain to another. More or less like call-gates and
protection rings. This is what I tended to concentrate on. It still
needs sync primitives like mutexes/condvars or sleep/wakeup
mechanisms.

So last night I did some reading of old posts hoping to find info
about Minix's multithreaded FS when I thought of cooperative
multitasking on a single server thread.

Filesystem runs on single thread but uses separate stacks for each
request and cooperatively schedules them by yielding. Filesystem
then acts like a giant monitor/condition variable. Might be similar
to Unix's sleep()/wakeup() and having only one thread in the
kernel at a time.

Having a stack is much easier than a state machine and storing
state in a separate structure for each request, makes it easier
to have loops and nesting of code. In Pseudocode it might
look something like this:

// Main co-op thread, responsible for receiving messages
// from clients and sending and receiving requests to
// device drivers. Manages a list of io requests created
// by

void Main()
{
for (;;)
{
msg = Receive(RECEIVE_ANY);

if (msg.type == client_request)
{
// Alloc co-op thread and stack, simple
// with a static table of Threads/stacks,
//
// Copy msg onto stack
// Initialse thread
// Place thread on co-op list of threads.

Yield(); // Starts running co-op threads.

// Optimization would run only this new
// co-op thread instead of all
}
else if (msg.type == driver_reply)
{
// Start running the co-op threads that handle the
// requests.

Yield();
}

// Process list of IORequests (read_block/write_block etc)
// created by the other co-op threads.
// Could remove redundant IORequests wantint to read
// same block.

for (each IO Request)
{
Send (driver, IORequst); // Non blocking Send.
}
}
}

// A co-op thread/stack is created to handle a Read request

void Read (struct Msg msg)
{
// Used for sending ReadBlock/WriteBlock requests to
// main() which then deals with sending the real requests
// to the devices.

struct IORequest ioreq;

if (block->in_cache == true)
{
MemCopy_To_Client();
ReplyMsg();
}
else
{
ioreq.device = xyz;
ioreq.block = block_nr;
EnqueueIORequest (&ioreq);

While (block->in_cache == false)
{
Yield();
}

MemCopy_To_Client();
ReplyMsg();
}

// The wrapper code upon return will remove this
// thread from the co-op list and free the stack.
}

It is pseudo-code so things like Read() would be more
complicated, it would be a loop until nbytes are read
for example.

Needs co-op thread for each filesystem request, so possibly
statically allocate co-op thread/stack for each real thread/
process supported by OS. Could possibly do it with fewer
co-op threads, just need to figure out where to queue the
requests.

Some optimizations possible, don't run through all threads when
placing new thread on co-op list. Code above wakes up
all co-op threads when a message is returned from a driver,
they then have to see if the message was from them, otherwise
they Yield(). It is analogous to dealing with level-triggered
interrupts or a monitor/condition-variable using CondBroadcast()
to wake up all threads.

Could maintain queues of threads for each cache block and
only wake up those co-op threads whose data has arrived,
a bit like having a condition variable for each cache block.

Maybe it is still a bottleneck with only one real thread running
lots of cooperative threads. Maybe it is possible to share some
of the cooperative threads among several kernel threads, but that
would get a lot more complicated and need more sync primitives.

PS: Haven't thought it through completely, thought about it last
night and wrote it down :)

--
Marv

wolfgang kern

unread,

Sep 16, 2011, 4:58:54 PM9/16/11

to

Your ideas about may not be wrong at all, but if you just got a
single core CPU (somehow rare anyway today) then I'd renounce of
msg-passing-interrupt scheduls and chose a time-sliced OS instead.
__
wolfgang

Rod Pemberton

unread,

Sep 17, 2011, 3:37:46 AM9/17/11

to

"Marven Lee" <marv...@gmail.com> wrote in message
news:9dglbc...@mid.individual.net...
> [snip]

> From reading old posts I gather there was a multithreaded
> filesystem for Minix but couldn't find out anything about it.

> [snip]

> PS: Haven't thought it through completely, thought about it last
> night and wrote it down :)
>

Yeah, information overload ... I didn't read that too thoroughly. I may do
so later. I'm no help on the Minix query. What I know:

If a sector is in memory and is to be read, then it can be read repeatedly,
or read by multiple cores/threads/processes since they are all accessing
shared memory.

If a sector is in memory and is to be written, that's when you could have
conflicts. You can have multiple cores/threads/processes attempting to
write different versions of the same sector, instead of a combined
result. One core/process/thread must identify that the sector has changed
since it was read, fail to write, reload the updated sector, modify it, and
retry the write.

Multiple read or write requests directly to the hardware, instead of from/to
memory, will need to be serialized or sequentially passed.

Rod Pemberton

Marven Lee

unread,

Sep 17, 2011, 10:02:45 AM9/17/11

to

Wolfgang Kern wrote:
> Marven Lee wrote:
>> I started thinking about send/receive message passing
>> microkernels again instead of the migraring thraad/protection
>> ring ideas that I've often considered. This lead me to think
>> about how to add concurrency to a single threaded filesystem.
>> From reading old posts I gather there was a multithreaded
>> filesystem for Minix but couldn't find out anything about it.
>>

> Your ideas about may not be wrong at all, but if you just got a
> single core CPU (somehow rare anyway today) then I'd renounce of
> msg-passing-interrupt scheduls and chose a time-sliced OS instead.

I was thinking of how it could be done in a server with a single thread,
not throughout the whole OS. It was a little thought I had the other
night as I was thinking about my old OS and how I could restructure
it as I made a mess of the FS and haven't touched it for some time.

My OS had a kernel thread per mount point and a thread for each driver.
A mount point could only exist in the root directory and each mount point
had to manage its own cache. I believe the root directory was protected
by a reader-writer lock, read lock for opening, closing,reading and writing
files, write lock for mounting and unmounting but can't be sure. The last
time I did any serious coding on it was a few years ago so can't remember
what I was changing.

Actually the mount points could be either handled by a thread or just by
function calls. Mount points were like /con /fd0 and /fd0. With the mounts
being handled by separate threads then access to each mount point could be
handled in parallel, but each mount point was single threaded, so if the
mount point had to call a driver to read a block, no other thread could
access
the mount's cache at the same time.

The kernel was monolithic even though it had all these drivers and
filesystem mounts implemented as threads. I basically cloned the Amiga's
IPC mechanisms and driver interface but I'm sure there are still a lot of
differences.

So I've been thinking on and off about my OS while doing other things
like learning C# and WPF. If I were to rewrite my FS I wouldn't have
the mount points as separate threads, I'd probably structure the whole
FS as a giant monitor with condition variables to wait for cache blocks
to be loaded, possibly with driver threads performing the cond_signal to
wake up blocked threads or a separate thread to maintain the buffer
cache. I'm guessing using sleep/wakeup in Unix did something similar.
I guess I need to get a copy of The Magic Garden or The Design of the
Unix Operating System to understand things better.

I was also thinking of the migrating threads/call-gates/protection ring
way of structuring a kernel. I turned my attention back to message
passing and started thinking about Minix's single threaded servers
and such like. I'm sure there are lots of gotchas in what I posted, no real
multithreading using kernel threads, possible problems queuing messages
out to drivers, lots of stacks/co-op threads needed and such like. I had
been reading old posts from about 1991 that mentioned a multithreaded
filesystem server for Minix but never found out anything about it or how
it worked..

--
Marv

Marven Lee

unread,

Sep 18, 2011, 11:23:13 AM9/18/11

to

Marven Lee wrote:

> Wolfgang Kern wrote:
>> Your ideas about may not be wrong at all, but if you just got a
>> single core CPU (somehow rare anyway today) then I'd renounce of
>> msg-passing-interrupt scheduls and chose a time-sliced OS instead.
>
> I was thinking of how it could be done in a server with a single thread,
> not throughout the whole OS.

'Fibers' is the word I was looking for, couldn't remember what to call
cooperative threads running on a single real thread. When I first read
about Windows having them I couldn't think of a reason why they
would be useful.

--
Marv

Marven Lee

unread,

Sep 18, 2011, 11:48:58 AM9/18/11

to

Marven Lee

> 'Fibers' is the word I was looking for, couldn't remember what to call
> cooperative threads running on a single real thread. When I first read
> about Windows having them I couldn't think of a reason why they
> would be useful.

Did some more searching and found the paper "Multimedia support
for Minix 3" and it seems to describe a similar technique of using
fibers and non-preemptive scheduling to make the filesystem server
multithreaded.

--
Marv

Marven Lee

unread,

Sep 18, 2011, 1:01:20 PM9/18/11

to

Marven Lee

Or coroutines

It appears that QNX used coroutines in the FSys server, not sure
if they still do or if they've moved to real threads

As mentioned in this thread in comp.os.research on threads
and coroutines:

http://tinyurl.com/6avthka

Oh and AmigaDOS used coroutines as well it would seem:

http://tinyurl.com/5rtpdyn

Blast! I must've been the last person to know about filesystems
and coroutines. I thought I had worked something out that
nobidy else had thought of, lol.

Heck, I didn't know what a coroutine was until today, I always
thought they were some obscure thing to do with languages like
lisp or bcpl.

I'm going to give these coroutines some more thinking and
decide if I want to restart my OS programming and use them
in the FS.

--
Marv

Rod Pemberton

unread,

Sep 19, 2011, 4:20:32 AM9/19/11

to

"Marven Lee" <marv...@gmail.com> wrote in message

news:9dmmd9...@mid.individual.net...

> Marven Lee
> > Marven Lee
>
> >> 'Fibers' is the word I was looking for, couldn't remember what to call
> >> cooperative threads running on a single real thread. When I first read
> >> about Windows having them I couldn't think of a reason why they
> >> would be useful.
> >
> > Did some more searching and found the paper "Multimedia support
> > for Minix 3" and it seems to describe a similar technique of using
> > fibers and non-preemptive scheduling to make the filesystem server
> > multithreaded.
>
> Or coroutines
>

It may be the solution to your problem, but ... here is my "two cents". The
problem with coroutines is that they violate the principles of structured
programming. Structured programming only allows a single entry and exit to
a block of code such as a loop or procedure. With structured programming,
you cannot jump into and out of loops or procedures or code blocks at will
or repeatedly. Coroutines allow for multiple entry and exit. That means
you *can* jump into and out of code at will. "Jumping around" is the
fundamental characteristic of "spaghetti" code. This eventually leads to
programming errors. Non-structured "spaghetti" code plagues HLLs without
structured control-flow, and also poorly written assembly.

> It appears that QNX used coroutines in the FSys server, not sure
> if they still do or if they've moved to real threads
>
> As mentioned in this thread in comp.os.research on threads

> and coroutines: [link]

>
> Oh and AmigaDOS used coroutines as well it would seem:

> [link]
>

Unfortunately, neither of those say why coroutines were used, or how. They
could've just been merging two existing code pieces to work together,
instead of writing the driver from scratch. Or, not ...

Rod Pemberton

wolfgang kern

unread,

Sep 19, 2011, 5:36:40 AM9/19/11

to

Marven Lee wrote:

>>> 'Fibers' is the word I was looking for, couldn't remember what to call
>>> cooperative threads running on a single real thread. When I first read
>>> about Windows having them I couldn't think of a reason why they
>>> would be useful.

I have to commit that I never understood this 'threading' idea.
How can a single CPU do several things at the very same time ? :)

It either must be kind of multiplexing by timeslice/taskswitch
or merging two functions within one code path.
But the latter seems quite awful inpractible to me.

My Os is HW-event driven (timer,keybd,mouse,etc) w/o hw-taskswitches,
and of course the user get the feeling to see several things happen
in parallel, but as long I havent implemented multicore options, all
code lines will work 'one after the other' :)

But I also allow other 'tasks' like printout, net-communication and
similar during the otherwise wasted wait-period after a disk-command.
I got a global system job-queu as part of the main idle routine.

>> Did some more searching and found the paper "Multimedia support
>> for Minix 3" and it seems to describe a similar technique of using
>> fibers and non-preemptive scheduling to make the filesystem server
>> multithreaded.

> Or coroutines

Never heard about it, reminds me on 'ON a GOSUB a*b+t'.

SATA NCQ could be of help in what you search for ?
I still use SATA-drives in native mode, but NCQ is on my ToDo-list.
__
wolfgang

Marven Lee

unread,

Sep 19, 2011, 8:41:54 AM9/19/11

to

Rod Pemberton wrote:
> Marven Lee wrote:

>>> Marven Lee wrote:
>>>> 'Fibers' is the word I was looking for, couldn't remember what to call
>>>> cooperative threads running on a single real thread. When I first
>>>> read about Windows having them I couldn't think of a reason why
>>>> they would be useful.

[snip]

>> Or coroutines
>
> It may be the solution to your problem, but ... here is my "two cents".
> The problem with coroutines is that they violate the principles of
> structured programming. Structured programming only allows a single
> entry and exit to a block of code such as a loop or procedure. With
> structured programming, you cannot jump into and out of loops or
> procedures or code blocks at will or repeatedly. Coroutines allow
> for multiple entry and exit. That means you *can* jump into and out
> of code at will. "Jumping around" is the fundamental characteristic
> of "spaghetti" code. This eventually leads to programming errors.
> Non-structured "spaghetti" code plagues HLLs without structured
> control-flow, and also poorly written assembly.

I'm not too keen on coroutines either, I'd prefer to use normal
multithreading, but was thinking of solutions to use when multithreading
is limited. Even with multithreading I find things look complicated.

I was doing some thinking last week of the different ways to structure an
OS. Currently mine is monolithic but with drivers and filesystems as threads
within the kernel, these normally communicate with Amiga-style message
passing, ports and signals (which are a sleep/wakeup mechanism).

Technically drivers and filesystems don't need to be threads, they can just
be called on the context of the caller, it's upto the driver whether to use
a thread or not. The last time I touched the code it more or less worked
but I would've liked to allow threads to read from a cache while another
thread was blocked on IO to a driver. Sounds simple but it appears
difficult in this case due to the driver structure and different synch
primitives like message passing and condition variables being hard to work
together. Perhaps having the cache managed by it's own task and confining
everything to message passing would've made it work or I could've
completely redesigned the drivers and filesystems not to use threads..

I imagine early Unix's sleep/wakeup mechanisms and no-preemption in the
kernel was a bit like using coroutines and simpler than using threaded
drivers and other sync primitives.

As I said I was thinking of other ways to structure an OS, not that I'm
going to start developing my OS again, just wanted to thiink about things.
Originally I was thinking about OSes structured as protection rings and
subjects unrelated to filesystems. Then turned my attention to message
passing microkernels and how they could support some form of
multithreading/concurrency on a single thread.

>> It appears that QNX used coroutines in the FSys server, not sure
>> if they still do or if they've moved to real threads
>>

>> Oh and AmigaDOS used coroutines as well it would seem:
>

> Unfortunately, neither of those say why coroutines were used, or how.
> They could've just been merging two existing code pieces to work
> together, instead of writing the driver from scratch. Or, not ...

Could be.the case, but from reading other QNX posts from that time
period I gather that QNX didn't have multithreading so they used coroutines
in FSys to allow requests to block on IO but still be able to service other
requests.

--
Marv

Antoine Leca

unread,

Sep 21, 2011, 1:01:07 PM9/21/11

to

Marven Lee wrote:
> 2) Multithreaded FS. Each thread handles a different request,
> needs kernel-mode threads and a number of sync primitives.

No about kernel-mode threads. In fact, current MINIX did implement
recently (less than a month ago) an "asynchronuous VFS", with a
(cooperative) thread mechanism in that server but without any kernel
thread support (of which there are none currently in MINIX anyway.)

It does require "a number" of synchronisations though.

Antoine