Proposal for an IO API

Christoph

unread,

Aug 11, 2016, 1:34:40 PM8/11/16

to BareMetal OS

Hello,

I really like the idea of BareMetal OS and I follow its development for quiet a while now.

The main problem I see is the lack of a high performance IO API.

In an operating system that uses only a single thread per execution unit blocking a whole core

to do a basic file read/write is a huge problem when writting some high performance code. Imagine

a server that has more assests to serve than the main memory can hold.

That's why I spend some time thinking on how to implement an IO API that achieves these goals.

The main idea is to have an event driven system which cooperates quiet well with the job queue of

BMOS because it already has the main loop. Moreover a focus is on never blocking therefore data

being accessed should always be locked in memory.

I've written a simple pseudo c code to present the idea:

https://gist.github.com/akamiru/822a921b6b3c8110d4cf72fba2fa2b38

The presented version includes a buffer cache but a version without it could be made by adding

a simple target address pointer to some functions. This would mean the user would be responsible

for allocating the main memory. Maybe something between that would be right like a default cache

which can be deactivated ?

Another idea I had would reduce the number of system calls:

Every request like read a file, start a timer, send data over a socket gets pushed on a stack

and the next io_poll() call would fire them in the same order. If it helps performance ofc.

An additional version which directly fires for priority requests could be added e.g. io_read and

io_pread()

I'd really appreciate your thoughts on my ideas.

Christoph

42Bastian

unread,

Aug 12, 2016, 1:45:11 AM8/12/16

to bareme...@googlegroups.com

Christoph

> The main problem I see is the lack of a high performance IO API.

Isn't this a contradiction in it self? "highperformance" and IO?

> In an operating system that uses only a single thread per execution
> unit blocking a whole core

The basic principle behind BMOS ;-)

AFAI understand, the basic idea behind BMOS is to avoid the pain of
multithreading.

With todays multicore CPUs with up to 12 cores (commercial available), I
think having a dedicated IO core is easier to implement.

You can use IPC to queue new IO jobs to this core.

If you have different IOs like disk, network. Spend one core per IO.

Cheers

--
42Bastian

Christoph

unread,

Aug 12, 2016, 6:03:41 AM8/12/16

to BareMetal OS

Am Freitag, 12. August 2016 07:45:11 UTC+2 schrieb 42Bastian:

Christoph

> The main problem I see is the lack of a high performance IO API.

Isn't this a contradiction in it self? "highperformance" and IO?

Why should it be? You can write high performance servers or databases were clearly IO is

a big focus while still needing extremly fast sorts. In my eyes high performance means do

your given task as efficient as possible. What's a high performance calculation worth if it

doesn't save or shares any results? Of course things like cracking a password doesn't need

much IO and therefore its performance is quiet unimportant but machine learning for example

has to load big amounts of training data.

> In an operating system that uses only a single thread per execution
> unit blocking a whole core

The basic principle behind BMOS ;-)

AFAI understand, the basic idea behind BMOS is to avoid the pain of
multithreading.

If you use a dedicated core for each task e.g. a hash table isn't shared but rather only

one core has access to it some cores will have to wait for others if its not well engineered

or simply if the work patterns change which clearly isn't high performance. Moreover you

come into the domain of the CAP theorem this way so you'll already have to deal with

multithreading problems.

With todays multicore CPUs with up to 12 cores (commercial available), I
think having a dedicated IO core is easier to implement.

You can use IPC to queue new IO jobs to this core.

If you have different IOs like disk, network. Spend one core per IO.

Even if I waste 2 of the 12 cores on IO the current way of how disk IO is done

(blocking the whole core until the job is done) doesn't allow to use the disk to its

full potential because it's not able to reorder and prepare commands in the queue.

So this is the very opposite of high performance.

Cheers

--
42Bastian

With an API like the proposed one which allows for real non blocking disk IO high performance doesn't

contradict high performance in any definition because the time spend for an IO is only a few

external register accesses to issue he command. Which is the minimal overhead to get your data/results

to disk or the network.

In an API like linux is using there is no way to actually do nonblocking file io and there's no great

demand for it because it already has the overhead of a scheduler and a blocking read simply means