Initial thoughts

25 views
Skip to first unread message

Jon Taylor

unread,
Jul 4, 2011, 8:09:30 PM7/4/11
to perlos
Hi all,

I created this group because I think that the Perl language and
environment is a much more powerful candidate for creation of a
virtual machine based operating system than Java, which is at this
time the obvious leading candidate. I have been exploring the JNode
OS (www.jnode.org) recently, and in the course of doing so I realized
that I could not overcome my dislike for the Java language itself,
regardless of how obvously powerful the VM/Jit OS concept is. The
obvious retort is to go backwards and deny the need for any sort of
high-level language at the basic systems level, let alone one which is
JIT-recompiled and/or OOP-based. Most OSes and systems software
libraries are written in a mix of native assembly and C for this
reason. Perl usually runs as an executor process on top of such basic
OSes. Yes, this sort of ultra-basic, static approach is flawed for a
variety of reasons:

* It isn't 100% portable, hence the need for specified assembly
sections and a mess of #ifdefs throughout the C code itself.
* It isn't (easily) resource-parallelizeable and JIT-recompileable,
which severely limits scalability in arbitrary environments. It also
has no high-level object/class abstraction, which creates a
requirement for concurrency to be specified via threading abstractions
instead of flowing naturally and automatically.
* It requires sophisticated, modular OS designs (microkernel and RTOS
style) to wring the most performance out of the hardware. However
these modern OS designs are still light-years less powerful than a
full VM with JIT recompilation, for the simple reason that they cannot
be resource-specced to the underlying hardware environment well. They
all end up being the same as any other bare-metal, statically compiled
OS ends up being: a collection of device drivers wrapped up in some
(usually badly designed) library interface.

A systems environment should be based on as high a level of
abstraction and the most advanced and feature-rich language as
possible, and it should be a scripting type of language which allows
for dynamic reflow and recompilation. It should be object oriented as
a capability and not a required style (so that abstract resources can
be cleanly specced to the hardware resource basis), contain a natural
and inherent templating/forms system (to allow for a real modular form
hierarchy WRT abstraction of hardware interfaces), and run on top of a
virtual machine to allow for a clean break between the native systems
environment and the abstract language executor (to allow for dynamic
reflow of native code generation and hardware resource timeslicing).
In general the more abstract the language, the better. Perl is many,
many times as abstract and feature-rich a language as Java - it is one
of the most powerful languages ever designed in this regard, AFAIK -
and therefore it seems to me a logical choice to replace Java at the
systems level.

But why? Why create an Perl-based operating system virtual machine,
device drivers, systems JITter, and all the other stuff that will be
needed? What will it gain anyone? I think it will gain many people
an awful lot. Perl is widely used as a website scripting language,
for example. It would be nice in many cases to replace a traditional
server OS in a server rack with a small embedded Perl OS which can run
the scripts directly in a secure VM instead of a UNIX or MS Windows
OS. Performance should be quite a lot better, too - the Perl OS will
be optimized for executing Perl code at high speed, the system
environment will be JIT-recompiled for even more performance, and if
even more performance is required, arbitrary numbers of hardware
resource modules should be addable with a simple connection of a
network cable because the environment will be parallelizeable to the
extent that a single Perl OS VM can spill over across multiple
arbitrary hardware resource spaces, optimized down to the clock cycle,
bus bandwidth and address range level.

Theoretically-maximal code reoptimization at runtime ("Speccing out")
of the hardware like this hasn't been done with Java VMs, because the
language spec simply doesn't allow for this type of fine-grained
resource respeccing at runtime. Perl isn't tied to the concept of an
object or class, so this should allow a Perl OS to run equivalent code
at a much higher level of reoptimization than the equivalent Java
code. Scripting type of languages are perfect for this, due to their
high level of abstraction. As hardware becomes more heterogeneous,
diverse, complex in terms of capabilities and interfaces, and more and
more asynchronous and in need of software tuning of IO balance and
priority at the bus level, fine-grained respeccing capacity is
absolutely vital to have maximal performance come out of the
hardware. The higher-performance the hardware, the more disastrous in
terms of a performance hit even small levels of deoptimization due to
imperfect speccing of the software virtual environment can cause.

So, let us move the discussion to what is needed in terms of code. I
personally am only an average Perl developer, and it has been a while
since I used the language. But I know OS-level principles very well
now, and there are not too many areas which need to be covered AFAIK,
JIT compiler and VM design as well as overall device driver systems
design being the two main sections:

Device driver systems architecture

Using JNode as an example from open source, one can see that the
temptation is to use a traditional hierarchical object model to
classify and reclassify interfaces down to the basic IO interface
level. This is IMHO far, far too heavywieght a system to use in a
dynamic environment. Objects in general need instantiation, which is
resource-heavy and hard to spec generically. It often causes cache
and TLB flushes because of this, which is death as far as high-speed
streaming IO is concerned. The existing PerlIO classes implement
layered sections, which are perfect for implementing modular
transforms in a stack. Compiler support for this form would probably
be extremely optimizable - exactly what is needed when as few layers
of translation between high-level IO requests from a command stream
and low-level register writes or DMA channel ops is desired.

JIT recompilation

Usually, the only question here is whether and how to allow for
multiprocessor and multicore spillover and scheduling. The easiest
way to do this is to require the use of monitor locks and to support
these at the VM level, disallowing the use of soft-scheduled threading
models outside of special scheduling pools and not supporting hard
threading primitives at all in the VM. Unfortunately, perldoc doesn't
show any results for a search on the 'monitor' keyword, so I am forced
to assume that this will need to be specced and designed. Hopefully
this will allow a fully modern monitor class to be added to the
language spec. If I am missing something, someone should let me
know... as far as the actual design of the monitors, usually you use a
stack graph of some kind and push and pop the monitor conditions via
critical sections embedded into the VM and driver code, tumbling the
code sections around until one or more float to the top and run out.
Simple to design, hard to optimize - but that is the way of all
systems code.

VM design

Perl's existing VM design appears to be minimal and focused around
providing wrapper interfaces to underlying system abstractions. Most
of the abstractions will obviously need to be pulled up and through,
being reimplemented in native Perl code in whole or in part. It is
impossible for me to comprehensively size up this task, give the size
of the existing CPAN codebase -hopefully it won't be a major problem.
This will result in a minimal VM, mostly concerning itself with
managing monitor section reflow. Hopefully the concepts of garbage
collection and code recompilation and JITting can be genericized into
abstract resource management systems (i.e. CPU drivers and memory
drivers). The ideas is that there should not actually be a
overarching virtual machine concept driving some sort of master
hierarchy of abstraction of objects, but rather a series of virtual
layers in typed sections encapsulating a set of state management
abstractions which can be reflowed to repotimize as hard conditions
impinge on the global system monitors and cause specced reflow of
resource locks - i.e., the whole thing is one big reaction system for
resolving sectional locks, which in turn are specced around resolving
hardware conditions. Ideally there won't be a whole lot of object
abstraction down in these layers of code - see the discussion of using
PerlIO layers above.

Most of the rest of any sort of design should involve what types of
object-level abstraction to wrap around the lower layers. Most of
this already exists in CPAN, but some of it will need to be designed
from scratch. The most stressful section will be the need to deal
with monitors. Monitors synchronization will need to be promoted as
the preferred method of systems IPC in Perl in general, and threading
will need to be relegated to thread-specific monitor sections and will
probably require back-end rewrites of traditional IPC mechanisms such
as locks, mutexes and semaphores, to allow IPC state to reflow through
monitor sections which will be priority-subsumal to the IPC layers.
Hopefully there won't be much need to run non-monitor-aware systems
code with a threading model.

Once final issue involves the use of basic assembly code for
implementing trap sections, interrupt handlers, basic fast-
interlocking and boot code. A native interface to allow arbitrary
binary objects to be "linked" into the Perl code execution environment
and called into at the entry point level should also be implemented.
Everything else in the native systems codebase can be implemented in C
or some other compiled language, but doesn't technically need to be -
this is one area where JNode's design really shines and should be
emulated - the JIT compiler and general language support is witten in
Java and compiled to native code, IIRC. It removes the need for a
third language, requiring only minimal assembly glue per processor
type for the small critical code sections like trap and exception
handling and boot code. Everything else can be written in pure Perl5.

Once all of this is in place, all legacy code should be supported.
The only remaining issues will be packaging related: creating a
distribution architecture, overall Perl OS build system, configuration
and installation scripts, templates and utilities, and of course
systems utilities, shells and scripts a la UNIX. Obviously there are
other specific issue such as GUI and Windowing toolkits, compatibility
with other languages, compatibility with other OSes and other such
"legacy" API emulation and translation needs, etc etc etc. The answer
to the question of production of such thing is "go write or port the
code to Perl", because at this point everything else needed should
pretty much be present. Issues of "design a new Perl-style GUI
system" or "do we clone the POSIX shell or not" I choose to ignore for
now, in favor of focusing on pure OS level abstractions.

I think that this should be enough of an introduction. I am convinced
that Perl can serve as the foundation of a hardware abstraction system
that can be superlative in every respect, definitely including code
performance. Please give me some feedback, Perl 5 experts, and let me
know what you think about this idea and what you would be willing to
contribute.

Jon
Reply all
Reply to author
Forward
0 new messages