> Is Matthew Dillon subscribed yet? I would like him to be our guide from
> how to get from this prototype to something a BSD maintainer would like.
> Please Matthew.
> My thinking is that he should do what he thinks is right to one of the drivers
> to move it to user space, we adjust the kit API until that works. Then
> test again. But I want him to lead since he drives the source and we
> consume the kit API.
I'm on the group now! My biggest strength is specifying and building
the API partitioning the chipset driver from the rest of the kernel.
I mentioned this in the meeting but I think we need to start with the
most well-defined interfaces which are those for standard network if
(not wifi) and disk driver.
I already have some ideas of the requirements based on the drivers I
wrote for AHCI and the Sili 3132, and I have a pretty good idea what
requirements there will need to be for a basic network driver.
What I am thinking is that we need a Wiki in addition to the group to
flesh these things out, since it is going to be an incremental process
as people think of new issues which need to be solved. Right at the
outset these are the issues I see. I am going to throw these out with
the intent that we move this stuff into a Wiki in fairly short order:
* Generally speaking we want the driver to have to interact with the
kernel/library as little as possible for mundane tasks such as BUSDMA
presync/postsync, locks, buffer management, and timeout management.
Basically we want the driver to specify the requirements and have the
kernel/library be responsible for implementing the requirements.
* Bulk data alignment, maximum DMA chain size, maximum segment size,
maximum total size requirements. Traditionally specified via the
BUSDMA subsystem in BSDland. I believe these requirements, including
bounce buffer operation and pre-sync/post-sync operations, must be
specified by the driver but implemented by the common kernel/library
layer. That is, the kernel needs to pass data to the driver which is
already properly aligned and the kernel should be responsible for pre-
* Passing of data via physical DMA chains. This is something which
does not currently exist in BSDland. For maximum portability the
kernel needs to be responsible for passing physical DMA chains to the
driver. For something like Sparc or other cpus which I/O MMUs the
kernel would be responsible for the translation, passing the
appropriate physical or semi-physical chains to the driver such that
the driver can just poke them into the chipset. For embedded systems
with no MMU the addresses would be passed straight through (though
still with alignment/bounce-buffer requirements met). If the driver
needs access to addressable data buffers the kernel/library would
provide a callback and map the chains to KVA (again, almost a NOP on
an embedded system), but in this case we have to be careful to specify
just exactly what the driver can expect in terms of how the data is
broken down since the segments may not be contiguous.
* Timeouts. Should the driver be responsible for implementing
timeouts passed from the kernel or should the kernel be responsible
for it? I think the kernel should be responsible for timeouts and
issue a callback (see threading abstraction below), but the driver is
responsible for enforcing any minimum timeouts imposed by hardware and
actually completing or aborting the I/O.
* I/O request structure. I'm thinking a SCSI abstraction should be
used for disk drivers since it is the most well-defined interface.
ATA pass-through can be implemented (there's a SCSI abstraction for
it). I don't think CAM is appropriate... CAM is unnecessarily complex
though we would still want the library/kernel to implement bus target/
lun scanning style features and not the device. It is unclear to me
whether we should require a BIO interface but I'm thinking we should
not because it opens a big can of worms in terms of assumptions on
availability of fields in the data structure. If not, though, the
SCSI abstraction would definitely need to provide a rolled-up block
number in the passed structure so the driver doesn't have to extract
it from the command. For network drivers we want something similar
which supports passing multiple packets in a single request, with
fields to support vlan and other features.
* Threading/blocking/locking abstraction. The driver would specify to
the OS the various entry points. For example, each interrupt vector
associated with the driver is a separate entry point. Callbacks for
timeout handling is an entry point. Directly queued commands are an
entry point (and in the critical path). Probing and power management
are entry points. Timeout callbacks are entry points. I'm thinking
the driver would specify the number of discrete 'threads' (in its
abstraction) and then associate the various calls into the driver with
a thread, plus have a fast path for direct calls which would only be
used for command queueing. Similarly the driver would be able to
associate a lock or set of locks with each entry point which the
kernel would acquire before making the call (see note 1). The actual
implementation in the kernel or library might be with FEWER actual
threads then the driver specified... all the way down to a single
thread context for all operations (making separate callbacks to the
abstractions defined by the device) if the kernel desires.
note 1: The driver can acquire locks as well, but giving the kernel/
library the ability to acquire primary locks before calling entry
points also gives the kernel/library the ability to deal with blocking
or scheduling issues based on how it actually decides to implement the
threading abstraction. Strict lock ordering would be a requirement.
* The threading abstraction is extremely important because for this to
work devices must be able to operate efficiently both in uni-processor
and SMP systems, in particular devices with the capability to queue in
parallel from multiple cpus without locking. The kernel/library
support can default to a single lock (kinda like a mini-giant-lock but
on a per-device basis) but must be flexible enough to handle the more
complex threading / multiple-lock abstractions that high-end GigE and
10GigE (and even AHCI - per SATA port) drivers might desire.
* Polling. Interrupts specified by the driver would also flag whether
polling is allowed or not. Default would be that it is allowed (only
really old ATA chipsets can't be polled due to a lack of an interrupt
status register). The kernel would be responsible for turning polling
on and off with control entry points into the driver to turn hardware
interrupt generation off or not.
* Probe sequence. FreeBSD & DragonFly use NewBus, but the API has
always been ill-defined in terms of its operating context. We would
again want to use the threading abstraction and have the driver
register which thread it wants various synchronous commands to run in
for probe, attach, detach, power management, polling control, and so
* Periodic operations (Phy state polling and such) would also use the
thread abstraction and the kernel/library would be responsible for
making the calls into the driver.
Those are my thoughts so far. In thinking about this I was
specifically thinking about how I would port something like our AHCI
disk driver to the new scheme, or something like the Intel network
(IF_EM) driver. Note that I am specifically not considering the far
more complex network protocol drivers yet (tcp, udp, etc). The USB
driver would be applicable though as it essentially uses a SCSI
interface. Well, for bulk storage devices anyhow. I don't just want
to duplicate an existing kernel interface... in fact, ALL existing
kernel interfaces currently put a much larger burden on the chipset
driver then they should. To do this right we need to offload as much
as possible to the middle layer.
Specifying the API is more suitable for a Wiki than for a conversation
in Google Groups but I wanted to get it down before I lose track of