Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Software Considerations for PCI H/W Designers

0 views
Skip to first unread message

Chuck Berg

unread,
Apr 19, 1999, 3:00:00 AM4/19/99
to
I have been asked to present "Software Considerations for PCI H/W Designers"
during the PCI Design Conference at the PC Developers Conference. Since no
individual has as much experience with PCI driver design as this collective
group, I am soliciting input from you.

I would like to present "The Top 10 Things Every PCI H/W Designer Should
Know About Drivers."

I would also like to present "The Top 10 Things PCI H/W Designers Do That
Bug Driver Developers."

Please send your input directly to me. If people are interested, I would be
happy to post the results back to the group.

TIA.

--
Chuck

--------------------------------------------------------
Charles R. Berg
Director of Development
The Software Studio, Inc
21580 Stevens Creek Blvd; Suite 204
Cupertino, CA 95014
(408) 777-0818 ext 25
(408) 777-0820 FAX
CB...@SoftwareStudio.SG818.com
www.TheSoftwareStudio.com
"Specialists in Core Technology for Windows Ž"


Brian Catlin

unread,
Apr 19, 1999, 3:00:00 AM4/19/99
to
Chuck Berg <cb...@SoftwareStudio.SG818.com> wrote in message
news:eeKS2.71$DW4....@client.news.psi.net...

> I have been asked to present "Software Considerations for PCI H/W
Designers"
> during the PCI Design Conference at the PC Developers Conference. Since
no
> individual has as much experience with PCI driver design as this
collective
> group, I am soliciting input from you.
>
> I would like to present "The Top 10 Things Every PCI H/W Designer Should
> Know About Drivers."

Assume that your device will be plugged into a multiprocessor (SMP) system.
This
means that you should try to provide as much parallelism in your hardware as
you can,
e.g. if possible, make send and receive separate.

Reduce the number of operations needed to program a device. Instead of
writing
bits in some sequence, have the driver build a command packet in memory and
then
write the physical address of the command packet to some HW register. The
HW
would then DMA the command packet from host memory and process the command.

Spend your money on a fast bus-master scatter-gather capable DMA controller
that
can access all of host memory. Many otherwise good HW designs fail because
data
cannot be read/written to/from the host quickly enough. If your DMA
controller doesn't
support scatter-gather you will pay a HUGE performance penalty. Also if
there are any
addressing limitations (the most common is 24-bit (16MB)) it will also cost
you BIG.

Do not use I/O ports - they are much slower than registers.

One of the biggest problems I've seen is that HW engineers, in an effort to
reduce the
number of device registers, try and group all the device's control and
status bits in as
few registers as possible. On a multiprocessor system, this can be a large
bottleneck,
assuming that the driver is written to take advantage of SMP systems, when
the same
register needs to be accessed by driver routines running at different IRQLs
(e.g., from
the ISR and a DPC). In this case, you will need to call
KeSynchronizeExecution from
the DPC to synchronize with the ISR. The problem, of course, is that if the
ISR is already
running, then KeSynchronizeExecution will "spin-wait" on the interrupt
object's spinlock.
So, as far as a generic design rule goes: "Don't share resources between
routines that
run at different IRQLs". This means that the system designer must
understand the hardware
AND the NT driver model.

HW engineers also try to reduce device register counts by aliasing read and
write
registers. For example, reading a register returns status, while writing
the same
register controls the device. This is a problem in SMP systems for the same
reasons
stated above. Generally it is better to separate registers.

Another problem is hardware that performs a side-effect as a result of
reading a
register. Having an index register automatically increment on a read is OK,
but any
effect that can "consume" or change data is to be avoided.

Again, taking the NT driver architecture into consideration, place control
and status
bits in registers that will require the least number of spinlock waits on an
SMP system.
For example, if you have a network card that has essentially independent
receive and
transmit rings, DO NOT put the ring status and control bits for both rings
in the same
register.

Reduce interrupts as much as possible. Excessive interrupts will kill
system performance.
For devices with high throughput requirements, use ring buffers as much as
possible, then
you only need interrupts when a ring transitions from "empty to not empty"
and "not empty
to full".

> I would also like to present "The Top 10 Things PCI H/W Designers Do That
> Bug Driver Developers."

Allow registers to be accessed using byte, word, and longword operations.
This is helpful
during debugging, especially if you are writing a WinDbg extension DLL to
display/write
your device's registers (WinDbg currently does not have a way to specify
register access
at a specific width, although I understand this will change).

> Please send your input directly to me. If people are interested, I would
be
> happy to post the results back to the group.

Posting to the group, just in case you don't get a chance to post the
summary :)

-Brian

--
Brian Catlin, Sannas Consulting 310-798-8930
Windows NT internals, WDM, and device driver consulting and training
See WWW.SOLSEM.COM for scheduling public and on-site seminars


Tim Keating

unread,
Apr 20, 1999, 3:00:00 AM4/20/99
to
On Mon, 19 Apr 1999 14:07:14 -0700, "Brian Catlin" <BCa...@AOL.COM> wrote:

>Chuck Berg <cb...@SoftwareStudio.SG818.com> wrote in message
>news:eeKS2.71$DW4....@client.news.psi.net...
>> I have been asked to present "Software Considerations for PCI H/W
>Designers"
>> during the PCI Design Conference at the PC Developers Conference. Since
>no
>> individual has as much experience with PCI driver design as this
>collective
>> group, I am soliciting input from you.
>>
>> I would like to present "The Top 10 Things Every PCI H/W Designer Should
>> Know About Drivers."
>
>

>Do not use I/O ports - they are much slower than registers.

Even worse,

1. Don't design features that require use of the
Configuration space accesses frequently. (Very bad design, but I've seen them.).

Configuration space accesses are even slower that I/O ports and require
holding spin locks for SMP usage.

I.E. Things like interrupt, dma engine, device control should be available as a
registers accessed thru a BAR map.

2, Don't stall PCI bus by filling up pipelines and forcing the CPU into retry's or by
holding off /trdy. Allow the writer to queue requests and/or check pipeline depth, that
way the driver can stop submitting requests before a stall condition occurs. It wouldn't
hurt to a programmable low-water pipeline interrupt so the driver doesn't have to poll
for a restart condition.

3. Use DMA engines with deep fifo's So PCI bursts can get MAX performance from PCI bus.

4. Don't restrict DMA lists to reside in device localized memory. System memory is the
best place for these lists.

5. Try to use DMA engines to control as much of the device as possible, were feasible.
I.E. Reduce the number of single cycle accesses to board. Also note:, Writes to
PCI board(write posting/ pipelining) are several times more efficient than reads.

6. Avoid being stuck with only the I2o standard, it's got some of the world's worst
queuing schemes I've ever seen, absolute junk. You'll be stuck with extra overhead, plus
i960 uP's forever. Most device to device data flow is an exception case and not the rule.

That's all for now. Good luck.


Tim Keating
ktcn...@mediaone1.net,
(Note: remove numeric digits from email address before responding.)

0 new messages