Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

windows PCIe driver related

66 views
Skip to first unread message

Yongqiang Shi

unread,
Nov 20, 2009, 9:07:41 AM11/20/09
to
I have got a misson about developing a windows driver for a PCIe data
capture device.
With WDK7200, I choose "KMDF" as a driver model.
The PCIe module of the device is designed using a FPGA chip.
The hardware developer find out that when my driver read data from the
device, FPGA get a bouch of scattered requests.My driver map the
memory space by "MmMapIospace", and "RtlCopyMemory" for read.
For example, I read 2 DWORDs from the device, the FPGA recieve 2 read
request!!!
I can only imagine the pci bus driver split my read request, but this
cause the efficiency problem.
The driver only get data at 4MBps speed, it's for away from the
demanding.
Can anyone point out what's wrong with my driver?or is there anything
happens with FPGA?

Also, I want use DMA for acceleration, but I don't know How to use
PC's DMA adapter in KMDF.
PC's DMA adapter is attatched on ISA bus which is attatched on PCI
bus.

Thanks for your attention, waiting forward your reply.

Don Burn

unread,
Nov 20, 2009, 9:19:02 AM11/20/09
to
First you should be using READ_REGISTER_XXX and READ_REGISTER_BUFFER_XXX not
RtlCopyMemory for reads. Second you say you ask for 2 DWORDS and you get 2
read requests, what were you expecting? Does your hardware allow a 64-bit
access for these registers, if so you need to use a request that forms
things as a 64-bit reads for these two items?


As far as PC DMA no you cannot use it and if you could you would be a lot
slower than what you are getting now. As far as performance what is the
goal here? If you have a large block of data to get each time, the FPGA
should have on chip DMA, if you are collecting a little bit of data i.e. a
few words then your driver is the place to look.


--
Don Burn (MVP, Windows DKD)
Windows Filesystem and Driver Consulting
Website: http://www.windrvr.com
Blog: http://msmvps.com/blogs/WinDrvr
Remove StopSpam to reply


"Yongqiang Shi" <zelt...@gmail.com> wrote in message
news:bcf0d294-971e-407a...@u36g2000prn.googlegroups.com...

> __________ Information from ESET NOD32 Antivirus, version of virus
> signature database 4624 (20091120) __________
>
> The message was checked by ESET NOD32 Antivirus.
>
> http://www.eset.com
>
>
>

__________ Information from ESET NOD32 Antivirus, version of virus signature database 4624 (20091120) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com


Yongqiang Shi

unread,
Nov 21, 2009, 2:57:44 AM11/21/09
to
On Nov 20, 10:19 pm, "Don Burn" <b...@windrvr.com> wrote:
> First you should be using READ_REGISTER_XXX and READ_REGISTER_BUFFER_XXX not
> RtlCopyMemory for reads.   Second you say you ask for 2 DWORDS and you get 2
> read requests, what were you expecting?  Does your hardware allow a 64-bit
> access for these registers, if so you need to use a request that forms
> things as a 64-bit reads for these two items?
>
> As far as PC DMA no you cannot use it and if you could you would be a lot
> slower than what you are getting now.   As far as performance what is the
> goal here?  If you have a large block of data to get each time, the FPGA
> should have on chip DMA, if you are collecting a little bit of data i.e. a
> few words then your driver is the place to look.
>
> --
> Don Burn (MVP, Windows DKD)
> Windows Filesystem and Driver Consulting
> Website:http://www.windrvr.com
> Blog:http://msmvps.com/blogs/WinDrvr
> Remove StopSpam to reply
>
> "Yongqiang Shi" <zeltr...@gmail.com> wrote in message


Thanks for replaying.
My OS is windows XP 32bit SP2. I changed the code from RtlCopyMemory
to READ_REGISTER_BUFFER_ULONG, but still can't accelerate the whole
reading operation.
When I send a read IRP for 2 DWORDs, the FPGA get 2 seperate
requests ,each requires only one DWORD.This make my reading too
slow,only 4MBps(standard PCIe can reach 5GBps for a 4x device).
Why the pci bus driver can't send only one request read for 2 DWORD at
a time.

For "READ_REGISTER_BUFFER_XXX" and "RtlCopyMemory", I thought the
first function is used for Physical Address, the second function is
for Vitrual Address.
Since I used "MmMapIoSpace" function map the PA to VA, Why can't I use
the second function? And Why the two functions behave the same?

eagersh

unread,
Nov 25, 2009, 3:56:58 PM11/25/09
to
It is not pci bus driver do that. It is your driver initiate such
transaction. You perform transfer data by using CPU not DMA.

> For "READ_REGISTER_BUFFER_XXX" and "RtlCopyMemory", I thought the
> first function is used for Physical Address, the second function is
> for Vitrual Address.

No, both function use Virtual addresses.


> Since I used "MmMapIoSpace" function map the PA to VA, Why can't I use
> the second function? And Why the two functions behave the same?

You could, but WDK documentation recommend to use the first one. This
recommendation related to compatibility of different version of
Windows.
To get bigger speed you need to use DMA transaction. WDK has sample of
a driver which using DMA for transferring data.
\WinDDK\XXX\src\general\PLX9x5x

Igor Sharovar


Tim Roberts

unread,
Nov 26, 2009, 10:56:46 PM11/26/09
to
Yongqiang Shi <zelt...@gmail.com> wrote:
>
>Thanks for replaying.
>My OS is windows XP 32bit SP2. I changed the code from RtlCopyMemory
>to READ_REGISTER_BUFFER_ULONG, but still can't accelerate the whole
>reading operation.
>When I send a read IRP for 2 DWORDs, the FPGA get 2 seperate
>requests ,each requires only one DWORD.This make my reading too
>slow,only 4MBps(standard PCIe can reach 5GBps for a 4x device).
>Why the pci bus driver can't send only one request read for 2 DWORD at
>a time.

You're not thinking about this sequentially. The PCI bus driver doesn't
KNOW that you want 2 DWORDs. The x86 instruction set simply does not have
any mechanism for the CPU to tell the bus about that.

RtlCopyMemory and READ_REGISTER_BUFFER_ULONG both compile to "rep movsd"
instructions (well, for only 2 dwords they won't, but the point is the
same). That instruction, after some set up, runs one "movsd" per CPU
cycle. Each "movsd" moves one dword from source to destination, which goes
out as one bus cycle. The two bus cycles are separate and unrelated.

It would be possible to design a PCI bus controller that waits after
receiving a read to see if it gets another read for the next consecutive
address, but that's a lot of trouble. Reads take a long time, so it's
usually best to forward the read cycles on immediately. (For memory,
caching takes care of this.) That kind of thing IS commonly done for
writes -- that's called "write combining".

>For "READ_REGISTER_BUFFER_XXX" and "RtlCopyMemory", I thought the
>first function is used for Physical Address, the second function is
>for Vitrual Address.

No. CPU instructions NEVER work with physical addresses. Those two
functions are exactly identical (for x86, anyway).

>Since I used "MmMapIoSpace" function map the PA to VA, Why can't I use
>the second function? And Why the two functions behave the same?

On the x86, it turns out that you CAN use RtlCopyMemory. However, there
are some CPUs where you need to have a "memory barrier" before you start
reading from memory-mapped I/O. The READ_REGISTER_* functions will do that
where necessary. So, it's a "best practice" to use them.
--
Tim Roberts, ti...@probo.com
Providenza & Boekelheide, Inc.

0 new messages