Porting to a new USB controller

905 views
Skip to first unread message

audi...@gmail.com

unread,
Jun 19, 2013, 11:29:15 AM6/19/13
to fpgalin...@googlegroups.com
 
I am working on a project that has uses an NXP LPC1347 as USB controller connected to an Altera Cyclone 5.
 
Is there a document describing the interface between the host library and the USB controller ? I would like to
get the fpgalink to work with my board.
 
Any general tips or hints for me to get this to work?
 
Thanks Carl

Chris McClelland

unread,
Jun 19, 2013, 12:27:49 PM6/19/13
to fpgalin...@googlegroups.com
The LUFA library[1] which is the foundation of the AVR firmware is also
available for the NXP LPC* devices[2], so you should probably start from
the AVR firmware[3].

The host<->micro connection is straightforward. There are a bunch of
custom vendor commands[4] for orchestration of GPIO and FPGA
programming, but the actual data transfer is done on IN/OUT pairs of
bulk endpoints. The pair to be used for programming operations and the
pair to be use for communication is requested by the firmware on
startup, as a response to the CMD_MODE_STATUS command. In [5], the first
call of CMD_MODE_STATUS(0x80) command goes to a board with the FX2
firmware. The response tells the host to use endpoints 2&4 (offset 6)
for programming and 6&8 (offset 7) for communications. The second call
of the CMD_MODE_STATUS command goes to an AVR, which requests the use of
endpoints 3&4 for both programming and communications.

The comm_fpga writes (i.e flWriteChannel()) work by sending a five-byte
header, giving the channel number 0-127 in byte 0 (top bit clear)
followed by a 32-bit big-endian length, followed by the data to be
written.

The comm_fpga reads (i.e flReadChannel()) work by sending a five-byte
request (same format as for writes, but with top bit of byte 0 set, not
clear). The micro then sends the requested data back on the IN endpoint.

I'm considering reducing the message lengths from four bytes to just
two, because in the process of adding support for endless streams of
data, it makes sense to split large blocks of data into streams of
smaller blocks which are dispatched to the USB controller
asynchronously.

A particular micro may support several different conduits over which the
host may communicate with the FPGA. As of the latest binary
distribution, the FX2 firmware only supports its own high-speed FIFO
interface and the AVR firmware only supports the Enhanced Parallel Port
(IEEE1284) protocol. But recently I've been working the AVR code for a
half-duplex synchronous serial protocol (good for applications where
FPGA I/Os are precious), and someone may want to implement the
comms-over-JTAG approach that is common in the Altera world. So the host
has to have some way to choose which conduit to use - hence the
flFifoMode() function, which is changing from accepting a bool to
accepting an integer, allowing the host to choose which conduit to use.

So for your micro, you need to choose what kind of communication to use
- the EPP protocol is simple, but requires a lot of I/O. A dedicated
high-speed synchronous FIFO interface like the FX2's would be nice, but
I don't think the LPC chips have that. The LPCs certainly have decent
serial comms support, so that's an option too, but my serial protocol
code is not yet in a usable state.

You might want to join the #fpgalink IRC channel on irc.freenode.net
(which Peter kindly created earlier today), where we can discuss
further.

Chris

[1]http://www.fourwalledcubicle.com/LUFA.php
[2]http://www.lpcware.com/content/project/nxpusblib
[3]https://github.com/makestuff/libfpgalink/tree/master/firmware/avr
[4]https://github.com/makestuff/libfpgalink/blob/master/firmware/avr/main.c#L276
[5]http://pastebin.com/raw.php?i=dWjyXFKv
> --
> You received this message because you are subscribed to the "FPGALink
> Users" mailgroup (see
> http://www.makestuff.eu/wordpress/software/fpgalink/).
>
> To post to this group, send email to fpgalin...@googlegroups.com
> To unsubscribe from this group, send email to
> fpgalink-user...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/fpgalink-users?hl=en
>
> ---
> You received this message because you are subscribed to the Google
> Groups "FPGALink Users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to fpgalink-user...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>


audi...@gmail.com

unread,
Jun 20, 2013, 12:13:32 PM6/20/13
to fpgalin...@googlegroups.com, fpgal...@m3.ath.cx


That's a long and through answer, I needs some time to absorb all the information and browse the sources before I continue with my project.

Thanks Carl
 

audi...@gmail.com

unread,
Jun 20, 2013, 5:06:25 PM6/20/13
to fpgalin...@googlegroups.com, fpgal...@m3.ath.cx
 
 
Is the README file in the firmware/avr folder up to date ?
It has a paragraph about neroJtag, is that copy paste from
a prevous project ?
 
Regards ///Carl
 
 
 
 
Fetch neroJtag and build:
  cd makestuff/apps
  ../common/msget.sh neroJtag
  cd neroJtag/
  make

Chris McClelland

unread,
Jun 20, 2013, 5:17:52 PM6/20/13
to audi...@gmail.com, fpgalin...@googlegroups.com
You're right. It's an anachronism from the days when the firmware was in
a separate project. I'll fix it.

The dependencies listed are correct. Once installed, you can build the
firmware on Linux like this:

http://pastebin.com/raw.php?i=u6jfme90

Chris

Frank Buss

unread,
Jun 29, 2013, 3:40:45 AM6/29/13
to fpgalin...@googlegroups.com

Carl, any news on your port? I would like to use FPGAlink with a LPC11U24. I guess the USB hardware is the same.

audi...@gmail.com

unread,
Jun 29, 2013, 1:32:55 PM6/29/13
to fpgalin...@googlegroups.com
I can't say that I gotten anywhere yet, this project had to be on the back burner for a while. 
I am fully immersed in writing a Linux device driver for my TV tuner card. When I am done 
with that or I have given up this is the next on my to do list.

Regards /// Carl

Frank Buss

unread,
Jul 3, 2013, 12:16:32 AM7/3/13
to fpgalin...@googlegroups.com
I've started porting it to the LPC, just to inform you to avoid double work. I think I can finish it today or tomorrow. Looks like the hardware and the board is abstracted in a way that it should be easy for you to add your LPC type.

audi...@gmail.com

unread,
Jul 3, 2013, 2:05:29 PM7/3/13
to fpgalin...@googlegroups.com

Do you plan on using the LPCOpen framework ?

Regards /// Carl

audi...@gmail.com

unread,
Jul 3, 2013, 2:35:47 PM7/3/13
to fpgalin...@googlegroups.com, fpgal...@m3.ath.cx

I am designing a universal device programmer, the general idea is to use an LPC1347 for USB comms, to control and monitor the VCC and VPP rails, and load FPGA with glue logic for different devices to be programmed.  Is it possible to have a virtual serial device connected to the LPC1347 while at the same time run fpgalink ? I am not very familiar with USB only used reference virtual serial port examples before and prolific converters.

Or would it be better to expand the vendor specific commands to include what I need to setup VCC VPP etc ... ?

Thanks /// Carl



Chris McClelland

unread,
Jul 3, 2013, 3:04:23 PM7/3/13
to fpgalin...@googlegroups.com
Why not just use the flPortAccess() function?

https://github.com/makestuff/libfpgalink/blob/master/libfpgalink.c#L257

Its purpose is to allow you to:

1) Set the direction of the bits on a specified 8-bit port.
2) Choose drive-low or drive-high for the bits selected as outputs.
3) Return the readback state of the port.

If all you want to do is set a list of individual port bits each to
either drive-high, drive-low or high-Z, you can use the more
user-friendly flPortConfig() function:

https://github.com/makestuff/libfpgalink/blob/master/prog.c#L893

It is implemented in terms of the lower-level flPortAccess(). It's used
by the programming algorithms, but it's also useful for general-purpose
control and monitoring. For example, the Digilent Nexys2 has a novel
arrangement where the FPGA can be powered from USB via a small power FET
which is under the control of the FX2 micro, on port D7. That's why all
the Nexys2 flcli examples you see have things like "-w D7+"; it means
"set port D7 as an output and drive it high". That -w option in the
flcli utility is implemented like this:

https://github.com/makestuff/flcli/blob/master/main.c#L414
https://github.com/makestuff/flcli/blob/master/main.c#L509

Ultimately the flPortAccess() function is implemented with a USB vendor
command, which is handled in the AVR firmware here:

https://github.com/makestuff/libfpgalink/blob/master/firmware/avr/main.c#L356

...and in the FX2 firmware here:

https://github.com/makestuff/libfpgalink/blob/master/firmware/fx2/app.c#L263

Presumably that is all you need for the VCC and VPP control and
monitoring?

Chris


On Wed, 2013-07-03 at 11:35 -0700, audi...@gmail.com wrote:
> I am designing a universal device programmer, the general idea is to
> use an LPC1347 for USB comms, to control and monitor the VCC and VPP
> rails, and load FPGA with glue logic for different devices to be
> programmed. Is it possible to have a virtual serial device connected
> to the LPC1347 while at the same time run fpgalink ? I am not very
> familiar with USB only used reference virtual serial port examples
> before and prolific converters.
>
> Or would it be better to expand the vendor specific commands to
> include what I need to setup VCC VPP etc ... ?
>
> Thanks /// Carl
>
>
>
>
>
>
>
>

Frank Buss

unread,
Jul 3, 2013, 3:11:07 PM7/3/13
to fpgalin...@googlegroups.com, fpgal...@m3.ath.cx
Yes, I use lpcopen. My first step was to compile the GenericHID example and then adjust all the paths and settings in the LPCXpresso project files, so that I could move the source code of the  GenericHID example outside of the c:\nxp\lpcopen directory (sometimes Eclipse is a mess, a clean Makefile would be easier, maybe someone can do this later). This worked. Then I tried to modify the code, with the AVR code as a guide. Should be all ported now, the device is detected, with two bulk endpoints, but the FPALink C example code on Windows says  "flOpen(): usbOpenDevice(): LIBUSB_ERROR_NOT_FOUND". So it does something, because when I unplug the device, it says "flOpen(): usbOpenDevice(): device not found". I guess now I have to learn how it works to find the bugs :-)

Regarding your question: You can implement an USB compound device: one interface for FPGALink and one for a virtual serial port. But if you plan to use your own PC software anyway, I would  expand the FPGALink vendor specific commands, much easier than implementing another USB device.

Chris McClelland

unread,
Jul 3, 2013, 3:44:11 PM7/3/13
to fpgalin...@googlegroups.com
Hi Frank,

Many thanks for doing the port! It's great to have more people looking
at the code and making changes & additions.

Secondly I believe that LIBUSB_ERROR_NOT_FOUND error means that the
device was unable to select the USB device configuration selected by
FPGALink.

Firstly, make sure you have used Zadig.exe to load either a WinUSB or a
libusbK driver for the 1D50:602B device (the FPGALink VID:PID).

Secondly, it's possible you have not ported the descriptor file
correctly. I have a small utility lsep which will dump the descriptors.
You can try it like this:

http://pastebin.com/raw.php?i=5zk7AGhJ

Compare the descriptors I get for the AVR8 firmware with what your LPC
firmware gives and we'll go from there.

Chris

Frank Buss

unread,
Jul 3, 2013, 4:39:04 PM7/3/13
to fpgalin...@googlegroups.com, fpgal...@m3.ath.cx
Thanks for the test program Chris, it showed the same error. I didn't know Zadig.exe. Somehow I installed libusb-win32-bin-1.2.6.0, which has a nice GUI for creating a driver .inf-file, but this was the reason for the error message, because looks like it is not compatible. With WinUSB, installed with Zadig.exe, your test program works and I can see the descriptor. Minor differences, but now it shouldn't be a problem anymore.

All these programs, forks and version are a bit confusing. I hope the libusbx project will solve this.

Frank Buss

unread,
Jul 3, 2013, 5:30:53 PM7/3/13
to fpgalin...@googlegroups.com, fpgal...@m3.ath.cx
Ok, now the descriptors are identical, except for the bRefresh and bSynchAddress fields, but looks like they are not specified in the firmware. When I try "flcli -v 1d50:602b -c" from "libfpgalink-20120621\win32\rel", I get the error "Could not open FPGALink device at 1d50:602b and no initial VID:PID was supplied" (maybe you should add the error message from libusb, too). My idea was that it could be another problem with version, so I compiled it from Github, with your build environment with "../common/msget.sh flcli" and "make deps". Compiling worked (after commenting the fx2 lib, because looks like there is no "basename" command and no Perl installed) and now I get this:

Frank@64bit$ flcli.exe -v 1d50:602b -c
Attempting to open connection to FPGALink device 1d50:602b...

Entering CommFPGA command-line mode:
flFifoMode(): usbControlWrite(): LIBUSB_ERROR_PIPE

Maybe a firmware problem now, but maybe you have another good idea. I'm still trying to avoid reading the libusb documentation :-)

Chris McClelland

unread,
Jul 3, 2013, 5:48:09 PM7/3/13
to fpgalin...@googlegroups.com
Again, version pain. You should try the 2013-03-21 release, described
here:

https://groups.google.com/d/msg/fpgalink-users/U8Ex30Q8RDw/a2OteBVOFCkJ

If that doesn't work, you may need to use some lower-level tools. Here
are a couple of useful utilities, ucm.exe ("USB Control Message") and
hxd ("Hex Dump"):

http://pastebin.com/raw.php?i=8PCrP1Yh

The ucm.exe tool allows you to call arbitrary vendor commands on a
device. In the example above, it calls the CMD_MODE_STATUS (0x80)
command on the device. The response should look similar to mine.

The situation on Windows *is* confusing, I agree. But it's partly my
fault for not updating the FPGALink user manual more regularly.

The old libusb-win32 project included both user-space code and the
kernel-space driver. But the libusb-1.0 project only includes the
user-space code. The kernel-space driver (called libusbK) is a separate
project. In time, the old libusb-win32 project will hopefully fall into
obscurity and this confusion will no longer arise.

Chris

Chris McClelland

unread,
Jul 3, 2013, 6:06:54 PM7/3/13
to fpgalin...@googlegroups.com
Sanity-check...my advice below assumes you based your firmware on the
AVR code on the GitHub master branch:

https://github.com/makestuff/libfpgalink/tree/master/firmware/avr

Chris

Frank Buss

unread,
Jul 4, 2013, 1:08:06 AM7/4/13
to fpgalin...@googlegroups.com, fpgal...@m3.ath.cx
Thanks, ucm.exe works and shows the same result with your cat's name. I found the problem with LIBUSB_ERROR_PIPE: The libusb documentation says "LIBUSB_ERROR_PIPE if the control request was not supported by the device", and it was right, because my ported version of the AVR firmware didn't handle this request, because I've ported only the USB side so far, not the port functions.

BTW: Nice for USB debugging is the USB sniffer software Bus Hound ( http://perisoft.net/bushound ). The free version has some limitations, like that you see only the first 4 bytes of each transaction, but most of the time that's all what you need. The problem with my firmware showed up in the capture window as "stall pid".

The next problem was a "stall pid" again. In commit https://github.com/makestuff/libfpgalink/commit/f939a60d210b90ca26039e3025d054608c5f7ea7 you changed the protocol of CMD_MODE_STATUS, but in the AVR firmware (all branches) it still expects 1 instead of 0 (but looks like it was changed for the FX2 firmware). I've fixed this in my port, now trying to make the rest work.

Maybe you should add a protocol version byte for the status response, for one of the reserved bytes, so that the host can detect if the device uses a compatible protocol. And another useful byte would be a "machine type" byte for AVR or LPC (or other microcontrollers in future).

Chris McClelland

unread,
Jul 4, 2013, 5:56:51 AM7/4/13
to fpgalin...@googlegroups.com
Apologies for the protocol change. The preliminary flSetFifoMode()
function used to accept a bool (i.e you can either talk to the FPGA or
you can program it), but then I started adding alternative communication
mechanisms to the firmwares. For example in the AVR firmware, I'm adding
a synchronous serial connection that uses far fewer FPGA I/Os than the
existing EPP connection but is not that much slower. I think it would be
good to add a virtual JTAG transport as well, and an FPGA-side comm_fpga
implementation that uses the sld_virtual_jtag megafunction on Altera and
the BSCAN core on Xilinx. So I'm leaning towards renaming
flSetFifoMode() to something like flSelectConduit() to choose
comms-over-JTAG, comms-over-EPP, comms-over-serial,
comms-over-something-else etc.

The protocol number and machine type bytes are good ideas. I'll add
them.

I tend to develop everything on Linux, and then port to Windows and
MacOSX. USB debugging is trivial on Linux. :-)

http://www.makestuff.eu/wordpress/enabling-linux-usb-debugging/

Chris

Frank Buss

unread,
Jul 4, 2013, 1:59:40 PM7/4/13
to fpgalin...@googlegroups.com, fpgal...@m3.ath.cx
Good idea to use Linux for development. Maybe I should do this, too.

The USB communication works now. First I've implemented the flPortAccess function. I couldn't use the original function, because the LPC architecture uses 32 bits per port, and I didn't want to implement workarounds with 8 bit segments and "virtual" ports.

My first ideas was to change the old function to 32 bit, but this would be overkill for the AVR architecture and not backward compatible, if we need this. The solution is a new function: flPortBitAccess. This function configures just one pin. For debugging usually you need only a few pins, and usually they are not on the same port, which would require multiple transfers anyway, so should be not a problem. And if you use it for other things, most of the time you need also only a few pins, e.g. implementing SPI over USB. This is the repository with the changes and the new LPC firmware implementation, cloned from your repository, on the master branch:


The separate commit steps might help Carl to see what needed to be changed from the example projects for the new project. And the modified flcli:


I've implemented a new parser (one of my hobbies to implement recursive descent parsers :-) and can use it like this:

Frank@64bit$ win.x64/rel/flcli.exe -v 1D50:602B -n a18-,a1
Attempting to open connection to FPGALink device 1D50:602B...
A18 = 0
A1 = 0

The n command (piN) is similar to the "write" command, but it combines read and writes: If you don't specify a '+' or '-' suffix, the port will be automatically configured as input. I don't see a reason why to use "?" for it. With '+' or '-' the level will be set to 1 or 0 and the port will be configured as output. For all ports it will read the current level and show it. Would be easy to implement the same function for the AVR microcontrollers, but we can discuss it first, if the new interface makes sense. In the LPC datasheet the GPIO ports are numbered, e.g. GPIO0_15, but I think it is more clearer to use A instead of 0 and B instead of 1 (there are only 2 ports in the LPC11Uxx chips).

Now some more information about the development and porting for Carl, if he wants to add support for his microcontroller.

I've tested it with LPCXpresso_5.2.4_2122.exe and lpcopen_v1.03.zip. The project files for lpcopen require to install and unpack it to c:\nxp, so that you have the following directories:

c:\nxp\lpcopen
c:\nxp\LPCXpresso_5.2.4_2122

First I moved the project file out of the lpcopen directory, for easier version management. I started with the HID example (still in the repository for testing) and copied these directories and files:

C:\nxp\lpcopen\applications\LPCUSBlib\lpcusblib_GenericHIDDevice
C:\nxp\lpcopen\applications\lpc11xx\xpresso_projects\nxp_xpresso_11u14\lpcusblib\nxp_xpresso_11u14_usb_GenericHIDDevice
C:\nxp\lpcopen\applications\lpc11xx\startup_code\cr_startup_lpc11xx.c
C:\nxp\lpcopen\software\lpc_core\lpc_board\boards_11xx\nxp_xpresso_11u14
C:\nxp\lpcopen\applications\lpc11xx\xpresso_projects\nxp_xpresso_11u14\nxp_xpresso_11u14_board_lib

In LPCXpresso you can compile the HID example like this:

File->Import->General->Existing Projects into Workspace, Next, Browse:
C:\nxp\lpcopen\applications\lpc11xx\xpresso_projects\nxp_xpresso_11u14\nxp_xpresso_11u14_usblib_device
ok, Finish

File->Import->General->Existing Projects into Workspace, Next, Browse:
[your Github project directory]\firmware\lpc
selected nxp_xpresso_11u14_board_lib and nxp_xpresso_11u14_usb_GenericHIDDevice for import
ok, Finish

If you want the .bin file: right click on the nxp_xpresso_11u14_usb_GenericHIDDevice project: Properties->C/C++ Build->Settings->Build Steps->Command: There is a "#", which comments the .bin file generation. Remove the "#". For faster testing I've added a "cp [path see below]\nxp_xpresso_11u14_usb_GenericHIDDevice.bin k:\firmware.bin". When you power-on the device in USB mode (PIO0_1 low while power-on or reset), a ROM loader on the chip registers itself as a thumb drive to Windows, so compiling the firmware flashes it automatically.

Then press ctrl-b to rebuild all. If something went wrong, try "Project->Clean->Clean all Projects", which builds the projects, too, if the checkbox "Start a build immediately" is checked. The firmware file is at [your Github project directory]\firmware\lpc\nxp_xpresso_11u14_usb_GenericHIDDevice\Debug\nxp_xpresso_11u14_usb_GenericHIDDevice.bin. Start C:\nxp\lpcopen\applications\LPCUSBlib\lpcusblib_GenericHIDDevice\HIDClient.exe on Windows, connect your microcontroller to the USB bus of your PC and it should detect one device in the "Device" combobox, named "LPCUSBLib Generic HID Demo".

The debug firmware is 14,756 bytes big. To build the "release" version: For each project left click it and then select Project->Build Configurations->Set Active->Release in the menu. The release version is 6,292 bytes big.

If still something doesn't work, try deleting the Eclipse workspace: C:\Users\[your username]\Documents\LPCXpresso_5.2.4_2122\workspace
There are lots of temporary files in it, which sometimes causes build errors, especially if you change the build settings, which can't be fixed with a "clean" (or at least I don't know what to do).

Another tip: don't add files to a project with "Import->File System", looks like it doesn't work: It doesn't change the project file, but a local copy in the workspace. Instead edit the linkedResources-section in the .project file. Then right-click and "Close Project" in LPCXpresso and open it again,
because the update doesn't work, even if you restart LPCXpresso.

If the HID example works, remove the project from the workspace and add the nxp_xpresso_11u14_usb_FPGALink project from firmware\lpc. Then you could create a new directory for the LPCExpresso project files and you might need to use the project files from your microcontroller and merge it with my project files, I don't know how much it differs. The source files are in lpcusblib_FPGALink and should be the same for all LPC microcontrollers. But it might need its own "board" project. The LPCExpresso project files for the board project are in nxp_xpresso_11u14_board_lib and the source code is in nxp_xpresso_11u14.

Currrently the FPGALink firmware is 17,128 bytes in debug mode and 7,216  in release mode. Not much difference from the HID example. Looks like there is a lot of overhead for lpcopen and the USB library itself.

 BTW: For me it doesn't looks like a good idea of the lpcopen project to use so many duplicated and only slightly changed project files, and even duplicated source code files. Some nice makefiles would be much better. I think LPCExpresso supports the use of makefiles, too, instead of their arcane XML project file thing.

Next I'll finish the JTAG interface port to start the CPLD on my board (a Lattice chip, but more the size of a small FPGA) and then I can implement and test the VHDL part for the FIFO communication on the CPLD and the microcontroller.

Frank Buss

unread,
Jul 5, 2013, 2:12:52 PM7/5/13
to fpgalin...@googlegroups.com, fpgal...@m3.ath.cx
Chris, JTAG is implemented now. I've refactored some of your macros to functions and separated the low level hardware layer more cleanly from the logic level. GCC should know how to optimize it :-) I don't know if it works, but at least it shows something for scanning the JTAG chain:

Frank@64bit$ flcli.exe -v 1D50:602B -q A1A2A3A4

Attempting to open connection to FPGALink device 1D50:602B...
The FPGALink device at 1D50:602B scanned its JTAG chain, yielding:
0x01FFFFFF
0x2BFFFFFF
0xC0000000
0x43FFFFFF

But when I try to program it, it hangs:

Frank@64bit$ flcli.exe -v 1D50:602B -p J:A1A2A3A4:CrazyCartridge.svf

Attempting to open connection to FPGALink device 1D50:602B...
Programming device...
I don't know how to setup the Visual Studio debugger with your makestuff build environment, but some printfs indicates that it hangs in the flLoadSvfAndConvertToCsvf function. This is the svf file: http://www.frank-buss.de/tmp/CrazyCartridge.zip I guess you are faster finding the bug.

What do you think about the JTAG port configuration? I don't know if it is necessary to configure the pins, because usually the firmware is for a given hardware with fixed connections for the JTAG pins. In the LPC implementation I used fixed pins and implemented a dummy stub for the CMD_PORT_IO, which is not needed, but still called from the flcli software, so that the original flcli software should work for JTAG.

Chris McClelland

unread,
Jul 5, 2013, 3:06:11 PM7/5/13
to fpgalin...@googlegroups.com
Hi Frank,

Those IDCODEs look wrong to me. If your board's JTAG chain contains just
one Lattice FPGA, the scan-chain should return just one 32-bit number,
which should be the correct IDCODE from your device's BSDL file.

Your .svf file uncovered some bugs in my SVF parser. I worked around
them to create this .csvf file:

www.swaton.ukfsn.org/temp/frank.csvf

Once you have the correct IDCODE, see if it works on your board. In the
meantime I'll fix the parser.

I had hoped to do the same port configuration on the AVR as I did on the
FX2, but the FX2 port config works by modifying its own code at runtime,
which is not possible on the AVR because of its Harvard architecture. It
would be possible without self-modifying code, but a lot slower. So I
came to the conclusion it'd be better to leave it hard-coded at firmware
compile-time.

Can you explain the meaning of the flPortBitAccess() parameters?

We should work out a good way for you to submit code back to the main
branch. The usual method is to use a GitHub pull request.

Chris

Frank Buss

unread,
Jul 6, 2013, 4:06:14 PM7/6/13
to fpgalin...@googlegroups.com, fpgal...@m3.ath.cx
Right, should be just one number. I have analyzed it on the scope and there it is the right ID. The Visual Studio debugger is now working (well, I can attach to a running process and then break all threads), so I found this tricky bug in the 64 bit version:


Maybe you should use the standard uint32_t types from stdint.h (not available for older Visual Studio versions). libusb.h does it like this:

/* stdint.h is also not usually available on MS */
#if defined(_MSC_VER) && (_MSC_VER < 1600) && (!defined(_STDINT)) && (!defined(_STDINT_H))
typedef unsigned __int8 uint8_t;
typedef unsigned __int16 uint16_t;
typedef unsigned __int32 uint32_t;
#else
#include <stdint.h>
#endif

I've tried to change it in makestuff.h, but then looks like I get bit bits mirrored. The BSDL file says this:

attribute IDCODE_REGISTER of LCMXO2_4000HC_XXTG144 : entity is
"0000" & --Version number
"0001001010111100" & --Device specific number
"000001000011"; --Company code

But I get this:

Frank@64bit$ win.x64/dbg/flcli.exe -v 1D50:602B -q A1A2A3A4

Attempting to open connection to FPGALink device 1D50:602B...
The FPGALink device at 1D50:602B scanned its JTAG chain, yielding:
0x01D403C2

Some other BSDL files suggests that the bit order in the BSDL file is really from most significant bit to least significant bit, e.g. this one from a Cyclone FPGA:

attribute IDCODE_REGISTER of EP1C3T100 : entity is
"0000"& --4-bit Version
"0010000010000001"& --16-bit Part Number (hex 2081)
"00001101110"& --11-bit Manufacturer's Identity
"1"; --Mandatory LSB

And there is a delay of 10 ms between the calls, is this right? One scan sequence looks like this:


You can see the flPortBitAccess() parameters in the header file: https://github.com/FrankBuss/libfpgalink/blob/master/libfpgalink.h But if there is such a high delay between two USB commands, and maybe even some more communication overhead, it could be better to implement just a 32 bit version of flPortAccess. This would at least reduce the overhead for the direction bits, and for e.g. SPI sometimes you can set two bits in parallel.

BTW: For better debugging, I've added a tiny printf implementation to the firmware project (and enhanced it to support %i, not just %d, because I use %i all the time). Makes the code 700 bytes bigger. There are problems with the newlib stubs in the lpcopen framework, so I renamed it to "uprintf". Might be useful for the AVR project, too, instead of the difficult to read multiple lines of debugSendLongHex and debugSendFlashString.


Good idea with the pull requests. But first I should implement the rest. Thanks for the CSVF file conversion, this would be the next step. Maybe we are lucky and I can fix some more bugs :-)

Chris McClelland

unread,
Jul 6, 2013, 7:17:16 PM7/6/13
to fpgalin...@googlegroups.com
Sorry, what am I supposed to look at on that Visual Studio screen-grab?
I could be going blind, but I don't see bugs there. In any event, I
don't advise changing makestuff.h. Everything depends on it, not just
FPGALink, all my other projects, on umpteen platforms, umpteen different
compilers, etc etc.

The JTAG IDCODEs are printed by FPGALink in the bit-order specified in
the BSDL (i.e MSB appears textually first, LSB textually last), so
Xilinx devices always end in 0x93, Altera devices always end in 0xDD.
Also, since the LSB of the IDCODE is guaranteed to be '1', the IDCODE
will always be an odd number, never even.

10ms seems a little slow. I have never tried benchmarking control
requests but at a guess I'd say one message per poll interval, which is
usually 2ms. What you suggest (doing up to 32 bit reads/writes in one
command) did occur to me. There's no reason why the existing 8-bit
vendor command couldn't be trivially extended to 32-bit ports; for
devices with 8-bit ports, the top 24 bits would be "don't care" on
writes, and returned as 0 on reads.

Chris

Chris McClelland

unread,
Jul 6, 2013, 7:24:07 PM7/6/13
to fpgalin...@googlegroups.com
Actually, regarding the IDCODE readback, it looks like you had it almost
right bfore. From the Lattice BSDL:

> attribute IDCODE_REGISTER of LCMXO2_4000HC_XXTG144 : entity is
> "0000" & --Version number
> "0001001010111100" & --Device specific number
> "000001000011"; --Company code

That's:
0000 0001 0010 1011 1100 0000 0100 0011
0 1 2 B C 0 4 3

And "01 2B C0 43" are precisely the bytes returned in the MSB of each
32-bit word, from your previous email:

> The FPGALink device at 1D50:602B scanned its JTAG chain, yielding:
> 0x01FFFFFF
> 0x2BFFFFFF
> 0xC0000000
> 0x43FFFFFF
^^

Chris


On Sat, 2013-07-06 at 13:06 -0700, Frank Buss wrote:

Chris McClelland

unread,
Jul 6, 2013, 7:32:08 PM7/6/13
to fpgalin...@googlegroups.com
Last word, I promise: there's no need to attach the Visual Studio
debugger to a running process; you can run a binary in the debugger by
launching the exe itself. Quoting from [1]:

On Windows, you can launch the Visual Studio debugger on an exe
with VCExpress.exe win32/dbg/mul.exe. You will then need to
right-click the mul project in Solution Explorer, select
Properties and fill in the Arguments property with 10 2. Now you
can right-click mul again and choose Debug->Step Into New
Instance to start debugging at main().

Chris

[1]http://www.makestuff.eu/wordpress/software/build-infrastructure/


On Sat, 2013-07-06 at 13:06 -0700, Frank Buss wrote:

Frank Buss

unread,
Jul 7, 2013, 1:35:00 AM7/7/13
to fpgalin...@googlegroups.com, fpgal...@m3.ath.cx
Sorry, I thought sizeof(uint32) were 8, but I was just confused by sizeof(char*)=8 and meanwhile I had changed the firmware, too, so I thought this were the bug, but it was just my firmware.

JTAG scanning is working now, was a really silly bug in my firmware port.

Frank Buss

unread,
Jul 7, 2013, 12:10:05 PM7/7/13
to fpgalin...@googlegroups.com, fpgal...@m3.ath.cx
Now it looks like the JTAG programming function is running. First it timed out in jtagIsSendingIsReceiving, but after I moved "Endpoint_ClearOUT" immediately after "Endpoint_Read_Stream_LE", the function finished. I'm not sure why :-)

But now I get this error message (I've enabled debug mode to see more information, but I get the error without debug, too)


Maybe a bug with the CSVF file conversion? At least it detects the chip id, so I guess the rest should work, too. Maybe you have an idea, otherwise I can verify it with the scope, but would be a bit work.

Chris McClelland

unread,
Jul 7, 2013, 3:03:18 PM7/7/13
to fpgalin...@googlegroups.com
There's no easy answer to that, unfortunately. Basically it's saying it was expecting the FPGA to respond with 02, but the FPGA actually responded with 00.

This could be due to a number of things:

1) A bug in my conversion of the Lattice .svf file into .csvf.

2) A bug in the .csvf player logic (unlikely, but possible I guess).

3) A bug in your firmware - I'd be happy to send you a suitable AVR board for testing with a "known-good" firmware if you like. Alternatively you could try using your firmware to program an Altera or Xilinx FPGA, which is known to work.

I think (1) is the most likely cause of the problems. One of the sanity-checks I used to do in the integration test suite was to verify that the .csvf file generated from the source .svf is the same irrespective of whether you go directly from the .svf using my xsvf2csvf utility, or you go indirectly via an .xsvf file generated by the Xilinx svf2xsvf tool. I stopped doing that because it was taking several hours to build all the HDL in the various combinations.

So just now I tried doing it for the VHDL version of the cksum example on all platform combinations, and for your .svf:

http://pastebin.com/0Uk2sBjW

The result is not particularly impressive:

http://pastebin.com/raw.php?i=DYQvvtRp

The most common problem seems to be the buffer-size check which fails for large XSDRTDO commands. I'll try to find the root cause of these problems and get back to you.

Chris


--

Peter Stuge

unread,
Jul 7, 2013, 8:05:17 PM7/7/13
to fpgalin...@googlegroups.com
Hi Frank, list,

I have circa 9 years experience with the libusb project (ie. the
project that created libusb-0.1, libusb-1.0 and libusb.org) and
I was asked by two succeeding libusb maintainers to take over
after them.

The first time I declined, because I did not feel that I knew the
code well enough to take responsibility for it. The project went
a year or so without any maintainer, we worked on creating the
libusb-1.0 API during that time. The second time I was asked in
summer of 2010 I accepted, after having spent another few years
with the project.


Frank Buss wrote:
> All these programs, forks and version are a bit confusing. I hope
> the libusbx project will solve this.

libusbx and the people there are rather the problem, not the solution.
(libusb-win32 is also a problem. It always was.) The emperor has no
clothes.

It's well worth to remember, and I think quite telling, that libusbx
very clearly proclaimed themselves to be "a hostile fork" in their
announcement email thread.

Anyway, I wish that you would look more closely into the facts of the
libusb situation. The history is indeed complicated, but if you take
some time to look into it you will have the satisfaction of knowing
what really happened, as opposed to knowing only what others say.
I can tell you all that you would ever want to know and more, but I
think it's much better to read independently and form an own opinion.

The libusbx people are quite persistent with propaganda and self-
promotion, and my time is simply too limited to fight lies, name-calling
and ad hominems all over the internet, while at the same time working
with untainted downstreams (one libusbx maintainer is a Red Hat employee
and most distributions switched to libusbx when he switched the Fedora
package source) as well as doing qualified user support and training,
and sometimes even working on actual code. I spent well over 1000
hours on libusb-1.0.9 over the course of circa two years. All of my
libusb work is voluntary.

I have high standards. The vocal majority on the libusb mailing list
disagree with this and want a feelgood sense of progress, regardless
of quality. None of them produce significant amounts of commits. Few
of them produce any commits at all. I naïvely tried to reason about
this, but that of course only made things worse.

There are many reasons for my high standards; part of it is that I
am uninterested in dealing with many users experiencing some problem
if I could preempt that by being more careful during development and
review (one-to-many benefit), another part is that I feel that it is
our damn responsibility to produce the very best quality software
that we can possibly accomplish since we are in the privileged
position of being able to create software which the entire world
can reuse.

Especially with open source I do not accept that deadlines should be
a driving factor, because deadlines will always result in mediocre
quality. We can do better than mediocre as a community, but obviously
people will also have to wait longer to get something that they can
try out, if they will not get anything at all until something exists
which is good enough to actually perform reliably.

Experience shows that almost nobody shares my values, libusb only
drives every printer, scanner, digital camera and MTP or iPod MP3
player on every Linux system in the world, and none of those things
are very important. Some of the medical systems might be, though.

People get frustrated, scared, angry or annoyed when I tell them that
I think that they can do even better than their first try. That's sad,
but all I can really do is to continue working according to my own
standards and keep rejecting mediocrity. Maybe some day someone will
appreciate that.

Please excuse this digression, as you may understand this is a topic
that I care about, and I believe that I have unique experience from
being the maintainer of libusb for several years and having been with
the project for such a long time, but I also don't want to go too far
off topic on this list. If you or anyone else wants to discuss libusb
with me, please let's do so privately or maybe on IRC.

If you feel like debating with me, please first have a look at this:
http://www.netbooknews.com/wp-content/2011/07/the-pyramid-of-debate-550x417.jpg

This list is about FPGALink, so let's get back on topic:


I'm a big fan of LPC1300, they're really cute controllers! I'm happy
to hear that you're working on adding support for them in FPGALink! :)

The LPC11U24 and other LPC11U controllers will probably work without
significant changes to the code. Awesome!

But is LPCOpen really the very choice? FPGALink is clearly a very
open source project. LPCOpen not so much. The LPCUSBlib license with
its restriction on use with NXP Microcontrollers is pretty daft, and
a polar opposite of an open source license. The only "open" about
LPCOpen is four letters in its name, I'm afraid - there aren't even
Makefiles for building with GCC.

I for one don't think this is such a good fit in a real open source
project such as FPGALink. I'd welcome suggestions for other sources
of a USB stack. I've taught several workshops with the LPC1343 using
modified code from 32bitmicro's old GNU toolchain port of NXP
examples, but that code isn't so much better from a licensing point
of view.

I would very much welcome ideas for how to easily improve this. Maybe
it would actually make sense to write a simple replacement for the
NXP code, it's actually not doing all that much..


Hoping for good ideas

//Peter

Peter Stuge

unread,
Jul 7, 2013, 9:26:45 PM7/7/13
to fpgalin...@googlegroups.com
Frank Buss wrote:
> The solution is a new function: flPortBitAccess. This function
> configures just one pin.

Keep in mind that round trips over USB will take much long time than
one might intuitively expect. USB offers high throughput but not low
latency..


> implementing SPI over USB

..so don't try to do generic bitbanging over USB, all you will get is
absolutely horrible performance. :\

The key to successful use of USB is creating an application-specific
protocol using native USB primitives in the best possible ways.


> In the LPC datasheet the GPIO ports are numbered, e.g. GPIO0_15, but
> I think it is more clearer to use A instead of 0 and B instead of 1
> (there are only 2 ports in the LPC11Uxx chips).

The established naming both in NXP documentation and otherwise is
P0.15 for GPIO0 bit 15. Please don't invent a new, different, naming
system. Work with the existing, established, one. Otherwise you'll
just create unneccessary confusion. (An example of this that comes to
mind are the many libusb derivatives.)


> Now some more information about the development and porting for Carl, if he
> wants to add support for his microcontroller.
>
> I've tested it with LPCXpresso_5.2.4_2122.exe and lpcopen_v1.03.zip.

LPCOpen is a dead end IMNSHO. NXP has chosen to place everything
related to software development on partners who only work with
proprietary IDEs (are all of them based on Eclipse these days?)
which isn't so useful for open source projects. It's possible to
use basically the same code with GCC, and I think it makes much more
sense to only require a minimal toolchain in order to build the
firmware, rather than hundreds of megabytes, even gigabytes, of
node-locked, for-pay only, or 30-day trial IDE software..

I agree 100% that a Makefile is the way to go!


//Peter

Frank Buss

unread,
Jul 8, 2013, 1:49:25 AM7/8/13
to fpgalin...@googlegroups.com
Hi Peter,

are you sure that LPCOpen is dead? It's still referenced as the library to use from "LPCZone": http://www.nxp.com/techzones/microcontrollers-techzone/design-resources.html and then on "LPCware": http://www.lpcware.com (we all love cool marketing names), with an active forum. And much useful code is integrated: all the USB examples, freertos, filesystem support for mass storage on SD cards etc. Would mean quite some work, if you need this and if you have to port it yourself. And the only limitation of the free version of the Eclipse based IDE is 32 kB, so no problem for many small microcontrollers.

But on the other hand maybe better to write our own simple version for the FPGLink project. The Eclipse build process and the source code is really bloated; a Makefile process, really open source and no restriction would be nice. And I'm not sure about the code quality. Some examples in LPCOpen which makes me nervous:

for (i = 0; i < Length; i++) {
  #if defined(__LPC175X_6X__) || defined(__LPC177X_8X__) || defined(__LPC407X_8X__)
  if (endpointselected[corenum] != ENDPOINT_CONTROLEP) {
    while (usb_data_buffer_OUT_size[corenum] == 0) ; /* Current Fix for LPC17xx, havent checked for others */
  }
  #endif
  ((uint8_t *) Buffer)[i] = Endpoint_Read_8(corenum);
}

uint8_t Endpoint_Read_Control_Stream_LE(uint8_t corenum, void *const Buffer,
uint16_t Length)
{
  while (!Endpoint_IsOUTReceived(corenum)) ; // FIXME: this safe checking is fine for LPC18xx
  Endpoint_Read_Stream_LE(corenum, Buffer, Length, NULL); // but hangs LPC17xx --> comment out
  Endpoint_ClearOUT(corenum);
  return ENDPOINT_RWCSTREAM_NoError;
}

static inline void Delay_MS(uint16_t Milliseconds)
{
  while (Milliseconds--)
  {
    volatile uint32_t i;
    for (i = 0; i < (4 * 1000); i++) { /* This logic was tested. It gives app. 1 micro sec delay */ [yeah, but what about new GCC versions or different optimizations?]
    ;
  }
}
And I don't like the polling concept of LUFA. Would be better to have an interrupt driven implementation, because then it is easier to write longer running programs in the main process, or even go into some CPU sleep mode, when supported and it runs faster. When implementing your own logic inside the "task" concept of LUFA and then code like this from libusb is called:

uint8_t Endpoint_Write_Stream_LE(uint8_t corenum,
  const void *const Buffer,
  uint16_t Length,
  uint16_t *const BytesProcessed)
{
  uint16_t i;
  while ( !Endpoint_IsINReady(corenum) ) { /*-- Wait until ready --*/
  Delay_MS(2);
  }
  for (i = 0; i < Length; i++)
    Endpoint_Write_8(corenum, ((uint8_t *) Buffer)[i]);
  return ENDPOINT_RWSTREAM_NoError;
}

then it can cause some serious performance problems. Another point is that the LPCXpresso IDE doesn't support C++, which I would like to use in my projects.

That said, would be a lot of work for me to learn all the USB low-level details, so first I try to finish it with the current LPCOpen implementation. Maybe when I have some time later, I'll try to write my interrupt-driven simple implementation.

Regards,

Frank

Peter Stuge

unread,
Jul 8, 2013, 3:34:30 PM7/8/13
to fpgalin...@googlegroups.com
Hi Frank,

Frank Buss wrote:
> are you sure that LPCOpen is dead?

Sorry, I meant "dead end" as in it doesn't look like the right approach.

You're completely right that it is what NXP is pushing as software
development environment for their parts.


> But on the other hand maybe better to write our own simple version
> for the FPGLink project. The Eclipse build process and the source
> code is really bloated; a Makefile process, really open source and
> no restriction would be nice.

I agree 100%.


> And I'm not sure about the code quality.

Uff, yes.. Not so nice.


> And I don't like the polling concept of LUFA. Would be better to have an
> interrupt driven implementation

Also agree 100%.


> That said, would be a lot of work for me to learn all the USB
> low-level details, so first I try to finish it with the current
> LPCOpen implementation. Maybe when I have some time later, I'll
> try to write my interrupt-driven simple implementation.

Take a look at http://stuge.se/lpc-p1343_buttons.tar.bz2 (or .zip)
which is a simple-ish code example for the LPC-P1343 Olimex board.

The licensing situation is not completely kosher - the USB code is
from the GCC port of NXP example code made by 32bitmicro, but at
least it is not terribly bloated, and it builds with a single make
command.


//Peter

Carl-Fredrik Sundström

unread,
Jul 8, 2013, 3:51:31 PM7/8/13
to fpgalin...@googlegroups.com
 
I was planning on using this project as basis for makefile based build environment.
 
 
I agree that LUFA polling is bad and probablhy wont work wery well with for example an RTOS
prefer interrupt driven as much as possible with macros to support different yeilds to different RTOS.
 
Regards /// Carl

Chris McClelland

unread,
Jul 8, 2013, 4:56:31 PM7/8/13
to fpgalin...@googlegroups.com

I'm not so sure that LUFA's polled architecture is a bad thing. USB bulk endpoints are reliable transports (meaning data packets are CRC'd and resent if in error), so the arrival and transmission of data is very bursty, and is furthermore subject to the (from the firmware's perspective) arbitrary request schedule defined by the host. USB is optimized for throughput, not for latency.

I did a lot of work on this to try to optimize the forthcoming synchronous serial transport in the AVR firmware, and even for cases where I had artificially simplified the ISR (i.e only enabling interrupts when the main thread of execution is inside a function to eliminate the need for register-saves in the ISR), the interrupt overhead was significant.

The conclusion I came to was "only use interrupts when there are many asynchronous external stimuli, and latency is crucial". Where throughput is important, interrupt overhead starts to hurt. And USB, by design[1], is not suitable for solving problems requiring low-latency or even those requiring merely well-defined latencies.

But I wholeheartedly agree about the build environment. Preparing an FPGALink binary release already takes a lot of effort because of all the host platforms (win.x86, win.x64, lin.x86, lin.x64, lin.ppc, lin.armel, lin.armhf, osx). It would be good if the LPC firmware builds could be scripted in the same way the FX2 and AVR builds are.

Chris

[1] Except, possibly USB's isochronous endpoints which are specifically designed for more predictable latency at the expense of transport reliability. But even so the latency timeframe we're talking about is in the region of milliseconds whereas the latency timeframe of optimized firmware code is four or five orders of magnitude shorter. Also, FPGALink is unlikely to ever use isochronous endpoints.

--

Frank Buss

unread,
Jul 8, 2013, 5:04:34 PM7/8/13
to fpgalin...@googlegroups.com
Chris, there were too many unknown factors, so first I tried another SVF player, to avoid the intermediate step to CSVF: http://www.clifford.at/libxsvf/ With this player I created a bit-banging file, with just TDI, TMS and TCK in each byte. This is the file, if you want to use it to verify your parser: http://www.frank-buss.de/tmp/CrazyCartridge2.zip The SVF file is the same. Bit definitions see prog.c:

#define PIN_JTAG_TDI (1 << 1)
#define PIN_JTAG_TCK (1 << 2)
#define PIN_JTAG_TMS (1 << 3)

Then I implemented a new function in the firmware fastJTAG (my hope was to send the file really fast over USB). It turned out that Endpoint_Read_Stream_LE is limited to 0x200 bytes, then it just reads green kobolds from the memory instead of reading more data from the USB bus, until the next control packet. So I wrote my own function for USB-bulk data reading, with direct access to the hardware registers (just for testing synchronous, with interrupts disabled to avoid problems with the interrupts which are used by the LPCOpen implementation, see readUsbPacket in prog.c). Now I can program my CPLD and it works (I routed the input clock to an output pin, instantiated a PLL for 4xinput clock and routed it to another pin and I can measure both signals with my scope). This means now we have a known working bitbanging stream, created from the SVF file.

Now I can test the USB side of your JTAG programming functions, to see if there are more bugs in the LPCOpen USB library or how I used it. Once I verified that the data transfer itself works, I can try to test the CPLD programming with your generated CSVF file again, to verify your parser.

The data transfer is now faster, too. The problem was too many printf outputs, which slowed it down. Now I can measure 2 ms between the packets. Maybe it can be improved to 1 ms, if I use the double-buffer mechanism of the LPC chip, but this is not supported by the LPCOpen implementation.

One question for the USB and libusb experts here: I've read that with Bulk transfers more than one packet per millisecond is possible, up to about 1 MB per second for full speed, if the bus is not busy with transfers of other devices. Do I need to do something special on the PC side with libusb to have this nice speed? Currently the data is transferred like this:

int uStatus = usbBulkWrite(
handle->device,
handle->progOutEP, // write to out endpoint
sendPtr, // write from send buffer
chunkSize, // write this many bytes
5000, // timeout in milliseconds
error
);

Chris McClelland

unread,
Jul 8, 2013, 6:23:42 PM7/8/13
to fpgalin...@googlegroups.com
Fantastic, thanks for the raw data, it will be a useful comparison.

Regarding the host-side throughput question, I assume you're talking
about the communication with the comm_fpga_* module on the FPGA (i.e
flWriteChannel() and flReadChannel())? The variable with the biggest
impact on throughput is the size of the transfer. The code snippet you
posted uses the libusb synchronous API (i.e it blocks until it's
certain the data has been received by the device), and so sending lots
of small chunks will introduce overhead.

I am in the process of re-writing the comm_fpga API functions to use
the libusb async APIs which will improve performance for cases where
there are many small reads and writes.

Chris

Peter Stuge

unread,
Jul 8, 2013, 7:49:27 PM7/8/13
to fpgalin...@googlegroups.com
Frank Buss wrote:
> One question for the USB and libusb experts here: I've read that with Bulk
> transfers more than one packet per millisecond is possible, up to about 1
> MB per second for full speed, if the bus is not busy with transfers of
> other devices. Do I need to do something special on the PC side with libusb
> to have this nice speed? Currently the data is transferred like this:
>
> int uStatus = usbBulkWrite(
> handle->device,
> handle->progOutEP, // write to out endpoint
> sendPtr, // write from send buffer
> chunkSize, // write this many bytes
> 5000, // timeout in milliseconds
> error
> );

Yes, send *all* data in a single usbBulkWrite() call.


//Peter

Frank Buss

unread,
Jul 8, 2013, 11:45:44 PM7/8/13
to fpgalin...@googlegroups.com
I've changed it, which makes the code easier to read, because libfpgalink doesn't need to do all the chunk logic, but still 2 ms delay between the packets.

But doesn't matter anymore for now, because I could compress the JTAG programming file from 4 MB to 42 kB. All JTAG commands from the libxsvf player are first setting TMS and TDI, and then clocking TCK. The new format for each byte is bits 0..5 for the number of TCK pulses, bit 6 for TMS and bit 7 for TDI (similar to Chris' jtagNotSendingNotReceiving function). The playback function on the microcontroller for one byte:

static void jtagFastByte(uint8_t b) {
  uint32_t count = (b & 63) + 2;
  setTMS(b & 64);
  setTDI(b & 128);
  while (--count) {
    setTCK(true);
    setTCK(false);
  }
}

It is interesting that it is faster to use "--count" than "count--". And GCC ignors "inline", so I inlined the TCK=1/TCK=0 sequence manually. The result is bit-banging at 2.5 MHz. with a TCK high time of 100 ns and low 300 ns. Maximum speed for my Lattice part is 25 MHz, might be too fast for other JTAG interfaces. Programming time over USB is 1.5 seconds. Now the name "fast JTAG" is right :-) Of course, this will change when I fill the CPLD with more logic, so would be still interesting to find out how to avoid the 2 ms delays between the packets. But for now I will finish the port of the firmware first, and implement the flWriteChannel and flReadChannel function, with the implementation on the FPGA.

Chris, this is the new packed JTAG test data: http://www.frank-buss.de/tmp/CrazyCartridge3.zip I guess easier for you to use than the data with the explicit TCK signals.

Looks like zip packs it even more, but the limiting factor is the programming time now. It wouldn't make sense to pack it even more, except maybe if I want to store it in the microcontroller flash.

Chris McClelland

unread,
Jul 9, 2013, 3:51:23 AM7/9/13
to fpgalin...@googlegroups.com
Oh, wait a sec. If this just specifies the data to be sent TO the
FPGA, it must be ignoring the parts of the SVF file which specify
readback and verify FROM the FPGA. Actually that raises a question -
when you generated the SVF, was there an explicit option for "verify"
that you selected? Is it possible to unselect it?

Frank Buss

unread,
Jul 9, 2013, 4:45:50 AM7/9/13
to fpgalin...@googlegroups.com
You are right, I just ignored the verify part, I don't think that I can unselect it in the Lattice Diamond IDE. Maybe I should have named it "fastAndDirtyJtag" :-)

I think a good way between this fast, but possibly dangerous, and your concept with many write/read roundtrips, would be to send the CSVF file and then the microcontroller sends only ok, or verify failed and the position in the CSVF file were it failed at the end. This would make it even possible to implement "frequency" and the microsecond parameter for "runtest" very accurate, for a fast general purpose JTAG programmer. But might need too much flash space for small microcontrollers.

The CSVF file you gave me is 244 kB. Even with the bad 2 ms delays between USB packets it would mean that it is on the microcontroller in 8 seconds, which is bearable for development.

Chris McClelland

unread,
Jul 9, 2013, 6:36:47 AM7/9/13
to fpgalin...@googlegroups.com
Your message inspired me to do some JTAG throughput investigation. The
FX2 JTAG bit-bang code for handling XSDR commands (i.e no readback -
the bulk of most SVF files) is written in 8051 assembler, and for each
bit, it does this[1]:

rrc a (1cyc: rotate accum right, LSB->Carry flag)
clr _TCK (2cyc: drive TCK low)
mov _TDI, c (2cyc: write C->TDI)
setb _TCK (2cyc: drive TCK high)

That takes a total of 7 instruction cycles. One FX2 instruction cycle
is 4 clocks. So we have 28 clocks at 48MHz, giving a theoretical
maximum throughput of 1.71Mb/s. There will obviously be some overhead
from executing this code in a loop, and more overhead from reading
64-byte chunks over USB.

We can get an idea of the actual throughput by running a real
programming operation and timing it:

http://pastebin.com/raw.php?i=9LXyVGHq

The vast majority of the bytes in a .csvf file for a Xilinx FPGA is
raw bitstream (i.e bytes to be clocked into JTAG). In the example
above, it clocks 480KB in 2.818s, giving a throughput of 170KB/s
(1.36Mb/s). There's no compression going on, so that 2.8s will remain
more or less constant irrespective of how complex the design is.

Where the microcontroller can do raw JTAG operations significantly
faster than the FX2 firmware (e.g in your LPC case, or an AVR using
the 8MHz SPI port for JTAG), then some optimization of the host-side
code will make a big difference. In the common case (at least for
Xilinx and Altera; perhaps not so common for Lattice) where the
majority of JTAG operations are send-only, with only a small amount of
status information being read back, the current host-side send-only
code[2] is rather inefficient; it would do better using the libusb
async API, and/or being lumped into bigger chunks. For the send&verify
operations[3], the chunks can't be made bigger, but they could be made
to use the async APIs to improve the speed.

Chris

[1]https://github.com/makestuff/libfpgalink/blob/master/firmware/fx2/prog.c#L125
[2]https://github.com/makestuff/libfpgalink/blob/master/prog.c#L735
[3]https://github.com/makestuff/libfpgalink/blob/master/prog.c#L722

Chris McClelland

unread,
Jul 9, 2013, 6:48:54 AM7/9/13