slow GPIO access

160 views
Skip to first unread message

jmelson

unread,
Sep 30, 2009, 5:23:15 PM9/30/09
to Beagle Board
I hate to keep flogging a dead horse, but I hope somebody knows
something about this area.
I want to interface to an existing parallel device using the BB's
GPIO. Some early tests indicate the GPIO6 section is running with a 4
MHz clock, so even though the OMAP CPU is running at 600 MHz, I can't
make anything happen on the GPIO pins faster than every 250 ns. It
also appears that accessing the GPIO pins probably freezes the CPU for
250 ns or 150 clock cycles. The 250 ns is very clearly some kind of
clock, as the timing edges are ROCK SOLID on a scope. I am using mmap
to directly map the IO ports to user space.

So, is there some way to turn up this clock without fouling up the
rest of the devices that use GPIO? I only need
to speed up about 13 lines connected to the expansion header. I have
read the sprufxx.pdf section on GPIO thorougly, and don't think there
is anything in there that can be adjusted. There are debounce timers
and such, but my reading seems to say that is all related to input
only - If I'm wrong on that, please correct me.

I have tried to plow through the PRCM section, but it is VERY complex
and difficult to figure out what side effects would happen if I
changed anything. I am using the USB and SD card interfaces, and
don't want to kill them.

I see that the CPU clock can be divided by 1 to 4 by a pair of chained
div by 1/div by 2 circuits, but don't find
anywhere where the CPU clock could be divided by 150 before being fed
to the GPIO, so obviously I'm missing something. Also, the doc seems
to indicate the WHOLE GPIO system runs from one clock, and I think
there MUST be stuff like USB and SD cards that get a higher clock.
That too seems to say I've missed something.

Thanks,

Jon

Jerry Johns

unread,
Sep 30, 2009, 7:18:20 PM9/30/09
to Beagle Board
What kind of parallel interface is this? Is it n-bit data + clock?
custom?
Any change you can use the stock peripherals on the OMAP?

Charles Krinke

unread,
Oct 1, 2009, 10:55:12 AM10/1/09
to beagl...@googlegroups.com
Dear Jon:

In thinking about your post and your need for timely toggling of hardware pins, I would say that working from user space is probably always going to be a problem.

If one wants the closest control over hardware, one needs to get down to the kernel driver level. At that point, you can tailor your software to toggle at machine language speed (to the limit of the available bandwidth, and to the extent you wish to stop all other programs while toggling pins).

Personally, if I were to tackle a project like this, I would make a small FPGA and put a few 8 or 16 bit registers in it and just read/write to the registers. Then I would let the FPGA do the toggling at bus speed. But thats just how I might tackle such a project.

I have always found that toggling GPIO on any project to end up being a drain on system resources.

Charles


From: jmelson <el...@pico-systems.com>
To: Beagle Board <beagl...@googlegroups.com>
Sent: Wednesday, September 30, 2009 2:23:15 PM
Subject: [beagleboard] slow GPIO access

jmelson

unread,
Oct 1, 2009, 1:00:13 PM10/1/09
to Beagle Board


On Oct 1, 9:55 am, Charles Krinke <c...@pacbell.net> wrote:
> Dear Jon:
>
> In thinking about your post and your need for timely toggling of hardware pins, I would say that working from user space is probably always going to be a problem.
>
These are just tests, although there will always be a user-mode
diagnostic program.
The real driver is to be a kernel module running as a real-time
process, under the RTAI scheduler.

> If one wants the closest control over hardware, one needs to get down to the kernel driver level. At that point, you can tailor your software to toggle at machine language speed (to the limit of the available bandwidth, and to the extent you wish to stop all other programs while toggling pins).
>
Well, that is the problem, this is NOWHERE NEAR "machine language"
speeds, about 150 X slower.

> Personally, if I were to tackle a project like this, I would make a small FPGA and put a few 8 or 16 bit registers in it and just read/write to the registers. Then I would let the FPGA do the toggling at bus speed. But thats just how I might tackle such a project.
>
I am going to attach an FPGA to the OMAP. That is a mature product,
and uses the EPP mode of a PC parallel port for communication. Since
the EPP handshaking is done by hardware in the PC, it is fairly fast.
I can do a byte every 600 - 800 ns depending on whether it is a
motherboard or PCI port card. I need to do at least as good as this,
but it seems the OMAP should be able to go quite a bit faster. To
control a CNC machine, the CPU reads position 1000 times a second and
sends new velocity commands to the output section. The communication
overhead needs to be kept to a reasonable minimum.

Thanks,

Jon

jmelson

unread,
Oct 1, 2009, 1:05:16 PM10/1/09
to Beagle Board
I want to stay with a Beagle Board, the expansion header only brings
out a limited number of
balls from the OMAP. One annoying thing is they do not bring out any
byte-aligned 8 contiguous bits, but there is a non-aligned string of 8
contiguous bits, so I can shift my 8-bit byte over to match that
group.

There is a data strobe and an address strobe, plus a write/read bit,
and an acknowledge. Read up on the IEEE-1284 (EPP mode) protocol to
see what it looks like.

I will need to provide a voltage level translator and a direction
control signal to that to turn the bus around.

It SHOULD be pretty simple, but I'd like to get a bit more performance
out of it if that is possible.

Thanks,

Jon

Keith Williams

unread,
Oct 1, 2009, 2:26:29 PM10/1/09
to beagl...@googlegroups.com

Since what you really seem to want is a IEEE-1284 port have you looked
into using a USB-Parallel adapter?

jmelson

unread,
Oct 1, 2009, 5:09:45 PM10/1/09
to Beagle Board


On Oct 1, 1:26 pm, Keith Williams <esp...@linuxinstruments.com> wrote:
> Since what you really seem to want is a IEEE-1284 port have you looked
> into using a USB-Parallel adapter?
USB does not guarantee real-time delivery of packets. Oh, if it were
only that simple!
We need to have a real-time module make the device sample the current
encoder position,
and that is a pretty tight real-time constraint. Once sampled, then
we have to read a bunch of
bytes, process it and send back a bunch of velocity command bytes, all
within a few hundred us
for sure, and would prefer to do it even faster, if at all possible.
You really can't do this back-and-
forth with USB very easily. Making the program dispatch when the hard-
real-time external hardware
clock ticks would be even better, but at the moment the program is not
set up to do that, the program wants
to be the master.

Jon

Keith Williams

unread,
Oct 1, 2009, 5:46:15 PM10/1/09
to beagl...@googlegroups.com

I thought that you might say that. So, here is my other thought.....

There are three other fairly inexpensive OMAP3 platforms Overo, one by
Cogent Computing, and the Embest board.

All of those offer different IO interfaces and some even allow for
direct Address/Data bus connection. If shoe-horning the Beagle into
your application is too much of a headache, but you still want to use
the 3530, then maybe one of those would be a better fit?

jmelson

unread,
Oct 1, 2009, 6:14:00 PM10/1/09
to Beagle Board


On Oct 1, 4:46 pm, Keith Williams <esp...@linuxinstruments.com> wrote:
> I thought that you might say that.  So, here is my other thought.....
>
> There are three other fairly inexpensive OMAP3 platforms Overo, one by
> Cogent Computing, and the Embest board.
>
> All of those offer different IO interfaces and some even allow for
> direct Address/Data bus connection.  If shoe-horning the Beagle into
> your application is too much of a headache, but you still want to use
> the 3530, then maybe one of those would be a better fit?
>
Well, I already have the Beagle, and it works as far as I have gone
with it.
All I want to do is turn up the clock that seems to be limiting GPIO6
by a
modest factor. I am going to write a program to read out a bunch of
the clock
selection registers to find out what the default settings are, I see
some of them
that look like they might be responsible for this slow clock. I think
I have figured
out how to make the Beagle's expansion header work for this
application, and it
will only take a couple lines of code to deal with the non-aligned
byte.

I have seen the other OMAP boards, and I don't think they really offer
any great
advantage over the Beagle. I have this thing running Debian, and I am
totally blown
away by the possibilities! I also have some other less time-critical
applications
in mind for it that also require the parallel port emulation, they
should be able'
to use the same voltage translator board.

Thanks,

Jon

jlee...@gmail.com

unread,
Oct 1, 2009, 5:34:40 PM10/1/09
to Beagle Board
Hi Jon,
I actually have a similar issue where i need to place a byte
(hopefully two) on the GPIO with as fast a signaling rate as
possible. I actually got my board (Rev C3) just yesterday so am still
in the process of getting everything set up for development. Would
you mind sending me a few commented code snippets on how your
configuring and using the GPIO pins? the sooner i get up and running
the sooner I can help make progress with this issue.
--Jacob

P.S. it was going over the TRM and it looks like the GPIO gets its
clock speed from L4_IFACE clock, but i can't find where that speed is
set. or how GPIO speed is controlled.

jmelson

unread,
Oct 2, 2009, 12:09:03 AM10/2/09
to Beagle Board


On Oct 1, 4:34 pm, "jleem...@gmail.com" <jleem...@gmail.com> wrote:
 Would
> you mind sending me a few commented code snippets on how your
> configuring and using the GPIO pins?
>
> P.S. it was going over the TRM and it looks like the GPIO gets its
> clock speed from L4_IFACE clock, but i can't find where that speed is
> set. or how GPIO speed is controlled.
Yes, that seems right. There are a bunch of registers in the PRCM
section that generate
these clocks. The worrisome part is that there is only ONE clock for
the whole GPIO
system, so changing it might have wide-ranging side effects on the USB
and SD card
interfaces.

I have found a divider in the GPIO_CTRL register, but have not found
that it has any
effect on output speed.
>
Here's a short program that toggles pin 24 of the expansion header :
#include <stdio.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/mman.h>

int main() {
int x,y;
int fd = open("/dev/mem", O_RDWR | O_SYNC);

if (fd < 0) {
printf("Could not open memory\n");
return 0;
}

// Pad configuration
volatile ulong *pinconf;
pinconf = (ulong*) mmap(NULL, 0x10000, PROT_READ | PROT_WRITE,
MAP_SHARED, fd, 0x480000
00);
if (pinconf == MAP_FAILED) {
printf("Pinconf Mapping failed\n");
close(fd);
return 0;
}

// Configure Expansion header pins as input.
x = pinconf[0x21bc/4];
printf("pinconf[0x21bc/4] = %x\n",x);
x = x & 0x0000ffff; // mask off high bits for GPIO 168
x = x | 0x011C0000; // set pulltype, bi-dir and mux mode 4
pinconf[0x21bc/4] = x;
close(fd);

fd = open("/dev/mem", O_RDWR | O_SYNC);

// GPIO Configuration: configure are input
volatile ulong *gpio;
gpio = (ulong*) mmap(NULL, 0x10000, PROT_READ | PROT_WRITE,
MAP_SHARED, fd, 0x49050000)
;
if (gpio == MAP_FAILED) {
printf("Gpio Mapping failed\n");
close(fd);
return 0;
}

// Configure 8th GPIO pin on bank 6 as output.
x = gpio[8034/4];
printf("gpio[8034/4] = %x\n",x);
x = x | 0xfffffeff; // change bit 8 to zero
gpio[0x8034/4] = x;

// Toggles GPIO_168 (expansion connector pin 24)
int c=0;
x = 0x803c/4; // addr of gpio data out reg
y = 0x8090/4; // addr of clear gpio reg

for(;;) {
gpio[x] = 0x00000100; // set bit 8
gpio[x] = 0x00000000; // set bit 8
gpio[x] = 0x00000100; // set bit 8
gpio[x] = 0x00000000; // set bit 8
gpio[x] = 0x00000100; // set bit 8
gpio[x] = 0x00000000; // set bit 8
gpio[x] = 0x00000100; // set bit 8
gpio[x] = 0x00000000; // set bit 8
}
}

Jerry Johns

unread,
Oct 2, 2009, 12:15:53 AM10/2/09
to Beagle Board
Based on a cursory glance at the EPP spec, it seems you can use the
GPMC peripheral for this - 8-bit asynchronous data bus, with addr/data
strobe, write enable, ....
Since the GPMC gives you the ability to add non-determinisitic waits
using the WAIT pin as well, this fits right down the EPP alley way.
And you get completely controlled behaviour from this peripheral, with
microsecond accuracy in terms of transaction timing.

Jerry

jmelson

unread,
Oct 2, 2009, 11:40:26 AM10/2/09
to Beagle Board
Is the GPMC brought out to any pins I can get to on the Beagle Board?
If not, then it
doesn't do me much good. "Microsecond accuracy" doesn't sound very
good, I've got 4 times better
than that already with the GPIO pins that are available. Looking in
the docs, GPMC is the General Purpose Memory Controller. It seems
like it might be a bit more specific than I was looking for, with a
lot of fixed protocol built into the logic. Anyway, if I can't get
to the OMAC balls from the Beagle Board, it is not much use.

Thanks,

Jon

Gerald Coley

unread,
Oct 2, 2009, 12:14:39 PM10/2/09
to beagl...@googlegroups.com
There are GPIO pins brought out onto the expansion header of the Beagle. You can find the information in the System Reference Manual http://beagleboard.org/hardware/design .
 
Gerald

Jerry Johns

unread,
Oct 2, 2009, 12:45:49 PM10/2/09
to Beagle Board
When i meant microsecond accuracy, i meant the accuracy with which you
can trigger a transaction. The timing of the actual transactional
waveforms can be adjusted to within 5ns of accuracy, more than
sufficient for a parallel port. Truly, this is how you should be doing
it instead of loading down the ARM for GPIO toggling, which it not at
all efficient. And the GPMC peripheral is actually much more
customizable than you think - since any and all timing parameters are
configurable with 5ns clock period accuracy, you can choose to match
the EPP standard quite closely. The only thing that might be tricky is
the distinction between the address and data phases, both of which
might be coupled together in one transaction in the GPMC.

However, it looks like the pins are not brought out on the Beagle, in
which case.....unless you're willing to go to the OMAP3EVM...

The PRCM clock module changes would be your best bet if you still
stick with GPIOs - see clock34xx.h in arch/arm/mach-omap2/ for
tweaking details. You're looking to change the EMU_PER_ALWON_CLK
output speed coming from dpll4m6.

Jerry

jmelson

unread,
Oct 3, 2009, 12:05:15 AM10/3/09
to Beagle Board


On Oct 2, 11:14 am, Gerald Coley <ger...@beagleboard.org> wrote:
> There are GPIO pins brought out onto the expansion header of the Beagle. You
> can find the information in the System Reference Manualhttp://beagleboard.org/hardware/design.

Yes, of course, a small selection of GPIO6 pins, roughly from GPIO135
to GPIO161, with a number
missing in the middle of this range, are brought out. I don't believe
any of these can be MUX'ed to the GPMC section.

jmelson

unread,
Oct 3, 2009, 12:10:42 AM10/3/09
to Beagle Board


On Oct 2, 11:45 am, Jerry Johns <jerry.jo...@gmail.com> wrote:
> When i meant microsecond accuracy, i meant the accuracy with which you
> can trigger a transaction. The timing of the actual transactional
> waveforms can be adjusted to within 5ns of accuracy, more than
> sufficient for a parallel port. Truly, this is how you should be doing
> it instead of loading down the ARM for GPIO toggling, which it not at
> all efficient. And the GPMC peripheral is actually much more
> customizable than you think - since any and all timing parameters are
> configurable with 5ns clock period accuracy, you can choose to match
> the EPP standard quite closely. The only thing that might be tricky is
> the distinction between the address and data phases, both of which
> might be coupled together in one transaction in the GPMC.
>
Well, this is interesting, but what I am doing is not long block
transfers,
but blocks of 12 - 24 bytes at a time. Setting up a DMA transfer for
this
is not efficient. There are a lot of single-byte reads and writes,
also.

> However, it looks like the pins are not brought out on the Beagle, in
> which case.....unless you're willing to go to the OMAP3EVM...

Yes, that makes me want to stay with the Beagle for the moment.

> The PRCM clock module changes would be your best bet if you still
> stick with GPIOs - see clock34xx.h in arch/arm/mach-omap2/ for
> tweaking details. You're looking to change the EMU_PER_ALWON_CLK
> output speed coming from dpll4m6.
GREAT, thanks for the pointer! I will look this up, it may be exactly
the
info I needed. I never thought it would be the EMU clock I needed to
work
with. Sheesh, the GPIO is sure complicated!

Thanks again,

Jon

jmelson

unread,
Oct 3, 2009, 2:03:11 AM10/3/09
to Beagle Board


On Oct 2, 11:45 am, Jerry Johns <jerry.jo...@gmail.com> wrote:

> The PRCM clock module changes would be your best bet if you still
> stick with GPIOs - see clock34xx.h in arch/arm/mach-omap2/ for
> tweaking details. You're looking to change the EMU_PER_ALWON_CLK
> output speed coming from dpll4m6.
Looking in register CM_IDLEST_CKGEN (4800 4D20) bit 13 is zero, which
indicates that clock
EMU_PER_ALWON_CLK is not active. Wish I'd checked that first, I tried
all sorts of things to
DPLL4 that had no effect, now I know why. For instance, I changed the
DIV_DPLL4 field of CM_CLKSEL1_EMU (4800 5140) from 3 to 2, 1 and 16
with no effect. So, the GPIO system must be clocked off something
else, but I haven't been able to track down where the clock actually
comes from. I am pretty sure I changed the multiplier setting of
DPLL4, so it must be coming from a different DPLL.

clock34xx.h and clock34xx.c are not small files, it will take a while
to dig through them.
Poking the registers directly is certainly more dangerous, but I can
also read them first to see what they are set to.

And, maybe what is throttling the GPIO pins has nothing to do with the
peripheral clock, but is some other part of the chip that is delaying
things.

Thanks,

Jon

Jacob Leemaster

unread,
Oct 7, 2009, 7:44:46 PM10/7/09
to beagl...@googlegroups.com
one thing that might be worth trying (I haven't had a chance to test this, still getting my stuff set up amid midterms) is speeding up the CPU clock.  i think i read somewhere on the angstrom wiki that the beaglboard cpu defaults to 500MHz operation and a certain flag need to be set in u-boot (before the kernel boots) to switch to 600MHz operation.  I'll see if i can dig up the original doc
--Jacob

jmelson

unread,
Oct 8, 2009, 12:19:51 AM10/8/09
to Beagle Board


On Oct 7, 6:44 pm, Jacob Leemaster <jleem...@gmail.com> wrote:
> one thing that might be worth trying (I haven't had a chance to test this,
> still getting my stuff set up amid midterms) is speeding up the CPU clock.
> i think i read somewhere on the angstrom wiki that the beaglboard cpu
> defaults to 500MHz operation and a certain flag need to be set in u-boot
> (before the kernel boots) to switch to 600MHz operation.  I'll see if i can
> dig up the original doc
Going from 500 to 600 MHz is only a 20% speed increase, hardly worth
the effort.
I was hoping to find a way to get a 2X up to 5X speedup of the GPIO
pins. It just doesn't
make sense for such a fast CPU to be saddled with such slow I/O. It
may have been set this way
for battery-operated systems, but my applications are line powered for
the most part.

I have sent two messages to TI technical support, but haven't heard
anything back from them.

Jon

Vladimir Pantelic

unread,
Oct 8, 2009, 3:02:51 AM10/8/09
to beagl...@googlegroups.com
jmelson wrote:
>
>
>
> On Oct 2, 11:45 am, Jerry Johns<jerry.jo...@gmail.com> wrote:
>
>> The PRCM clock module changes would be your best bet if you still
>> stick with GPIOs - see clock34xx.h in arch/arm/mach-omap2/ for
>> tweaking details. You're looking to change the EMU_PER_ALWON_CLK
>> output speed coming from dpll4m6.

according to the TRM, GPIO banks 2-6 are driven from the PER_L4_ICLK
which is L4_ICLK. so no surprise EMU_PER_ALWON_CLK is not used...

> clock34xx.h and clock34xx.c are not small files, it will take a while
> to dig through them.

may I suggest looking into the TRM 1st, which is also no small file,
but might be a better starting point.

John (USP)

unread,
Oct 8, 2009, 12:34:16 PM10/8/09
to beagl...@googlegroups.com
Have you tried changing the gating ratio (TRM P3428). The default is set to
0x01 which means the GPIO functional clock is interface clock divided by 2.
>
>
>

John (USP)

unread,
Oct 8, 2009, 12:56:43 PM10/8/09
to beagl...@googlegroups.com
Never mind, in gpio.c line 1435, GPIO_CTRL is set to 0.
> >
> >
> >
>
>

Jacob Leemaster

unread,
Oct 8, 2009, 6:43:24 PM10/8/09
to beagl...@googlegroups.com
I noticed someone was playing with DPLL4 to speed up L4_ICLK and therefore, the GPIO.
After going over the TRM for the 3530, it looks like L4_ICLK come from DPLL3, not DPLL4 (pg 185 of the 3530 TRM) unless i'm misunderstanding something.  Also, I found the section of the TRM that discusses how to reconfigure the L4 clock (section 1.7.8.2 of the PRCM chapter, page 231 of the OMAP 353x TRM).  I haven't had a chance to dig into how to utilize this but i think it might help. 
let me know what you guys figure out and i'll do the same
Also, has anyone checked that the GPIO clock is free_running and not gated, i think it's set in GPIO_SYSCONFIG bit 0 (AUTO_IDLE)

Hope this helps
--Jacob

jmelson

unread,
Oct 9, 2009, 12:43:52 PM10/9/09
to Beagle Board


On Oct 8, 2:02 am, Vladimir Pantelic <p...@nt.tu-darmstadt.de> wrote:
>
> according to the TRM, GPIO banks 2-6 are driven from the PER_L4_ICLK
> which is L4_ICLK. so no surprise EMU_PER_ALWON_CLK is not used...
>
> > clock34xx.h and clock34xx.c are not small files, it will take a while
> > to dig through them.
>
> may I suggest looking into the TRM 1st, which is also no small file,
> but might be a better starting point.
Right, and my attempt to adjust the L4_ICLK crashed the system, which
wasn't much of
a surprise. It seems the L4_CLK which creates PER_L4_ICLK runs a LOT
of different parts
of the chip, so changing it may affect some timing, like for the SD
card or USB, that
can't be changed.

jmelson

unread,
Oct 9, 2009, 12:47:31 PM10/9/09
to Beagle Board


On Oct 8, 11:34 am, "John \(USP\)" <jsyne...@us-power.com> wrote:

> Have you tried changing the gating ratio (TRM P3428). The default is set to
> 0x01 which means the GPIO functional clock is interface clock divided by 2.

It is set to a ratio of 1 on my system when Debian is running, but
changing it
had no effect. I thought that was curious, but I can't really
determine what it
does, I have read the GPIO section several times. It may also be that
the bottleneck
is somewhere else, in the L3 interconnect, L4 interconnect, or IO
firewall. Geez, why do
they need a FIREWALL in there, isn't this all supposed to be trusted
software?

Jon

jmelson

unread,
Oct 9, 2009, 12:53:35 PM10/9/09
to Beagle Board


On Oct 8, 5:43 pm, Jacob Leemaster <jleem...@gmail.com> wrote:
> I noticed someone was playing with DPLL4 to speed up L4_ICLK and therefore,
> the GPIO.
> After going over the TRM for the 3530, it looks like L4_ICLK come from
> DPLL3, not DPLL4 (pg 185 of the 3530 TRM) unless i'm misunderstanding
> something.
That was my understanding, too.

 Also, I found the section of the TRM that discusses how to
> reconfigure the L4 clock (section 1.7.8.2 of the PRCM chapter, page 231 of
> the OMAP 353x TRM).  I haven't had a chance to dig into how to utilize this
> but i think it might help.
> let me know what you guys figure out and i'll do the same
> Also, has anyone checked that the GPIO clock is free_running and not gated,
> i think it's set in GPIO_SYSCONFIG bit 0 (AUTO_IDLE)
All of this is quite confusing. So, if GPIO6 was in AUTO_IDLE, it
would power down after every access, and then take 250 ns to power
back up? That could be it, I will have to see what mode it is in on
my system. (I know I have looked at it, but didn't write it down.)

Thanks,

Jon

Jacob Leemaster

unread,
Oct 9, 2009, 8:50:29 PM10/9/09
to beagl...@googlegroups.com
Looks like my AUTOIDLE idea didn't pan out.  I tried turning off AUTOIDLE for GPIO6 but it had no effect
--Jacob

jmelson

unread,
Oct 10, 2009, 2:13:46 PM10/10/09
to Beagle Board


On Oct 9, 7:50 pm, Jacob Leemaster <jleem...@gmail.com> wrote:
> Looks like my AUTOIDLE idea didn't pan out.  I tried turning off AUTOIDLE
> for GPIO6 but it had no effect
So, you are seeing the same 250 ns limit on the GPIO? What program
are you using?
(I think I included my test program in an earlier message in this
thread.)
I really hope the 250 ns speed is not a fundamental limitation of the
L3 and L4 interconnect
hardware. It seems like maybe the system could run at that speed,
with the intelligent peripherals making most things work acceptably,
but it is hell for embedded sort of jobs.

I was wondering if it is the particular kernel I was using, so what
kernel are you testing with?
I also have Angstrom here, but couldn't get the mmap function to
compile on that, it probably needs some additional set of include
files to compile.

Thanks,

Jon

jmelson

unread,
Oct 10, 2009, 2:15:41 PM10/10/09
to Beagle Board
The GPMC turns out to be used to communicate with the NAND flash
memory chip that
is piggybacked on top of the CPU, so it is not available for other
things.

Jon

Jacob Leemaster

unread,
Oct 12, 2009, 2:51:21 PM10/12/09
to beagl...@googlegroups.com
oh, and i forgot to mention, i'm using the code-sorcery toolchain with pre-built rootfs from koen's blog, not the openembedded build system
--J


On Mon, Oct 12, 2009 at 2:49 PM, Jacob Leemaster <jlee...@gmail.com> wrote:
Yeah, I'm seeing the same 250ns switching rate, with a lot of ringing as well (thats probably just from the cable) and i'm using the example code that you sent me earlier (i still haven't found a good reference for mapping the GPIO pin numbers to the bits in the GPIO6 register, can't find it in the TRM)
I'm using the latest kernel from koen's blog (kernel version 2.6.29-r44)

I think the next step for me is to go over the gpio.txt and gpio.h files in the kernel tree and see if they reveal anything.  last resort i think is going to be to just load the entire disto into RAM (yay ramdisk) so no other system IO is needed and then mess around with DPLL3, 4, etc

Let me know if you figure anything out
--Jacob

Koen Kooi

unread,
Oct 12, 2009, 8:04:12 PM10/12/09
to beagl...@googlegroups.com
On Mon, Oct 12, 2009 at 8:51 PM, Jacob Leemaster <jlee...@gmail.com> wrote:
oh, and i forgot to mention, i'm using the code-sorcery toolchain with pre-built rootfs from koen's blog, not the openembedded build system

As I said earlier today:

I feel compelled to repeat that using codesourcery when targetting angstrom is a *bad* idea. Your development toolchain should match the one the system was built with, as well as all the C and LD FLAGS. Unless you know what you're doing (hi mru!), you shouldn't be mixing toolchains.

regards,

Koen 

jmelson

unread,
Oct 13, 2009, 1:18:04 PM10/13/09
to Beagle Board


On Oct 12, 1:51 pm, Jacob Leemaster <jleem...@gmail.com> wrote:
> oh, and i forgot to mention, i'm using the code-sorcery toolchain with
> pre-built rootfs from koen's blog, not the openembedded build system
> --J
>
> On Mon, Oct 12, 2009 at 2:49 PM, Jacob Leemaster <jleem...@gmail.com> wrote:
> > Yeah, I'm seeing the same 250ns switching rate, with a lot of ringing as
> > well (thats probably just from the cable) and i'm using the example code
> > that you sent me earlier
OK, glad to hear this is a reliable measurement, not something due to
bad system configuration on just my board. I have a scope probe poked
into the expansion header holes and the signals look fine that way.
(i still haven't found a good reference for mapping
> > the GPIO pin numbers to the bits in the GPIO6 register, can't find it in the
> > TRM)
Yeah, the TRM is a totally INSANE piece of work! 3700 pages! Nobody
can possibly wade through all of that.

The pin mapping is in 3 places. There is a GPIO chapter, I think next
to last one in the TRM. Page 30 of that chapter (section 1.6.1) lists
all the register addresses. GPIO1 is pins 0-31, GPIO2 is 32-63, etc.
So, GPIO6 is pins 160 - 195, the register addresses are in the block
0x4905 8xxx. If you write to bit zero of 0x4905 803C (the dataout
register for GPIO6) it will come out on GPIO 160, assuming you have
that pad set to the raw GPIO rather than some other I/O module. So,
there's also the pad configuration registers, that is in a different
chapter, the System Control Module. You have to set the pad
multiplexer to select each I/O ball to one of several possible modules
for that specific ball.

And, finally, the beagle board's own manual has two pages that are
somewhat confusing. Table 20 and table 30 both show the mapping from
specific I/O functions to expansion header pins. I think table 20 is
the one that has been updated for the Rev C Beagle.

> > I'm using the latest kernel from koen's blog (kernel version 2.6.29-r44)
>
> > I think the next step for me is to go over the gpio.txt and gpio.h files in
> > the kernel tree and see if they reveal anything.  last resort i think is
> > going to be to just load the entire disto into RAM (yay ramdisk) so no other
> > system IO is needed and then mess around with DPLL3, 4, etc
Well, that is an interesting thought. people have revved up the CPU
clock, I wonder if that would show a speed-up in the GPIO as well. I
think these speed up methods have been published.

Jon
>

Søren Steen Christensen

unread,
Nov 4, 2009, 10:27:00 AM11/4/09
to beagl...@googlegroups.com
Hi Jon,

Sorry for first kicking in on this thread now. But better late than never
:-) - I have been away from my BeagleBoard-list-account for like 6 weeks and
I'm now slowly catching up on the previous ~1.500 emails :-)

> Yeah, the TRM is a totally INSANE piece of work! 3700 pages! Nobody
> can possibly wade through all of that.

You are right - It's insane - Never the less I have been through most parts
of it at least twice (some of the chapters many more times) - But I have as
well spend the last ~7 years doing this - Starting with OMAP1 long time ago
:-)

> The GPMC turns out to be used to communicate with the NAND flash memory
> chip that is piggybacked on top of the CPU, so it is not available for
other things.

Even though the GPMC is connected to the NAND it can be used to communicate
with other devices and well, since it have several Chip Selects, which can
be mapped to different memory regions and configured individually. But you
are right: You won't have exclusive access to the GPMC - That being said the
GPMC isn't accessible at Beagle, so it isn't relevant :-)

> It may also be that the bottleneck is somewhere else, in the L3
interconnect,
> L4 interconnect, or IO firewall.

The L4, L3 and IO firewalls won't be the bottlenecks. It's unfortunately the
GPIO module itself, which isn't made for the purpose you would like. I.e.
it's *not* supposed to be used as a 8-bit parallel IO bus interface, but to
be used as single GPIOs for control on their own.

In case you need a bus interface in OMAP3 you need to use either the GPMC
(IO), ISP (I), LCD (O), or MMC (IO) interface. The newly introduced OMAP
L138 have a Universal Parallel Port (uPP), which is basically what you want
but to be honest I don't know that much about this one yet, since it's still
relatively new in the OMAP world and I haven't dealt with it yet :-)

So the short answer to your GPIO trouble is: You won't (unfortunately) be
able to go higher than the ~4MHz you have found - It's limited by the IP
block design AFAIK - I know this isn't the answer you are searching for, but
unfortunately it's the truth :-)

Hope you find another way around this. Using the MMC interface together with
a FPGA for converting into the format you need should bring you the ability
to go to 8-bit@48MHz minus the MMC protocol overhead. This might be a
solution while still utilizing the BeagleBoard? Alternative you can access
the GPMC on an Gumstix Overo board, but again this requires you to add
extra/other HW to your setup.

Best regards - Sorry about the bad news :-) - Good luck
Søren

---
SSC Solutions ApS - Denmark - www.ssc-solutions.dk


jmelson

unread,
Nov 4, 2009, 1:12:02 PM11/4/09
to Beagle Board


On Nov 4, 9:27 am, Søren Steen Christensen <li...@ssc-solutions.dk>
wrote:

> So the short answer to your GPIO trouble is: You won't (unfortunately) be
> able to go higher than the ~4MHz you have found - It's limited by the IP
> block design AFAIK - I know this isn't the answer you are searching for, but
> unfortunately it's the truth :-)

Well, that certainly is not the best news. I have a number of
options. One
is that the protocol I'm using was designed for long cables, and the
Beagle
would be mounted to an adaptor board that will plug directly into the
target
device. So, I can likely just skip the handshaking and know that the
target
device will respond within one or two I/O clocks.

> Hope you find another way around this. Using the MMC interface together with
> a FPGA for converting into the format you need should bring you the ability
> to go to 8-bit@48MHz minus the MMC protocol overhead. This might be a
> solution while still utilizing the BeagleBoard? Alternative you can access
> the GPMC on an Gumstix Overo board, but again this requires you to add
> extra/other HW to your setup.
>
The GPIO that is there will work for the initial experiments, but
doesn't provide
any improved performance above a PC's already slow parallel port. Oh,
I see you CAN
get to all 8 data bits on MMC2 of the Beagle's expansion header. That
would certainly
handle the data rate, but I don't know ANYTHING about the interface.
This is not
a job for long block transfers, some will be as short as write one
address, read one
byte. The longest block transfer will be write one address, read 12
bytes. So, the
MMC may not be a great help there, if the setup of the controller
takes a lot of time.

Thanks for your info! I can't believe TI tech support has taken 3
weeks so far and can't tell me this! But, maybe they know something
that can help, and are trying to code it up. I don't need a massive
speedup, just X2 or X4 would make me happy in this first experiment.

Jon

Søren Steen Christensen

unread,
Nov 4, 2009, 2:30:26 PM11/4/09
to beagl...@googlegroups.com
Hi Jon,

> Oh, I see you CAN
> get to all 8 data bits on MMC2 of the Beagle's expansion header. That
> would certainly
> handle the data rate, but I don't know ANYTHING about the interface.
> This is not
> a job for long block transfers, some will be as short as write one
> address, read one
> byte. The longest block transfer will be write one address, read 12
> bytes. So, the
> MMC may not be a great help there, if the setup of the controller
> takes a lot of time.

Hmm - MMC is really designed for longer transfers, although you can do short
transfers as well. It's though not designed for a write/ack kind of
communication and you will get a severe protocol overhead hit (around 5-10
bytes pr write AFAIR => You are again down around the 4MHz). The way you
communicate in MMC is, that you need to write all the data in one MMC
package, and then the respondent will indicate if the data is received
correctly (by a CRC calculation) afterwards. This was a very simplified and
rough explanation, but in short it doesn’t fit your type of EPP
communication very well, and you would need to have a FPGA in between doing
some kind of protocol conversion.

Setting up and restarting the MMC module between transfers doesn't take
long. AFAIR it's just a matter of feeding the data you want to send into the
MMC FIFO, programming the MMC packet type and setting a start-bit (again a
very rough simplification), but you should be able to do packages at a
reasonable speed, but the package overhead itself will "kill you" -
Unfortunately

The above being said I think you best option for something like this really
is to utilize either the GPMC bus (on an Gumstix Overo or similar) or the
uPP interface on an OMAP L138 which is designed for this kind of
communication AFAIK...

The Beagleboard really isn't very capable of a task like the one you need -
Unfortunately...

jmelson

unread,
Nov 5, 2009, 11:54:58 AM11/5/09
to Beagle Board


On Nov 4, 1:30 pm, Søren Steen Christensen <li...@ssc-solutions.dk>
wrote:

>
> Setting up and restarting the MMC module between transfers doesn't take
> long. AFAIR it's just a matter of feeding the data you want to send into the
> MMC FIFO, programming the MMC packet type and setting a start-bit (again a
> very rough simplification), but you should be able to do packages at a
> reasonable speed, but the package overhead itself will "kill you" -
> Unfortunately

Well, this might not be so bad. If the transfer rate is quite fast on
the MMC
side, I could have a packet format that looks like address:r/w:data
for each
byte or contiguous block to be transferred, and an FPGA or CPLD on the
interface
board would do the conversion to a speeded-up EPP transfer to the
target device.
The way the PC driver works now, it wraps up all the reads to be done
in one
group and does them, then the program performs calculations for the
motion control
and sends data to the driver to be written all in one block to the
target. If I set
up the FPGA (Hmm, now seems a CPLD might not have enough memory) so
you could
pre-load a format, and then every read request would send you a packet
containing
data corresponding to that format, you just ask for a read, and the
data comes
all pre-formatted as required. Putting all the transfers together in
blocks is
already done by the existing driver, so this wouldn't be hard to do.
With appropriate programming of the FPGA, this should all boil down to
two
blocks to be transferred, first a read, then a write.
> The above being said I think you best option for something like this really
> is to utilize either the GPMC bus (on an Gumstix Overo or similar) or the
> uPP interface on an OMAP L138 which is designed for this kind of
> communication AFAIK...
So, is there going to be an affordable, Beagle-like board for the
L138? If the
Beagle can be a stepping stone to a better system, that will be OK.
Right now we
don't even have an RTAI kernel, but that is being worked on.

Thanks,

Jon
Reply all
Reply to author
Forward
0 new messages