Could PRU read this signals?

187 views
Skip to first unread message

PAk Ys

unread,
May 26, 2020, 5:49:33 AM5/26/20
to BeagleBoard

Hello. 

I am quite new to PRU, sorry in advance if I make any mistake or incorrect asumption.


I would need to read a LVDS signal which is a 4 data channels + CLK and Framesync, that looks like this:




As you can see, the signals change in every edge (both falling and rising edges). 

Of course, we would place a differential LVDS driver receptor prior to the beaglebone pru ports (so signal will be exactly as the black + traces in the figure). Our intention is to send them later to memory and probably the ARM would create a UDP packet to send them over ethernet.


I would like to know if there is a way to receive these signals with the PRU in any way and process them correctly.

Would it be a Direct Connection Mode configuration?


If not would it be possible to make a fast polling on the CLK  & FRAME signals, and in every change read/sample the other 4 channels simultaneously?


Thank you all.


rpau...@gmail.com

unread,
May 26, 2020, 8:30:08 AM5/26/20
to beagl...@googlegroups.com

You don’t indicate how fast your clock is.  Everything in the PRU is polled, so you would have to construct a loop that looks at each pin and makes a determination if the clock has changed state.  The PRU operates at 200 MHz and has simple instructions, so you would have to calculate how many instructions you have for each clock to determine if it is even possible.  Then, of course, you have to put the data somewhere.   Typically one PRU is used to poll I/O lines and assemble the data into chunks that are passed to the 2nd PRU which can either put them in ARM DDR memory, send out, etc.  If you get the voltages on the PRU input pins correct and the data rate is within what the PRU can handle, then this can be done.

--
For more options, visit http://beagleboard.org/discuss
---
You received this message because you are subscribed to the Google Groups "BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beagleboard...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/beagleboard/d9898dc2-a2b5-4a00-b8d2-bf1165e28e62%40googlegroups.com.

PAk Ys

unread,
May 26, 2020, 11:04:50 AM5/26/20
to BeagleBoard
First of all, let me thank you for your answer.

Our signal will be in the tens of MHz (from 5Mhz to 50MHz max), depending on configuration.

The polling method is what I expected, however the manual (spruh73q) states there are three methods (Direct Input, Parallel Mode and 28bit shift Register), which confuses me. In your proposal, is Direct Input used? What is the max speed at this mode, 100MHz?
What is the difference between using Parallel mode and Polling, the automatic clocking?

Sorry for my lack of deeper knowledge  (yet). Thank you again.

To unsubscribe from this group and stop receiving emails from it, send an email to beagl...@googlegroups.com.

TJF

unread,
May 26, 2020, 11:22:19 AM5/26/20
to BeagleBoard
Hi!


Am Dienstag, 26. Mai 2020 17:04:50 UTC+2 schrieb PAk Ys:
Our signal will be in the tens of MHz (from 5Mhz to 50MHz max), depending on configuration.

The polling method is what I expected, however the manual (spruh73q) states there are three methods (Direct Input, Parallel Mode and 28bit shift Register), which confuses me. In your proposal, is Direct Input used? What is the max speed at this mode, 100MHz?
What is the difference between using Parallel mode and Polling, the automatic clocking?

The shift register is limited to a single pin.

GPIO is limited to 0V or 3V3, you have to adapt the input voltage.

You've to use direct PRU-GPIO (up to 16 pins), in order to avoid L3 latency.

Polling the clock signal needs only one cycle -> 200 MHz, but reading the lines and writing to memory consumes three further cycles -> 50 MHz main loop. So it should run reliable up to 25 MHz.

In order to buffer the data you can use SRam (12 kB) or DRam (up to 16 kB).

Regards

PAk Ys

unread,
May 26, 2020, 11:35:20 AM5/26/20
to BeagleBoard
Thank you TJF, this is quite helpful. 
Could you point me to a  code example for a similar solution. I understand the lines would be read as a parallel in the registers, right?

pru_GPI.PNG



We were thinking about delaying on of the clocks half a cycle and using and AND gate (sothe output effectively is like having a clock with twice the frequency), and read using the parallel method. Is there any benefit in doing that (besides the max 50MHz freq.)? Is it easier to handle?

Best regards

Dennis Lee Bieber

unread,
May 26, 2020, 12:09:44 PM5/26/20
to Beagleboard
On Tue, 26 May 2020 08:04:49 -0700 (PDT), in
gmane.comp.hardware.beagleboard.user PAk Ys
<frasalyu-Re5JQE...@public.gmane.org> wrote:

>First of all, let me thank you for your answer.
>
>Our signal will be in the tens of MHz (from 5Mhz to 50MHz max), depending
>on configuration.
>
>The polling method is what I expected, however the manual (spruh73q) states
>there are three methods (Direct Input, Parallel Mode and 28bit shift
>Register), which confuses me. In your proposal, is Direct Input used? What
>is the max speed at this mode, 100MHz?

How many PRU instructions will be required to...

Sample the clock

Determine clock state changed (eg: compare to previous value and branch
back to top if same)

Read data pin(s)

Write the data to memory

Return to top


... oh, and if you need to first look for the frame synch, that adds
another loop around the above with its sample/test

At best, that looks to be 5 to 7 (frame synch version) instructions. A
five instruction loop with 200MHz processor results in 40MHz "baud rate".
If you need more instructions -- increment a memory pointer, say, that will
reduce the effective rate. If the worst case cycle (and you WILL have to do
that worst-case evaluation!) consumes 10 instructions, your effective rate
will only be 20MHz.

That is based upon the PRU assembler instruction set -- if using a
C-level source, you may need to have the compiler dump an assembly listing
so you can study the instructions needed by the loops...



--
Dennis L Bieber

TJF

unread,
May 28, 2020, 1:57:43 AM5/28/20
to BeagleBoard
There're 17 input lines in each R31 (bit [0:16]). It'd be best if you use a set of eight (bit [0:7] or bit [8:15]), because the data to write would be only one byte.

Anyhow, the following ASM code is for word data (bit [0:15] range), written to DRam:

#define CLKB 5         // define the clock bit# for polling

...

  LDI  r0
, 0           // counter init
HIGH
:
  QBBC r31
, HIGH, CLKB // wait for clk bit getting high
  SBBO r31
, r0, 0, 2   // safe data
  ADD  r0
, r0, 2       // increment counter
  QB
?? ??, OUT         // termination
LOW
:
  QBBS r31
, LOW, CLKB  // wait for clk bit getting low
  SBBO r31
, r0, 0, 2   // safe data
  ADD  r0
, r0, 2       // increment counter
  QB
?? ??, HIGH        // reverse termination
OUT
:

// Note:
// In order to get higher frequency the SBBO + ADD instructions can
// get replaced by MVIW for buffering the data in the register file,
// but this is limited to 30*2=60 sets of data.

The main loop contains two similar sub loops, one starting after the clk line gets high, the other starting after the clk line gets low.
If the state of the clk line starts undefined, you've to add an initial QBB? before the main loop, in order to start at the right sub-loop.

AFAI understand your signal diagram, you're dealing with redundant data. Only three lines provide information.

Find example code in the libpruio documentation. Example pruss_toggle defines an output line and loads firmware to toggle that line. You can adapt that code for your input lines.

Andrew P. Lentvorski

unread,
May 28, 2020, 4:48:56 AM5/28/20
to BeagleBoard
What are you actually trying to do?  That's a 50Mbit sustained transfer rate if bit time is 20ns.  100Mbit if 10ns.  And 4 channels.  That's 200-400Mbps. You're moving a *TON* of data even at slower clock rates--there is no resource on the BBB that can keep that up very long.

This seems quite questionable on the BBB ... even moreso if your timing diagram is valid (which I suspect it is not).

First, given your diagram, the sampling point would have to be recovered.  Note that the Clock, Data, and Frame are all coincident.  Normally the clock falls in the middle of the data window.  That's not easy to recover without a PLL.  I doubt the BBB has enough resolution to hit the data stable point on the data with any reliability at 50MHz.

Second, are you *actually* running at 50MHz?  Are you considering every *cross* as the frequency (high-to-low on a single line) or are you considering an individual signal rise-to-rise as the frequency.  Is that data changing every 20ns or every *10ns*?  If it's 10ns, it's probably not possible.

I would *STRONGLY* recommend that you use an FPGA.  Even an incredibly cheap FPGA would deserialize that with ease and then you could put it into a form in which you might be able abuse something like the RMII from the Ethernet peripheral to transfer it out.

But it's still a *LOT* of data.  You need something that can handle gigabit speeds if you keep this up even for a couple milliseconds.

Your best bet would be to use that FPGA to also decimate the data as well so that you're working with a reasonable amount of data.

PAk Ys

unread,
May 28, 2020, 5:50:37 AM5/28/20
to BeagleBoard
Thank you TJF for that detailed answer and thank you too Andrew for your insights.

We are trying to extract radar data from one radar board. The goal is not have it working at 50MHz (which is the system max rate), but have the possibility if needed. Probably we will work at 10 to 20MHz, and the data on the BBB is only sent via Ethernet to another device in the same network. We only need to extract the data from the 4 Data lines (channels) whenever CLK changes, the Frame line is only used to synchronized at the beggining since the format doesn't change.

We have already done this setup with DSPs and MCUs with Ethernet integrated with no issues, however for my new design I only can use LVDS so a fast device (like PRU) is needed. B eaglebone would give me much more flexibility than a FPGA, because, I could decimate as you well said and/or implement a simple radar data viewer on the ARM core.

About Images

Steve Lentz

unread,
May 28, 2020, 8:21:32 AM5/28/20
to beagl...@googlegroups.com
I too was skeptical of FPGAs until recently, I think you owe it to yourself to check out Zynq and Altera which combine FPGAs with ARM cores.  The PL (FPGA part of these chips) can align the clock edges and write the data into the memory of the ARM processor.  You might find it is possible to do the decimation in the PL saving the ARM for other things.  You could potentially even generate UDP packets in the PL, if all you want to do is move the data someplace else.  The infamous FPGA learning curve is quite real, but not insurmountable.  There are lots of tutorials available, both from the vendors and from other sources.


--
For more options, visit http://beagleboard.org/discuss
---
You received this message because you are subscribed to the Google Groups "BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beagleboard...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/beagleboard/d45e79ae-736d-42c3-abc1-765b0274a34c%40googlegroups.com.

PAk Ys

unread,
May 28, 2020, 1:13:37 PM5/28/20
to BeagleBoard
Thank you for your point of view Steve, the thing is I preffer to have community behind like in the beaglebone and a smaller form factor, besides my goal is to generate radar board that people could use in their projects.  If PRU is enough, would be fantastic, a little of AI from the beaglebone AI could help a lot in development of new killer applications.
Don't you think?
Anyway, I will have the Zynq Z-turn board in mind.

To unsubscribe from this group and stop receiving emails from it, send an email to beagl...@googlegroups.com.

Andrew P. Lentvorski

unread,
May 28, 2020, 7:22:40 PM5/28/20
to BeagleBoard
You can certainly persist, but I'm going to point out the existence of chips like the AWR1843--"Single-chip 76-GHz to 81-GHz automotive radar sensor integrating DSP, MCU and radar accelerator":

This is about $30, and does all the RF-y things while sending your ADC data straight to a DSP and Cortex R4F with extra Radar-y things to accelerate analysis.  This allows you to focus on analyzing the results instead of the guts of "implementing a radar".

Anyways, good luck.  Sounds like an interesting project.

PAk Ys

unread,
May 29, 2020, 5:35:56 AM5/29/20
to BeagleBoard
Thank you Andrew. I am a radar engineer for more than 15 years myself.
We have designs more interesting than the AWR series of TI, which is quite interesting for some applications but no so much on others. The costs are a little bit higher than those $30, specially if you have to develop your PCB antenna.



My idea is to create a simpler, better device able to work with  the beaglebone (or the beaglewire), allowing users to create their own radar applications in a very fast way (including the UI).

TJF

unread,
May 29, 2020, 8:39:27 AM5/29/20
to BeagleBoard
Am Freitag, 29. Mai 2020 11:35:56 UTC+2 schrieb PAk Ys:
My idea is to create a simpler, better device able to work with  the beaglebone (or the beaglewire), allowing users to create their own radar applications in a very fast way (including the UI).

You can use the second PRU to de-serialize the data, while the first PRU is fetching them. Meanwhile the ARM CPU can provide the data transfer over the network (or PRU IEP module?), or it can do simple evaluations like pre-selection of relevant data sections, or GUI output ...

PAk Ys

unread,
May 29, 2020, 1:22:27 PM5/29/20
to BeagleBoard
Thank you TJF, that was exactly my idea from the beggining with that exact architecture.
If you think is feasible, then I will go for it.

Best regards

Gerhard Hoffmann

unread,
May 29, 2020, 3:13:20 PM5/29/20
to beagl...@googlegroups.com
Hi,

I'm reading LTC2500-32 ADCs with the PRU. The LTC2500 delivers 32 Bits
every usecond

as a 320nsec burst with 100 MHz bit rate. I receive the burst with a
shift register in a

Xilinx CoolrunnerII CPLD and read the shift register bytewise with PRU2.

I think I could read 3 ADCs with the current timing (minimum requirement
for me),

maybe 4 with some optimization. The PRU writes the data into this 12 KW
shared ram,

organized as a ping-pong buffer. The ARM reads continuously half of the
buffer while

the other half is written by the PRU. That also solves the problem that
is is hard

to allocate REALLY big buffers in the virtual address space of Linux and
fixing

their location somewhere for the PRU, in addition to the unpredictable
duration

of a memory cycle when competing with the ARM for access.

I have currently paused that software development to first fix the
analog part.


Cheers, Gerhard



PAk Ys

unread,
May 29, 2020, 5:35:58 PM5/29/20
to BeagleBoard
That is really interesting Gerhard!!

Let me ask you, why did you decide to use a CPLD instead any other device?

Actually this signal configuration is very common in ADC+Serdes devices (ADCs with DDR bit clock), in example you can take a look at these application notes:


My idea is, if possible, to do just that with PRUs as a more powerful, lower cost, better integrated setup.

Gerhard Hoffmann

unread,
May 29, 2020, 8:14:25 PM5/29/20
to beagl...@googlegroups.com

Am 29.05.20 um 23:35 schrieb PAk Ys:
> That is really interesting Gerhard!!
>
> Let me ask you, why did you decide to use a CPLD instead any other device?
>
The CPLD is the ideal thing to collect a few leftover logic bits and for
experimenting.

I have used the coolrunners since a very long time. They would be considered

"mature" by now. In this $2,50 device, you get 64 flipflops and enough
combinatorial

logic for

-generating the 1 MHz sample clock from the 100 MHz Xosc, 30 nsec wide
pulses

   just like the ADC loves it

- state machine to read out the ADC in a burst. The ADC wants to be left
alone

  in 2/3 of each cycle to avoid coupling dirt during conversion. In the
last 1/3 you

  have to hurry to  get your data bits,

- 32 bit shift register to collect the SPI data

- 4*8 bit mux and interface to the BBB with 2 byte select lines set by
the BBB

- data_available from ADC / ready to BBB handling and allowing the PRU

  to bit bang the ADC for setting up filters, decimation rate etc.

The 3 channels would not fit together, one might use a Spartan or
whatever finally.

But for first tests, it is fine. The CPLD is stamp-sized mini-board with
core voltage

regulator, it remembers its programming and can be programmed in the
usual way

via JTAG.

The other stamp is the LTC2500-32 ADC, with low noise LT3042 regulators
for analog

& digital VCC, a negative regulator, LT6655 reference and the analog ADC
driver.


That's all the hardware. With the BBB and its software it would be a
complete

Fourier analyzer with cross correlation and things. BTW I could compile
FFTW,

the fastest FFT in the West, just so. To talk with the analyzer, you
just open port 5005

on 192.168.178.33 and dump GPIB-style commands. Just like my Agilent
89441A. :-)


Methinks that the BBB can have a SRAM-like 16 Bit bus interface. That
would be

very interesting for FPGA device registers, FIFOs, DMA buffers and such.

Reasonably wide single cycle acesses, but I'm not sure what has to be
given up for it,

let alone how to get positively rid of these competing features. And how
to switch on

and place that memory window. I think it would cost a 16 bit cmos
transceiver

and an address latch, but it might ease a lot of things.


cheers,

Gerhard


Auswahl_004www.png

TJF

unread,
May 30, 2020, 4:11:17 AM5/30/20
to BeagleBoard
Am Samstag, 30. Mai 2020 02:14:25 UTC+2 schrieb Gerhard Hoffmann:
Methinks that the BBB can have a SRAM-like 16 Bit bus interface. That
would be

very interesting for FPGA device registers, FIFOs, DMA buffers and such.

Reasonably wide single cycle acesses, but I'm not sure what has to be
given up for it, ...

On BBB one can configure a 17 bit unidirectional interface on PRU-0 (perhaps also bidirectional by run-time pinmuxing). The SD card slot has to be given up for it. Find details in section PRU fast GPIO 16 bit

Regards
Reply all
Reply to author
Forward
0 new messages