Weird behaviour with pins and PRUs

92 views
Skip to first unread message

Loïc Droz

unread,
Nov 8, 2017, 3:27:20 PM11/8/17
to BeagleBoard
Hello,

I am a beginner in PRU and embedded programming in general so I apologize if I am missing something obvious.

I am trying to implement an audio processing algorithm (CIC filter) to run on the PRUs. To do so, I need to generate a CLK signal for the microphone, which I generate on PRU0, output from PRU0 to pin P8.11 and I then connect the mic's CLK pin to this pin directly. However, I also need to be able to read the CLK signal from PRU1, so I figured I would plug the CLK signal generated by PRU0 to pin P8.46 which I then poll from PRU1.

Doing this gives strange results though. As soon as I connect P8.11 (output CLK) to P8.46 (input CLK), the CLK signal from P8.11 gets stuck at Vdd. If I unplug it it will remain stuck at Vdd. Even restarting both programs (the clock generating one on PRU0 and the audio processing one on PRU1) does not fix the issue. Only restarting the Beaglebone does. I checked the voltage of P8.46 and it is always at 0.

I tried running only the PRU0 CLK program and the issue does not happen, it looks like the PRU1 program causes some issues.

Here is the code for the PRU0 CLK program :




/* Code for the clock generated by PRU0 and sent to the microphone. */
.origin 0
.entrypoint TOP

#include "prudefs.hasm"

#define CYCLES 39
#define CLK_PIN r30.b1

TOP:
    MOV     r0, CYCLES
_LOOP:
    SUB     r0, r0, 1
    QBNE    _LOOP, r0, 0

    // Toggle CLK signal
    XOR     CLK_PIN, CLK_PIN, 1 << 7
    QBA     TOP




And here is the code for the PRU1 audio program (sorry it's a little long) :




#define PRU1_ARM_INTERRUPT 20

// Input pins offsets
#define CLK_OFFSET 1
#define DATA_OFFSET 0

// Register aliases
#define IN_PINS r31
#define SAMPLE_COUNTER r5
#define WAIT_COUNTER r6
#define TMP_REG r7
#define BYTE_COUNTER r8

#define HOST_MEM r20
// Host mem size is multiple of 8, this is ensured on the host side
#define HOST_MEM_SIZE r21
#define LOCAL_MEM r22
// Defined in page 19 of the AM335x PRU-ICSS Reference guide
#define LOCAL_MEM_ADDR 0x2000

#define INT0 r0
#define INT1 r1
#define INT2 r2
#define INT3 r3
#define LAST_INT r4

#define COMB0 r10
#define COMB1 r11
#define COMB2 r12
//#define COMB3 r13
#define LAST_COMB0 r14
#define LAST_COMB1 r15
#define LAST_COMB2 r16

// DEBUG (assumes pin P8.44)
#define SET_LED SET r30, r30, 3
#define CLR_LED CLR r30, r30, 3

.origin 0
.entrypoint TOP

TOP:
    //MOV     r31.b0, PRU1_ARM_INTERRUPT + 16
    SET_LED
    // ### Memory management ###
    // Enable OCP master ports in SYSCFG register
    // It is okay to use the r0 register here (which we use later too) because it merely serves as a mean to temporary hold the value of C4 + 4, the OCP masters are enabled by writing the correct data to C4
    LBCO    r0, C4, 4, 4
    CLR     r0, r0, 4
    SBCO    r0, C4, 4, 4
    // Load the local memory address in a register
    MOV     LOCAL_MEM, LOCAL_MEM_ADDR
    // From local memory, grab the address of the host memory (passed by the host before this program started)
    LBBO    HOST_MEM, LOCAL_MEM, 0, 4
    // Likewise, grab the host memory length
    LBBO    HOST_MEM_SIZE, LOCAL_MEM, 4, 4

    // ### Set up start configuration ###
    // Setup counters to 0 at first
    LDI     SAMPLE_COUNTER, 0
    LDI     BYTE_COUNTER, 0
    // Set all integrator and comb registers to 0 at first
    LDI     INT0, 0
    LDI     INT1, 0
    LDI     INT2, 0
    LDI     INT3, 0
    LDI     COMB0, 0
    LDI     COMB1, 0
    LDI     COMB2, 0
    //LDI     COMB3, 0
    LDI     LAST_INT, 0
    LDI     LAST_COMB0, 0
    LDI     LAST_COMB1, 0
    LDI     LAST_COMB2, 0

    // ### Signal processing ###
wait_edge:
    // First wait for CLK = 0
    WBC     IN_PINS, CLK_OFFSET
    // Then wait for CLK = 1
    WBS     IN_PINS, CLK_OFFSET

    // Wait for t_dv time, since it can be at most 125ns, we have to wait for 25 cycles
    LDI     WAIT_COUNTER, 12 // Because 25 = 1 + 12*2 and the loop takes 2 one-cycle ops
wait_signal:
    SUB     WAIT_COUNTER, WAIT_COUNTER, 1
    QBNE    wait_signal, WAIT_COUNTER, 0

    // Retrieve data from DATA pin (only one bit!)
    AND     TMP_REG, IN_PINS, 1 << DATA_OFFSET
    LSR     TMP_REG, TMP_REG, DATA_OFFSET
    // Do the integrator operations
    ADD     SAMPLE_COUNTER, SAMPLE_COUNTER, 1
    ADD     INT0, INT0, TMP_REG
    ADD     INT1, INT1, INT0
    ADD     INT2, INT2, INT1
    ADD     INT3, INT3, INT2

    // Branch for oversampling
    QBNE    wait_edge, SAMPLE_COUNTER, 64

    // Reset sample counter once we reach R
    LDI     SAMPLE_COUNTER, 0

    // 4 stage comb filter
    SUB     COMB0, INT3, LAST_INT
    SUB     COMB1, COMB0, LAST_COMB0
    SUB     COMB2, COMB1, LAST_COMB1
    SUB     TMP_REG, COMB2, LAST_COMB2

    // Output the result to memory
    // We write one word (4 B) from TMP_REG to HOST_MEM with an offset of BYTE_COUNTER
    SBBO    TMP_REG, HOST_MEM, BYTE_COUNTER, 4
    // Increment the written bytes counter once the write operation is done
    ADD     BYTE_COUNTER, BYTE_COUNTER, 4
    // First, check if we are about to overrun the buffer, that is, if HOST_MEM_SIZE - BYTE_COUNTER < 4
    // If yes, send an interrupt to the host, and reset the byte counter/offset back to 0
    // TODO: since HOST_MEM_SIZE is a multiple of 8, maybe we could just do an equality check ?
    //SUB     TMP_REG, HOST_MEM_SIZE, BYTE_COUNTER
    //QBGE    check_half, 4, TMP_REG  // Jump to "check_half" if HOST_MEM_SIZE - BYTE_COUNTER >= 4
    QBNE    check_half, HOST_MEM_SIZE, BYTE_COUNTER
    MOV     r31.b0, PRU1_ARM_INTERRUPT + 16  // Interrupt the host, TODO: could be done in a safer way by writing to the host memory which buffer we're in
    LDI     BYTE_COUNTER, 0  // Reset counter/offset, which will make us write to the beginning of host memory again
    QBA     continue_comb

// TODO: could be done in a more efficient way, by storing the half value in a register
check_half:
    // If we have filled more than half of the buffer on the host side, send an interrupt, use TMP_REG to store the value of the host buffer divided by 2, because the host side memory length is a multiple of 8, so half of it will be a multiple of 4
    LSR     TMP_REG, HOST_MEM_SIZE, 2
    QBNE    continue_comb, TMP_REG, BYTE_COUNTER
    // Interrupt the host to tell him we wrote to half of the buffer
    MOV     r31.b0, PRU1_ARM_INTERRUPT + 16

continue_comb:
    // Update LAST_INT value and LAST_COMBs
    // TODO: check this is correct, and this could perhaps be done in fewer instructions
    MOV     LAST_INT, INT3
    MOV     LAST_COMB0, COMB0
    MOV     LAST_COMB1, COMB1
    MOV     LAST_COMB2, COMB2

    // Branch back to wait edge
    QBA     wait_edge

    // Interrupt the host so it knows we're done
    MOV     r31.b0, PRU1_ARM_INTERRUPT + 16

    HALT



If you don't feel like reading everything, these are all the instructions that mention r31, the input pins register :

WBC     IN_PINS, CLK_OFFSET
WBS     IN_PINS, CLK_OFFSET
AND     TMP_REG, IN_PINS, 1 << DATA_OFFSET
MOV     r31.b0, PRU1_ARM_INTERRUPT + 16


I suppose the last instructions might cause problems since the bit corresponding to pin P8.46 is in r31.b0, but as far as I understand, writing to these bits does not actually change the value of the pins but only triggers an interrupt, I may be wrong though. I tried using pin P8.30 for CLK input which has an offset of 11, so it is not in r31.b0, but I still have the same problem.

I made sure to run these commands before running my programs :

config-pin -a P8.11 pruout
config-pin -a P8.46 pruin
config-pin -a P8.30 pruin


Feel free to ask for more information.

Greetings,
Loïc

mike.ma...@gmail.com

unread,
Nov 9, 2017, 7:32:02 AM11/9/17
to BeagleBoard
How about the pin-multiplexing of your two PRU-pins? Are they really enabled as PRU in/outputs or are they still mapped to some other functions of the main core?


On Wednesday, November 8, 2017 at 9:27:20 PM UTC+1, Loïc Droz wrote:
However, I also need to be able to read the CLK signal from PRU1, so I figured I would plug the CLK signal generated by PRU0 to pin P8.46 which I then poll from PRU1.

This seems to be a bad idea in general. For me it looks like you waste computing power in PRU1 to retrieve an information PRU0 already has. Can't you do both operations on one single PRU? Or use an EHRPWM-output to generate the clock in order to not to waste a complete PRU for such a stupid task?

Loïc Droz

unread,
Nov 9, 2017, 8:00:37 AM11/9/17
to BeagleBoard
Yes I checked they are all enabled by calling config-pin -q on them and this showed the right configuration. I started working on the setup again today though and now it works fine, but I didn't change anything in my code or my pinmux setup, weird....

And I agree this a bad idea because it is a huge waste of computing power. I have tried using the BBB built in PWM pin which, as far as I know, can be configured to be a 50% duty cycle, stable clk signal by following this link : ... But so far as I haven't been able to make it work because the files mentioned there are not present in my kernel (Linux beaglebone 4.4.91-ti-r133 #1 SMP Tue Oct 10 05:18:08 UTC 2017 armv7l GNU/Linux). I would like to use that PWM though, is there a way to do it on the more recent kernels ?

I could also do both operations on one PRU yes. However, since the audio processing steps involve memory writes to the host, I am worried these might not take deterministic time, which would prevent my clock my clock from being stable.

mike.ma...@gmail.com

unread,
Nov 9, 2017, 9:15:35 AM11/9/17
to BeagleBoard
On Thursday, November 9, 2017 at 2:00:37 PM UTC+1, Loïc Droz wrote:
And I agree this a bad idea because it is a huge waste of computing power. I have tried using the BBB built in PWM pin which, as far as I know, can be configured to be a 50% duty cycle, stable clk signal by following this link : ... But so far as I haven't been able to make it work because the files mentioned there are not present in my kernel (Linux beaglebone 4.4.91-ti-r133 #1 SMP Tue Oct 10 05:18:08 UTC 2017 armv7l GNU/Linux). I would like to use that PWM though, is there a way to do it on the more recent kernels ?

I have no idea, I do not use the Hardware from within Linux but out of a bare metal firmware...
 
I could also do both operations on one PRU yes. However, since the audio processing steps involve memory writes to the host, I am worried these might not take deterministic time, which would prevent my clock my clock from being stable.


There is some shared memory available where both, PRU and main core have access to. Since PRU has the priority for accessing this ram, you can use it for a ringbuffer with defined timing which also can be read from the main core.

 

TJF

unread,
Nov 9, 2017, 10:08:16 AM11/9/17
to BeagleBoard


Am Donnerstag, 9. November 2017 14:00:37 UTC+1 schrieb Loïc Droz:
I could also do both operations on one PRU yes. However, since the audio processing steps involve memory writes to the host, I am worried these might not take deterministic time, which would prevent my clock my clock from being stable.

On one PRU, you could use the PRU internal eCAP subsystem to generate the CLK output. Synchronize your software by reading back the eCAP counter.

BR

Loïc Droz

unread,
Nov 16, 2017, 10:16:15 AM11/16/17
to BeagleBoard
I could yes, but I read somewhere (can't remember where exactly, but it was in some TI PowerPoint presentation that reading the eCAP register takes at least 4 cycles, which is not ideal for my purposes. In the meantime I found a way to use the BBB built in PWM as a CLK signal and it works now. Check this link if you're interested. Thanks for your help nonetheless.

TJF

unread,
Nov 16, 2017, 10:57:00 AM11/16/17
to BeagleBoard


Am Donnerstag, 16. November 2017 16:16:15 UTC+1 schrieb Loïc Droz:
I could yes, but I read somewhere (can't remember where exactly, but it was in some TI PowerPoint presentation that reading the eCAP register takes at least 4 cycles, which is not ideal for my purposes.

Read or write access from PRU to the PRU internal eCAP module takes exactly one cycle. You can find this information in the PRU TRM.

Please don't confuse other readers.
Reply all
Reply to author
Forward
0 new messages