aer-spinnaker interface issue

51 views
Skip to first unread message

Vaggelis Ntouros

unread,
Feb 28, 2022, 11:10:45 AM2/28/22
to SpiNNaker Users Group
Hello all,

I am trying to utilize the communication using an FPGA. To test my fpga design I use a microcontroller to "simulate" a dvs. So I can send 16 bit to the fpga just like a dvs would (including 4 phase handshake). Then I feed the spinnlink with the 7 signals plus the acknowledge bit. 

At first I send the start command which is sent correctly (acknowledge signal in start_cmd.jpg). 

Then if I try to send "0x1234030701" (virtual key = 1234, message = 0307, header = 01) the acknowledge signal is seen in 0307_1.jpg. Four complete transitions occur with the acknowledge bit toggling correctly. After that, L[0...6] change one more time but acknowledge does not toggle. Moreover if I reset spinn-3, communication resumes and the acknowledge signal continues as normal (0307_2.jpg). And all the rest data are transmitted correctly.

This happens every time. Do you have any advice to help me debugging this? Also a couple of questions:
    1) I see that acknowledge signal toggles every 50 ns approximately. Though, in app-note-7 it is referred that the speed is about 50MSamples/sec meaning 20 ns per symbol.
    2) Looking at 0307_1.jpg I see that the 3rd transition of the acknowledge bit happens too soon. Could this be a hint for the issue I am facing?
    3) Finally, I am debugging without any code. Do I need to send the start command in this stage of debugging, or is this something that the software needs? 

Thank you  
start_cmd.jpg
0307_2.jpg

Vaggelis Ntouros

unread,
Feb 28, 2022, 11:12:13 AM2/28/22
to SpiNNaker Users Group
Uploaded 0307_1.jpg since there was an issue.
0307_1.jpg

Andrew Rowley

unread,
Feb 28, 2022, 11:28:21 AM2/28/22
to Vaggelis Ntouros, SpiNNaker Users Group

Hi,

 

I am not certain what you mean by the “start command”.  If I am thinking correctly, this is something that the software can send *from* the SpiNNaker board to the external device board (your FPGA in this case) which tells you that the simulation has started.  This will just be a multicast packet with a particular key that you can recognise (the key is configurable in the software).

 

In terms of sending messages into SpiNNaker, it is worth confirming that for each value in the table, the 1 represents a toggle between high and low and 0 means no toggle.  So you don’t send 1 as high and 0 as low, just to make sure you are doing this, as it has caught people out in the past!

 

So if you start all wires at low, you can send 0x1 as (for links L[6-0]):

Low Low High Low Low High Low

 

Following this you would receive a single acknowledge, which would go to High, then you should send the 0x0 as:

Low Low Low Low Low High High

 

i.e. for 0 you transition L4 and L0, so L1 that was High from sending 0x1 stays high (no transition).  You should then get another acknowledge transition back to Low to indicate the reception again.

 

Can you confirm that you are doing this?  I should point out that I am definitely more of a software person myself, but I think this is how the protocol goes…

 

Thanks,

 

Andrew :)

 

--
You received this message because you are subscribed to the Google Groups "SpiNNaker Users Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spinnakeruser...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/spinnakerusers/ce0b2160-8a91-444d-a29a-8a8160cf73dfn%40googlegroups.com.

ntouev

unread,
Feb 28, 2022, 11:47:13 AM2/28/22
to Andrew Rowley, SpiNNaker Users Group
There was a misunderstanding I guess, regarding the start command, from my side. In any case as I understand, the message information (virtual key, header, ...) should not affect the acknowledge of the board. I mean that sending 32 dummy bits plus a correct header (7 zeros and 1 odd parity bit) should not matter at this point of development. Spinnaker should acknowledge for every symbol it receives, right?

I am aware of the protocol. What I did so far, is that I sent the dummy bits "0x0000ffff01" and spinnaker acknowledged 11 times (10 for each transition and 1 for the EOP). Sending a second dummy message "0x1234030701" the sending stopped after 4 symbols (only 4 acks from spinnaker). Though, pressing reset, spinnaker correctly resumed the transmission by toggling the acknowledge bit 7 more times.

Luis Plana Cabrera

unread,
Feb 28, 2022, 12:03:29 PM2/28/22
to ntouev, Andrew Rowley, SpiNNaker Users Group
The receiver controls transmission speed with the acknowledge. If the receiver runs out of resources, for example buffer space, It can stop transmission by not acknowledging. The sender must be capable of handling whatever delay the receiver imposes. This can happen at any time, not only between packets.

The receiver will not acknowledge an unrecognised symbol. This would be a symbol that is not a correct 2-of-7 symbol, as listed in app note 7.

Can you test what happens if you send the first set of "dummy" bits and the send the same set again? This could help us understand if the problem is an incorrect symbol in the second set.


From: spinnak...@googlegroups.com <spinnak...@googlegroups.com> on behalf of ntouev <vaggeli...@gmail.com>
Sent: 28 February 2022 14:47
To: Andrew Rowley <Andrew...@manchester.ac.uk>
Cc: SpiNNaker Users Group <spinnak...@googlegroups.com>
Subject: Re: [SpiNNaker Mailing List] aer-spinnaker interface issue
 


From: spinnak...@googlegroups.com <spinnak...@googlegroups.com> on behalf of ntouev <vaggeli...@gmail.com>
Sent: 28 February 2022 14:47
To: Andrew Rowley <Andrew...@manchester.ac.uk>
Cc: SpiNNaker Users Group <spinnak...@googlegroups.com>
Subject: Re: [SpiNNaker Mailing List] aer-spinnaker interface issue
 

Vaggelis Ntouros

unread,
Mar 1, 2022, 12:26:57 PM3/1/22
to SpiNNaker Users Group
I tried to resend the same message again "0000ffff01" and the first and second response of acknowledge is seen in first.jpg and second.jpg.

The first time "0000ffff01" is sent correctly and acknowledge seems fine. The second time, the data are transmitted but the waveform of acknowledge seems weird. I tried a bunch of different messages but could not get any pattern specific pattern.
  • Sometimes acknowledge was ok, like first.jpg.
  • Sometimes acknowledge was weird but data were transmitted correctly.
  • Sometimes communication was stuck in the middle of the transfer. Pressing reset or waiting (a lot of seconds) usually resumed the transfer till the end
Any advice would be highly appreciated.
Thank you
second.jpg
first.jpg

Andrew Rowley

unread,
Mar 3, 2022, 3:12:53 AM3/3/22
to Vaggelis Ntouros, SpiNNaker Users Group

Hi,

 

If waiting a number of seconds resumes the transfer, one thought would be that the routers on the machine are not configured to do anything with the received packets.  This might mean that they end up “default routing” around the board.  This might mean that the packets loop forever, particularly on a 4-chip board, which might then clog up the board.  One thought then is to load some routes on to the machine which either sends the packets to a core, or else tells the router not to send the packets anywhere; and if it sends them to a core, you probably want the core to actually accept them as that can also lead to delays while the router drops the packets.

 

An example of a “simple” application on SpiNNaker is here:

https://github.com/SpiNNakerManchester/spinnaker_tools/tree/master/apps/simple

 

This does more than you need, but you could probably set up the route here to be key 0, mask 0, which should then route everything to the core:

https://github.com/SpiNNakerManchester/spinnaker_tools/blob/master/apps/simple/simple.c#L104-L107

 

You could then disable these events, so only the multicast packet event is active:

https://github.com/SpiNNakerManchester/spinnaker_tools/blob/master/apps/simple/simple.c#L399-L401

 

You could also disable the activation of the DMA when a packet is received and replace it with an io_printf:

https://github.com/SpiNNakerManchester/spinnaker_tools/blob/master/apps/simple/simple.c#L311-L315

 

Let us know if you need any other help.

 

Andrew :)

 

Vaggelis Ntouros

unread,
Mar 4, 2022, 7:32:49 AM3/4/22
to SpiNNaker Users Group
Mr Cabrera and Mr Rowley,

First of all thank you for your advices. 

I come back to you with a more targeted approach on my issue. Most of the problems regarding the randomness of the output are now gone since I fixed connection issues. Now I am facing a final problem that is not relevant to connections etc since I tested a lot of different HW setups. So below I am describing this issue.

Sending the flit "0111" or "1101", meaning 0x7 or 0xd causes the acknowledge to stop toggling. 
    - This is irrelevant of the current state of the 7 wires. 
    - It happens every time. 
    - The strange part is that spinnaker acknowledges the receive of 0x7 or 0xd, but the next change does not produce any acknowledge, no matter what the next change is.

I wonder whether this could be related to the software or something in the acking mechanism in general that I am missing.

To be noted that
    - acknowledge signal does not do anything strange before stops toggling, nor the 7 wires.  
    - the design works robustly with every other symbol. Tested at even high speeds (1 packet per us)

Vaggelis

Vaggelis Ntouros

unread,
Mar 7, 2022, 9:40:29 AM3/7/22
to SpiNNaker Users Group
Hello,

any advice on my issue would be really helpful since I am stuck here for a lot of days. I would mostly like to know if there could be something related to the SW or the acking mechanism that produces this behaviour, which I am not aware of.

Thank you 
Vaggelis

Luis Plana Cabrera

unread,
Mar 7, 2022, 10:13:56 AM3/7/22
to Vaggelis Ntouros, SpiNNaker Users Group
Hi,

Sending the flit "0111" or "1101", meaning 0x7 or 0xd causes the acknowledge to stop toggling. 
​We are not aware of any issues related to the transmission of specific nibbles.

We have FPGA boards that transmit millions of packets per second to SpiNNaker and I can guarantee that all possible nibbles have been transmitted. Additionally, the same exact SpiNNaker links are used for chip-to-chip communication on the SpiNNaker board without any problems.

Communications on SpiNNaker links are completely brokered by hardware, so software should not be an issue.

As I mentioned earlier, SpiNNaker can apply backpressure on the link to reduce speed but it will not deadlock the link. SpiNNaker will eventually throw away stuck packets to allow new ones in. I can also guarantee that backpressure is *not* triggered by specific nibbles.

I cannot see in your setup picture if you connected all the GND terminals. You probably noticed that GND and signals alternate in the cable/connector. Making sure that all GND signals are properly connected on both ends of the cable usually helps with signal integrity.

If you are certain that there are no signalling issues then I can only suggest to check the 2-of-7 NRZ encoder. It might be sending the wrong code for those nibbles.


From: spinnak...@googlegroups.com <spinnak...@googlegroups.com> on behalf of Vaggelis Ntouros <vaggeli...@gmail.com>
Sent: 07 March 2022 14:40

Luis Plana Cabrera

unread,
Mar 7, 2022, 10:26:04 AM3/7/22
to Vaggelis Ntouros, SpiNNaker Users Group
SpiNN-3 boards export two SpiNNaker links. Have you tried both of them? If not, you can try the other one to rule out a defective SpiNNaker link.

I have to say that, so far, we have not seen any defective SpiNNaker links.


From: Luis Plana Cabrera <Luis....@manchester.ac.uk>
Sent: 07 March 2022 15:13
To: Vaggelis Ntouros <vaggeli...@gmail.com>; SpiNNaker Users Group <spinnak...@googlegroups.com>

Subject: Re: [SpiNNaker Mailing List] aer-spinnaker interface issue

ntouev

unread,
Mar 8, 2022, 12:50:48 PM3/8/22
to Luis Plana Cabrera, SpiNNaker Users Group
Thank you for your advice. The error was actually an inverse connection of L[0...6] in the ribbon cable. 

Vaggelis
Reply all
Reply to author
Forward
0 new messages