Corruption of all received data after sending

132 views
Skip to first unread message

Tutman

unread,
Sep 4, 2012, 11:19:39 AM9/4/12
to rf22-a...@googlegroups.com
I have an issue that I have been trying to solve.  Here is my setup:

Sender: RFM22B (915Mhz)
Receiver: RFM22B (915Mhz)

Both are on breadboards with good power supply.  I am running at 124kbs on 910MHz and 17dB.  The RF22 library is set up in standard communication mode.

I can send data from Sender at a very high rate, and receiver receives all packets fine.  I can ask the receiver to reply to a message.  These are (usually) received fine.  Eventually, a reply will cause the receiver to start receiving garbage.  I believe that it is due to timing.  I believe that if the receiver is in the process of sending and it receives a message, all data from then on is corrupted until I reset the radio.  If I transmit messages slower from the sender, it is much more difficult to reproduce, but still possible.

The message length is still correct (read from the packet length register), but the data received is now random.  I can reset the receiver side, and valid data is received again.

I added debugging print statements in the RF22's ISR, and can confirm that the data is corrupt immediately from the spiBurstRead command to read in the data.

I also added code in the library to disable the interrupts during a send operation, thinking that it was reading and writing to the SPI from both loop code and the ISR.  But that didn't help.

So to me, it seems like a corruption problem in the radio itself, and not the library.  I may try different radio settings to see if I can get it to go away.  I looked at the spec sheet and I can see the timing requirements for switching between receive and transmit, and it looks like that is already being handled.

By the way, I have tried swapping out the radios and they all exhibit the same behavior.

Can you help Mike?

Tutman

unread,
Sep 4, 2012, 12:53:37 PM9/4/12
to rf22-a...@googlegroups.com
Above, I wrote that I modified the library to disable the interrupts during send.  I now see that this was already done:-)

However, I did modify the library so that all cli/sei code does not allow the interrupt to be re-entrant.

For instance, in spiWrite...the code was:
cli();
...
sei();

I modified it to:
uint8_t saveSREG = SREG;
cli();
...
SREG = saveSREG;

Here is a note about why I did this:

I made the changes to each of the spi commands.  However, this didn't solve my issue.

Tutman

unread,
Sep 4, 2012, 1:38:30 PM9/4/12
to rf22-a...@googlegroups.com
I have found and fixed the issue.

Mike, the issue that I was having was fixed by modifying the code that disables/enables the interrupts.  You have it in the spi methods.  If code has to do multiple spi operations, the interrupt gets enabled at the end of every spi operation, which gives the interrupt a chance to run and potentially issue spi commands while the main loop has a series of spi operations going on its own.  The interrupt code itself does not need cli/sei because it is already handled for you by the processor.  I added the disable/enable interrupt code to the ::send and ::recv functions instead, and now my issue is gone.  There are other public functions that need this as well, but that is all that I was calling.  Your addition of disabling the interrupts is good...except you need it around entire series of spi operations...not just single ones.

Agree?

Mike McCauley

unread,
Sep 4, 2012, 7:39:15 PM9/4/12
to rf22-a...@googlegroups.com
Hello,

On Tuesday, September 04, 2012 10:38:30 AM Tutman wrote:
> I have found and fixed the issue.
>
> Mike, the issue that I was having was fixed by modifying the code that
> disables/enables the interrupts. You have it in the spi methods. If code
> has to do multiple spi operations, the interrupt gets enabled at the end of
> every spi operation, which gives the interrupt a chance to run and
> potentially issue spi commands while the main loop has a series of spi
> operations going on its own. The interrupt code itself does not need
> cli/sei because it is already handled for you by the processor. I added
> the disable/enable interrupt code to the ::send and ::recv functions
> instead, and now my issue is gone. There are other public functions that
> need this as well, but that is all that I was calling. Your addition of
> disabling the interrupts is good...except you need it around entire series
> of spi operations...not just single ones.
>
> Agree?

Hmmm, could be.
I wonder is this might explain Franks issues too?

Can you pls send your modified code so I can check?

And how did you test the before and after? Actual test code would be helpful.

Cheers.
--
Mike McCauley mi...@open.com.au
Open System Consultants Pty. Ltd
9 Bulbul Place Currumbin Waters QLD 4223 Australia http://www.open.com.au
Phone +61 7 5598-7474 Fax +61 7 5598-7070

Radiator: the most portable, flexible and configurable RADIUS server
anywhere. SQL, proxy, DBM, files, LDAP, NIS+, password, NT, Emerald,
Platypus, Freeside, TACACS+, PAM, external, Active Directory, EAP, TLS,
TTLS, PEAP, TNC, WiMAX, RSA, Vasco, Yubikey, MOTP, HOTP, TOTP,
DIAMETER etc. Full source on Unix, Windows, MacOSX, Solaris, VMS, NetWare etc.

Mike McCauley

unread,
Sep 4, 2012, 8:04:07 PM9/4/12
to rf22-a...@googlegroups.com
Hi,


After closer examination, I can see there is a problem: the cli() and sei()
calls in arduino are not stackable, so when the interrupt service routine
calls spiBurstRead, when spiBurstRead calls sei(), interrupts are enabled
again *inside* the ISR :-( But the ISR is not inteneded to be reentrant.

If Tutman can send me his mods and test cases, I will test his fix.

Cheers.



On Wednesday, September 05, 2012 09:39:15 AM Mike McCauley wrote:
> Hello,
>
> On Tuesday, September 04, 2012 10:38:30 AM Tutman wrote:
> > I have found and fixed the issue.
> >
> > Mike, the issue that I was having was fixed by modifying the code that
> > disables/enables the interrupts. You have it in the spi methods. If
> > code has to do multiple spi operations, the interrupt gets enabled at
> > the end of every spi operation, which gives the interrupt a chance to
> > run and potentially issue spi commands while the main loop has a series
> > of spi operations going on its own. The interrupt code itself does not
> > need cli/sei because it is already handled for you by the processor. I
> > added the disable/enable interrupt code to the ::send and ::recv
> > functions instead, and now my issue is gone. There are other public
> > functions that need this as well, but that is all that I was calling.
> > Your addition of disabling the interrupts is good...except you need it
> > around entire series of spi operations...not just single ones.
> >
> > Agree?
>
> Hmmm, could be.
> I wonder is this might explain Franks issues too?

Roland Mieslinger

unread,
Sep 5, 2012, 5:31:14 AM9/5/12
to rf22-a...@googlegroups.com


Am Mittwoch, 5. September 2012 02:02:50 UTC+2 schrieb mikem:
After closer examination, I can see there is a problem: the cli() and sei()
calls in arduino are not stackable, so when the interrupt service routine
calls spiBurstRead, when spiBurstRead calls sei(), interrupts are enabled
again *inside* the ISR :-( But the ISR is not inteneded to be reentrant.

Maybe ATOMIC_BLOCK(ATOMIC_RESTORESTATE) macro is what your are looking for.

--
Roland

Paul Martinsen

unread,
Sep 5, 2012, 5:58:14 AM9/5/12
to rf22-a...@googlegroups.com
We've been trouble shooting lockups too. Last night we encountered situations where the last 5 bytes of the transmit buffer might not be sent from the fifo. This meant all subsequent transmits included the last 5 bytes from the previous message + N-5 bytes from the new message of length N. Calling resetTxFifo() from send resolved this. 

Also, I just noticed that _mode is not declared volatile, but is changed in the ISR. Should it be volatile for this reason?

Tutman

unread,
Sep 5, 2012, 6:47:24 AM9/5/12
to rf22-a...@googlegroups.com
I have been doing additional testing.  Right now, I've only been testing in my application.  I may write new code just for testing with the library.

I have been trying to pin down some additional problems.  They seemed to stem from sending multiple message quickly.  I'm also doing a better job of just listening after I send a request.  I'm playing around with just disabling INT0 (in my case) instead of globally disabling interrupts.

I didn't want to wait on the completion of a send after every send.  Instead, I want to wait for the send to complete right before I send.  My application has work to do after a send, and the radio will likely complete it in the background by the time I'm ready to send again.  In order to support this pattern, I had to make some changes to some of the logic.  I'm sure that this would break existing applications using the library.

One additional thing that I am in the process of testing, is in send...including a boolean variable indicating if it should go to RX mode after the send is completed.  The interrupt handler will switch the mode automatically upon send completion.

I'll take a look at the _mode variable.  I hadn't noticed that.

Once I'm certain that the interrupt disabling is working properly, I definitely don't mind sharing.  I just don't want to share something that I'm not sure about yet.

Mike McCauley

unread,
Sep 5, 2012, 7:12:24 AM9/5/12
to rf22-a...@googlegroups.com
Hi,

On Wednesday, September 05, 2012 02:58:14 AM Paul Martinsen wrote:
> We've been trouble shooting lockups too. Last night we encountered
> situations where the last 5 bytes of the transmit buffer might not be sent
> from the fifo. This meant all subsequent transmits included the last 5
> bytes from the previous message + N-5 bytes from the new message of length
> N. Calling resetTxFifo() from send resolved this.
>
> Also, I just noticed that _mode is not declared volatile, but is changed in
> the ISR. Should it be volatile for this reason?

In theory, yes. However, all the *current* uses of _mode would not be affected
by volatile. Ill change it to be safe.


Cheers.

>
> On Wednesday, September 5, 2012 9:31:14 PM UTC+12, Roland Mieslinger wrote:
> > Am Mittwoch, 5. September 2012 02:02:50 UTC+2 schrieb mikem:
> >> After closer examination, I can see there is a problem: the cli() and
> >> sei()
> >> calls in arduino are not stackable, so when the interrupt service
> >> routine calls spiBurstRead, when spiBurstRead calls sei(), interrupts
> >> are enabled again *inside* the ISR :-( But the ISR is not inteneded
> >> to be reentrant.>
> > Maybe ATOMIC_BLOCK(ATOMIC_RESTORESTATE) macro is what your are looking
> > for.
> >
> > --
> > Roland

Mike McCauley

unread,
Sep 5, 2012, 7:47:12 AM9/5/12
to rf22-a...@googlegroups.com, Tutman
Hi,

On Wednesday, September 05, 2012 03:47:24 AM Tutman wrote:
> I have been doing additional testing. Right now, I've only been testing in
> my application. I may write new code just for testing with the library.

Im mostly keen to see where exactly you disable and reenable the interrupts.

>
> I have been trying to pin down some additional problems. They seemed to
> stem from sending multiple message quickly. I'm also doing a better job of
> just listening after I send a request. I'm playing around with just
> disabling INT0 (in my case) instead of globally disabling interrupts.
>
> I didn't want to wait on the completion of a send after every send.
> Instead, I want to wait for the send to complete right before I send. My
> application has work to do after a send, and the radio will likely complete
> it in the background by the time I'm ready to send again. In order to
> support this pattern, I had to make some changes to some of the logic. I'm
> sure that this would break existing applications using the library.

You might just need to look at _mode before transmitting?
I can see a good argument for checking that anyway at the beginning of send()

>
> One additional thing that I am in the process of testing, is in
> send...including a boolean variable indicating if it should go to RX mode
> after the send is completed. The interrupt handler will switch the mode
> automatically upon send completion.

You might consider setting _idleMode to have the receiver turned on:
Then it will be receiving whenever its not transmitting.
something like

_idleMode = RF22_MODE_RX

>
> I'll take a look at the _mode variable. I hadn't noticed that.
>
> Once I'm certain that the interrupt disabling is working properly, I
> definitely don't mind sharing. I just don't want to share something that
> I'm not sure about yet.

Tutman

unread,
Sep 5, 2012, 4:22:58 PM9/5/12
to rf22-a...@googlegroups.com, Tutman
So here is the recv command.  I commented-out the code that globally disables interrupts.  I'm trying to test to see if just disabling the one interrupt works as well.

boolean RF22::recv(uint8_t* buf, uint8_t* len)
{
boolean result = false;
//uint8_t saveSREG = SREG;
//cli();
EIMSK &= ~(1 << INT0);
if (available())
{
if (*len > _bufLen)
*len = _bufLen;

memcpy(buf, _buf, *len);
clearRxBuf();
result = true;
}
//SREG = saveSREG;
EIMSK |= (1 << INT0);
return result;
}

As for the suggestion of maybe modifying _idleMode...at times I need to be able to transmit four or five packets in a row.  I don't want the receiver enabled in between sends...that would likely require more time.  I would like to have the receiver enabled automatically when the last packet is finished sending.

I'll be working some on this again tonight.

Mike McCauley

unread,
Sep 5, 2012, 7:29:25 PM9/5/12
to rf22-a...@googlegroups.com, Tutman
Thanks for everyones contributions to this.

We have now uploaded a new version 1.20 with these changes.

version 1.20

_mode is now volatile.

RF22::send() now waits until any previous transmission is complete before
sending.

RF22::waitPacketSent() now waits for the RF22 to not be in _mode ==
RF22_MODE_TX

_txPacketSent member is now redundant and removed.

Improvements to interrupt handling and blocking. Now use
ATOMIC_BLOCK(ATOMIC_RESTORESTATE)
to prevent reenabling interrupts too soon. Thanks to Roland Mieslinger for
this suggestion.

Added some performance measurements to documentation.

It would be good for testers and modifiers to work from this base.

Cheers.

On Wednesday, September 05, 2012 03:47:24 AM Tutman wrote:

Tutman

unread,
Sep 6, 2012, 8:43:03 AM9/6/12
to rf22-a...@googlegroups.com, Tutman
Checking out the changes now.

Mike, I did notice that _rxGood, _rxBad, _txGood are all declared as uint16_t, but are also declared as volatile.  Although these variables are not used outside of the interrupt anywhere, the volatile keyword is not benefiting anything, since 16-bit variables can still be modified 1 byte at a time.  Might want to consider getting rid of these variables since they aren't used.

Do you know if the transition duration between RX and TX is greater than Idle and TX?  I can't quite tell from the datasheet.  Also, do you know if the transmission just takes longer to complete or if we must somehow delay the write after switching modes?  Again, I couldn't gather this from the datasheet.

Mike McCauley

unread,
Sep 6, 2012, 4:46:43 PM9/6/12
to rf22-a...@googlegroups.com
On Thursday, September 06, 2012 05:43:03 AM Tutman wrote:
> Checking out the changes now.

Thanks. Im keen to know if the changes in 1.20 fix your corruption issues.

>
> Mike, I did notice that _rxGood, _rxBad, _txGood are all declared as
> uint16_t, but are also declared as volatile. Although these variables are
> not used outside of the interrupt anywhere, the volatile keyword is not
> benefiting anything, since 16-bit variables can still be modified 1 byte at
> a time. Might want to consider getting rid of these variables since they
> aren't used.

They are intended for statistics use by subclasses. I agree there may be an
issue with their 16 bitness, but I do want to leave them there.


>
> Do you know if the transition duration between RX and TX is greater than
> Idle and TX? I can't quite tell from the datasheet. Also, do you know if
> the transmission just takes longer to complete or if we must somehow delay
> the write after switching modes? Again, I couldn't gather this from the
> datasheet.

No its not clear.

AFAICT, the chip does not go direct from TX to RX or vice versa, but always
via IDLE, so I expect that RX->IDLE->TX will be slower than IDLE->TX.

Cheers.

Tutman

unread,
Sep 6, 2012, 5:13:12 PM9/6/12
to rf22-a...@googlegroups.com


On Thursday, September 6, 2012 4:45:23 PM UTC-4, mikem wrote:
Thanks. Im keen to know if the changes in 1.20 fix  your corruption issues. 

I think that a lot of the corruption issues has(d) to do with both RX and TX sharing the same buffer.  It may have been me for not using the library the way it was intended.  I have made several changes suited for my application, so I'm not sure if I'll be able to test 1.20.  It does seem that many of the operations (other than send and recv) really need atomicity between SPI operations.  The only other public method that I use is the one for temperature, and to check to see if it is still in TX mode.
  

They are intended for statistics use by subclasses. I agree there may be an
issue with their 16 bitness, but I do want to leave them  there.

I figured that might be the case.  I'm only using the base class.
 

No its not clear. 

AFAICT, the chip does not go direct from TX to RX or vice versa, but always
via IDLE, so I expect that RX->IDLE->TX will be slower than IDLE->TX.

In my testing today, I tried it both ways and added some time measurement logging.  It didn't seem to make a difference.  But I am having a weird lock-up in the code when I do two sends in a row.  If I add debug logging, it goes away.  So I'm not sure if we are supposed to delay the operations or not.  I hate adding delay statements, but if they are small enough, it won't matter much in my application.

 

Mike McCauley

unread,
Sep 6, 2012, 5:20:54 PM9/6/12
to rf22-a...@googlegroups.com
Hi,

On Thursday, September 06, 2012 02:13:12 PM Tutman wrote:
> On Thursday, September 6, 2012 4:45:23 PM UTC-4, mikem wrote:
> > Thanks. Im keen to know if the changes in 1.20 fix your corruption
> > issues.
>
> I think that a lot of the corruption issues has(d) to do with both RX and
> TX sharing the same buffer. It may have been me for not using the library
> the way it was intended. I have made several changes suited for my
> application, so I'm not sure if I'll be able to test 1.20.

Thats a shame. It will be good if you can migrate, or make a set of patches
that you can apply to 1.20 and future versions (so you can stay up-to-date)


> It does seem
> that many of the operations (other than send and recv) really need
> atomicity between SPI operations. The only other public method that I use
> is the one for temperature, and to check to see if it is still in TX mode.
>
> > They are intended for statistics use by subclasses. I agree there may be
> > an
> > issue with their 16 bitness, but I do want to leave them there.
>
> I figured that might be the case. I'm only using the base class.

OK.

>
> > No its not clear.
> >
> > AFAICT, the chip does not go direct from TX to RX or vice versa, but
> > always
> > via IDLE, so I expect that RX->IDLE->TX will be slower than IDLE->TX.
>
> In my testing today, I tried it both ways and added some time measurement
> logging. It didn't seem to make a difference. But I am having a weird
> lock-up in the code when I do two sends in a row. If I add debug logging,
> it goes away. So I'm not sure if we are supposed to delay the operations
> or not. I hate adding delay statements, but if they are small enough, it
> won't matter much in my application.

Can you give some more details about this? I did some tests yesterday where I
sent packets as fast as possible without waiting for a reply and did not see
any lockups. Saw about 330 x 13 octet packets per sec.

Code to demonstrate the problem would be welcome.

Cheers.

Joe Tuttle

unread,
Sep 6, 2012, 5:32:00 PM9/6/12
to rf22-a...@googlegroups.com
I am still learning as I go, and I feel like I'm still learning a lot
about the library and the radios. As soon as I think that I've solved
my issue, I'll try to go back to the library.

The "locking up" could be caused by something I added...too early to
tell. I did add code in the packet-sent isr section to change to RX
after a send. This might have cause it too. I looked at doing what
you suggested, and changing the _idleMode, but it looked like RX would
be Or-d with TX when transitioning to TX, which looked wrong. So I
just explicitly called setModeRx in the ISR. Maybe this needs a 200uS
delay after calling it, as does setModeTx??? Still trying to
understand the problem...and it is very difficult when debugging code
adds delays and the problem disappears.

Roland Mieslinger

unread,
Sep 6, 2012, 5:35:43 PM9/6/12
to rf22-a...@googlegroups.com


2012/9/6 Mike McCauley <mi...@open.com.au>


They are intended for statistics use by subclasses. I agree there may be an
issue with their 16 bitness, but I do want to leave them  there.

As far as I have understand it, volatile causes the compiler to generate code that will always read/write the value from/to SRAM and avoids caching in a register. To get consistent read/writes, every access has to happen in a ATOMIC_BLOCK.

--
Roland

Roland Mieslinger

unread,
Sep 6, 2012, 5:41:56 PM9/6/12
to rf22-a...@googlegroups.com
2012/9/6 Joe Tuttle <joet...@gmail.com>

Still trying to understand the problem...and it is very difficult when debugging code
adds delays and the problem disappears.

I'm in the same newbie state. 
From my limited experience, a logic analyzer is the best debugging tool for this kind of problems, as changing a pin can be done in two cycles (I can't think of anything faster/having less side effects).

--
Roland
 

Joe Tuttle

unread,
Sep 6, 2012, 6:02:15 PM9/6/12
to rf22-a...@googlegroups.com
Great idea.

I'm thinking outside the box here just a little...I don't see a reason
why we even need to have an interrupt pin on the arduino hooked up to
the radio. As long as you are polling the interrupt status registers
often enough to keep the fifos full/empty, you shouldn't need to deal
with all of this. For my application, I can't use the wait methods
for sending or receiving. I need to poll to see if something is
available. I could see calling handleInterrupt from the main loop of
the application. Since I am already polling for data, and the fact
that I'd like to free up another interrupt, I am going to investigate
this approach.

Any thoughts?

Mike McCauley

unread,
Sep 6, 2012, 6:06:00 PM9/6/12
to rf22-a...@googlegroups.com
On Thursday, September 06, 2012 11:35:43 PM Roland Mieslinger wrote:
> 2012/9/6 Mike McCauley <mi...@open.com.au>
>
> > They are intended for statistics use by subclasses. I agree there may be
> > an issue with their 16 bitness, but I do want to leave them there.
> As far as I have understand it, *volatile* causes the compiler to generate
> code that will always read/write the value from/to SRAM and avoids caching
> in a register. To get consistent read/writes, every access has to happen in
> a ATOMIC_BLOCK.

Yes, thats the problem with leaving them 16 bits.
Prob should add accessor functions that have ATOMIC_BLOCK


>
> --
> Roland

Mike McCauley

unread,
Sep 6, 2012, 6:09:47 PM9/6/12
to rf22-a...@googlegroups.com
Hi,

On Thursday, September 06, 2012 05:32:00 PM Joe Tuttle wrote:
> I am still learning as I go, and I feel like I'm still learning a lot
> about the library and the radios. As soon as I think that I've solved
> my issue, I'll try to go back to the library.


>
> The "locking up" could be caused by something I added...too early to
> tell.

version 1.20 should wait for any previous transmit to finish before starting a
new transmit.


> I did add code in the packet-sent isr section to change to RX
> after a send. This might have cause it too. I looked at doing what
> you suggested, and changing the _idleMode, but it looked like RX would
> be Or-d with TX when transitioning to TX, which looked wrong. So I
> just explicitly called setModeRx in the ISR. Maybe this needs a 200uS
> delay after calling it, as does setModeTx???

As I undertsand it you can ask the chip to transmit, but if it is not ready,
it will delay until it is ready ie has got to TX state).
I dont think you should need to add artificial delays in your code to wait for
the chip to transition between states?

> Still trying to
> understand the problem...and it is very difficult when debugging code
> adds delays and the problem disappears.

Yes, thats always a pain.

Joe Tuttle

unread,
Sep 6, 2012, 8:09:09 PM9/6/12
to rf22-a...@googlegroups.com
I haven't solved my lockup yet...and I'm still having it even though
the rf22 library is not using the hardware interrupt anymore. This
wasn't that big of change. I just changed handleInterrupt to public,
disabled the attachInterrupt command, changed the wait inside send to
wait for the mode not to be TX and continually calling
handleInterrupt, and calling handleInterrupt from my main loop.
Everything works as before, and I'm still having the lockup. I also
added a couple of LEDs to output pins, to toggle around sections of
code to see where it locks up. It seems to be moving where it locks
up, so I'm inclined to believe it is locking up inside an interrupt or
timer service routine in another library. I'm wondering about the SPI
library. But I'm also using serial and TLC. It could be memory, but
I'm using MemoryFree which is reporting over 600 bytes free at
startup. I am going to switch over to a fresh sketch to see if I can
reproduce.

Paul Martinsen

unread,
Sep 7, 2012, 8:58:04 AM9/7/12
to rf22-a...@googlegroups.com
Do you call waitPacketSent after each message is sent? I found calling send twice without waiting for completion can cause lookups.

Joe Tuttle

unread,
Sep 7, 2012, 9:04:21 AM9/7/12
to rf22-a...@googlegroups.com
Thanks for the suggestion...the new 1.20 code does the wait at the
beginning of the send. I'm not locking up on the wait. Somewhere
else in my code. Still eluding me.

I have took out the RF22 code temporarily, and I cannot get the
lockup. I put it back in, and I get a lockup somewhere. I have even
replaced the SPI code with my own bit-bang code. The lockup still
happens, and the library doesn't yet fully work in bit-bang mode.
Still debugging. But I should end up with code that allows the RF22
to work with any 4 I/O pins and no interrupt pin.

Joe Tuttle

unread,
Sep 8, 2012, 5:09:27 PM9/8/12
to rf22-a...@googlegroups.com
I tried to go back to 1.20 and I couldn't get my code to work with it.
I may still try again later.

One thing that I found with my version of the library...when I was
sending a lot of packets (one send, then another...which waited on the
first to finish), I was also checking to see if serial data was
available in between the sends. What I found was that I was dropping
a lot of serial data. I assumed this had something to do with all of
the atomic blocks. I switched all of the blocks to just disabling
INT0, and I no longer have incoming serial issues. I did find that
you can't disable the interrupt during the setup routine, or your
application will lock up. I will probably test reading the interrupt
mask, and then restoring it just like the atomic block does for global
interrupts.

I went back to using SPI...just a little faster than bit-banging.
Although I think it would make a great addition to the library...to be
able to switch it to bit-bang mode so that you can use any 4 pins you
want.

I also found that if you want RSSI, you have to use a hardware
interrupt. The way that the library gets the last rssi, is via the
interrupt when it receives a valid preamble. I found that this was
not reliable when I was servicing the interrupts from the main loop.
By the time I went to service the interrupt, the radio signal may have
been finished. But everything else seemed to work well without the
interrupt. It made debugging a lot easier. But I went back to using
the hardware interrupt.

Mike McCauley

unread,
Sep 8, 2012, 5:34:10 PM9/8/12
to rf22-a...@googlegroups.com
Hi,

On Saturday, September 08, 2012 05:09:27 PM Joe Tuttle wrote:
> I tried to go back to 1.20 and I couldn't get my code to work with it.
> I may still try again later.

I think it would be good if you did it now: then you wont have such a big
difference between your code and the mainline..

>
> One thing that I found with my version of the library...when I was
> sending a lot of packets (one send, then another...which waited on the
> first to finish), I was also checking to see if serial data was
> available in between the sends. What I found was that I was dropping
> a lot of serial data. I assumed this had something to do with all of
> the atomic blocks.

In your code, is the mode test/wait at the beginning of send() inside or
outside the interrupt block?
I think it can and should be outside: I dont think it needs protection. Thats
how it is in 1.20. The idea of the atomic blocks should be that they only last
a very short time, and not be operating during a busy/wait

The recv code you sent quite a while back has the test *inside* the atomic
block.

> I switched all of the blocks to just disabling
> INT0, and I no longer have incoming serial issues. I did find that
> you can't disable the interrupt during the setup routine, or your
> application will lock up. I will probably test reading the interrupt
> mask, and then restoring it just like the atomic block does for global
> interrupts.
>
> I went back to using SPI...just a little faster than bit-banging.
> Although I think it would make a great addition to the library...to be
> able to switch it to bit-bang mode so that you can use any 4 pins you
> want.


Yes, that would be nice. Maybe need to abstract the SPI library.

>
> I also found that if you want RSSI, you have to use a hardware
> interrupt. The way that the library gets the last rssi, is via the
> interrupt when it receives a valid preamble. I found that this was
> not reliable when I was servicing the interrupts from the main loop.
> By the time I went to service the interrupt, the radio signal may have
> been finished. But everything else seemed to work well without the
> interrupt. It made debugging a lot easier. But I went back to using
> the hardware interrupt.
>
> On Fri, Sep 7, 2012 at 9:04 AM, Joe Tuttle <joet...@gmail.com> wrote:
> > Thanks for the suggestion...the new 1.20 code does the wait at the
> > beginning of the send. I'm not locking up on the wait. Somewhere
> > else in my code. Still eluding me.
> >
> > I have took out the RF22 code temporarily, and I cannot get the
> > lockup. I put it back in, and I get a lockup somewhere. I have even
> > replaced the SPI code with my own bit-bang code. The lockup still
> > happens, and the library doesn't yet fully work in bit-bang mode.
> > Still debugging. But I should end up with code that allows the RF22
> > to work with any 4 I/O pins and no interrupt pin.
> >
> > On Fri, Sep 7, 2012 at 8:58 AM, Paul Martinsen <pmart...@gmail.com>
wrote:
> >> Do you call waitPacketSent after each message is sent? I found calling
> >> send twice without waiting for completion can cause lookups.

Joe Tuttle

unread,
Sep 8, 2012, 5:52:42 PM9/8/12
to rf22-a...@googlegroups.com
On Sat, Sep 8, 2012 at 5:34 PM, Mike McCauley <mi...@open.com.au> wrote:
> Hi,
>
> On Saturday, September 08, 2012 05:09:27 PM Joe Tuttle wrote:
>> One thing that I found with my version of the library...when I was
>> sending a lot of packets (one send, then another...which waited on the
>> first to finish), I was also checking to see if serial data was
>> available in between the sends. What I found was that I was dropping
>> a lot of serial data. I assumed this had something to do with all of
>> the atomic blocks.
>
> In your code, is the mode test/wait at the beginning of send() inside or
> outside the interrupt block?
> I think it can and should be outside: I dont think it needs protection. Thats
> how it is in 1.20. The idea of the atomic blocks should be that they only last
> a very short time, and not be operating during a busy/wait
>
> The recv code you sent quite a while back has the test *inside* the atomic
> block.
>

The wait is at the beginning of the send and before the interrupt gets
disabled. I went back and looked at my snippet from before, and you
are correct. That was a bug.
I don't have any code inside the atomic blocks that is very long in
duration. I think that it was just global interrupts were being
disabled very often. I may try to reproduce this with just a simple
sketch. But I do know as soon as I went with just disabling int0, my
serial data didn't have any more problems.

I did have a bug at one point where I was tracking down a lockup. It
seemed that it was caused inside the interrupt where on a valid
packet, you read the packet length from the register. At that point,
it read back 255. The spiBurstRead went and read 255 bytes into a
buffer that wasn't big enough. You should limit the length to the
size of the remaining buffer. I don't remember what got the radio
into that state, but the symptom was a lockup.

Mike McCauley

unread,
Sep 8, 2012, 9:54:17 PM9/8/12
to rf22-a...@googlegroups.com, Joe Tuttle
Hi Joe,

On Saturday, September 08, 2012 05:52:42 PM Joe Tuttle wrote:
> On Sat, Sep 8, 2012 at 5:34 PM, Mike McCauley <mi...@open.com.au> wrote:
> > Hi,
> >
> > On Saturday, September 08, 2012 05:09:27 PM Joe Tuttle wrote:
> >> One thing that I found with my version of the library...when I was
> >> sending a lot of packets (one send, then another...which waited on the
> >> first to finish), I was also checking to see if serial data was
> >> available in between the sends. What I found was that I was dropping
> >> a lot of serial data. I assumed this had something to do with all of
> >> the atomic blocks.
> >
> > In your code, is the mode test/wait at the beginning of send() inside or
> > outside the interrupt block?
> > I think it can and should be outside: I dont think it needs protection.
> > Thats how it is in 1.20. The idea of the atomic blocks should be that
> > they only last a very short time, and not be operating during a
> > busy/wait
> >
> > The recv code you sent quite a while back has the test *inside* the
> > atomic block.
>
> The wait is at the beginning of the send and before the interrupt gets
> disabled. I went back and looked at my snippet from before, and you
> are correct. That was a bug.
> I don't have any code inside the atomic blocks that is very long in
> duration. I think that it was just global interrupts were being
> disabled very often.

Hmmm, as long as they are reenabled soon enough, I would not think that would
be a problem.


> I may try to reproduce this with just a simple
> sketch. But I do know as soon as I went with just disabling int0, my
> serial data didn't have any more problems.
>
> I did have a bug at one point where I was tracking down a lockup. It
> seemed that it was caused inside the interrupt where on a valid
> packet, you read the packet length from the register. At that point,
> it read back 255. The spiBurstRead went and read 255 bytes into a
> buffer that wasn't big enough. You should limit the length to the
> size of the remaining buffer. I don't remember what got the radio
> into that state, but the symptom was a lockup.

Thanks, I think I have found and fixed that in the new 1.21.

Cheers.
Reply all
Reply to author
Forward
0 new messages