I2C Lockup

47 views
Skip to first unread message

Simon Peacock

unread,
Jun 6, 2023, 7:41:34 PM6/6/23
to bcm2835
Hi All,
I've experienced several I2C lockups using an I2C LCD 4 line display and bcm2835.c,v 1.28. by Mike McCauley.
This is due to infinite loops within the code (there is no fail-safe if the hardware doesn't end correctly).  I've added a fix by using a failsafe counter to abort and return timeout.

I will submit a diff if there is someway of doing this.

Simon

Mike McCauley

unread,
Jun 6, 2023, 9:11:48 PM6/6/23
to bcm2835, Simon Peacock
Hi,
If you will send the diff to me I will look at adding it to the mainstream.

Cheers.
--
Mike McCauley VK4AMM mi...@airspayce.com
Airspayce Pty Ltd 9 Bulbul Place Currumbin Waters QLD 4223 Australia
http://www.airspayce.com 5R3MRFM2+X6
Phone +61 7 5598-7474



Simon Peacock

unread,
Jun 6, 2023, 10:27:51 PM6/6/23
to bcm...@googlegroups.com
Hi Mike

Quite a simple change, the failsafe is simply a down counter, I've based this upon a 4 char send which takes between 800 and 1050 counts and this seems to have fixed the issue (but I am still running lots of tests).  Using a multiplier of 1000 may be a bit extreme if larger frames are sent, but I am always sending small frames so I haven't tested fully with large frames, maybe 500 x len ?
Note: I am using a Raspberry PI Model B+.

There are also other lockups waiting to be found.  Anywhere you wait for a bit to be set by hardware you might also end up waiting forever.  I2C is notoriously hard to get right in Hardware.  ST 8 and 32 bit processors I2C lock up all the time due to missed timings so this is an easy win.

Simon


--
You received this message because you are subscribed to a topic in the Google Groups "bcm2835" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/bcm2835/DDf9dYOtt0E/unsubscribe.
To unsubscribe from this group and all its topics, send an email to bcm2835+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bcm2835/6563635.4vTCxPXJkl%40zulu.
bcm2835.h.dif
bcm2835.c.dif

Mike McCauley

unread,
Jun 6, 2023, 11:18:48 PM6/6/23
to bcm...@googlegroups.com, Simon Peacock
Hi Simon,

OK. pls send through the diff when you think its at production quality.

Cheers.

Arjan van Vught

unread,
Jun 7, 2023, 5:33:45 AM6/7/23
to bcm2835
Op woensdag 7 juni 2023 om 05:18:48 UTC+2 schreef mi...@airspayce.com:
Simon,

Have you checked the hardware for any issues? 

I am using this great library since 1.18 and I have never experienced any I2C timing issues which results in a loop.

For the patch, you can do it with a single time_out variable. However this time_out is not reliable as it depends on the CPU clock speed and the CPU clock speed it not taken in account. 

Please note that the code is using lower-case variable names.

Thanks, Arjan
 

Mike McCauley

unread,
Jun 8, 2023, 12:14:43 AM6/8/23
to Simon Peacock, bcm...@googlegroups.com
Hi again Simon

Can you tell me more about under what circumstances you might get, or do see a
timeout?

Im not sure I can see how there could be a timeout in the inner part of the
loop (ie inside the while (!Timeout && remaining && (bcm2835_peri_read(status)
& BCM2835_BSC_S_TXD ))) unless the chip itself was broken?

So: do you only see the hang in the outer loop (ie after all the data has been
queued, and remaining is 0)? Or in the inner loop too?

Also, could you pls send the diff as a _unified_ diff or a _context_ diff,
suitable fo use by patch?

Cheers.



On Wednesday, 7 June 2023 13:37:31 AEST Simon Peacock wrote:
> The diff I've sent is rock solid. I've been testing most of the
> afternoon, not a single issue so far. I've seen a few warnings of I2C
> timeout, but without consequence.
>
> My only concern is in applications other than mine. You might be slightly
> better informed than I in this respect. I haven't necessarily thought thru
> all the possible consequences.
> Usually I would have a hardware timer that can be used. SYSTICK for
> example, this gets rid of CPU clock dependencies and I can failsafe at 10ms
> or 100ms for example, which would be well beyond normal operation.
>
> Simon
Reply all
Reply to author
Forward
0 new messages