Buffer Full

107 views
Skip to first unread message

Pengcheng Liu

unread,
May 13, 2021, 9:51:32 AM5/13/21
to openthread-users
Hi Openthread,

After running few minutes, the NCP stops sending any messages but it can still receive the message (the received messages are not processed by MLE, for example, "Failed to process Parent Request: Drop").

I checked the bufferinfo and I think it may be the reason:

> bufferinfo
total: 100
free: 8
6lo send: 0 0
6lo reas: 0 0
ip6: 0 0
mpl: 0 0
mle: 26 52
arp: 0 0
coap: 10 20
coap secure: 0 0
application coap: 10 20
Done

The NCP seems hanging. The buffer state does not change with time. Could you explain the possible reasons that may cause this issue and also the solution? please?  The commit id: 22445ab69d180f7ef1cad3f437c8c9e65a5c7049

Thank you very much in advance,
P

Jonathan Hui

unread,
May 13, 2021, 4:26:47 PM5/13/21
to Pengcheng Liu, openthread-users
Message buffers collecting in the MLE component may indicate that the Mle::HandleDelayedResponseTimer() callback is not getting called.

Is it possible for you to use a debugger to determine if the above method is being called?

Also, can you provide details on your hardware platform and git commit id?

--
Jonathan Hui



--
You received this message because you are subscribed to the Google Groups "openthread-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openthread-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/openthread-users/CANjJY8EAa2Kp12pTozWLSW-aznCtdGB5AsOOD4TywAN6yaY%3DCw%40mail.gmail.com.

Pengcheng Liu

unread,
May 14, 2021, 10:00:34 PM5/14/21
to Jonathan Hui, openthread-users
Hi Jonathan,

Thank you very much for your response. I am using TI cc1352.
I did check the Mle::HandleDelayedResponseTimer() and it works. 
More often, the buffer is occupied by ip6, such as:

> bufferinfo
total: 300
free: 0

6lo send: 0 0
6lo reas: 0 0
ip6: 152 300
mpl: 0 0
mle: 0 0
arp: 0 0
coap: 0 0
coap secure: 0 0
application coap: 0 0
Done

I doubt that the root cause of the above ip6 buffer occupies is due to the tasklet list operation, so I print out the addresses of all operating tasklets:
MeshForwarder=20009df4,     mNext=0
Ip6=200098f0,                         mNext=200002a8
Notefier=200002a8,                mNext=200098f0
Mac=20009d14,                      mNext=0

What I can see is that the Ip6 and Notifier have been "posted" and "point to each other???", so the scheduler will not trigger to dequeue the messages from their own buffer.
The Openthread is hanging at this stage and even I call "thread stop", "ifconfig down" and "ifconfig up", "thread stop", no any further changes.
Could you give some suggestions about this senario?

Thank you,
P


Just wondering what 

Jonathan Hui

unread,
May 14, 2021, 10:52:28 PM5/14/21
to Pengcheng Liu, openthread-users
The tasklet queue is maintained as a circular linked list - see src/core/common/tasklet.cpp

I would check to make sure TaskletScheduler::ProcessQueuedTasklets() is getting called.

--
Jonathan Hui


Abtin Keshavarzian

unread,
May 15, 2021, 12:09:45 AM5/15/21
to openthread-users
I have theory/guess for something to investigate: We have seen similar situation in past when there were issues with some radio platform layer implementations (if it can somehow get stuck).

After a frame tx request, OT core expects radio platform layer to invoke `TxDone` callback. If it is not called, then MAC layer will remain stuck. MLE layer will continue to run and add messages (e.g., link adv, etc) which end up piling up in the message queue and consume all the `Message` buffers.

Debug log level should show all the MAC operation changes and callback. The  `mOperation` variable in `Mac` class shows the current active operation.

Abtin.

Jonathan Hui

unread,
May 15, 2021, 12:52:22 AM5/15/21
to Abtin Keshavarzian, openthread-users
Yes, message buffers getting stuck in the ip6 message queue is most often due to the radio not calling otPlatRadioTxDone() after accepting a call to otPlatRadioTransmit(). I would check on this as well. This might explain Pengcheng's most recent message.

That said, I'm not sure if that explains messages getting stuck in other queues and not the ip6 send queue (as in Pengcheng's first message).

--
Jonathan Hui



Pengcheng Liu

unread,
May 15, 2021, 3:20:27 AM5/15/21
to Jonathan Hui, Abtin Keshavarzian, openthread-users
Hi Jonathan and Abtin,

I did some changes on the radio platform, and it is real that some logs shown like "ack timeout retry ..." from MAC layer. (I use mac layer re-transmission and csmabackoff), and besides that, I also see, sometimes not often, that the 6lo queue is full. I will try to decrease the kAckTimeout (also been changed) and see if anything improved. (I once suspected that the reason may be that the tasklet list is reentrant, but it is probably not the case)

Thank you very much, Jonathan and Abtin, for all your kind advice and I will check on these.
Kind regards,
Pengcheng.

Pengcheng Liu

unread,
May 16, 2021, 5:34:48 PM5/16/21
to Jonathan Hui, Abtin Keshavarzian, openthread-users
Hi Jonathan and Abtin,

I think the issues have been resolved, from the observations of my current tests.  
I believe the reasons are the same as what you guess. I have a certain amount of logs in the platform module and I enabled all these logs during the tests.
I think all these logs impact the performance of the platform a lot and the otPlatRadioTxDone is not timely invoked. 

May I have a suggestion (maybe not a good idea) that when we call, for example, "ifconfig down",  the buffers then be all cleared?
Thank you very much, for your precious advice and they are all helpful.

Kind regards,
Pengcheng

Jonathan Hui

unread,
May 16, 2021, 6:47:50 PM5/16/21
to Pengcheng Liu, Abtin Keshavarzian, openthread-users
I'm glad you are no longer experiencing the issue.

That said, I'm not yet convinced it is due to the lack of the radio driver calling "TxDone()". If that is the case, you should see the messages being queued in the "6lo" message queue. Instead, you are seeing messages queued up in places like the "ip6" message queue. In this case, as long as the task queue is getting processed, the "ip6" message queue should get serviced - see Ip6::HandleSendQueue(). Calling "ifconfig down" should not be necessary to clear the ip6 message queue.

--
Jonathan Hui


Pengcheng Liu

unread,
May 16, 2021, 7:05:42 PM5/16/21
to Jonathan Hui, Abtin Keshavarzian, openthread-users
Hi Jonathan,

Ashamed to say that the platform is not the only reason that causes the issue.
I have a task that is dedicated to processing the OT tasklet event (after tasklet posts events, my application task callback the ProcessQueuedTasklets, which then dequeues the messages from buffer). However, the priority of this task is set pretty lower than others.... it is kind of blocked some time (additional, I use sem lock somewhere in application), and on such occasions, the OT accumulated messages in the buffer and finally cause the buffer to fill.

I increase the priority of my application task and for now, I didn't see the buffer's full in ip6. I believe this is the root reason but I will carefully check other possibilities during the further tests.

Kind regards,
P
Reply all
Reply to author
Forward
0 new messages