High current sleepy end device behavior on disconnect

97 views
Skip to first unread message

Neal Jackson

unread,
Aug 8, 2018, 2:38:45 PM8/8/18
to openthread-users
Hello, I've been having some trouble with current consumption with battery powered sleepy end devices when they disconnect from the network.

I have a few NRF52840-based nodes that are acting as sleepy end devices, and on average only drawing 1-2uA. They periodically send COAP messages to server with sensor data. I have a raspberry pi acting as a border router with a kw41z-based ncp. The SEDs are configured to poll every 5 seconds. I've noticed that when I shut down the border router, the SEDs begin drawing 3-7mA constantly. If I bring the border router back up, sometimes these nodes don't rejoin the thread network. Additionally, sometimes the SEDs disconnect on their own and never rejoin, even though the border router is still up. This current draw and inability to rejoin the network is killing the battery life of these devices.

I assume these devices are stuck in an always listen mode. Is this the intended behavior when a node is disconnected? If so, how can I manage this better so my nodes don't die waiting to rejoin? It seems like a bug if these devices never rejoin, even if the network is back up.

I plan on implementing a watchdog that checks for connection to a parent and the outside routable connected role, but this won't limit the current draw until the device is reset.

Below is my init function and thread state callback used on the NRF devices. I'm calling init with config.sed = true, a poll period of 5 seconds, a child period of 40 seconds, and autocommission = true.

typedef struct {
  uint8_t   channel;
  uint16_t  panid;
  bool      sed; // sleepy end device
  uint32_t  poll_period;
  uint32_t  child_period;
  bool      autocommission;
} thread_config_t;

void __attribute__((weak)) thread_state_changed_callback(uint32_t flags, void * p_context)
{
    NRF_LOG_INFO("State changed! Flags: 0x%08lx Current role: %d\r\n",
                 flags, otThreadGetDeviceRole(p_context));

    if (flags & OT_CHANGED_THREAD_NETDATA)
    {
        /**
         * Whenever Thread Network Data is changed, it is necessary to check if generation of global
         * addresses is needed. This operation is performed internally by the OpenThread CLI module.
         * To lower power consumption, the examples that implement Thread Sleepy End Device role
         * don't use the OpenThread CLI module. Therefore otIp6SlaacUpdate util is used to create
         * IPv6 addresses.
         */
         otIp6SlaacUpdate(m_ot_instance, m_slaac_addresses,
                          sizeof(m_slaac_addresses) / sizeof(m_slaac_addresses[0]),
                          otIp6CreateRandomIid, NULL);
    }
}

void __attribute__((weak)) thread_init(const thread_config_t* config)
{
    otError error;

    PlatformInit(0, NULL);

    m_ot_instance = otInstanceInitSingle();
    ASSERT(m_ot_instance != NULL);


    NRF_LOG_INFO("Thread version: %s", otGetVersionString());
    NRF_LOG_INFO("Network name:   %s",
                 otThreadGetNetworkName(m_ot_instance));

    if (!otDatasetIsCommissioned(m_ot_instance) || config->autocommission)
    {
        error = otLinkSetChannel(m_ot_instance, config->channel);
        ASSERT(error == OT_ERROR_NONE);
        NRF_LOG_INFO("Thread Channel: %d", otLinkGetChannel(m_ot_instance));

        error = otLinkSetPanId(m_ot_instance, config->panid);
        ASSERT(error == OT_ERROR_NONE);
        NRF_LOG_INFO("Thread PANID: 0x%lx", (uint32_t)otLinkGetPanId(m_ot_instance));
    }

    otLinkModeConfig mode;
    memset(&mode, 0, sizeof(mode));
    if (config->sed) {
      // sleepy end device
      mode.mSecureDataRequests = true;
      mode.mRxOnWhenIdle       = false; // Join network as SED.
      otLinkSetPollPeriod(m_ot_instance, config->poll_period);
    }
    else {
      // regular end device
      mode.mRxOnWhenIdle       = true;
      mode.mSecureDataRequests = true;
      mode.mDeviceType         = true;
      mode.mNetworkData        = true;
    }

    error = otThreadSetLinkMode(m_ot_instance, mode);
    ASSERT(error == OT_ERROR_NONE);

    otThreadSetChildTimeout(m_ot_instance, config->child_period);

    error = otIp6SetEnabled(m_ot_instance, true);
    ASSERT(error == OT_ERROR_NONE);

    if (otDatasetIsCommissioned(m_ot_instance) || config->autocommission)
    {
      error = otThreadSetEnabled(m_ot_instance, true);
      ASSERT(error == OT_ERROR_NONE);

      NRF_LOG_INFO("Thread interface has been enabled.");
      NRF_LOG_INFO("802.15.4 Channel: %d", otLinkGetChannel(m_ot_instance));
      NRF_LOG_INFO("802.15.4 PAN ID:  0x%04x", otLinkGetPanId(m_ot_instance));
      NRF_LOG_INFO("rx-on-when-idle:  %s", otThreadGetLinkMode(m_ot_instance).mRxOnWhenIdle ?
          "enabled" : "disabled");
    }

    otSetStateChangedCallback(m_ot_instance, thread_state_changed_callback, m_ot_instance);
}

Jonathan Hui

unread,
Aug 8, 2018, 5:25:09 PM8/8/18
to Neal Jackson, openthread-users
Thanks for raising this issue.

To reproduce, can you provide a bit more detail on your setup, including:
  1. Git commit ID (for both kw41z and nrf58240 if different)
  2. Firmware configuration / build options (for both kw41z and nrf52840)
Thanks.

--
Jonathan Hui

--
You received this message because you are subscribed to the Google Groups "openthread-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openthread-use...@googlegroups.com.
To post to this group, send email to openthre...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/openthread-users/fbc15cfb-f451-4f86-994b-fd175cf93e46%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Neal Jackson

unread,
Aug 8, 2018, 5:49:15 PM8/8/18
to openthread-users
Hi Johnathan,

1. I've lost track of the commit ID for the kw41z NCP, but the NCP version is "OPENTHREAD/20170716-00660-gdf14197c-dirty; KW41Z; Jun 19 2018 17:06:30"
The nrf52840 is using commit 41f1cc24f7235149f0ce29b4959a5db736073b90.

2. I'm using the recommended build flags for the NCP with a border router:
BORDER_AGENT=1 BORDER_ROUTER=1 COMMISSIONER=1 UDP_PROXY=1
For the nrf52840 I used:
COMMISSIONER=1 JOINER=1 COAP=1 DNS_CLIENT=1 MTD_NETIAG=1 BORDER_ROUTER=1 MAC_FILTER=1 TMF_PROXY=1 DISABLE_SPI=1

Thanks!

Jonathan Hui

unread,
Aug 8, 2018, 6:04:56 PM8/8/18
to Neal Jackson, openthread-users
Thanks for providing additional information.

When a Sleepy End Device (SED) is no longer able to communicate with its parent, it should start looking for additional parents by sending MLE Parent Request messages.

In order to reduce average power consumption in looking for a new parent, OpenThread implements an exponential backoff between each attempt.  If you enable logging, you should see logs similar to below:

[NOTE]-MLE-----: Attach attempt 8 unsuccessful, will try again in 32.154 seconds
[DEBG]-MAC-----: Idle mode: Radio sleeping

Do you see those logs?

More generally, can you provide log output for the SED?

Thanks.

--
Jonathan Hui

Neal Jackson

unread,
Aug 8, 2018, 7:55:31 PM8/8/18
to openthread-users
Hi Johnathan,

I enabled logging and I am receiving the messages you mentioned. For example:
[0000171099] [INFO]-MLE-----: AttachState Announce -> Idle
[0000171174] [INFO]-MLE-----: AttachState Idle -> Start
[0000171174] [INFO]-MLE-----: Attach attempt 9 unsuccessful, will try again in 64.300 seconds
[0000171174] [DEBG]-MAC-----: Idle mode: Radio sleeping

The current draw also appears to be dropping to what I would expect in sleep. The other nodes that are part of the network don't seem to be following this behavior and remain in a high current state. They are flashed with the openthread library that doesn't include logging. It might be the case that they are using a very old version of the library, not the commit I mentioned previously. I'm going to try rebuilding the library again, but without logging and see if it fixes the issue. I'll update with results.

Thanks for the help!

Neal Jackson

unread,
Aug 8, 2018, 8:08:08 PM8/8/18
to openthread-users
So yes, I must have been using a library based on an old commit. I can confirm I am seeing the expected behavior using a newly compiled library. I'll reflash all my devices.

Thanks for all the help Jonathan!
Reply all
Reply to author
Forward
0 new messages