New Install of ot-br-posix

413 views
Skip to first unread message

Michael Simpson

unread,
Apr 28, 2021, 10:39:04 PM4/28/21
to openthread-users
Hi Jonathan

Silabs suggested I do a fresh install of otbr for my ongoing problem with my RCP coms.

I renamed my ot-br-posix directory and then worked through the instructions from https://openthread.io/guides/border-router/build

cd ot-br-posix 
./script/bootstrap
sudo INFRA_IF_NAME=eth0 ./script/setup

My RCP is ttyACM0

I edited my /etc/default/otbr-agent file and found it set to:
OTBR_AGENT_OPTS="-I wpan0 -B eth0 spinel+hdlc+uart:///dev/ttyACM0"

I tried this and also from the guide (link above) but no difference.
OTBR_AGENT_OPTS="-I wpan0 spinel+hdlc+uart:///dev/ttyACM0"

When I run sudo systemctl status it reports: (full report attached state.txt)
    State: degraded
     Jobs: 0 queued
   Failed: 1 units
    Since: Thu 2021-04-29 14:24:31 NZST; 2min 44s ago
   CGroup: /
...and I only see avahi-daemon.service, not otbr-agent.service or otbr-web.service

khadas@Khadas:~$ sudo systemctl --failed
  UNIT             LOAD   ACTIVE SUB    DESCRIPTION
● rc-local.service loaded failed failed /etc/rc.local Compatibility

LOAD   = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB    = The low-level unit activation state, values depend on unit type.

1 loaded units listed.

When I run tail -f /var/log/syslog | grep otbr I get

Apr 29 14:33:04 localhost systemd[1]: otbr-agent.service: Scheduled restart job, restart counter is at 97.
Apr 29 14:33:04 localhost otbr-agent[6038]: Running 0.3.0-488e63969
Apr 29 14:33:04 localhost otbr-agent[6038]: Thread version: 1.2.0
Apr 29 14:33:04 localhost otbr-agent[6038]: Thread interface: wpan0
Apr 29 14:33:04 localhost otbr-agent[6038]: Backbone interface: eth0
Apr 29 14:33:04 localhost otbr-agent[6038]: [INFO]-PLAT----: RCP reset: RESET_SOFTWARE
Apr 29 14:33:04 localhost otbr-agent[6038]: [NOTE]-PLAT----: RCP API Version: 1
Apr 29 14:33:04 localhost otbr-agent[6038]: [CRIT]-PLAT----: RCP is missing required capabilities: tx-security tx-timing
Apr 29 14:33:04 localhost otbr-agent[6038]: [CRIT]-PLAT----: CheckRadioCapabilities() at ../../third_party/openthread/repo/src/lib/spinel/radio_spinel_impl.hpp:381: RadioSpinelIncompatible
Apr 29 14:33:04 localhost systemd[1]: otbr-agent.service: Main process exited, code=exited, status=3/NOTIMPLEMENTED
Apr 29 14:33:04 localhost systemd[1]: otbr-agent.service: Failed with result 'exit-code'.

Can you please help.




state.txt

Jonathan Hui

unread,
Apr 28, 2021, 10:42:03 PM4/28/21
to Michael Simpson, openthread-users
It seems the RCP version is older than what the latest ot-br-posix is expecting. Can you try building a new RCP image using the latest in https://github.com/openthread/ot-efr32 ?

--
Jonathan Hui



--
You received this message because you are subscribed to the Google Groups "openthread-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openthread-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/openthread-users/954a4921-bd95-4bcd-a4f7-7b4a3994c5b5n%40googlegroups.com.

Michael Simpson

unread,
Apr 29, 2021, 12:09:16 AM4/29/21
to openthread-users
Thanks Jonathan

Things seem to be moving quite fast causing synch / compatibility issues with Silabs.
I am currently using Simplicity Studios to build my RCP and REEDs and would prefer to stay with this if I can, mainly as I don't know how to build the RCP outside of Simplicity Studios.

Silabs told me to go to comitt a37e299ffe7b2f444f06e1e904a27b184596ab02  or later, so for now I have just  done
git checkout a37e299ffe7b2f444f06e1e904a27b184596ab02
and then followed the otbr instructions and everything is working again......
Except I am still getting
Apr 29 15:48:51 localhost otbr-agent[6518]: [CRIT]-PLAT----: exit(1): ProcessRadioStateMachine line 999, radio tx timeout, Failure
Apr 29 15:48:51 localhost systemd[1]: otbr-agent.service: Main process exited, code=exited, status=1/FAILURE
Apr 29 15:48:51 localhost systemd[1]: otbr-agent.service: Failed with result 'exit-code'.
Apr 29 15:48:57 localhost systemd[1]: otbr-agent.service: Scheduled restart job, restart counter is at 2.
Apr 29 15:48:57 localhost otbr-agent[6974]: Running 0.2.0-a37e299ff
Apr 29 15:48:57 localhost otbr-agent[6974]: Thread interface wpan0
Apr 29 15:48:57 localhost otbr-agent[6974]: Backbone interface 
Apr 29 15:48:57 localhost otbr-agent[6974]: [WARN]-PLAT----: Error decoding hdlc frame: Parse
Apr 29 15:48:57 localhost otbr-agent[6974]: [INFO]-PLAT----: RCP reset: RESET_SOFTWARE
Apr 29 15:48:57 localhost otbr-agent[6974]: [INFO]-CORE----: Non-volatile: Read NetworkInfo {rloc:0x5c00, extaddr:7a3e62085c206ac5, role:Leader, mode:0x0f, version:2, keyseq:0x0, ...
Apr 29 15:48:57 localhost otbr-agent[6974]: [INFO]-CORE----: Non-volatile: ... pid:0x517b44, mlecntr:0xb237a, maccntr:0x2e24d1, mliid:93483d410609c278}
Apr 29 15:48:57 localhost otbr-agent[6974]: Set state callback: OK
Apr 29 15:48:57 localhost otbr-agent[6974]: Thread is down
Apr 29 15:48:57 localhost otbr-agent[6974]: Check if Thread is up: OK

I have spent so much time on this and am just not making any progress.

You have diagnozed the OTBR logs as "This timeout is when the RCP was requested to transmit an 802.15.4 frame, but the RCP did not signal completion of the event."

Whereas Silabs are looking at it from a RCP compatibility with the OTBR.

I am now up to the version of OTBR that Silbas suggest and still have my problem.

I note that $sudo systemctl status still reports my State as degraded (see "state.txt"attached for full report)

    State: degraded

     Jobs: 0 queued

   Failed: 1 units

and $ sudo systemctl --failed reports:
  UNIT             LOAD   ACTIVE SUB    DESCRIPTION
● rc-local.service loaded failed failed /etc/rc.local Compatibility
LOAD   = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB    = The low-level unit activation state, values depend on unit type.
1 loaded units listed.

I don't know what this means or how relevant / important it is. Do you think I have some O/S issue outside of OTBR and the RCP which is interfering with the serial coms? SHould I go back to a fresh install of Linux and start again from scratch. This is quite a bit of work as I have Java and my databse engine etc to install from scratch.

Alternatively, do you think that building the latest RCP and testing with the latest OTBR would help with this specific problem. I don't know how to go about building the RCP outside of Simplicity Studios. Is there a guide which covers this?

I feel like I am "clutching at straws".  It is odd that noone else has this problem except for me. I really need help to get over this hurdle.

Thanks

state.txt

Jonathan Hui

unread,
Apr 29, 2021, 12:29:11 AM4/29/21
to Michael Simpson, openthread-users
There are EFR32 specific build instructions at src/README.md, which is linked from the top-level README.md.

The recent incompatibility is due to upgrading the default Thread Protocol version to 1.2 (from 1.1). This was done in preparation to support Project Connected Home over IP (CHIP).

--
Jonathan Hui


Michael Simpson

unread,
Apr 29, 2021, 1:12:10 AM4/29/21
to openthread-users
Thanks for this. I will take a look.

But are you saying this latest change is not likely to help my immediate problem, or do you think moving to Thread 1.2 might help.
I would prefer to leave this until Silabs release the next SDK which is supposed to include official support of Thread 1.2

Do you have any further advise for me?  What would you do next?
I have invested so much time and effort into this and have a working solution which performs really well most of the time is except for this regular dropout issue about 5-6 times per hour for 10 seconds or so.

Michael Simpson

unread,
May 28, 2021, 2:00:05 AM5/28/21
to openthread-users
In case anyone else is following this, I finally resolved my problem.
My SBC (not a Pi) was running some host software which did not have the full login credentials for connection. As a result is was using nearly 100% of one of the 6 cores of the CPU. This was not noticeable in the operation of the overall system, but enough to upset the OTBR / RCP coms intermittently.
If anyone else has the problem, I would suggest use Top to view the CPU usage for each individual core.
And for the doubters, yes the OTBR does run on very nicely on other SBCs, not just a Raspberry Pi.

Sébastien Parent-Charette

unread,
May 31, 2021, 5:52:07 PM5/31/21
to openthread-users
Hi Michael,

On Friday, May 28, 2021 at 2:00:05 a.m. UTC-4 michae...@gmail.com wrote:
In case anyone else is following this, I finally resolved my problem.
My SBC (not a Pi) was running some host software which did not have the full login credentials for connection. As a result is was using nearly 100% of one of the 6 cores of the CPU. This was not noticeable in the operation of the overall system, but enough to upset the OTBR / RCP coms intermittently.

Thank you for the update.

If anyone else has the problem, I would suggest use Top to view the CPU usage for each individual core.

That is a good idea to keep in mind for similar situations where the OTBR/RCP comms show some signs of trouble.
 
And for the doubters, yes the OTBR does run on very nicely on other SBCs, not just a Raspberry Pi.

Always good to have some confirmation. In theory, this should be a pretty straightforward process on most systems, as long as they have all the components required to run the OTBR properly and connect to your RCP. Of course, the complexity comes from the (possibly numerous) small details that may be unique to each SBC and its ecosystem.

I'm glad to you learn you were able to make it work! Good luck with your project.

Sincerely,

Reply all
Reply to author
Forward
0 new messages