Thanks for the suggestions.
For now, all BRs belong to the same Thread network, so TREL would fit our use case. We're currently using the ML-EID in our application (in combination with Service ALOCs), so using TREL would also minimize the changes we would need to do to our application if I understand correctly.
In my first tests, I have some trouble getting TREL to work correctly. Here is my setup:
Router A <--TREL--> Router B <--THREAD--> Router C
In this case, Router A is a TREL-only POSIX "controller" (just for testing, IRL this will be TREL+THREAD), Router B is an intermediary TREL+THREAD POSIX router and Router C is a Nordic Thread-only lamp.
I can see OpenThread has correctly formed the network including the TREL links:
On Router A:
# ot-ctl neighbor table
| Role | RLOC16 | Age | Avg RSSI | Last RSSI |R|D|N| Extended MAC |
+------+--------+-----+----------+-----------+-+-+-+------------------+
| R | 0x9400 | 2 | -20 | -20 |1|1|1| 2261e111773d79c8 | // Router B
Done
# ot-ctl extaddr
be0110ec20e2fa86
Done
On Router B:
# ot-ctl multiradio neighbor list
ExtAddr:be0110ec20e2fa86, RLOC16:0x4400, Radios:[TREL(255)] // Router A
ExtAddr:2203d43e64285cc1, RLOC16:0x7000, Radios:[15.4(255)] // Router C
Done
# ot-ctl extaddr
2261e111773d79c8
Done
I'm capturing packets via Wireshark on Router A's wpan0 interface and I'm controlling the light device (Router C) by sending Confirmable CoAP messages to it's ML-EID.
In this setup, everything is working correctly: the confirmable messages are received by the lamp and the corresponding ACKs are received by the controller.
However, whenever I introduce another (non-router capable) Thread device to the network (let's call this Child A), the communication seems to break: CoAP messages to the lamp are still received, but I can't see the corresponding ACKs on the controller. The child device is also supposed to communicate with the controller over CoAP, but I can't see any communication on the Wireshark capture from Router A.
When I now remove Router C (the lamp) from the network, communication with Child A starts working for a while, but after some time, it seems to stop working again and no communication is happening between Child A and Router A (over Router B).
When adding extra lamp devices instead of the Child-only device, they don't seem to cause issues.
These devices were all working correctly in a network where TREL was not configured. Is it possible there is some incompatibility or bug in the TREL implementation that could explain this behaviour?
I'll try to reproduce this behaviour using the toranj test environment and report back.