Hi all,
I'm just doing a little research into how the on-mesh prefix works with
multiple border routers.
The situation I'm considering is one where we have a mesh distributed
over a large area (think a large caravan park or marina), with a number
of border routers servicing it at strategic points.
Ordinarily, the border routers would have the same back-bone (in the
Thread Router Best Practices¹ guide, they assume a shared WiFi network).
In our case, it's likely that each border router may have its own
(possibly via 3G) route to the Internet (or at least the IPv4 subset of
it) as the distances between the nodes may make WiFi-type solutions
unfeasible.
Cryptography is hard to do well in a small microcontroller, and so an
attractive option is to "outsource" the bulk of the crypto work to a VPN
router. Thus one architecture would be an off-the-shelf border router
connected to an off-the-shelf 3G router that has a VPN client built in.
Using this approach, I can see two ways of achieving bi-directional
communications between the Thread nodes and "the mother ship".
1. Using NAT64 and PCP:
In this case, the Thread node asks the border router to forward ports
to the border router's off-mesh IPv4 interface. All off-mesh
communications happens over IPv4 tunnels. Party like it's 1981².
When an outbound request comes from the mesh, the Internet service
sees the external interface of the border router that routed it. If
two border routers forward a request, we'll see two instances of that
request via two distinct IPv4 addresses.
We have a challenge in knowing what IPs and port numbers a particular
node's ports got forwarded to. I'll have to look at how PCP normally
handles this.
I don't think PCP is part of OpenThread Border Router at last check.
A bonus with this is pretty much any off-the-shelf kit will work with
IPv4 (99% of the time³).
2. IPv6 routing:
Here, we route the entire on-mesh prefix. There are two ways we can
do it depending on how the on-mesh prefix is handled. IF the prefix
is common to two or more border routers, we have a problem
identifying *WHERE* to send the reply.
In cases where the connectivity is perfect, then it may not matter,
although picking the closest border router to the originating node
would be best.
If there exists a partition though, we might send it to a border
router on another partition, thus the reply will not be heard by the
node. The presence of the partition is not known to the responding
host.
2a. Each border router has a *unique* on-mesh prefix:
In this scenario, provided that prefix does not change, we can
simply forward that prefix as-is. No need to do anything fancy.
Some smarts might be needed to have the nodes try alternate
source addresses if a request times out, but otherwise it should
JustWork™.
2b. Each border router advertises the *same* on-mesh prefix:
Now things get tricky. The only way to be sure from where I sit
is to have each border router use network prefix translation.
If the on-mesh prefix was fd00:aabb:ccdd:eeff::/64:
- BR1 might translate this to 2001:0db8:1111:1111::/64
- BR2 might translate this to 2001:0db8:2222:2222::/64
Thus if fd00:aabb:ccdd:eeff::9999:8888 sends an outgoing packet,
the external service will see 2001:0db8:1111:1111::9999:8888 if
the request passed through BR1, or
2001:0db8:2222:2222::9999:8888 if it passed through BR2.
Both solutions here might employ static routing or dynamic routing.
Option 2 obviously requires a VPN device that understands IPv6 and a
border router willing to "expose" its mesh network.
The Thread documentation seems to hint at border routers somehow
allocating their own on-mesh prefixes (e.g. DHCPv6-PD, ULAs) which would
suggest they should come up with unique ones.
The OpenThread Border Router just provides a text field for the user to
specify their own.
Thus two routers can either be, by accident (because through chance they
"randomly" pick the same ULA, or a bug in the DHCPv6 server) or
deliberately (by user input), configured with the same on-mesh prefix.
What I want to know is, is this configuration a valid one? Should we
try to support it, or should we forbid such a configuration?
Regards,
--
Stuart Longland (aka Redhatter, VK4MSL)
I haven't lost my mind...
...it's backed up on a tape somewhere.
1.
https://www.threadgroup.org/Portals/0/documents/support/ThreadBorderRouterBestPractices_2530_1.pdf
2.
https://tools.ietf.org/html/rfc791 (yes, it's that old)
3. I've had configuration tools tell me that
192.168.69.0/23
or
192.168.68.255/23 is "invalid".