udp / dhcpc problem with multiple interfaces

294 views
Skip to first unread message

Sebastien Lorquet

unread,
Jun 21, 2018, 7:34:55 PM6/21/18
to NuttX
I had a working setup, then something in the network happened that broke
the DHCP client.

Here is a log of what happens, this is the network debug output with
some added device addresses.

nsh> stm32_ifup: Bringing up: 0.0.0.0
stm32_ethconfig: Reset the Ethernet block
stm32_ethconfig: Initialize the PHY
stm32_phy_boardinitialize: called (intf=0)
stm32_phy_boardinitialize: PHY reset...
stm32_phy_boardinitialize: PHY reset done.
stm32_phyinit: PHYSR[30]: 0116
stm32_phyinit: Duplex: FULL Speed: 100 MBps
stm32_ethconfig: Initialize the MAC and DMA
stm32_ethconfig: Enable normal operation
stm32_macaddress: eth0 MAC: d8:80:39:9c:0c:0d
netdev_ifup: ifup method success, dev for d_flags = 20000ff8
Interface is going UP
netdev_ifr_ioctl: cmd: 1812
dhcpc_open: MAC: d8:80:39:9c:0c:0d
Starting DHCP request
netdev_ifr_ioctl: cmd: 1793 (this is GETIP)
netdev_ifr_ioctl: cmd: 1794 (this is SETIP)
dhcpc_request: Broadcast DISCOVER
psock_udp_sendto: WARNING: device (200040e4) is DOWN
psock_sendto: ERROR:  Family-specific send failed: -118
DHCP request failed: -1 errno 118

Basically the interface goes UP (via a simple netmonitor daemon) but the
logic in net/udp/udp_psock_sendto_unbuffered.c finds it DOWN

And the reason it finds it down is that the interface pointer where the
flags are checked is not the same...

Because now I have a TUN interface, which is just created BEFORE dhcp is
negotiated. This interface is still DOWN.

What happens is that in the DHCP client, even if we pass an interface
name, does not use this interface name to select the actual interface on
which the request is sent.

So, having enabled TUN (and created a tunnel, but yet uninitialized),
the DHCP request is sent to the TUN interface (which is down) and not to
the Ethernet interface (which is UP) - just because there is no
interface selection.

How can we force an UDP socket to bind on a specific interface (by MAC
address?), when no IP has been set up yet, and the two interfaces have a
INADDR_ANY address?


I had a look at the standard ISC DHCP source code for linux and it seems
that at least one of the options to do that is to implement something
called SO_BINDTODEVICE that allows to bind a UDP socket to a specific
ethernet interface without resolving the device using an IP address

See here:
https://source.isc.org/cgi-bin/gitweb.cgi?p=dhcp.git;a=blob;f=common/socket.c;h=483eb9c3bd53260a5c27f0849f8bc1c148e65777;hb=HEAD

Line 265

Also documented here https://linux.die.net/man/7/socket


What do you think of this? Maybe I'm missing something simpler? Or
another solution?

Thanks

Sebastien

patacongo

unread,
Jun 21, 2018, 8:07:11 PM6/21/18
to NuttX
Hi, Sebastien,

I had a working setup, then something in the network happened that broke
the DHCP client.  ...

Basically the interface goes UP (via a simple netmonitor daemon) but the
logic in net/udp/udp_psock_sendto_unbuffered.c finds it DOWN

And the reason it finds it down is that the interface pointer where the
flags are checked is not the same...

Because now I have a TUN interface, which is just created BEFORE dhcp is
negotiated. This interface is still DOWN.

What happens is that in the DHCP client, even if we pass an interface
name, does not use this interface name to select the actual interface on
which the request is sent.

Currently the up-ness and down-ness of an interface is controlled via logic in nuttx/net/netdev_ioctl.c, specifically by"

 910       case SIOCSIFFLAGS:  /* Sets the interface flags */
 
 it gets the device like:
 
 914           dev = netdev_ifr_dev(req);

Where netdev_ifr_dev() is in the same file.  The request, req, contains the device name string as the first parameter and it simply does a lookup for the device with the matching name:

 652       return netdev_findbyname(req->ifr_name);

I can't imagine how that would find the an interface where the names doe not match.  netdev_findbyname() just basically searches through the devices for the matching name.  The logic would have to be broken to get the wrong device.  You would have to tell me.

The interface is then taken down or brought up for that device based on flags specified in that request:

 917               if ((req->ifr_flags & IFF_UP) != 0)
 918                 {
 921                   netdev_ifup(dev);
 922                 }
 926               else if ((req->ifr_flags & IFF_DOWN) != 0)
 927                 {
 930                   netdev_ifdown(dev);
 931                 }


Other than some ICMPv6 auto-configuration logic, that is the only way that the interface goes up (it can, of course, go down for other reasons).

I don't know what the "simple netmonitor daemon" you are talking about is.  Is this the NSH netmonitor of apps/nshlib/nsh_netinit.c?  The NSH network will handle only one network device and it has a fixed NET_DEVNAME that it uses.  You can see how it selects this single device name.  "eth0"  should have priority over "tun0" so there would be a problem if you did want to use the TUN device, but it looks like it should handle the Ethernet device okay.

The actual deed is done by apps/netutils/netlib/netlib_setifstatus.c.

I don't understand how the DHCPC client has anything to do with this.  I don't think that apps/netutils/dhcpc changes that interface flags.
 
So, having enabled TUN (and created a tunnel, but yet uninitialized),
the DHCP request is sent to the TUN interface (which is down) and not to
the Ethernet interface (which is UP) - just because there is no
interface selection.

How can we force an UDP socket to bind on a specific interface (by MAC
address?), when no IP has been set up yet, and the two interfaces have a
INADDR_ANY address?

Sockets are are bound to addresses which (we hope) correspond to devices, but not to devices.  Doing the device lookup by address uses basically the same logic as doing the device lookup by device name, just using a different match key.

The type of socket or the properties of the socket are irrelevant for selecting the device to bring up or down.  Any socket will work.  The socket is only your ticket into the network.  It does not have to be connected to a device (== subnet).  It is the device name in the request that determines which device will be controlled.

Unless, of course, if something is broken.  I think we would have known such a bug by now, but nothing surprises me anymore.


I had a look at the standard ISC DHCP source code for linux and it seems
that at least one of the options to do that is to implement something
called SO_BINDTODEVICE that allows to bind a UDP socket to a specific
ethernet interface without resolving the device using an IP address

See here:
https://source.isc.org/cgi-bin/gitweb.cgi?p=dhcp.git;a=blob;f=common/socket.c;h=483eb9c3bd53260a5c27f0849f8bc1c148e65777;hb=HEAD

Line 265

Also documented here https://linux.die.net/man/7/socket

What do you think of this? Maybe I'm missing something simpler? Or
another solution?

I can't imagine why it would be necessary.  I think the first step would be to understand why the current device name is not resolving to the correct device in your case (it does in other cases).

Greg

patacongo

unread,
Jun 21, 2018, 8:13:46 PM6/21/18
to NuttX

 it gets the device like:
 
 914           dev = netdev_ifr_dev(req);

Where netdev_ifr_dev() is in the same file.  The request, req, contains the device name string as the first parameter and it simply does a lookup for the device with the matching name:

 652       return netdev_findbyname(req->ifr_name);

I can't imagine how that would find the an interface where the names doe not match.  netdev_findbyname() just basically searches through the devices for the matching name.  The logic would have to be broken to get the wrong device.  You would have to tell me.

netdev_findbyname() is mindlessly simple.  It just searches through the list an finds a device that has exactly the same name as is in the request (simplified):

 73 FAR struct net_driver_s *netdev_findbyname(FAR const char *ifname)
 74 {
 80       for (dev = g_netdevices; dev; dev = dev->flink)
 81         {
 82           if (strcmp(ifname, dev->d_ifname) == 0)
 83             {
 85               return dev;
 86             }
 87         }
 90     }
 91
 92   return NULL;
 93 }

So the only way that the device lookup can succeed if is dev->d_fname is exactly the same is ifr_name in the request.  So the first question is how can a request for ifr_name="eth0" match a device with d_name="tun0".  If you understand that, then the issue is essentially solved.

Greg

Sebastien Lorquet

unread,
Jun 21, 2018, 8:34:25 PM6/21/18
to nuttx

Hello

the netmonitor is a copy of the nsh net monitor. I am asking it to supervise eth0, which is the STM32 MAC.

https://github.com/f4grx/hn70ap/blob/master/apps/sysdaemon/sysdaemon_netmonitor.c

(this is also the file that calls DHCP whenever the link is up)


However, the order of initialization of my app happens to create a tun interface (I have named it uhf%d) BEFORE the DHCP request is sent after the link becomes ready.

Note that disabling TUN fixes the DHCP issue, so there must be some influence from having a second interface...


To send packets, the DHCP Client (apps/netutils/dhcpc/dhcpc.c) creates a standard UDP socket.

To send the request, the DHCP client does this:

      pdhcpc->sockfd = socket(PF_INET, SOCK_DGRAM, 0);

      addr.sin_family      = AF_INET;
      addr.sin_port        = HTONS(DHCPC_CLIENT_PORT);
      addr.sin_addr.s_addr = INADDR_ANY;

      ret = bind(pdhcpc->sockfd, (struct sockaddr*)&addr, sizeof(struct sockaddr_in));

The UDP socket is bound to INADDR_ANY

There is no way to tell I want to send on eth0 or on tun0

We just tell to bind on 0.0.0.0 ... But at that point, this address is used BOTH by eth0 and tun0, so there is no way to select the proper interface.

We would need something like 'bind_to_interface' (by MAC or name) instead of bind() by IP, which is ambiguous.

A bit later in dhcpc_request() we just set the selected interface to INADDR_ANY but it is of no use. We should bind to the required interface instead.

netdev_findbyname() will obviously be used within the IP stack, but we need a way to use this from userspace.


It is probably easy to test, by running DHCP on any system with two network interfaces with no IP address set at all (for example, an atmel SAM board with two MACs enabled?). In that situation the DHCP client is unable to select an interface to send UDP packets.

Sebastien
--
You received this message because you are subscribed to the Google Groups "NuttX" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nuttx+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Sebastien Lorquet

unread,
Jun 21, 2018, 8:38:35 PM6/21/18
to NuttX
Hello again,

On 22/06/2018 02:07, patacongo wrote:
> Sockets are are bound to addresses which (we hope) correspond to
> devices, but not to devices.  Doing the device lookup by address uses
> basically the same logic as doing the device lookup by device name,
> just using a different match key.
That is the problem: when two devices have the same IP address (here
0.0.0.0), the key is not unique and the intended interface cannot be
selected anymore! (the first sentence probably has a typo)

Note: NSH is not the user entry point on my board; NSH net
initialization is not executed.

Sebastien

patacongo

unread,
Jun 21, 2018, 8:40:05 PM6/21/18
to NuttX
When the STM32 Ethernet is brought up, you always see this message:

nsh> stm32_ifup: Bringing up: 0.0.0.0

But you do not when you bring the interface later.

Interface is going UP
netdev_ifr_ioctl: cmd: 1812

So certainly stm32_ifup() was never called.  That can be normal if the network thinks that the device is already up but other wise the call should occur.

Wait ... 1812 is not the ioctl command to set the IFF flags!  it is:

#define SIOCGIFHWADDR    _SIOC(0x0014)  /* Get hardware address */

The command to set the IFF flags is:

#define SIOCSIFFLAGS     _SIOC(0x001a)  /* Sets the interface flags */

Which has the value 1818

patacongo

unread,
Jun 21, 2018, 8:52:44 PM6/21/18
to NuttX


The UDP socket is bound to INADDR_ANY

There is no way to tell I want to send on eth0 or on tun0

We just tell to bind on 0.0.0.0 ... But at that point, this address is used BOTH by eth0 and tun0, so there is no way to select the proper interface.

We would need something like 'bind_to_interface' (by MAC or name) instead of bind() by IP, which is ambiguous.

You don't need to bind the socket at all.  That is irrelevant.  you just need to do socket() -> ioctl() -> close() as is done in apps/netutils/netlib/netlib_setifstatus.c.  The correct IOCTL number and the device name are all that are needed.

I actually did not see anyplace in your debug output where you sent the SIOCSIFFLAGS IOCTL command (see my previous message).  My current belief is the network did not go up because you did not send the IOCTL command (or it is missing from the debug output).  But of course, I can't know that with any certainty.


A bit later in dhcpc_request() we just set the selected interface to INADDR_ANY but it is of no use. We should bind to the required interface instead.

netdev_findbyname() will obviously be used within the IP stack, but we need a way to use this from userspace.

Somethinig similar is available in user space.  See apps/netutils/netlib/netlib_ipv4adaptor.c.  It doesn't deal with device structures, but if you provide a destination IP address on the subnet, it will return the IP address assigned to the network device.  It can even handle destination addresses that are a few hops away (provided that the information in the routing table).

Greg

patacongo

unread,
Jun 21, 2018, 8:55:27 PM6/21/18
to NuttX

That is the problem: when two devices have the same IP address (here
0.0.0.0), the key is not unique and the intended interface cannot be
selected anymore! (the first sentence probably has a typo)


NOOoo..  that is not the problem.  The IP address is not the key in the lookup.  The device name is the key in the lookup and that was uniquely assigned when the network device was registered.

Sebastien Lorquet

unread,
Jun 22, 2018, 2:08:22 AM6/22/18
to nuttx

more logs and instrumentation to make it clear that IFUP is called successfully for eth0 but DHCP tries to send data to tun0

netdev_register: Registered MAC: 00:00:00:00:00:00 as dev: tun0
Started tun RX thread
Mass Storage mounted at /data
Mounted /proc
Set MAC: D8:80:39:9C:0C:0D
netdev_ifr_ioctl: cmd: 1813
Entry
netdev_ifr_ioctl: cmd: 1827
arch_phy_irq: Attach PHY IRQ
netdev_ifr_ioctl: cmd: 1819
netdev_ifr_ioctl: cmd: 1828
netdev_ifr_ioctl: cmd: 1829
Bringing the link up
netdev_ifr_ioctl: cmd:*** Launching nsh
 
1NuttShell (NSH)
8nsh> 18
netdev_ifr_ioctl: Called SIOCSIFFLAGS with pointer 20000ff8, flags 00000002


stm32_ifup: Bringing up: 0.0.0.0
stm32_ethconfig: Reset the Ethernet block
stm32_ethconfig: Initialize the PHY
stm32_phy_boardinitialize: called (intf=0)
stm32_phy_boardinitialize: PHY reset...
stm32_phy_boardinitialize: PHY reset done.
stm32_phyinit: PHYSR[30]: 0116
stm32_phyinit: Duplex: FULL Speed: 100 MBps
stm32_ethconfig: Initialize the MAC and DMA
stm32_ethconfig: Enable normal operation
stm32_macaddress: eth0 MAC: d8:80:39:9c:0c:0d
netdev_ifup: ifup method success, dev for d_flags = 20000ff8
Interface is going UP
netdev_ifr_ioctl: cmd: 1812
dhcpc_open: MAC: d8:80:39:9c:0c:0d
Starting DHCP request
netdev_ifr_ioctl: cmd: 1793

netdev_ifr_ioctl: cmd: 1794


dhcpc_request: Broadcast DISCOVER
psock_udp_sendto: WARNING: device (200040e4) is DOWN
psock_sendto: ERROR:  Family-specific send failed: -118
DHCP request failed: -1 errno 118

netdev_ifr_ioctl: cmd: 1827
netdev_ifr_ioctl: cmd: 1819
netdev_ifr_ioctl: cmd: 1828
netdev_ifr_ioctl: cmd: 1829

the dev pointer for eth0 is 20000ff8, netdev_ifup is called and successful


but after DHCP_request we see that psock_udp_sendto FAILS because it has determined that the device for 0.0.0.0 is at pointer 0x200040e4, which is TUN0

This happens even if tun0 is DOWN and eth0 is UP


The reason for this is that in udp_psock_send_unbuffered, the DEV that is used for testing interface flags has been selected a few lines before using udp_find_raddr_device(conn). I have instrumented this function (in udp_finddev) and this is what happens:

udp_find_raddr_device ->udp_find_ipv4_device -> netdev_findby_ipv4addr

So yes there is a search by IP and not by device name.

This then goes to net/netdev/netdev_findbyaddr.c

I have added more instrumentation here.

This is the output again:

Set MAC: D8:80:39:9C:0C:0D
netdev_ifr_ioctl: cmd: 1813
Entry
netdev_ifr_ioctl: cmd: 1827
arch_phy_irq: Attach PHY IRQ
netdev_ifr_ioctl: cmd: 1819
netdev_ifr_ioctl: cmd: 1828
netdev_ifr_ioctl: cmd: 1829
Bringing the link up
*** Launching nsh

NuttShell (NSH)
nsh> netdev_ifr_ioctl: cmd: 1818
netdev_ifr_ioctl: Called SIOCSIFFLAGS with pointer 20000ff8, flags 00000002


stm32_ifup: Bringing up: 0.0.0.0
stm32_ethconfig: Reset the Ethernet block
stm32_ethconfig: Initialize the PHY
stm32_phy_boardinitialize: called (intf=0)
stm32_phy_boardinitialize: PHY reset...
stm32_phy_boardinitialize: PHY reset done.

stm32_phyinit: PHYSR[30]: 0136


stm32_phyinit: Duplex: FULL Speed: 100 MBps
stm32_ethconfig: Initialize the MAC and DMA
stm32_ethconfig: Enable normal operation
stm32_macaddress: eth0 MAC: d8:80:39:9c:0c:0d
netdev_ifup: ifup method success, dev for d_flags = 20000ff8
Interface is going UP
netdev_ifr_ioctl: cmd: 1812
dhcpc_open: MAC: d8:80:39:9c:0c:0d
Starting DHCP request
netdev_ifr_ioctl: cmd: 1793

netdev_ifr_ioctl: cmd: 1794
dhcpc_request: Broadcast DISCOVER
udp_find_raddr_device: called for PF_INET with conn->u.ipv4.raddr=FFFFFFFF
netdev_findby_ipv4addr: called
netdev_findby_ipv4addr: return g_netdevices because rip=BROADCAST and lip=ANY


psock_udp_sendto: WARNING: device (200040e4) is DOWN

psock_sendto: ERROR:  Family-specific send failed: -118
DHCP request failed: -1 errno 118

netdev_ifr_ioctl: cmd: 1827
netdev_ifr_ioctl: cmd: 1819
netdev_ifr_ioctl: cmd: 1828
netdev_ifr_ioctl: cmd: 1829

So: We are in the situation where RIP=broadcast and LIP=ANY

This goes to a branch of the code where I can read:

  /* First, check if this is the broadcast IP address */

  if (net_ipv4addr_cmp(ripaddr, INADDR_BROADCAST))
    {
      /* Yes.. Check the local, bound address.  Is it INADDR_ANY? */

      if (net_ipv4addr_cmp(lipaddr, INADDR_ANY))
        {
ninfo("return g_netdevices because rip=BROADCAST and lip=ANY\n"); <-- this is my added instrumentation
          /* Yes.. In this case, I think we are supposed to send the
           * broadcast packet out ALL locally available networks.  I am not
           * sure of that and, in any event, there is nothing we can do
           * about that here.
           *
           * REVISIT:  For now, arbitrarily return the first network
           * interface in the list of network devices.  The broadcast
           * will be sent on that device only.
           */

          return g_netdevices;
        }
      else
        {
          /* Return the device associated with the local address */
ninfo("return based on lipaddr\n");
          return netdev_finddevice_ipv4addr(lipaddr);
        }
    }

Admittedly this is a bit different: I thought the problem was the duplicated local address, but in fact, DHCP requires sending a BROADCAST to eth0 but in that situation you just return g_netdevices which is whatever happens to be the last registered interface, here, tun0.

Sending on ALL interfaces does not look like a good option, and an arbitrary device is what happens if we dont bind a UDP socket to a particular interface. So again: How do we make sure that the choice is not arbitrary, but is sent to the interface passed to the dhcp client?

I insist that this is a DHCP specific issue because we want to send broadcast on a controlled interface with no bound source address. All other broadcast do not have this problem since the correct device can be selected via its correctly defined source address.

Sebastien

Sebastien Lorquet

unread,
Jun 22, 2018, 4:08:56 AM6/22/18
to nuttx
The situation is also mentioned here:
https://stackoverflow.com/questions/683624/udp-broadcast-on-all-interfaces


I have attached a patch that does two things:

-reformat many comments to fit within 80 columns (you may like it independently)

-add a value for SO_BINDTODEVICE


Sebastien
> *udp_find_raddr_device: called for PF_INET with conn->u.ipv4.raddr=FFFFFFFF**
> **netdev_findby_ipv4addr: called**
> **netdev_findby_ipv4addr: return g_netdevices because rip=BROADCAST and lip=ANY**
> **psock_udp_sendto: WARNING: device (200040e4) is DOWN**
> *psock_sendto: ERROR:  Family-specific send failed: -118
> DHCP request failed: -1 errno 118
> netdev_ifr_ioctl: cmd: 1827
> netdev_ifr_ioctl: cmd: 1819
> netdev_ifr_ioctl: cmd: 1828
> netdev_ifr_ioctl: cmd: 1829
>
> So: We are in the situation where RIP=broadcast and LIP=ANY
>
> This goes to a branch of the code where I can read:
>
>   /* First, check if this is the broadcast IP address */
>
>   if (net_ipv4addr_cmp(ripaddr, INADDR_BROADCAST))
>     {
>       /* Yes.. Check the local, bound address.  Is it INADDR_ANY? */
>
>       if (net_ipv4addr_cmp(lipaddr, INADDR_ANY))
>         {
> ninfo("return g_netdevices because rip=BROADCAST and lip=ANY\n"); <-- this is
> my added instrumentation
> *          /* Yes.. In this case, I think we are supposed to send the
>            * broadcast packet out ALL locally available networks.  I am not
>            * sure of that and, in any event, there is nothing we can do
>            * about that here.
>            *
>            * REVISIT:  For now, arbitrarily return the first network
>            * interface in the list of network devices.  The broadcast
>            * will be sent on that device only.
>            */
> *
>           return g_netdevices;
>         }
>       else
>         {
>           /* Return the device associated with the local address */
> ninfo("return based on lipaddr\n");
>           return netdev_finddevice_ipv4addr(lipaddr);
>         }
>     }
>
> Admittedly this is a bit different: I thought the problem was the duplicated
> local address, but in fact, DHCP requires sending a BROADCAST to eth0 but in
> that situation you just return g_netdevices which is whatever happens to be
> the last registered interface, here, tun0.
>
> Sending on ALL interfaces does not look like a good option, and an arbitrary
> device is what happens if we dont bind a UDP socket to a particular interface.
> So again: How do we make sure that the choice is not arbitrary, but is sent to
> the interface passed to the dhcp client?
>
> I insist that *this is a DHCP specific issue because we want to send broadcast
> on a controlled interface with no bound source address*. All other broadcast
>> <mailto:nuttx+un...@googlegroups.com>.
>> For more options, visit https://groups.google.com/d/optout.
>
> --
> You received this message because you are subscribed to the Google Groups
> "NuttX" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to nuttx+un...@googlegroups.com
> <mailto:nuttx+un...@googlegroups.com>.
0085_SO_comments_plus_bindto.patch

Sebastien Lorquet

unread,
Jun 22, 2018, 5:08:11 AM6/22/18
to nuttx
Here is a new patch that amends the previous one and attempts a naive
implementation of SO_BINDTODEVICE for UDP.

it builds, but I do not have my board with me at the moment, so I cannot test yet.

So, please consider that this patch is mainly for discussion. It is not tested
and I do not consider it ready for inclusion, unless you confirm that this is
the correct way of doing it.


I am a network stack newbie so I may have missed some points where the device
has to be selected, eg in the receive() side.

I also have no idea how this affects ARP.

I would like your help on these points.


I have only implemented this for UDP since it does only make sense for
broadcasts, and TCP has no way to behave correctly when using broadcasts.

Maybe it could be implemented here to be useful for all protocols, but I am not
really convinced...

https://bitbucket.org/nuttx/nuttx/src/82ff00fabf1e93cdc5ad0dbabb2c83320aa13768/net/netdev/netdev_findbyaddr.c?at=master&fileviewer=file-view-default#netdev_findbyaddr.c-211


Sebastien
0086_sobindtodevice_tentative.patch

patacongo

unread,
Jun 22, 2018, 10:22:45 AM6/22/18
to NuttX

more logs and instrumentation to make it clear that IFUP is called successfully for eth0 but DHCP tries to send data to tun0


 Okay that makes a lot more sense.  So we are not talking about IFUP or IFDOWN anymore.  We are talking about how UDP broadcast packets are sent.  That is a much more interesting topic.

I am not enthusiastic SO_BINDTODEVICE.  That would let you pick one device when there are many Ethernet options available.  Its only real function is to bypass the routing table.  It has impacts throughout the network in all protocols and that is why I lack enthusiasm.  The scope of the change would be for all sending logic for all protocols.  That is not going to happen in the near future.

Here is a simple alternative:  If the address is a broadcast (or multicast) address, then use the locally bound source address (laddr) instead of the remote, destination address (raddr).  Try this and let me know if I screwed it up.  I will do a little more test later today too:

commit 1b6990b69f974b9ef299c8707713cef80b944374
Author: Gregory Nutt <gn...@nuttx.org>
Date:   Fri Jun 22 08:19:17 2018 -0600

    net/udp:  When sending a broadcast (or multicast) packet do not attempt to look up the device by the destination IP address.  Rather, use the locally bound address for these cases to select the correct network device.

Of course you do have to bind the socket to a valid local IP address before you that will work.

Greg

Sebastien Lorquet

unread,
Jun 22, 2018, 10:59:01 AM6/22/18
to NuttX
Hello,

I cannot do that, since this is about a DHCP client!

At this point I have no valid IP, neither on the tun (just created) nor on the
ethernet.

So both interfaces have a 0.0.0.0 local address.

This is really a specific situation.


We could turn that SO_ option to a UDP specific thing. This is the only case
where this shit (sorry) has to happen. Other protocols will probably deal with
initialized local addresses, and that's no problem.


The problem only happen when BOTH conditions are true:

-Destination is INADDR_BROADCAST;

-Local address is INADDR_ANY.


Sebastien
> --
> You received this message because you are subscribed to the Google Groups
> "NuttX" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to nuttx+un...@googlegroups.com
> <mailto:nuttx+un...@googlegroups.com>.

patacongo

unread,
Jun 22, 2018, 11:20:32 AM6/22/18
to NuttX

I cannot do that, since this is about a DHCP client!

At this point I have no valid IP, neither on the tun (just created) nor on the
ethernet.

So both interfaces have a 0.0.0.0 local address.

This is really a specific situation.

We could turn that SO_ option to a UDP specific thing. This is the only case
where this shit (sorry) has to happen. Other protocols will probably deal with
initialized local addresses, and that's no problem.

The problem only happen when BOTH conditions are true:

-Destination is INADDR_BROADCAST;

-Local address is INADDR_ANY.


A quick and dirty solution would be to change the order or registration so that the Ethernet device is encountered first.  You would probably have to set the LATEINIT option to change that order.

Another option would be to skip over non-Ethernet, DOWN drivers in the search.  That would allow you to support a single Ethernet DHCP which, I think, is all that is possible now anyway.

Another part of my opposition to SO_BINDTODEVICE is that it is non-standard and non-portable socket option:  http://pubs.opengroup.org/onlinepubs/009695399/functions/setsockopt.html.   It would be the first or only such feature, however.  There is a lot of non-portable Linux-like stuff in there now.  I would not accept an incomplete UDP only feature.  The implementation would have to complete and as-close-to-perfect-as-reasonable or not at all.  In am not going to be rushed into a bad implementation.

I am looking into some options. I will get back to you a little later.

Greg

Sebastien Lorquet

unread,
Jun 22, 2018, 11:42:57 AM6/22/18
to NuttX
Hello again,

Changing the init order: In my case, at boot time, this means waiting for the
end of DHCP negotiation to start the tunnels. I can do that as a temporary
workaround, but this is not really future proof...

Also, DHCP is renegotiated every time the ethernet wire is plugged in the
board... and I cant stop the tunnels every time the ethernet is disconnected,
then proceed to DHCP negotiation and wait this to restart the tunnels...

Tunnels are really independent of the ethernet link. They all have their own
life cycles.


Skipping DOWN interfaces would work in my boot-time use case, but not in the
unplug/replug use case, since at that time the tunnel will still be up and running.

So you can't really assume that all complex network setups will only have one UP
interface when running DHCP negotiations...


I understand that you dont want a UDP only feature, even if I wonder what would
be the use of this feature in other protocols.

Would that be possible to insert the "interface selection bypass" directly in
netdev_findbyaddr(), if the socket calling this is bound to a device? This way,
you could get support for all protocols 'at once'.

I would be happy to help writing this code, or modify all places that require a
change.


I will wait for your research and decisions. Thank you for taking time to look
into this.


Sebastien


PS: There is another ugly workaround: Affect a random IP before DHCP negotiation
and use that to bind to the correct interface. But I really dont like that!
Please don't ask me to do that :)

patacongo

unread,
Jun 22, 2018, 12:38:19 PM6/22/18
to NuttX

Changing the init order: In my case, at boot time, this means waiting for the
end of DHCP negotiation to start the tunnels. I can do that as a temporary
workaround, but this is not really future proof...

I was thinking of changing the order or registration at boot time.  There is this logic in up_initialize.c.  First this initializes the STM32 driver:

270 #ifndef CONFIG_NETDEV_LATEINIT
271   /* Initialize the network */
272
273   up_netinitialize();
274 #endif

Then this initializes the TUN driver putting it at the head of the list:

282 #ifdef CONFIG_NET_TUN
283   /* Initialize the TUN device */
284
285   (void)tun_initialize();
286 #endif


up_initialize() is called before your board initialization code.   So if you enable CONFIG_NETDEV_LATEINIT then call up_netinitialize() in your board bringup up logic, then the Ethernet driver would be at the head of the list.


Also, DHCP is renegotiated every time the ethernet wire is plugged in the
board... and I cant stop the tunnels every time the ethernet is disconnected,
then proceed to DHCP negotiation and wait this to restart the tunnels...

The device will not lose its IP address when the cable is disconnected (at least I don't think so).  It has a valid IP address after the first time and so you can use the bound socket.  The logic that I just commit will support that.

The only possibility of a problem is if the lease on the IP address has expired while the cable was disconnect and the DHCP server has given that IP address to another MAC.  Most DHCP servers, however, avoid reassigning addresses quickly.

I understand that you dont want a UDP only feature, even if I wonder what would
be the use of this feature in other protocols.

There are also protocol-specific socket options which could be used that would effect only one protocol.  All of the  SOL_SOCKET socket options have global effect on all socket protocols.  But SOL_UDP would effect only UDP sockets.  The socket option could then be like UDP_BINDTODEVICE.  UDP socket options would be defined in include/netinet/udp.h which does not yet exist.

I will look around to see if there is any similar UDP socket options around.


Would that be possible to insert the "interface selection bypass" directly in
netdev_findbyaddr(), if the socket calling this is bound to a device? This way,
you could get support for all protocols 'at once'.

But you bind the device so what good does that do you?

PS: There is another ugly workaround: Affect a random IP before DHCP negotiation
and use that to bind to the correct interface. But I really dont like that!
Please don't ask me to do that :)

No and I that would have the possibility of putting bad IP addresses on the network.  Might be work looking into IPv4 link-local addresses for such a case.

I haven't come up with any other good solutions.

Greg

patacongo

unread,
Jun 22, 2018, 1:07:03 PM6/22/18
to NuttX

I understand that you dont want a UDP only feature, even if I wonder what would
be the use of this feature in other protocols.

There are also protocol-specific socket options which could be used that would effect only one protocol.  All of the  SOL_SOCKET socket options have global effect on all socket protocols.  But SOL_UDP would effect only UDP sockets.  The socket option could then be like UDP_BINDTODEVICE.  UDP socket options would be defined in include/netinet/udp.h which does not yet exist.

I will look around to see if there is any similar UDP socket options around.

can't find any discussion of UDP socket options on OpenGroup.org.  I doubt that there is any specification.



There are no commong options in either (one is functionally similar but with a different name).  But you can at least see the file and UDP socket option naming conventions.  The options names should begin with UDP (not SO_) and should begin at:

166 /* Protocol-level socket options may begin with this value */
167
168 #define __SO_PROTOCOL  16
 
Greg

Sebastien Lorquet

unread,
Jun 22, 2018, 2:26:14 PM6/22/18
to NuttX

Meanwhile please find attached a patch that reformat some comments that were longer than 80 chars per line.

Sebastien

--
You received this message because you are subscribed to the Google Groups "NuttX" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nuttx+un...@googlegroups.com.
0085_socket_comments.patch

patacongo

unread,
Jun 22, 2018, 6:54:20 PM6/22/18
to NuttX


Would that be possible to insert the "interface selection bypass" directly in
netdev_findbyaddr(), if the socket calling this is bound to a device? This way,
you could get support for all protocols 'at once'.

Actually, the more I think about it, I believe that would be incorrect.  Please don't waste your time.

Binding on a UDP socket is much "weaker" thing than binding a TCP address.  Binding to a UDP socket does not have anything to do with the device that you can send UDP packets on.  That is controlled only by the address provided to sendto().  Even its address is bound to a device, it can still send outgoing UDP packets on any device.  Nothing prevents.

The local IP address does matter when it the the destination address of an incoming IP packet.  That controls the routing of the packet.  But still, that does not depend on any particular devices.  We currently support IP forwarding so that you can receive an packet with a matching destination address from a different device.  I think that can only happen with "route over' routing.  I would have to think more clearly about that.

But such a change would be flat wrong in many cases and probably wrong in certain other cases.  So let's not do that.

Greg


Sebastien Lorquet

unread,
Jun 23, 2018, 3:55:20 AM6/23/18
to NuttX

I agree with you, that's not the correct place. So what can we do? Implement the thing in the sendto() methods of protocols?

First we need a list of required changes.

It looks like it's not so complex. The presence of the option on a socket just says that the interface is predefined instead of being chosen via the normal way.

My opinion is that when the user defines this option on a socket, he becomes accountable for the new behaviour, even if things dont work as intended.

This option only makes real sense for udp broadcasts with no source IP. In all other cases it is irrelevant since the local or remote IP is enough to choose the interface, and normal routing happens.


Do you have any other idea of required changes?

Sebastien

--
You received this message because you are subscribed to the Google Groups "NuttX" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nuttx+un...@googlegroups.com.

Sebastien Lorquet

unread,
Jun 25, 2018, 3:41:33 AM6/25/18
to NuttX
Greg

Can we agree on a way forward here? This is blocking my development at this
point (and an opportunity to showcase a working NuttX board at Nuit du Hack, a
french security/hacking convention).

At least can we start a branch and define this option in headers and setsockopt,
so we can start implementing it in all protocols?

I'm ready to follow whaterver path pleases you, but I need DHCP to work
correctly in this setup. The workarounds are not really practical. I can poke
code all around to make it work, but it's better to things correctly and officially.

Thanks.

Sebastien
>> <mailto:nuttx+un...@googlegroups.com>.
>> For more options, visit https://groups.google.com/d/optout.
>
> --
> You received this message because you are subscribed to the Google Groups
> "NuttX" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to nuttx+un...@googlegroups.com
> <mailto:nuttx+un...@googlegroups.com>.

patacongo

unread,
Jun 25, 2018, 9:20:24 AM6/25/18
to NuttX

Can we agree on a way forward here? This is blocking my development at this
point (and an opportunity to showcase a working NuttX board at Nuit du Hack, a
french security/hacking convention).

At least can we start a branch and define this option in headers and setsockopt,
so we can start implementing it in all protocols?

I'm ready to follow whaterver path pleases you, but I need DHCP to work
correctly in this setup. The workarounds are not really practical. I can poke
code all around to make it work, but it's better to things correctly and officially.

I though we already had agreement.  But perhaps not.  I though we agreed that we would not implement this on all protocols, only UDP where it is needed.

And I thought we agreed to use UDP protocol specific socket options (SOL_UDP vs SOL_SOCK) with socket options defined in include/netinet/udp.h and with naming prefix UDP_ (vs. SO_) as is standard for protocol-specific socket options.

So to select the option you would do something like:

  ret = setsockopt(socfd, SOL_UDP, UDP_BINDTODEVICE, &something, sizeof(something));

You can use the TCP_KEEPALIVE protocol option as a model.

I did create a branch:  https://bitbucket.org/nuttx/nuttx/branch/bindtodev

Greg

patacongo

unread,
Jun 25, 2018, 9:29:30 AM6/25/18
to NuttX
Minor correction:


You can use the TCP_KEEPALIVE protocol option as a model.

Sorry, there is no TCP_KEEPALIVE. SO_KEEPALIVE is a global option required by OpenGroups.org: http://pubs.opengroup.org/onlinepubs/009696699/basedefs/sys/socket.h.html

I don't know why it is global; probably for historical reasons. But all of the socket options to configure the TCP keepalive are non-standard SOL_TCP socket options:

 65 #define TCP_KEEPIDLE  (__SO_PROTOCOL + 1) /* Start keeplives after this IDLE period
 66                                            * Argument: struct timeval */
 67 #define TCP_KEEPINTVL (__SO_PROTOCOL + 2) /* Interval between keepalives
 68                                            * Argument: struct timeval */
 69 #define TCP_KEEPCNT   (__SO_PROTOCOL + 3) /* Number of keepalives before death
 70                                            * Argument: max retry count */

Those are the ones that should serve as models for the non-standard UDP_BINDTODEVICE option.

Greg

Sebastien Lorquet

unread,
Jun 25, 2018, 9:32:59 AM6/25/18
to NuttX
Hello

Sorry, it was not clear if you agreed to that or not, now we're OK. I will
submit PR to this branch.

Thanks, also for the TCP keepalive specific options. That's a good example.

Sebastien

patacongo

unread,
Jun 25, 2018, 9:57:32 AM6/25/18
to NuttX

Can we agree on a way forward here?

I think that the only place that this option applies is the UDP and only to UDP send and only to the single spot the the send-to device is determined which is in net/udp/udp_finddev.c.  And I think that the only logic on the consumer size of the option is two places like:

          else
            {
              /* Not a suitable IPv6  unicast address for device lookup */

+ifdef CONFIG_UDP_BINDTODEVICE
+            /* Use the bound device, if any */
+
+            return conn->dev;
+#else
              return NULL;
+endif
            }

The device reference needs to be maintained in the UDP connection structure (not the socket structure). On the socket option provider side, the socket option received the device name as an input and just needs to check for a NULL or empty device name (which unbinds the device by setting conn-.dev to NULL) or calls netdev_findbyname() to set conn->dev.

Greg

Sebastien Lorquet

unread,
Jun 25, 2018, 10:01:48 AM6/25/18
to NuttX
Hello,

OK, will do that in the UDP connection instead of the socket

Sebastien

patacongo

unread,
Jun 25, 2018, 10:12:56 AM6/25/18
to NuttX

Thanks, also for the TCP keepalive specific options. That's a good example.

Yes, it is pretty simple.   psock_setsockopt() has a switch based on the 'level' argument of the socket option:

For SOL_TCP these are handled by:

366       case SOL_TCP:    /* TCP protocol socket options (see include/netinet/tcp.h) */
367 #ifdef CONFIG_NET_TCPPROTO_OPTIONS
368        ret = tcp_setsockopt(psock, option, value, value_len);
369        break;
370 #endif

But SOL_UDP currently generates an error:

378       case SOL_UDP:    /* TCP protocol socket options (see include/netinit/udp.h) */
379         ret = -ENOSYS;
380        break;

You will have to create a udp_sectsockopt() and make the logic above like the TCP case.

Greg


patacongo

unread,
Jun 25, 2018, 10:15:13 AM6/25/18
to NuttX

OK, will do that in the UDP connection instead of the socket

You will find that will make your life much easier anyway.  The socket structure is not available in the low level UDP code.  It cannot be accesses.  But the connection structure is always available.

patacongo

unread,
Jun 25, 2018, 10:57:54 AM6/25/18
to NuttX

OK, will do that in the UDP connection instead of the socket

One complication is that a device can be unregistered at anytime via netdev_unregister.  If that device is bound when it is unregistered, then any saved pointer will be stale invalid.

My recommendation:  Don't save the device structure pointer, save the device index and use netdev_findbyindex() to look up the device.  If the device is unregistered, the lookup will simply fail.

POSIX supports accessing network devices either by name or by index and supports a whole set of user APIs to make the conversions.  The index that is in place there now is only fragmentary but I am implementing full device indexing support on a private branch.

unsigned              if_nametoindex(const char *);
char                 *if_indextoname(unsigned, char *);
struct if_nameindex  *if_nameindex(void);
void                  if_freenameindex(struct if_nameindex *);

You can continue the way you are going now and I will backfit the device indexing into the bind-to-device logic when I have the device indexing in place.

Greg

Sebastien Lorquet

unread,
Jun 25, 2018, 11:11:21 AM6/25/18
to NuttX
Bitbucket does not allow me to create pull requests at this moment so here is a
patch that enables UDP setsockopt, along with build infrastructure.

It just defines the values and functions that will be filled later.

Sebastien
0086_udpsockopt.patch

Sebastien Lorquet

unread,
Jun 25, 2018, 11:13:50 AM6/25/18
to NuttX
forget it the pull request worked and is easier to use. Dont use this patch, thanks.

Sebastien

Sebastien Lorquet

unread,
Jun 25, 2018, 11:20:17 AM6/25/18
to NuttX
OK, no pointers, but an index, thats fine for me.

Will the lookup by index also fail if we delete an interface, then create a new
one? The index should reassigned, and valid again in this unfortunate case?

Better would be to store the full interface name and lookup that when required,
but it would be a pretty large loopup overhead each time we send data.

Sebastien


Le 25/06/2018 à 16:57, patacongo a écrit :
>

patacongo

unread,
Jun 25, 2018, 12:23:58 PM6/25/18
to NuttX


Will the lookup by index also fail if we delete an interface, then create a new
one? The index should reassigned, and valid again in this unfortunate case?

Yes.  The index could be reused in that case.

I just what to make sure that we can handle the case where a removable network device is disconnected.  There is no easy way to determine if the device is disconnected and quickly reconnected.  I think things should recover slower after a device is removed as things are (I don't have much experience with that, however).  But I am concerned about the asynchronous removal of the device while things are in progress.


Better would be to store the full interface name and lookup that when required,
but it would be a pretty large loopup overhead each time we send data.


You are making the assumption that interface names are unique too.  The minor numbers are also reused.  That is actually a more difficult problem because the minor numbers are unique for each device type.  The current way that multiple device minor numbers are assigned is kludgey.

Greg

Sebastien Lorquet

unread,
Jun 25, 2018, 12:27:00 PM6/25/18
to NuttX
I understand, the goal is to avoid a crash, and that's a good reason. If the
interface is replaced this is not cool but will not crash the whole thing.

Can you please handle my pull request instead of the patch so I can get my trees
in sync?

Thanks

Sebastien

Sebastien Lorquet

unread,
Jun 25, 2018, 1:23:49 PM6/25/18
to NuttX
Greg

Can you please push a stub and declaration for if_nametoindex in the bindtodev branch?

I see that it will collide names with sim/src/up_tapdev.c line 226...


Sebastien
--
Envoyé de mon appareil Android avec Courriel K-9 Mail. Veuillez excuser ma brièveté.

patacongo

unread,
Jun 25, 2018, 2:13:30 PM6/25/18
to NuttX

Can you please push a stub and declaration for if_nametoindex in the bindtodev branch?

I just merged what I have on my local branch.  It is the complete, initial, untested version.

Please not *not* use if_nametoindex() or if_indextoname().  Those are end-user interfaces and do things that are fobidden in the OS... specifically setting the errno variable.

There are alternative netdev_nametoindex() and netdev_indextoname() that you can use internally that do not modify the errno value.  That these are just simple wrappers around netdev_findbyname() and netdev_findbyindex().


I see that it will collide names with sim/src/up_tapdev.c line 226...

No a problem.  Those are different namespaces (long story about how the simulator is built should go here).  One is the target symbol namespace and the other is in the host symobl namespace.

Greg

Sebastien Lorquet

unread,
Jun 25, 2018, 3:57:44 PM6/25/18
to nuttx

a7c1394d broke netdev_register, net_unlock() line 400 finds g_count equals to zero and asserts.

Probably because a lock() is missing in the new interface handling functions

Sebastien

--
You received this message because you are subscribed to the Google Groups "NuttX" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nuttx+un...@googlegroups.com.

Sebastien Lorquet

unread,
Jun 25, 2018, 3:58:17 PM6/25/18
to nuttx

I meant commit 7c1394d8

Sebastien

Sebastien Lorquet

unread,
Jun 25, 2018, 4:07:56 PM6/25/18
to nuttx

found it, fix will be included in incoming patch.

Meanwhile:

Interface is going UP
udp_setsockopt: UDP_BINDTODEVICE: conn 20005ce8 interface eth0, index 1
Starting DHCP request
psock_udp_sendto: conn 20005ce8: Retrieving interface via BOUNDTODEVICE option (index 1 -> dev 0)
psock_udp_sendto: ERROR: udp_find_raddr_device failed
DHCP request failed: -1 errno 114

We're close, but netdev_findbyindex does not return eth0 for index 1. Looking into this now.

Sebastien

Sebastien Lorquet

unread,
Jun 25, 2018, 4:15:21 PM6/25/18
to nuttx

IT WORKS!

Interface is going UP
udp_setsockopt: UDP_BINDTODEVICE: conn 20005ce8 interface eth0, index 1
Starting DHCP request

psock_udp_sendto: conn 20005ce8: Retrieving interface via BOUNDTODEVICE option (index 1 -> dev 20000ff8)
*** Launching nsh

NuttShell (NSH)
nsh> psock_udp_sendto: conn 20005ce8: Retrieving interface via BOUNDTODEVICE option (index 1 -> dev 20000ff8)
IP addr: 192.168.0.19
Net mask: 255.255.255.0
Default router: 192.168.0.254

Please find patch attached

-fixes two typos introduced by commit 7c1394d8

-define and use UDP_BINDTODEVICE option in UDP psock_send_*

I have left two generic _info() calls so you can easily test the option was activated and used. You can remove them.

Sebastien

0087_udpbindtodevice.patch

Sebastien Lorquet

unread,
Jun 25, 2018, 4:16:51 PM6/25/18
to nuttx

Patch for DHCP client in apps repository. Activates BINDTODEVICE if available, and break up some long lines

Sebastien


On 25/06/2018 22:07, Sebastien Lorquet wrote:
0088_APPS_dhcpc_bindtodevice.patch

patacongo

unread,
Jun 25, 2018, 4:37:07 PM6/25/18
to NuttX
There are a few things that I disagree with in this patch.  I will incorporate the the change.  For example, as we discussd before, sendo address must take precedence over the bound device address.  Didn't we agree to that?  There is another change that looks odd to me but I will need to study the code more first.

I'll get back with you.  This will probably take one more round.

Greg
To unsubscribe from this group and stop receiving emails from it, send an email to nuttx+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "NuttX" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nuttx+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "NuttX" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nuttx+unsubscribe@googlegroups.com.

Sebastien Lorquet

unread,
Jun 25, 2018, 4:50:43 PM6/25/18
to NuttX

I dont remember, sorry a million times...

I did that to (try to) match what linux does.

But I think I was confused because the the man 7 socket page talks about reception, not transmission. Sorry about that :(

Thanks for your help.

Sebastien

To unsubscribe from this group and stop receiving emails from it, send an email to nuttx+un...@googlegroups.com.

patacongo

unread,
Jun 25, 2018, 5:12:50 PM6/25/18
to NuttX

There are a few things that I disagree with in this patch.  I will incorporate the the change.  For example, as we discussd before, sendo address must take precedence over the bound device address.  Didn't we agree to that?  There is another change that looks odd to me but I will need to study the code more first.

Committed.  Please review and comment:  commit 65be13bffef269264823699380b985635c24af89
Author: Sebastien Lorquet <seba...@lorquet.fr>
Date:   Mon Jun 25 15:07:53 2018 -0600

    net/udp:  Finish support for the UDP_BINDTODEVICE protocol socket option


 Another thing that was changed is the type and values of the bindtodevice field.  Per the POSIX specific for an interface index, it is an unsigned integer type with the value zero meaning no interface and the values >0 being interface indices (The implemenation has an upper bounder of 32 because a uint32_t is used to hold the indices 1..32).

Please check if I screwed anything up.  I have been doing that a lot lately.  I am getting to old for this.

Greg

patacongo

unread,
Jun 25, 2018, 5:20:06 PM6/25/18
to NuttX
Also committed.

Sebastien Lorquet

unread,
Jun 25, 2018, 5:56:39 PM6/25/18
to nuttx

Hello

you changes are good, I have chosen -1 thinking that interface 0 was valid, which it is not. 0 is a better "non enabled" value.

But it does not work. The interface is bound, but DHCP still fails because the interface is resolved to null.

See this patch against current master. You have discarded this important change: itf indexes are ONE based, but this function assumed ZERO based.

----------------------------

Also I dont agree (but it does not matter): I insist that the bound device should be tried before the search by remote address.

If we dont do that, the effect of bindtodevice is useless except when broadcasting and the socket is bound to INADDR_ANY.

The point of this option is to force the device whatever the source/destination address would select. This is an override.

But it seems that you dont want this override. That's fine for me, the important use case still works.

Sebastien
--
You received this message because you are subscribed to the Google Groups "NuttX" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nuttx+un...@googlegroups.com.
0089_fix_netdev_findbyindex.patch

patacongo

unread,
Jun 25, 2018, 6:27:17 PM6/25/18
to NuttX
Hi, Sebastien,


See this patch against current master. You have discarded this important change: itf indexes are ONE based, but this function assumed ZERO based.

I still don't understand it.  The index should be one besed everywhere in the system, but converted to zero-based only before accessing the set.  Which function is assuming zero based?

Also I dont agree (but it does not matter): I insist that the bound device should be tried before the search by remote address.

If we dont do that, the effect of bindtodevice is useless except when broadcasting and the socket is bound to INADDR_ANY.

The point of this option is to force the device whatever the source/destination address would select. This is an override.

But it seems that you dont want this override. That's fine for me, the important use case still works.

Inconsistent implementations is what we get when we go with non-standard, kludge interfaces.  I think there will be inconsistencies no matter what you do.

The sendto() syntax permits UDP to send datagrams to any available network through the device serving network.  If you send the device through the wrong network, then you have screwed up.  Other than your particular usage.  I can't see any general use for this socket option.

I prefer to keep it out of the way of the address-based packet flow.

Greg


Sebastien Lorquet

unread,
Jun 25, 2018, 7:06:25 PM6/25/18
to nuttx

Hello,

Please see my replies below.


On 26/06/2018 00:27, patacongo wrote:
Hi, Sebastien,

See this patch against current master. You have discarded this important change: itf indexes are ONE based, but this function assumed ZERO based.

I still don't understand it.  The index should be one besed everywhere in the system, but converted to zero-based only before accessing the set.  Which function is assuming zero based?
The bit test in g_devset requires a zero based index, otherwise bit 0 will never be used (since interface indexes start at 1)


The ifindex passed to this function is one based, thats OK.

But to test the g_devset bitfield, this input ifindex has to be zero based (this is confirmed by reading netdev_register). That's why the index was decremented before being used to test the interface bitfield.

However, after this step, in the FOR loop, the now-DECREMENTED ifindex (now zero based) is compared to the actual device index (d_ifindex) stored in the net_driver_s structure, which is still ONE based.

The decrementation is ONLY useful for testing the g_devset bitfield in the IF() test !

So there are two options :

-my patch (avoid decrement except in the bitfield test)
OR
-reincrement ifindex before entering the FOR loop.

Either is OK for me.

Sebastien


Also I dont agree (but it does not matter): I insist that the bound device should be tried before the search by remote address.

If we dont do that, the effect of bindtodevice is useless except when broadcasting and the socket is bound to INADDR_ANY.

The point of this option is to force the device whatever the source/destination address would select. This is an override.

But it seems that you dont want this override. That's fine for me, the important use case still works.

Inconsistent implementations is what we get when we go with non-standard, kludge interfaces.  I think there will be inconsistencies no matter what you do.

The sendto() syntax permits UDP to send datagrams to any available network through the device serving network.  If you send the device through the wrong network, then you have screwed up.  Other than your particular usage.  I can't see any general use for this socket option.

I prefer to keep it out of the way of the address-based packet flow.

Greg


Sebastien Lorquet

unread,
Jun 25, 2018, 7:08:57 PM6/25/18
to NuttX


On 26/06/2018 00:27, patacongo wrote:
> I still don't understand it.  The index should be one besed everywhere
> in the system, but converted to zero-based only before accessing the
> set.  Which function is assuming zero based?

Shorter explanation: it is converted to zero based before accessing the
set, but the FOR loop in the following code still requires a one-based
index.

So it has to be converted back to one-based (or zero-base conversion can
be kept in the set test only, this is what I did)

(ACK for the explanation about where to apply device binding)

Sebastien

patacongo

unread,
Jun 25, 2018, 7:13:21 PM6/25/18
to NuttX
Hi, Sebastien,


The bit test in g_devset requires a zero based index, otherwise bit 0 will never be used (since interface indexes start at 1)

Thanks for explanation.  I did figure that out.

It turns out that I was debugging a related problem.   nsh>ifconfig

That seems to be working find for me now, but there are still some issues with how procfs enumerates devices by an index.  I does not need anything so complex as the netdev_findbyindex(), but I tried to make that function serve all purposes.

But it is a particularly bad in in net/procfs.  I probably should back out  the changes for procfs and keep it simple.

Greg

Reply all
Reply to author
Forward
0 new messages