Environment Characteristics
- Zorin OS 15.3 64-bit
- Linux zorin-rosa 5.4.0-122-generic #138~18.04.1-Ubuntu SMP Fri Jun 24 14:14:03 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
- Intel® Core™ i7-8565U CPU @ 1.80GHz × 8
- NVIDIA GeForce MX150/PCIe/SSE2
- ns-3.35
- g++ (Ubuntu 11.1.0-1ubuntu1~18.04.1) 11.1.0
Simulation Characteristics
- I’m trying to simulate a smart home with 29 smart devices connected wirelessly to a server via an AP.
- 29 nodes STAs each in a fixed position within approximately 40 m2 connected to 1 AP
- 1 AP connected via CSMA channel with 1 server
- STAs nodes connect to the server using TCP sockets
- AP is used only to connect to STAs nodes and redirect packets between server and STAs
- AP has a bridge to link its wireless and CSMA ports
The issue
I could place all 29 nodes in their fixed positions and 28 of 29 could successfully open their TCP socket with the server. Only one could not connect and is breaking the simulation. The connection attempt is made at 2.4s and the error only returns at 191.4s.
Things I Tried
- Changing node's fixed position: if I change their positions, other nodes start failing, and at least one fails. By also using the Z coordinate, I could have fewer failures.
- Changing network parameters: I tried to change the remote station manager, set multiple antennas for the AP, change the physical channel parameters (loss), ARP cache parameters (timeouts, retries, queue), CSMA rate and delay, use IPV6 instead of IPV4.
- Changing nodes number: I could use 8 nodes maximum without having at least one fail in the TCP socket connection. That’s too low for what I’m trying to simulate.
- Giving more time between applications to start: I tried to give a lot of time between applications to start, like 10s for each, but the problem persisted.
- Removing only the one fault node: if I do that, another node or set of nodes fails.
Suspect
I created a public repo on GitHub with minimal code to reproduce the issue and with the relevant pcap files. I’m not sure if I configured the network wrongly or if it’s a bug in ns-3. Probably the first option. I would really appreciate any help and directions.
https://github.com/giovannirosa/ns3-issue
I suspect that the simulation has too many packets flowing in the physical channel, so the AP cannot handle them properly. I could see in the pcap files that the ARP packet reaches the AP “Who has 192.168.0.1? Tell 192.168.0.10” (AccessPoint-1-1.pcap) but is not answered and does not reach the server (Server-0-0.pcap, Server-1-0.pcap). All other ARP packets are answered. Currently, I’m just ignoring this one fault node in the simulation to make it go through the end. However, does anyone know what is the problem? How can I make it work consistently to connect all nodes with the server?