Small CSMA Drop Tail Queue Causes Unreasonable Delay?

85 views
Skip to first unread message

Stephen Jackson

unread,
Aug 19, 2015, 11:42:10 AM8/19/15
to ns-3-users
Hello All,

I'm working modeling a distributed algorithm in NS-3 and exploring the effects of queueing on that algorithm. I've run into a problem however, If I set the CSMA queue to be very small (1 packet) the simulation starts behaving strangely. In particular, some messages sent by the client/server are delayed for approximately one second for no discernable reason. If someone could help explain why this is happening, or acknowledge that this is not the expected behavior that would be very helpful.

My application has an n^2 message complexity. I've managed to recreate the problem with NS-3's EchoClient and EchoServer. I created a CSMA network and set it to have a DropTailQueue of size 1 packet. I created 21 Clients and 1 server and scheduled all the clients to send their message at time=2s. Here is the output from the simulation:

At time 2s client sent 1024 bytes to 10.1.1.22 port 9
At time 2s client sent 1024 bytes to 10.1.1.22 port 9
At time 2s client sent 1024 bytes to 10.1.1.22 port 9
At time 2s client sent 1024 bytes to 10.1.1.22 port 9
At time 2s client sent 1024 bytes to 10.1.1.22 port 9
At time 2s client sent 1024 bytes to 10.1.1.22 port 9
At time 2s client sent 1024 bytes to 10.1.1.22 port 9
At time 2s client sent 1024 bytes to 10.1.1.22 port 9
At time 2s client sent 1024 bytes to 10.1.1.22 port 9
At time 2s client sent 1024 bytes to 10.1.1.22 port 9
At time 2s client sent 1024 bytes to 10.1.1.22 port 9
At time 2s client sent 1024 bytes to 10.1.1.22 port 9
At time 2s client sent 1024 bytes to 10.1.1.22 port 9
At time 2s client sent 1024 bytes to 10.1.1.22 port 9
At time 2s client sent 1024 bytes to 10.1.1.22 port 9
At time 2s client sent 1024 bytes to 10.1.1.22 port 9
At time 2s client sent 1024 bytes to 10.1.1.22 port 9
At time 2s client sent 1024 bytes to 10.1.1.22 port 9
At time 2s client sent 1024 bytes to 10.1.1.22 port 9
At time 2s client sent 1024 bytes to 10.1.1.22 port 9
At time 2s client sent 1024 bytes to 10.1.1.22 port 9
At time 2.00012s server received 1024 bytes from 10.1.1.5 port 49153
At time 2.00012s server sent 1024 bytes to 10.1.1.5 port 49153
At time 2.00027s server received 1024 bytes from 10.1.1.21 port 49153
At time 2.00027s server sent 1024 bytes to 10.1.1.21 port 49153
At time 2.00074s server received 1024 bytes from 10.1.1.10 port 49153
At time 2.00074s server sent 1024 bytes to 10.1.1.10 port 49153
At time 2.00112s server received 1024 bytes from 10.1.1.1 port 49153
At time 2.00112s server sent 1024 bytes to 10.1.1.1 port 49153
At time 2.00126s server received 1024 bytes from 10.1.1.3 port 49153
At time 2.00126s server sent 1024 bytes to 10.1.1.3 port 49153
At time 2.00139s client received 1024 bytes from 10.1.1.22 port 9
At time 2.00177s server received 1024 bytes from 10.1.1.19 port 49153
At time 2.00177s server sent 1024 bytes to 10.1.1.19 port 49153
At time 2.00212s server received 1024 bytes from 10.1.1.8 port 49153
At time 2.00212s server sent 1024 bytes to 10.1.1.8 port 49153
At time 2.00238s client received 1024 bytes from 10.1.1.22 port 9
At time 2.00271s server received 1024 bytes from 10.1.1.12 port 49153
At time 2.00271s server sent 1024 bytes to 10.1.1.12 port 49153
At time 2.00312s server received 1024 bytes from 10.1.1.14 port 49153
At time 2.00312s server sent 1024 bytes to 10.1.1.14 port 49153
At time 2.00386s client received 1024 bytes from 10.1.1.22 port 9
At time 2.00412s server received 1024 bytes from 10.1.1.2 port 49153
At time 2.00412s server sent 1024 bytes to 10.1.1.2 port 49153
At time 2.00424s client received 1024 bytes from 10.1.1.22 port 9
At time 2.0045s server received 1024 bytes from 10.1.1.9 port 49153
At time 2.0045s server sent 1024 bytes to 10.1.1.9 port 49153
At time 2.00512s server received 1024 bytes from 10.1.1.4 port 49153
At time 2.00512s server sent 1024 bytes to 10.1.1.4 port 49153
At time 2.00524s client received 1024 bytes from 10.1.1.22 port 9
At time 2.00544s server received 1024 bytes from 10.1.1.11 port 49153
At time 2.00544s server sent 1024 bytes to 10.1.1.11 port 49153
At time 2.00612s server received 1024 bytes from 10.1.1.15 port 49153
At time 2.00612s server sent 1024 bytes to 10.1.1.15 port 49153
At time 2.00627s server received 1024 bytes from 10.1.1.20 port 49153
At time 2.00627s server sent 1024 bytes to 10.1.1.20 port 49153
At time 2.00669s server received 1024 bytes from 10.1.1.17 port 49153
At time 2.00669s server sent 1024 bytes to 10.1.1.17 port 49153
At time 2.00712s server received 1024 bytes from 10.1.1.16 port 49153
At time 2.00712s server sent 1024 bytes to 10.1.1.16 port 49153
At time 2.00783s client received 1024 bytes from 10.1.1.22 port 9
At time 2.00812s server received 1024 bytes from 10.1.1.18 port 49153
At time 2.00812s server sent 1024 bytes to 10.1.1.18 port 49153
At time 2.00912s server received 1024 bytes from 10.1.1.6 port 49153
At time 2.00912s server sent 1024 bytes to 10.1.1.6 port 49153
At time 2.00924s client received 1024 bytes from 10.1.1.22 port 9
At time 2.00937s server received 1024 bytes from 10.1.1.13 port 49153
At time 2.00937s server sent 1024 bytes to 10.1.1.13 port 49153
At time 2.00949s client received 1024 bytes from 10.1.1.22 port 9
At time 2.00978s server received 1024 bytes from 10.1.1.7 port 49153
At time 2.00978s server sent 1024 bytes to 10.1.1.7 port 49153
At time 2.0099s client received 1024 bytes from 10.1.1.22 port 9
At time 2.01029s client received 1024 bytes from 10.1.1.22 port 9
At time 2.01038s client received 1024 bytes from 10.1.1.22 port 9
At time 2.01124s client received 1024 bytes from 10.1.1.22 port 9
At time 2.0119s client received 1024 bytes from 10.1.1.22 port 9
At time 2.01255s client received 1024 bytes from 10.1.1.22 port 9
At time 2.01324s client received 1024 bytes from 10.1.1.22 port 9
At time 2.01361s client received 1024 bytes from 10.1.1.22 port 9
At time 2.01381s client received 1024 bytes from 10.1.1.22 port 9
At time 2.01448s client received 1024 bytes from 10.1.1.22 port 9
At time 2.01624s client received 1024 bytes from 10.1.1.22 port 9
At time 2.01824s client received 1024 bytes from 10.1.1.22 port 9
At time 3.00024s client received 1024 bytes from 10.1.1.22 port 9

As you can see, the final message in the simulation arrives nearly a second after the message before it was received. What could cause such a large delay in message delivery? I expected a certain level of contention, but this behavior does not seem reasonable to me. If I log other aspects of the simulation:

  LogComponentEnable ("UdpEchoClientApplication", LOG_LEVEL_INFO);
 
LogComponentEnable ("UdpPyApplication", LOG_LEVEL_INFO);
 
LogComponentEnable ("DropTailQueue", LOG_LEVEL_LOGIC);
 
LogComponentEnable ("DropTailQueue", LOG_PREFIX_TIME);
 
LogComponentEnable ("CsmaNetDevice", LOG_LEVEL_LOGIC);
 
LogComponentEnable ("CsmaNetDevice", LOG_PREFIX_TIME);
 
LogComponentEnable ("UdpSocketImpl", LOG_LEVEL_LOGIC);
 
LogComponentEnable ("UdpSocketImpl", LOG_PREFIX_TIME);

Then you can see that nothing happens with the Channel, Socket or Queue for that second:

2.01824s UdpSocketImpl:ForwardUp(0x15c9d20, 0x15d1fa0, tos 0x0 DSCP Default ECN Not-ECT ttl 64 id 0 protocol 17 offset (bytes) 0 flags [none] length: 1052 10.1.1.22 > 10.1.1.6, 9)
2.01824s UdpSocketImpl:RecvFrom(0x15c9d20, 4294967295, 0)
2.01824s UdpSocketImpl:Recv(0x15c9d20, 4294967295, 0)
At time 2.01824s client received 1024 bytes from 10.1.1.22 port 9
2.01824s UdpSocketImpl:RecvFrom(0x15c9d20, 4294967295, 0)
2.01824s UdpSocketImpl:Recv(0x15c9d20, 4294967295, 0)
3.00012s CsmaNetDevice:GetBroadcast()
3.00012s CsmaNetDevice:GetAddress()
3.00012s CsmaNetDevice:GetBroadcast()
3.00012s CsmaNetDevice:Send(0x15cf5b0, 02-06-ff:ff:ff:ff:ff:ff, 2054)
3.00012s CsmaNetDevice:SendFrom(0x15cf5b0, 02-06-00:00:00:00:00:16, 02-06-ff:ff:ff:ff:ff:ff, 2054)
3.00012s packet =0x15cf5b0
3.00012s UID is 187)
3.00012s CsmaNetDevice:IsLinkUp()
3.00012s CsmaNetDevice:IsSendEnabled()
3.00012s CsmaNetDevice:AddHeader(0x15cf5b0, 00:00:00:00:00:16, ff:ff:ff:ff:ff:ff, 2054)
3.00012s p->GetSize () = 28
3.00012s m_encapMode = 1
3.00012s m_mtu = 1500
3.00012s Encapsulating packet as DIX (type interpretation)
3.00012s header.SetLengthType (2054)
3.00012s DropTailQueue:DoEnqueue(0x1597a70, 0x15cf5b0)
3.00012s Number packets 1
3.00012s Number bytes 64
3.00012s DropTailQueue:DoDequeue(0x1597a70)
3.00012s Popped 0x15cf5b0
3.00012s Number packets 0
3.00012s Number bytes 0
3.00012s CsmaNetDevice:TransmitStart()
3.00012s m_currentPkt = 0x15cf5b0
3.00012s UID = 187
3.00012s CsmaNetDevice:IsSendEnabled()
3.00012s Schedule TransmitCompleteEvent in 5.12e-06sec
3.00012s CsmaNetDevice:TransmitCompleteEvent()

Here is my complete example, so you can recreate my issue:

#include "ns3/core-module.h"
#include "ns3/network-module.h"
#include "ns3/internet-module.h"
#include "ns3/point-to-point-module.h"
#include "ns3/applications-module.h"
#include "ns3/csma-module.h"


using namespace ns3;


NS_LOG_COMPONENT_DEFINE
("FirstScriptExample");


int
main
(int argc, char *argv[])
{
 
Time::SetResolution (Time::NS);
 
LogComponentEnable ("UdpEchoClientApplication", LOG_LEVEL_INFO);
 
LogComponentEnable ("UdpPyApplication", LOG_LEVEL_INFO);
 
LogComponentEnable ("DropTailQueue", LOG_LEVEL_LOGIC);
 
LogComponentEnable ("DropTailQueue", LOG_PREFIX_TIME);
 
LogComponentEnable ("CsmaNetDevice", LOG_LEVEL_LOGIC);
 
LogComponentEnable ("CsmaNetDevice", LOG_PREFIX_TIME);
 
LogComponentEnable ("UdpSocketImpl", LOG_LEVEL_LOGIC);
 
LogComponentEnable ("UdpSocketImpl", LOG_PREFIX_TIME);


 
const int NUM_NODES = 21;


 
NodeContainer c_nodes;
  c_nodes
.Create (NUM_NODES+1);


 
CsmaHelper csma;
  csma
.SetQueue("ns3::DropTailQueue", "MaxPackets", UintegerValue(1));
  csma
.SetChannelAttribute ("DataRate", StringValue ("100Mbps"));
  csma
.SetChannelAttribute ("Delay", TimeValue (NanoSeconds (6560)));


 
NetDeviceContainer c_devices, s_devices;
  c_devices
= csma.Install (c_nodes);


 
InternetStackHelper stack;
  stack
.Install (c_nodes);


 
Ipv4AddressHelper address;
  address
.SetBase ("10.1.1.0", "255.255.255.0");


 
Ipv4InterfaceContainer c_interfaces = address.Assign (c_devices);


 
UdpEchoServerHelper echoServer (9);


 
ApplicationContainer serverApps = echoServer.Install (c_nodes.Get(NUM_NODES));
  serverApps
.Start (Seconds (1.0));
  serverApps
.Stop (Seconds (10.0));




 
ApplicationContainer clientApps;
 
for(uint32_t i=0; i < NUM_NODES; i++)
 
{
   
UdpEchoClientHelper echoClient (c_interfaces.GetAddress(NUM_NODES), 9);
    echoClient
.SetAttribute ("MaxPackets", UintegerValue (1));
    echoClient
.SetAttribute ("Interval", TimeValue (Seconds (0.3)));
    echoClient
.SetAttribute ("PacketSize", UintegerValue (1024));
    clientApps
.Add(echoClient.Install(c_nodes.Get(i)));
 
}
  clientApps
.Start (Seconds (2.0));
  clientApps
.Stop (Seconds (10.0));


 
Simulator::Run ();
 
Simulator::Destroy ();
 
return 0;
}


The NUM_NODES variable controls the number of Client processes in the simulation. 21 is the smallest number I noted observing the effect with, larger numbers caused a greater effect. For example with NUM_NODES=100, messages are still transmitting 6 seconds into the simulation, with 1 second gaps between each volley of messages.

Additional details if needed -- I'm using ns 3.23, on an up-to-date instance of ArchLinux.

Thanks For You Help!

Stephen Jackson

Tommaso Pecorella

unread,
Aug 19, 2015, 12:54:07 PM8/19/15
to ns-3-users
Hi,

no idea. The thing I can say is that yo have an ARP storm. All the nodes are trying to send ARP requests to the server (only being separated by the channel's busy status) and the server is replying as fast as it can.
One of the replies is taking a long time. Why ? As I said, no idea.
If you find out why, let us know, I'm curious.

Have fun,

T.

Stephen Jackson

unread,
Aug 19, 2015, 5:11:05 PM8/19/15
to ns-3-users
Hello,

Could you give me some more information on where this would happen? I'm tracing the code, but I'm not sure where the "Arp storm" is affecting the procedure of sending the packet.

Thanks

Stephen Jackson

unread,
Aug 19, 2015, 6:03:54 PM8/19/15
to ns-3-users
Update-

My problem was caused by the "WaitReplyTimeout" in the ARP cache. The default value is 1 second. Changing the value adjusts how large the gap is.

A second seems like a large wait value, is it realistic?

Tommaso Pecorella

unread,
Aug 19, 2015, 6:25:09 PM8/19/15
to ns-3-users
Hi,

yes, you're right. i was searching for that too, because I did remember I did see it, but I couldn't remember where.

I believe the timer is realistic, and takes into account that the destination could be busy doing more urgent stuff. Honestly i can't find an RFC or an implementation reference (usually we use the Linux one) to enforce the timer.
However, consider that this is the timer when the destination isn't replying. Usually in a LAN there is little chance that the ARP reply is lost.

Cheers,

T.
Reply all
Reply to author
Forward
0 new messages