[Emulation] MPI Parallel Run: Cannot cast RealTimeScheduler to cNullMessageProtocol

109 views
Skip to first unread message

Luca Barbierato

unread,
May 28, 2019, 10:06:07 AM5/28/19
to OMNeT++ Users
Dear all,

I just designed an emulation testbed with Omnet++5.4.1 and INET 4.1.0 in which:
  • 3 StandardHost exploit ExtInterface to receive TCP traffic from real Ethernet cards
  • A network of router that routes packets
  • 3 StandardHost generate UDP background traffic to congest the network
A congestionRate parameter (0.0, 1.0] permits to modify the congestion in the network. For low value of congestionRate, simulation goes in real-time (simsec/sec = 1) and I don't experience problem. For a high value of congestionRate, I start experiencing a high delay in the TCP communication due to the high throughput of the 3 StandardHost that generate UDP background traffic. simsec/sec drops to 0.1 and this problem doesn't permit the normal functioning of external TCP communication (TCP socket drops down). So I was wondering to parallelize the simulation to run the experiments and boost performances to keep simsec/sec = 1. I follow the description in the Omnet++ Manual to configure the omnetpp.ini file. This is my configuration:

[General]
scheduler-class = "inet::RealTimeScheduler"
sim-time-limit = 3600s
cmdenv-express-mode = true
parallel-simulation = true
parsim-communications-class = "cMPICommunications"
parsim-synchronization-class = "cNullMessageProtocol"

*.configurator**.partition-id = 0
*.host1**.partition-id = 0
*.host3**.partition-id = 0
*.host5**.partition-id = 0
*.router1**.partition-id = 0
*.router2**.partition-id = 0
*.router3**.partition-id = 0
*.router4**.partition-id = 0
*.router5**.partition-id = 0
*.router6**.partition-id = 0
*.host2**.partition-id = 0
*.host4**.partition-id = 0
*.host6**.partition-id = 0

[Config ExternalCommunication]
...

Firstly, I tried to assign to all module the same partition 0. After compiling, I'm experiencing this error in the network initialization:

Starting...

$ cd /home/luca/omnetpp/omnetpp-5.4.1/samples/inet/showcases/emulation/basic
$ opp_run -m -u Cmdenv -c ExternalCommunication -n ../../../src:../../../examples:../../../tutorials:../.. --image-path=../../../images -l ../../../src/INET -s --cmdenv-redirect-output=false --record-eventlog=false --scalar-recording=false --vector-recording=false --cmdenv-express-mode=true omnetpp.ini

cMPICommunications: started as process 0 out of 1.
MPI thinks this process is the only one in the session (did you use mpirun to start this program?)
ExternalCommunication run 0: , $repetition=0

<!> Error: check_and_cast(): Cannot cast (omnetpp::cNullMessageProtocol*) to type 'inet::RealTimeScheduler *' -- in module (inet::ExtEthernetTapDevice) ExternalCommunication.host2.eth[0].tap (id=165), during network initialization
[INFO] Clear all sockets
[INFO] Clear all sockets
[INFO] Clear all sockets
[INFO] Clear all sockets
[INFO] Clear all sockets
[INFO] Clear all sockets

Simulation terminated with exit code: 1
Working directory: /home/luca/omnetpp/omnetpp-5.4.1/samples/inet/showcases/emulation/basic
Command line: opp_run -m -u Cmdenv -c ExternalCommunication -n ../../../src:../../../examples:../../../tutorials:../.. --image-path=../../../images -l ../../../src/INET -s --cmdenv-redirect-output=false --record-eventlog=false --scalar-recording=false --vector-recording=false --cmdenv-express-mode=true omnetpp.ini

Environment variables:
PATH=/home/luca/omnetpp/omnetpp-5.4.1/bin::/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin
LD_LIBRARY_PATH=/home/luca/omnetpp/omnetpp-5.4.1/lib::/home/luca/omnetpp/omnetpp-5.4.1/samples/inet/src:
OMNETPP_IMAGE_PATH=/home/luca/omnetpp/omnetpp-5.4.1/images


Somebody knows if RealTimeScheduler is able to work in parallel distributed run with this configuration?

Kind regards,
Luca

Rudolf Hornig

unread,
May 29, 2019, 9:18:39 AM5/29/19
to OMNeT++ Users
No, the RealTimeScheduler and the cParsimSynchronizer (which is implicitly used when parallel support is turned on in OMNET) are mutually exclusive. Both class is an subclass of the cScheduler calls and OMNET can have only a single scheduler. You can configure either RealTimeScheduler or the cParsimSynchronizer, but you cannot configure both. If you configure parsim-synchronizaton-class then you implicitly select cParsimSynchronizer as the scheduler implementation (that's why you receive a class cast exception as the tap module assumes that it is running using the RealTimeScheduler)

The task of a scheduler class is to deliver the messages to the appropriate modules at the right time and hold off the simulation in real time if a message is not yet available). Now cParsimSynchronizer is syncing to the other MPI processes, on the other hand RealTimeScheduler would sync to external interface cards. You cannot configure both at the same time.

In theory it would be possible to write a scheduler that does both (i.e. it sync both with the other MPI processes and also with external interfaces) if you would be able to merge the two functionality.

Sadly that would not help you anyway. You are trying to speed up the execution speed of the simulation by using parallel simulation, however it is a general misconception that parallel simulation improves always the performance. In most cases, that is not the true. Marshalling/unmarshalling the messages and passing them between different processes adds a lot of overhead to the communication between the modules. On the other hand, the different simulation partitions still have to run in lock-step to avoid the violation of causality, so syncing their simulation time is again adding overhead.

You should expect speedup ONLY, if you have a kind of simulation where there are partitions of the simulation which are tightly coupled while message exchanges between partitions are relatively rare and have a considerable delay on message delivery between them. This is not typical for an internet simulation. An additional issue with parallel simulation is that you cannot use global variables, caches, static variables etc.

For the above reasons, INET was never designed to be able to run using parallel simulation (it has several global variables, caches, modules etc.).
So in short, INET itself is not usable under parallel simulation. Sadly the only way to speed up the simulation is to use a computer with faster single threaded performance...
Reply all
Reply to author
Forward
0 new messages