Multiple parallel simulations and waf side effecs

195 views
Skip to first unread message

Rémy Grünblatt

unread,
Jan 22, 2018, 5:08:38 AM1/22/18
to ns-3-users
Hi,
to create some "heat maps" using NS-3, I currently use a python script using subprocess and joblib to launch commands like "./waf --run scratch/80211n-mimo-buildings" in the working directory of ns-3.

Still, there is problems with this approach, as multiple simulations happening in parallel interfer with each other (I'm using only one ns-3 directory), by writing some files in the build directories for example, or recompiling (each instance) the shared object corresponding to simulations from the scratch directory.

So, what are my option if I want to avoid this ? I consider copying the ns3 directory not being a solution, as I do not want to have one directory / core (some machine I use may have more than one hundred cores). Is there a way to ask waf to create standalone binaries for scratch files, without side effects other than what is specified in their source code? Is their a way to completely bypass waf I find utterly complicated?

Thanks,

ps: attached is an example of random problem I get when executing many instances of ns-3 in parallel using the same directory

Rémy
log.txt

Tom Henderson

unread,
Jan 22, 2018, 9:28:51 AM1/22/18
to ns-3-...@googlegroups.com
On 01/22/2018 02:08 AM, Rémy Grünblatt wrote:
> Hi,
> to create some "heat maps" using NS-3, I currently use a python script
> using subprocess and joblib to launch commands like "./waf --run
> scratch/80211n-mimo-buildings" in the working directory of ns-3.
>
> Still, there is problems with this approach, as multiple simulations
> happening in parallel interfer with each other (I'm using only one
> ns-3 directory), by writing some files in the build directories for
> example, or recompiling (each instance) the shared object
> corresponding to simulations from the scratch directory.
>
> So, what are my option if I want to avoid this ? I consider copying
> the ns3 directory not being a solution, as I do not want to have one
> directory / core (some machine I use may have more than one hundred
> cores). Is there a way to ask waf to create standalone binaries for
> scratch files, without side effects other than what is specified in
> their source code? Is their a way to completely bypass waf I find
> utterly complicated?

Remy, I'm not familiar with that specific error.  I have been running
parallel ns-3 instances via waf with a bash script, using gnu parallel,
for some time now, and don't get such errors (I don't observe waf
writing common files in the build directory).  Are you ensuring that you
have built the project successfully before you launch parallel runs, or
are your workers competing to build the project because of a source code
change, perhaps?

You can bypass waf by directly calling the binaries, if you set your
LD_LIBRARY_PATH appropriately to pick up the ns-3 libraries.  If you
recurse into build/scratch directory you will find your executable, and
if you run it you might see; e.g.

./80211n-mimo-buildings
./80211n-mimo-buildings: error while loading shared libraries:
libns3-dev-aodv-debug.so: cannot open shared object file: No such file
or directory

but that type of error can be remedied if you add the build/ directory
to your library path.

Note that calling './waf shell' will set this environment correctly, so
you could do this:

$ ./waf shell
$ cd build/scratch
$ ./80211n-mimo-buildings

i.e. running programs through waf is just a convenience but you can run
them like other C++ executables if your shell knows where to find the
libraries (if you are using a shared library build).

- Tom

richard

unread,
Feb 18, 2018, 3:05:16 PM2/18/18
to ns-3-users
I also had the same problem, this happens because waf is trying to read/write a json object to the same file every time you run ./waf <your program>.

To fix that, I modified the following file:

waf-tools/clang_compilation_database.py

just edit the beginning of the second function with this:

def write_compilation_database(ctx):
"Write the clang compilation database as JSON"
database_file = ctx.bldnode.make_node('compile_commands.json')
Logs.info("Build commands will be stored in %s" % database_file.path_from(ctx.path))
try:
print database_file
root = json.load(database_file)
except IOError:
root = []
except ValueError:
root = []


I hope it helps!
richard

saumil shah

unread,
Feb 24, 2018, 10:48:50 AM2/24/18
to ns-3-users
Hi All,

I also get this error sometimes and not always. Many times i start simulations in parallel with use of bash script and most of the time they run perfectly for 2-3 days. Sometimes i get this error and everything stops.

I have tried the suggestion give by richard and it worked for me at least as of now , not sure if i will get the error again or not. 


@richard : Thanks for your suggestion.

Best Regards
Saumil
Reply all
Reply to author
Forward
0 new messages