Memory footprint and wallclock time per packet

36 views
Skip to first unread message

man...@gmail.com

unread,
Sep 11, 2025, 9:07:25 AM (9 days ago) Sep 11
to 5G-LENA-users
I am running ns3-5g-lena simulations with custom code for slicing, compute and caching, as well as communication with a python process using the ns3-ai package to train an RL agent.

Simulations currently consist of 24 UE nodes, 1 gnb and 9 EPC nodes (SGW+PGW +remote host per slice), as well as 1 custom compute node and 1 caching node. For the beamforming, the QuasiOmniDirectPath class is used.

Packets are sent at 1.2MBps Uplink and 23.2MBps Downlink (with maximum throughput per gNb node being 11.8MBps and 25.7MBps) in one specific scenario.

The memory use increases from ~250MB to ~1200MB over time (with a duration of about 1500 seconds sim time). Valgrind  test passes for the ns3 scenario with ns3-ai communication disabled. In terms of wallclock time, 1 sec of sim time corresponds to 2.65 seconds of wallclock time.

Looking at the literature, there are only ns3 performance tests for distributed settings (https://digital.library.unt.edu/ark:/67531/metadc827370/m2/1/high_res_d/1073803.pdf), and the comparison shows my sims run about 10 times slower and memory usage is also about an order of magnitude less.

Do you have your own benchmarks I can compare with? Do you have a suggestion for which example is comparable in the 5g Lena package that I can run to compare to?

man...@gmail.com

unread,
Sep 16, 2025, 1:11:37 PM (4 days ago) Sep 16
to 5G-LENA-users
I run `ns3.42-cttc-nr-traffic-3gpp-xr-optimized` with options  `--arUeNum=1 --vrUeNum=1 --cgUeNum=1 --appDuration=100000` with heaptrack, with the following summary stats

```
heaptrack stats:
allocations:           356545397
leaked allocations:   19864
temporary allocations: 66029770
```

and analyze the  output

```
total runtime: 165.50s.
calls to allocation functions: 356545397 (2154339/s)
temporary memory allocations: 104475926 (631270/s)
peak heap memory consumption: 1.02G
peak RSS (including heaptrack overhead): 1.08G
total memory leaked: 1.90M
suppressed leaks: 20.80K
```

The above scenario has quite high traffic demand, a total of 55Mbps in the UL direction.
For my scenario (but with a lot less traffic, about UL 9.28Mbps, so a fair bit lower to the nr example just above), heaptrack summarized

```
total runtime: 2995.13s.
calls to allocation functions: 7562630775 (2524979/s)
temporary memory allocations: 4439088254 (1482104/s)
peak heap memory consumption: 168.00M
peak RSS (including heaptrack overhead): 260.22M
total memory leaked: 1.55M
suppressed leaks: 23.92K
```

The memory leak appears to be insignificant, so the main issue seems to be related to the NrPhy layer.
I am using version 3.2.y incidentally. Maybe this has improved in the last year?

man...@gmail.com

unread,
Sep 17, 2025, 4:50:49 AM (3 days ago) Sep 17
to 5G-LENA-users
I also ran Valgrind on my example, and there were no memory leaks

```
==547275== Memcheck, a memory error detector
==547275== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==547275== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info
==547275== Command: /home/ns3/ns-3-dev/build/contrib/marlslicing/examples/ns3.42-run-marl-environment-optimized --sliceConfigFile=/home/ns3/ns-3-dev/config/ues24gnbs1slices3-env/sliceConfig1757961146.json --edgeConfigFile=/home/ns3/ns-3-dev/config/edge-env/edgeConfig1757961146.json --sliceOwnershipFile=/home/ns3/ns-3-dev/config/slice-ownership-env/sliceOwnership1757961146.json --totalTxPower=17 --simTime=330100.0 --numWindows=33 --numUes=24 --numGnbs=1 --numSlices=3 --RngRun=2713 --genStartTime=100 --genStopTime=330000.0 --epcDelay=100 --dlCapacity=25500000.0 --ulCapacity=11800000.0 --timeUnit=MS --gridStep=300 --note=env --logEnabled=False --autoPilot=True
==547275==
==547275==
==547275== HEAP SUMMARY:
==547275==     in use at exit: 0 bytes in 0 blocks
==547275==   total heap usage: 7,562,630,775 allocs, 7,562,630,775 frees, 2,577,804,923,405 bytes allocated
==547275==
==547275== All heap blocks were freed -- no leaks are possible
==547275==
==547275== For lists of detected and suppressed errors, rerun with: -s
==547275== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
```
Looking at other threads and asking ChatGPT-5, it appears that the issue is the channel model configuration. Are there specific settings (`Attributes`) I can change, and what is the recommended way to do this (i.e. using the ChannelHelper or the NrHelper)? Keep in mind I am currently using version 3.2.y because switching to 4.0+ would mean porting custom code.

Gabriel Ferreira

unread,
Sep 18, 2025, 8:01:35 AM (2 days ago) Sep 18
to 5G-LENA-users
Use the heaptrack-gui to investigate the issues. Follow guide in https://www.nsnam.org/docs/manual/html/profiling.html#id3

Gabriel Ferreira

unread,
Sep 18, 2025, 8:02:54 AM (2 days ago) Sep 18
to 5G-LENA-users
Every heap allocation that is not matched by a free is identified as a leak in heaptrack. Including static initializers. Valgrind excludes these.

man...@gmail.com

unread,
Sep 19, 2025, 11:28:06 AM (15 hours ago) Sep 19
to 5G-LENA-users
On Wednesday we made some progress. Heaptrack did not show any leaks either.

I used the massif tool to analyze heap usage (and massif-visualizer). It turns out the ns3 event scheduler's size was growing linearly with time (about 24MB per minute).
At the same time, using numerology=2 caused growing heap usage because of the high number of slots. 
Changing the event Scheduler from a map-based one to the heap based one (which according to the docs have different memory overheads per event) halved the growth rate by about two (12MB per minute).
The docs mention that one may want to consider changing the scheduler if the sim time duration is more than an hour, which in our case it definitely is (we would like to simulate about a day of activity).
Further decreasing the numerology from 2 to 0 dropped the rate to ~7MB per minute. This is manageable, especially because at about 8 hours of running time, the simulator cleans some of this claimed space (the amount reclaimed varies).
We consider this a fair trade-off for our approach. I include a screenshot of the output if you would like to provide further feedback and you think we missed something.

massif-output.png
Reply all
Reply to author
Forward
0 new messages