Memory Leak in MUC Component? (Big Jitsi Hostings)

128 views
Skip to first unread message

Nick

unread,
Apr 30, 2020, 3:38:31 PM4/30/20
to prosod...@googlegroups.com
There was aready an issue on your issue tracker:
https://issues.prosody.im/1312
Further, there is an issue on the prosody forum:
https://community.jitsi.org/t/prosody-leaking-memory/22311

Currently, big jitsi hostings are facing issues due to a memory leak in
the 0.11.2-1.
Probably, the memory leak is in the MUC module.

Do u know if this already fixed in later versions? How do u check for
memory leaks?
I think I will try this tool: https://github.com/processone/rtb

Matthew Wild

unread,
Apr 30, 2020, 4:22:48 PM4/30/20
to Prosody IM Users Group
Hi Nick,

On Thu, 30 Apr 2020 at 20:38, Nick <vin...@systemli.org> wrote:
There was aready an issue on your issue tracker:
https://issues.prosody.im/1312
Further, there is an issue on the prosody forum:
https://community.jitsi.org/t/prosody-leaking-memory/22311

Memory issues can be hard to track down, because they vary widely in cause and solution. Unfortunately many reports we receive are too hard to act on.

We are currently working with a small group of Prosody operators who are assisting us with gathering data about their issues. This project is leading to gaining better insight into what causes memory issues on some servers.

Currently, big jitsi hostings are facing issues due to a memory leak in
the 0.11.2-1.

The latest release is 0.11.5.

Probably, the memory leak is in the MUC module.

That's a random guess. I've operated Prosody deployments with hundreds of thousands of MUC rooms without any leaks. Many of the MUC improvements that were released with 0.11.x were driven by the need to efficiently support such large numbers of MUCs.

That's not to say it's impossible for some certain configurations to have issues, but it's not helpful to jump to conclusions without evidence. And gathering reliable evidence requires some effort.

Do u know if this already fixed in later versions?

I'm not aware of any memory leaks fixed or unfixed in 0.11.x.
 
How do u check for
memory leaks?

We use a range of tools, depending on the kind of leak, and where it is located.

Lua is a garbage collected language that can't truly leak memory (the Lua VM tracks all object allocations and cleans up anything that is finished with), but sometimes a bug can accidentally cause data to be kept around longer than intended. For debugging this kind of leak, we use a diagnostic tool we built that dumps the Lua state to a file which can be inspected in various ways.

Lua integrates with C, which is not a language with garbage collection. A number of the libraries that Prosody depends on are written in C, and are obvious candidates for leaks. Luckily there are many existing tools for catching these, such as valgrind.

Finally there is a third kind of leak, which is caused by a combination of memory allocation patterns and the memory allocator that is in use, this is known as memory fragmentation. This can cause the process to appear to the OS that is is using more memory than it really using internally.

After investigating a range of deployments, we've not found any leaks tied to the Prosody codebase. We did identify a small leak in a third-party library (LuaSec), but this is unlikely to affect Jitsi deployments that don't federate.

Most of our current analysis points to the third cause being the most common issue. Unfortunately it's potentially one of the hardest for us to fix, since allocation patterns are largely controlled by the Lua VM and the garbage collector.

As part of this work we recently adapted our trunk nightly packages to run on any Lua version (5.1, 5.2, 5.3, soon 5.4). Each Lua version has received changes to its algorithms, and each one has different configuration options that control the GC and alter allocation patterns. This will allow us to perform more experiments and gather more data about what may reduce fragmentation for affected deployments.
 
Another parallel solution is experimenting with different memory allocators. In the past I demonstrated that jemalloc significantly reduced fragmentation in earlier versions of Prosody. If we can reproduce that result, we may look at making it easier (or the default) to run Prosody with jemalloc. I'm not going to provide instructions here as the internet is likely full of them, but actually testing jemalloc is something anyone can do with no code changes and just a bit of sysadmin skills.

I think I will try this tool: https://github.com/processone/rtb

Please do! Last time we ran that tool against Prosody, the author of rtb commented on how Prosody used far less memory than ejabberd under the same test :)

But if you find a way to reproduce memory issues with rtb, please share details of your setup and we'd be very interested.

Hope this info all helps!

Regards,
Matthew
Reply all
Reply to author
Forward
0 new messages