High memory usage in KStream JOIN

60 views
Skip to first unread message

Bobby Harsono

unread,
Nov 27, 2024, 10:25:33 PM11/27/24
to rocksdb
Good day,

I am using KStream on Java API (and springboot) to correlate (LEFT JOIN) two topics, lets say topic A and topic B

- A and B both have TPS around hundred thousands with event size approximately around 100GB per 3 hours
- Using Java 11 and KStream + Topology Processors Interface
- LEFT JOIN based on exact event KEY with windowed time 15 minutes
- A and B have retentions about 1 hour


The memory usage is ridiculously high (around 90-100GB) and only down after each windowed time

So, i tried to tweaked rocksdb like below
rocksdb.write.buffer.size=1073741824
rocksdb.max.write.buffer.number=2
rocksdb.block.cache.size=1073741824
rocksdb.increase.parallelism=2
rocksdb.level.compaction.style=true
rocksdb.min.write.buffer.number.to.merge=1
rocksdb.total.off.heap.memory=4294967296
rocksdb.total.memtable.size=2147483648


Reduction on those numbers seems to no taking effects;

Any ideas? or that perhaps i was go wrong?

Thank you

Bobby Harsono

unread,
Nov 27, 2024, 10:36:39 PM11/27/24
to rocksdb
Yesterday, we tried to reduce the off heap memory to 4 GB and memtable size to 3GB, also write buffer size reduced to 4MB and block cached size to 60MB 

Still same output, incremental memory usage

Ted Mostly

unread,
Nov 28, 2024, 1:49:23 AM11/28/24
to rocksdb
cache_index_and_filter_blocks=true
pin_l0_filter_and_index_blocks_in_cache=true

optional

optimize_filters_for_hits=true
optimize_filters_for_memory=true


Petr Langr

unread,
Nov 30, 2024, 5:58:18 AM11/30/24
to rocksdb
Hi Bobby,

I had the same issue with Java and RocksDB. I went through all the configuration and suggestions about memory usage in rocks. I assume you’ve seen https://github.com/facebook/rocksdb/wiki/Memory-usage-in-RocksDB.

What was the issue in my case was memory fragmentation. I’m running my app in distroless image that uses Debian base image. That by default uses glibc implementation of malloc. After switching to jemalloc the memory decreased by half and stopped increasing. 

Best,
Petr

On 28 Nov 2024, at 7:49 AM, 'Ted Mostly' via rocksdb <roc...@googlegroups.com> wrote:


--
You received this message because you are subscribed to the Google Groups "rocksdb" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rocksdb+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/rocksdb/8d2900ce-03ef-4213-a964-222791f9e7fdn%40googlegroups.com.

MARK CALLAGHAN

unread,
Nov 30, 2024, 6:13:22 PM11/30/24
to Petr Langr, rocksdb

Bobby Harsono

unread,
Dec 2, 2024, 3:36:16 AM12/2/24
to rocksdb
I tried jemalloc with below config 

write-buffer-size=67108864
min-write-buffer-number-to-merge=2
max-write-buffer-number=3
block-cache-size=134217728

offehap memory = 4 GB
memtable size = 2 GB

OPTIMIZE_FILTERS_FOR_HITS = false
OPTIMIZE_FILTERS_FOR_MEMORY = false

MALLOC_ARENA_MAX = 2


Still getting incremental memory usage, now im going to try tcmalloc 

If this is not working, then i have to search for something else

Bobby Harsono

unread,
Dec 2, 2024, 8:04:28 AM12/2/24
to rocksdb
I mistakenly used MALLOC_ARENA_MAX, this is config for gclib only

By using tcmalloc default, i managed to cut 50% of memory usage, 

I will try to use jemalloc + MALLOC_CONF later 

Reply all
Reply to author
Forward
0 new messages