Hi guys,
I am running scalability tests with twister2.
I am running 240 workers on 10 nodes. Each node is running 24 workers.
I first generated 200M tweetID-date pairs for each worker.
In total, we have 200Mx240= 48billion tweetID-date pairs.
I am starting all workers with 4096MB of memory.
They are supposed to read the input files, partition the data, and persist it.
Workers read the first 10M tweetID-date pairs without any issues.
However, before finishing to read 20M tweetID-date pairs, they throw out of memory exception.
I am attaching the logs of one of the workers:
The program that I run is at:
thanks,
Ahmet