Cluster Memory Size

87 views
Skip to first unread message

Tim Wisniewski

unread,
Nov 12, 2009, 2:44:12 PM11/12/09
to Hazelcast
Hi,

What is the maximum size of a hazelcast distributed collection within
the cluster? Is it related to the size of the node with the minimum
amount of memory or is it some fraction of the ram of the entire
cluster? Thanks.

-Tim

Talip Ozturk

unread,
Nov 12, 2009, 4:19:41 PM11/12/09
to haze...@googlegroups.com
yes. it is all about memory. each entry will consume key-byte-size +
value-byte-size + (~200bytes). Do not forget the backups.

now if you have 100K map entries with 1 backup then your total memory
in the cluster should be about = 2 * 100K * (key-byte-size +
value-byte-size + 200bytes)

-talip
> --
>
> You received this message because you are subscribed to the Google Groups "Hazelcast" group.
> To post to this group, send email to haze...@googlegroups.com.
> To unsubscribe from this group, send email to hazelcast+...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/hazelcast?hl=.
>
>
>

Tim Wisniewski

unread,
Nov 13, 2009, 2:08:43 PM11/13/09
to Hazelcast
thanks for your reply. what I'm actually trying to find out is can
the collection occupy more memory than any one node? sorry I was
unclear.

Talip Ozturk

unread,
Nov 13, 2009, 9:04:24 PM11/13/09
to haze...@googlegroups.com
yes it can. Let's say you have 10 servers each with 2GB RAM. This
cluster can hold a distributed map of 10GB data with 1 backup copy. So
entire collection occupies more memory than any one node.

-talip

erikf

unread,
Dec 8, 2009, 9:42:56 PM12/8/09
to Hazelcast
I'm trying out hazelcast with the intention of holding a large dataset
in memory -- in the order of 50GB, so it wouldn't fit in a single
JVM. To figure out how much heap space I need, I started a JVM with
512MB heap on five servers (i.e. total heap size is about 2.5GB) and
started loading entries into hazelcast maps. The keys are short
strings (<100 characters) and the values are com.hazelcast.nio.Data
objects.

I connect jconsole to each of the JVMs to monitor memory usage.

After loading 65389 entries with average value.size() of 1100 bytes,
the memory usage according to you formula should be 2 * 65389 * (~100
+ 1100 + 200bytes) = 175MB, but when I add up the heap sizes of the
JVMs (after invoking garbage collection a couple of times) it adds up
to 262MB, so I guess java adds some extra overhead.

The ~50GB dataset I mentioned consists of about 7.5 million records,
so to fit the whole dataset, I need 262MB * 7.5M / 65389 = ~300GB of
heap, So six servers with 64GB RAM each might be able to do it.

Has anyone out there used Hazelcast for that large datasets?

-erik

Talip Ozturk

unread,
Dec 8, 2009, 10:01:25 PM12/8/09
to hazelcast
Erik,

What version of Hazelcast?

Can you post the code that puts up 65389 records with 1100-byte-values
and measures the memory used?

Regards,
-talip
> --
>
> You received this message because you are subscribed to the Google Groups "Hazelcast" group.
> To post to this group, send email to haze...@googlegroups.com.
> To unsubscribe from this group, send email to hazelcast+...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/hazelcast?hl=en.
>
>
>

erikf

unread,
Dec 9, 2009, 4:33:11 PM12/9/09
to Hazelcast
I had another mistake in my math -- no wonder this was confusing me.

The actual memory use that I read from jconsole was 262MB per JVM, so
1313MB total. That's much higher than the ~175MB I would expect, but
I'm surprised at the level of overhead I'm seeing.

(The overall total memory calculation now becomes 1313MB * 7.5M /
65389 = 147GB which probably is manageable in my case)

I guess it's time to grab a couple of cups of coffee and do some more
thorough experiments before I post any more incorrect numbers.

-erik

erikf

unread,
Dec 9, 2009, 3:42:28 PM12/9/09
to Hazelcast
A couple of things to keep in mind regarding this:

- 64-bit JVMs appear to add more overhead than 32-bit JVMs, which is
reasonable as the pointers in the JVM's internal memory structures are
twice as big.
- The garbage collector used can also affect memory usage. For
example, -XX:+UseConcMarkSweepGC -XX:+UseParNewGC appeared to be more
memory-hungry than the default MarkSweep/Scavenge in my test.

So the numbers in my previous comments were collected on the 32-bit
version of Sun's JDK 1.6.0_14 on Linux with JVM arguments:
-Xmx512m -Xms512m -Xloggc:gc.log -Dcom.sun.management.jmxremote -
Dcom.sun.management.jmxremote.authenticate=false -
Dcom.sun.management.jmxremote.ssl=false -
Dcom.sun.management.jmxremote.port=8766


-erik

erikf

unread,
Dec 9, 2009, 3:28:24 PM12/9/09
to Hazelcast
First of all, it looks like I'm off by a factor of 10 in my
calculation. 262MB * 7.5M / 65389 is actually ~30GB. That also means
my estimate of 50 GB for the whole dataset is also wrong -- I'll have
to run some more experiments on this, but it seems like I can get away
with less hardware.


The hazelcast version is 1.7.1

My code reads data out of a database, so it's not quite "portable" as
a test case, but I can try to make a version that creates dummy data
and post that.

My entries are straightforward beans:

public class MyBean implements com.hazelcast.nio.DataSerializable {
String s; long n;
... (there are 20 properties of types long,String,float)
public void writeData(DataOutput out) throws IOException {
out.writeUTF(s); out.writeLong(n); ...
}
public void readData(DataInput in) throws IOException {
s = in.readUTF(); n = in.readLong(); ...
}
}


The code that puts entries into the map looks like this:

IMap<String, Data> map = Hazelcast.getMap("mymap");
com.hazelcast.nio.Serializer ser = new Serializer();
long n = 0;
long totalSize = 0;
while (rs.next()) {
MyBean bean = new MyBean(rs.getString(1), rs.getLong(2), ...);
Data data = ser.writeObject(bean);
Data old = map.get(bean.s);
if (old == null || !old.equals(data)) {
map.put(bean.s, data);
n++;
totalSize += data.size();
}
}
System.out.println("n=" + n + " totalSize=" + totalSize);


That last println() statement is where I got the 65389 number and the
aversage size of 1100 bytes.

I start up the application with -Dcom.sun.management.jmxremote and
connect jconsole to each instance. In jconsole's "Memory" tab, I look
size of the memory pool "PS Old Gen" to determine the memory use.
Before each reading, I invoke the garbage collector (Full GC) to get
rid of garbage and move all the surviving objects to the Old Gen pool.

I realize that I'm counting _all_ the memory that's used, not only the
memory that's used to hold the map itself. However, when the app
first starts up, it uses 8.7MB. I guess I can subtract that memory (5
jvms * 8.7 MB = 43.5MB) from my previous calculation, which takes it
from 262MB to 218.5 MB. Then I can extrapolate the total heap size
needed to 218.5 * 7.5M / 65389 = ~25 GB, but I would also have to add
in at least 8.7MB overhead for each JVM I would run.

I'll keep working on this.

-erik


On Dec 8, 7:01 pm, Talip Ozturk <ta...@hazelcast.com> wrote:
> Erik,
>
> What version of Hazelcast?
>
> Can you post the code that puts up 65389 records with 1100-byte-values
> and measures the memory used?
>
> Regards,
> -talip
>

Talip Ozturk

unread,
Dec 10, 2009, 12:43:34 AM12/10/09
to hazelcast
> The hazelcast version is 1.7.1

definitely try 1.8-snapshot.

> public class MyBean implements com.hazelcast.nio.DataSerializable {

very nice.

>    IMap<String, Data> map = Hazelcast.getMap("mymap");
>    com.hazelcast.nio.Serializer ser = new Serializer();


why are you working directly with the Serializer.
why not map.put (bean.s, bean)

Here is a simple test that I ran. It is putting 65389 entries into a
map with 1100 byte values.

public static void main(String[] args) throws Exception {
IMap map = Hazelcast.getMap("default");
for (int i = 0; i < 65389; i++) {
map.put("StringKey" + i, new byte[1100]);
}
Runtime r = Runtime.getRuntime();
while (true) {
System.out.println("map size " + map.size());
r.gc();
System.out.println("Used " + ((r.maxMemory() -
r.freeMemory()) / 1024 / 1024) + " MB");
Thread.sleep(3000);
}
}


With the latest code in SVN, it prints

map size 65389
Used 113 MB

-talip

erikf

unread,
Dec 10, 2009, 5:23:03 PM12/10/09
to Hazelcast
> why are you working directly with the Serializer.
> why not map.put (bean.s, bean)

I was trying to store the object in serialized form in the map,
thinking that it would take up less memory that way. I probably need
to put byte[] values instead of Data values to achieve that.

> With the latest code in SVN, it prints
>
> map size 65389
> Used 113 MB

I tried this test program on a few different JVMs, and on all of them
I get much higher values than you do. To make the platform more
evident, I added these lines to your program:

System.out.println(System.getProperty("java.vm.name") + " " +
System.getProperty("java.runtime.version"));
System.out.println(System.getProperty("os.name") + " " +
System.getProperty("os.version"));
System.out.println(System.getProperty("sun.management.compiler") +
" data.model=" + System.getProperty("sun.arch.data.model"));

Then I ran you test program with the latest 1.8 snapshot:

When I run it inside eclipse:

Java HotSpot(TM) Server VM 1.6.0_12-b04
Linux 2.6.31.6-162.fc12.i686
HotSpot Tiered Compilers data.model=32
map size 65389
Used 703 MB

When I run it on the command line:

Java HotSpot(TM) Server VM 1.6.0_12-b04
Linux 2.6.31.6-162.fc12.i686
HotSpot Tiered Compilers data.model=32
map size 65389
Used 881 MB

Command line with Java5:

Java HotSpot(TM) Server VM 1.5.0_16-b02
Linux 2.6.31.6-162.fc12.i686
HotSpot Server Compiler data.model=32
map size 65389
Used 940 MB

Now I switch to a big server: (this is the environment where I would
run my real app)

Java HotSpot(TM) Server VM 1.6.0_14-b08
Linux 2.6.18-128.el5
HotSpot Tiered Compilers data.model=32
map size 65389
Used 815 MB

On the big server with 64-bit JVM:

Java HotSpot(TM) 64-Bit Server VM 1.6.0_11-b03
Linux 2.6.18-128.el5
HotSpot 64-Bit Server Compiler data.model=64
map size 65389
Used 3298 MB

On a Windows Terminal server that I have access to:

Java HotSpot(TM) 64-Bit Server VM 1.6.0_13-b03
Windows Server 2008 6.0
HotSpot 64-Bit Server Compiler data.model=64
map size 65389
Used 3268 MB

My buddy tried it on his Windows7 box:

Java HotSpot(TM) Client VM 1.6.0_17-b04
Windows 7 6.1
HotSpot Client Compiler data.model=32
map size 65389
Used 1400 MB


So, my lowest reading was 703 MB which is over six times higher than
what you are seeing. I don't know how to explain that.

What environment did you do your measurement in?


-erik

erik.fo...@gmail.com

unread,
Dec 11, 2009, 2:26:54 PM12/11/09
to Hazelcast
> > With the latest code in SVN, it prints
> >
> > map size 65389
> > Used 113 MB

I enhanced the test program, so I can choose between using Hazelcast and just a ConcurrentHashMap to store the test data, and I get about the same memory usage with both options. So Hazelcast appears to be optimal, while the JVM is wasting memory.


My test program follows...

-erik




import com.hazelcast.core.Hazelcast;
import com.hazelcast.core.IMap;
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;

public class Talip {

public static void main(String[] args) throws Exception {
Map map;
if (args.length > 0 && args[0].equals("h")) {
System.out.println("Using Hazelcast for storage");

map = Hazelcast.getMap("default");
} else {
System.out.println("Using ConcurrentHashMap for storage");
map = new ConcurrentHashMap();

}
for (int i = 0; i < 65389; i++) {
map.put("StringKey" + i, new byte[1100]);
}
System.out.println(System.getProperty("java.vm.name") + " " + System.getProperty("java.runtime.version"));
System.out.println(System.getProperty("os.name") + " " + System.getProperty("os.version"));
System.out.println(System.getProperty("sun.management.compiler") + " data.model=" + System.getProperty("sun.arch.data.model"));
Runtime r = Runtime.getRuntime();
for (int i = 1; i < 5; i++) {

System.out.println("map size " + map.size());
r.gc();
System.out.println("Used " + ((r.maxMemory() -r.freeMemory()) / 1024 / 1024) + " MB");
Thread.sleep(3000);
}
Hazelcast.shutdown();
System.out.println("bye.");
}
}

erik.fo...@gmail.com

unread,
Dec 15, 2009, 4:38:04 PM12/15/09
to Hazelcast
A couple of findings related to this thread:

- The calculation of used memory in the test program is incorrect. Instead of Runtime.maxMemory() it works better to use Runtime.totalMemory(), so the computations becomes:
System.out.println("Used " + ((r.totalMemory() -r.freeMemory()) / 1024 / 1024) + " MB");

- Enabling Hazelcast's JMX agent makes it significantly slower and more memory-hungry. The test program (with the above change) run without JMX enabled takes 7.8 seconds to populate the map and uses 111MB of memory on my box. When adding the JVM parameter -Dcom.sun.management.jmxremote the same program takes 25.2 seconds to populate the map and uses 333 MB of memory. That's a 3x factor for both time and memory. Looking at the source code, it looks like Hazelcast jmx is making a managed bean out of each map entry, which naturally uses more resources. In the furture, this level of detail could be reduced via the method boolean ManagementService.showDetails(), but that's not implemented yet.

Hazelcast's JMX agent is enabled either by setting the system property hazelcast.jmx to "true" or by specifying the system property com.sun.management.jmxremote that enabled JMX in general. The second case means that any application that uses JMX will also get the Hazelcast JMX agent. In my case, I want to use JMX, but not enable Hazelcast's JMX agent, i.e. Run my application with " -Dcom.sun.management.jmxremote -Dhazelcast.jmx=false " To get that to work the way I want, I made this change to the ManagementService.init() method:

- if (!("TRUE".equalsIgnoreCase(System.getProperty(ENABLE_JMX))
- || System.getProperties().containsKey("com.sun.management.jmxremote"))) {
- // JMX disabled
- return false;
- }

+ String enableJmx = System.getProperty(ENABLE_JMX);
+ if (enableJmx != null) {
+ if (!"TRUE".equalsIgnoreCase(enableJmx)) return false;
+ } else {
+ // user has not configured hazelcast jmx specifically, so fall back to overall JVM setting
+ if (! System.getProperties().containsKey("com.sun.management.jmxremote")) return false;
+ }


-erik



On Dec 11, 2009 11:26am, erik.fo...@gmail.com wrote:
> > > With the latest code in SVN, it prints
> > >
> > > map size 65389
> > > Used 113 MB
>
> I enhanced the test program, so I can choose between using Hazelcast and just a ConcurrentHashMap to store the test data, and I get about the same memory usage with both options. So Hazelcast appears to be optimal, while the JVM is wasting memory.
>
>
> My test program follows...
>
> -erik
>
>
>
>
> import com.hazelcast.core.Hazelcast;
> import com.hazelcast.core.IMap;
> import java.util.Map;
> import java.util.concurrent.ConcurrentHashMap;
>
> public class Talip {
> public static void main(String[] args) throws Exception {
> Map map;
> if (args.length > 0 && args[0].equals("h")) {
> System.out.println("Using Hazelcast for storage");
> map = Hazelcast.getMap("default");
> } else {
> System.out.println("Using ConcurrentHashMap for storage");
> map = new ConcurrentHashMap();
> }
> for (int i = 0; i map.put("StringKey" + i, new byte[1100]);

> }
> System.out.println(System.getProperty("java.vm.name") + " " + System.getProperty("java.runtime.version"));
> System.out.println(System.getProperty("os.name") + " " + System.getProperty("os.version"));
> System.out.println(System.getProperty("sun.management.compiler") + " data.model=" + System.getProperty("sun.arch.data.model"));
> Runtime r = Runtime.getRuntime();
> for (int i = 1; i System.out.println("map size " + map.size());

Talip Ozturk

unread,
Dec 15, 2009, 4:48:27 PM12/15/09
to hazelcast
> - The calculation of used memory in the test program is incorrect. Instead
> of Runtime.maxMemory() it works better to use Runtime.totalMemory(), so the
> computations becomes:
> System.out.println("Used " + ((r.totalMemory() -r.freeMemory()) / 1024 /
> 1024) + " MB");

right. with totalMemory- freeMemory calculation, when jmx is disabled,
i am getting 113MB
how about you?


> - Enabling Hazelcast's JMX agent makes it significantly slower and more
> memory-hungry. The test program (with the above change) run without JMX
> enabled takes 7.8 seconds to populate the map and uses 111MB of memory on my
> box. When adding the JVM parameter -Dcom.sun.management.jmxremote the same
> program takes 25.2 seconds to populate the map and uses 333 MB of memory.
> That's a 3x factor for both time and memory. Looking at the source code, it
> looks like Hazelcast jmx is making a managed bean out of each map entry,
> which naturally uses more resources. In the furture, this level of detail
> could be reduced via the method boolean ManagementService.showDetails(), but
> that's not implemented yet.

right. JMX by default watches all map entry updates, which cost a lot.
that shouldn't be the default.

> Hazelcast's JMX agent is enabled either by setting the system property
> hazelcast.jmx to "true" or by specifying the system property
> com.sun.management.jmxremote that enabled JMX in general. The second case
> means that any application that uses JMX will also get the Hazelcast JMX
> agent. In my case, I want to use JMX, but not enable Hazelcast's JMX agent,
> i.e. Run my application with " -Dcom.sun.management.jmxremote
> -Dhazelcast.jmx=false " To get that to work the way I want, I made this
> change to the ManagementService.init() method:
>
> - if (!("TRUE".equalsIgnoreCase(System.getProperty(ENABLE_JMX))
> - || System.getProperties().containsKey("com.sun.management.jmxremote"))) {
> - // JMX disabled
> - return false;
> - }
>
> + String enableJmx = System.getProperty(ENABLE_JMX);
> + if (enableJmx != null) {
> + if (!"TRUE".equalsIgnoreCase(enableJmx)) return false;
> + } else {
> + // user has not configured hazelcast jmx specifically, so fall back to
> overall JVM setting
> + if (! System.getProperties().containsKey("com.sun.management.jmxremote"))
> return false;
> + }

ok. we will look into that.

thanks,
-talip
Reply all
Reply to author
Forward
0 new messages