I'm back again :-)
By running the same benchmark like before, on 2 JVM resident on the same server, throughput drops from 63k msg/sec to 5.9k msg/sec. More than 10 times slower. Below the full report (I stop after a while). You can see that after the 2nd node joins, throughput slow down immediately. I was expecting a slow down, but not more than 10x on the same server.
How can we improve this?
Thanks,
Lvc@
set 24, 2014 11:04:20 AM com.hazelcast.config.XmlConfigLocator
Loading 'hazelcast-default.xml' from classpath.
set 24, 2014 11:04:20 AM com.hazelcast.instance.DefaultAddressPicker
[LOCAL] [dev] [3.3] Prefer IPv4 stack is true.
set 24, 2014 11:04:20 AM com.hazelcast.instance.DefaultAddressPicker
[LOCAL] [dev] [3.3] Picked Address[10.3.15.55]:5701, using socket ServerSocket[addr=/0:0:0:0:0:0:0:0,localport=5701], bind any local is true
set 24, 2014 11:04:20 AM com.hazelcast.spi.impl.BasicOperationScheduler
[10.3.15.55]:5701 [dev] [3.3] Starting with 16 generic operation threads and 16 partition operation threads.
set 24, 2014 11:04:20 AM com.hazelcast.system
[10.3.15.55]:5701 [dev] [3.3] Hazelcast 3.3 (20140907) starting at Address[10.3.15.55]:5701
set 24, 2014 11:04:20 AM com.hazelcast.system
[10.3.15.55]:5701 [dev] [3.3] Copyright (C) 2008-2014 Hazelcast.com
set 24, 2014 11:04:20 AM com.hazelcast.instance.Node
[10.3.15.55]:5701 [dev] [3.3] Creating MulticastJoiner
set 24, 2014 11:04:20 AM com.hazelcast.core.LifecycleService
[10.3.15.55]:5701 [dev] [3.3] Address[10.3.15.55]:5701 is STARTING
set 24, 2014 11:04:28 AM com.hazelcast.cluster.MulticastJoiner
[10.3.15.55]:5701 [dev] [3.3]
Members [1] {
Member [10.3.15.55]:5701 this
}
set 24, 2014 11:04:28 AM com.hazelcast.core.LifecycleService
[10.3.15.55]:5701 [dev] [3.3] Address[10.3.15.55]:5701 is STARTED
1 Start sending msgs
1 Start receiving msgs
set 24, 2014 11:04:28 AM com.hazelcast.partition.InternalPartitionService
[10.3.15.55]:5701 [dev] [3.3] Initializing cluster partition table first arrangement...
set 24, 2014 11:04:28 AM com.hazelcast.nio.tcp.SocketAcceptor
[10.3.15.55]:5701 [dev] [3.3] Accepting socket connection from /
10.3.15.55:52568set 24, 2014 11:04:28 AM com.hazelcast.nio.tcp.TcpIpConnectionManager
<- messages 100000/1000000 = 1926ms (51921,078125 msg/sec)
-> messages 100000/1000000 = 1926ms (51921,078125 msg/sec)
-> messages 200000/1000000 = 1563ms (63979,527344 msg/sec)
<- messages 200000/1000000 = 1563ms (63979,527344 msg/sec)
-> messages 300000/1000000 = 1555ms (64308,683594 msg/sec)
<- messages 300000/1000000 = 1555ms (64308,683594 msg/sec)
set 24, 2014 11:04:34 AM com.hazelcast.cluster.ClusterService
[10.3.15.55]:5701 [dev] [3.3]
Members [2] {
Member [10.3.15.55]:5701 this
Member [10.3.15.55]:5702
}
set 24, 2014 11:04:34 AM com.hazelcast.partition.InternalPartitionService
[10.3.15.55]:5701 [dev] [3.3] Re-partitioning cluster data... Migration queue size: 135
set 24, 2014 11:04:36 AM com.hazelcast.partition.InternalPartitionService
[10.3.15.55]:5701 [dev] [3.3] All migration tasks have been completed, queues are empty.
-> messages 400000/1000000 = 4496ms (22241,992188 msg/sec)
<- messages 400000/1000000 = 4500ms (22222,222656 msg/sec)
-> messages 500000/1000000 = 16450ms (6079,027344 msg/sec)
<- messages 500000/1000000 = 16452ms (6078,288574 msg/sec)
-> messages 600000/1000000 = 16766ms (5964,451660 msg/sec)
<- messages 600000/1000000 = 16763ms (5965,519531 msg/sec)
-> messages 700000/1000000 = 16696ms (5989,458496 msg/sec)
<- messages 700000/1000000 = 16694ms (5990,176270 msg/sec)
Process finished with exit code 137
On Wednesday, 24 September 2014 11:19:13 UTC+2, Luca Garulli wrote:
With 1 node only the performance are not so bad:
63k/sec (full output below). What it's weird is the coupling between producer and consumer. I was expecting the producer ended the loop much earlier than consumer, but seems the queue size is limited so the producer is BLOCKING. By inspecting the code the queue by default has Integer.MAX as limit, so this isn't the problem. But the operation is executed using Java Futures, so it uses the implicit Executor queue. This is the piece of code:
BasicInvocation.doInvokeLocal(){
...
if (operationService.scheduler.isAllowedToRunInCurrentThread(op)) { <-----
operationService.runOperationOnCallingThread(op);
} else {
operationService.executeOperation(op);
}
}
In the next email the results of the same benchmark, but with 2 nodes.
Benchmark Output:
set 24, 2014 11:03:10 AM com.hazelcast.config.XmlConfigLocator
Loading 'hazelcast-default.xml' from classpath.
set 24, 2014 11:03:10 AM com.hazelcast.instance.DefaultAddressPicker
[LOCAL] [dev] [3.3] Prefer IPv4 stack is true.
set 24, 2014 11:03:10 AM com.hazelcast.instance.DefaultAddressPicker
[LOCAL] [dev] [3.3] Picked Address[10.3.15.55]:5701, using socket ServerSocket[addr=/0:0:0:0:0:0:0:0,localport=5701], bind any local is true
set 24, 2014 11:03:10 AM com.hazelcast.spi.impl.BasicOperationScheduler
[10.3.15.55]:5701 [dev] [3.3] Starting with 16 generic operation threads and 16 partition operation threads.
set 24, 2014 11:03:10 AM com.hazelcast.system
[10.3.15.55]:5701 [dev] [3.3] Hazelcast 3.3 (20140907) starting at Address[10.3.15.55]:5701
set 24, 2014 11:03:10 AM com.hazelcast.system
[10.3.15.55]:5701 [dev] [3.3] Copyright (C) 2008-2014 Hazelcast.com
set 24, 2014 11:03:10 AM com.hazelcast.instance.Node
[10.3.15.55]:5701 [dev] [3.3] Creating MulticastJoiner
set 24, 2014 11:03:10 AM com.hazelcast.core.LifecycleService
[10.3.15.55]:5701 [dev] [3.3] Address[10.3.15.55]:5701 is STARTING
set 24, 2014 11:03:18 AM com.hazelcast.cluster.MulticastJoiner
[10.3.15.55]:5701 [dev] [3.3]
Members [1] {
Member [10.3.15.55]:5701 this
}
set 24, 2014 11:03:18 AM com.hazelcast.core.LifecycleService
[10.3.15.55]:5701 [dev] [3.3] Address[10.3.15.55]:5701 is STARTED
0 Start sending msgs
0 Start receiving msgs
set 24, 2014 11:03:18 AM com.hazelcast.partition.InternalPartitionService
[10.3.15.55]:5701 [dev] [3.3] Initializing cluster partition table first arrangement...
-> messages 100000/1000000 = 2637ms (37921,882813 msg/sec)
<- messages 100000/1000000 = 2637ms (37921,882813 msg/sec)
-> messages 200000/1000000 = 1768ms (56561,085938 msg/sec)
<- messages 200000/1000000 = 1768ms (56561,085938 msg/sec)
-> messages 300000/1000000 = 1628ms (61425,062500 msg/sec)
<- messages 300000/1000000 = 1628ms (61425,062500 msg/sec)
-> messages 400000/1000000 = 1597ms (62617,406250 msg/sec)
<- messages 400000/1000000 = 1597ms (62617,406250 msg/sec)
-> messages 500000/1000000 = 1597ms (62617,406250 msg/sec)
<- messages 500000/1000000 = 1597ms (62617,406250 msg/sec)
-> messages 600000/1000000 = 1576ms (63451,777344 msg/sec)
<- messages 600000/1000000 = 1576ms (63451,777344 msg/sec)
-> messages 700000/1000000 = 1559ms (64143,683594 msg/sec)
<- messages 700000/1000000 = 1559ms (64143,683594 msg/sec)
-> messages 800000/1000000 = 1578ms (63371,355469 msg/sec)
<- messages 800000/1000000 = 1577ms (63411,539063 msg/sec)
-> messages 900000/1000000 = 1575ms (63492,062500 msg/sec)
<- messages 900000/1000000 = 1576ms (63451,777344 msg/sec)
-> messages 1000000/1000000 = 1578ms (63371,355469 msg/sec)
<- messages 1000000/1000000 = 1578ms (63371,355469 msg/sec)
17097 Finished sending msgs
17100 Test finished
Lvc@