Hazelcast Queue performance

273 views
Skip to first unread message

Luca Garulli

unread,
Sep 23, 2014, 7:49:38 PM9/23/14
to haze...@googlegroups.com
Hi guys,
I'm struggling to improve performance of HZ Queues, but with no success. If I've 1 producer + 1 consumer on separate JVMs, but on the same server, I cannot go faster than 1,5k messages per second. I found an old thread (2011) about this topic and it reported similar values.

By a quick profiling using YourKit I see most of the time is spent in KQueueArrayWrapper.kevent0() mehod.

Do you know any trick to speed up queues?

Thanks,
Lvc@

Luca Garulli

unread,
Sep 24, 2014, 5:12:02 AM9/24/14
to haze...@googlegroups.com
I have more information about it. This is a simple TestCase to see the maximum Hazelcast throughput with Queues.

public class HazelcastQueueThroughputTest {
  private static final int TOTAL = 1000000;
  private static final int LAP   = 100000;

  public static void main(String[] args) throws InterruptedException {
    final HazelcastInstance hz = Hazelcast.newHazelcastInstance();
    final IQueue<Object> queue = hz.getQueue("test");

    final long start = System.currentTimeMillis();
    long lastLap = start;

    Thread t = new Thread() {
      long lastLap = start;

      @Override
      public void run() {
        System.out.println((System.currentTimeMillis() - lastLap) + " Start receiving msgs");
        for (int i = 1; i < TOTAL + 1; ++i) {
          try {
            Object msg = queue.take();

            if (i % LAP == 0) {
              final long lapTime = System.currentTimeMillis() - lastLap;
              System.out.printf("<- messages %d/%d = %dms (%f msg/sec)\n", i, TOTAL, lapTime, ((float) LAP * 1000 / lapTime));
              lastLap = System.currentTimeMillis();
            }

          } catch (InterruptedException e) {
            e.printStackTrace();
          }
        }
      }
    };
    t.start();

    System.out.println((System.currentTimeMillis() - lastLap) + " Start sending msgs");
    for (int i = 1; i < TOTAL + 1; ++i) {
      queue.offer(i);

      if (i % LAP == 0) {
        final long lapTime = System.currentTimeMillis() - lastLap;
        System.out.printf("-> messages %d/%d = %dms (%f msg/sec)\n", i, TOTAL, lapTime, ((float) LAP * 1000 / lapTime));
        lastLap = System.currentTimeMillis();
      }
    }

    System.out.println((System.currentTimeMillis() - start) + " Finished sending msgs");

    t.join();

    System.out.println((System.currentTimeMillis() - start) + " Test finished");

Peter Veentjer

unread,
Sep 24, 2014, 5:16:09 AM9/24/14
to haze...@googlegroups.com
Can you create a github ticket for it?

Then my team can create a stabilizer test for it and do some inspection ourselves. We just added Intel VTune integration, so we should be able to see way beyond what a normal profiler sees.

--
You received this message because you are subscribed to the Google Groups "Hazelcast" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hazelcast+...@googlegroups.com.
To post to this group, send email to haze...@googlegroups.com.
Visit this group at http://groups.google.com/group/hazelcast.
To view this discussion on the web visit https://groups.google.com/d/msgid/hazelcast/89d89874-d4bb-4d14-9ae7-1da7c2eccd8c%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Message has been deleted

Luca Garulli

unread,
Sep 24, 2014, 5:23:20 AM9/24/14
to haze...@googlegroups.com
I'm back again :-)
By running the same benchmark like before, on 2 JVM resident on the same server, throughput drops from 63k msg/sec to 5.9k msg/sec. More than 10 times slower. Below the full report (I stop after a while). You can see that after the 2nd node joins, throughput slow down immediately. I was expecting a slow down, but not more than 10x on the same server. 

How can we improve this?

Thanks,
Lvc@


set 24, 2014 11:04:20 AM com.hazelcast.config.XmlConfigLocator
Loading 'hazelcast-default.xml' from classpath.
set 24, 2014 11:04:20 AM com.hazelcast.instance.DefaultAddressPicker
[LOCAL] [dev] [3.3] Prefer IPv4 stack is true.
set 24, 2014 11:04:20 AM com.hazelcast.instance.DefaultAddressPicker
[LOCAL] [dev] [3.3] Picked Address[10.3.15.55]:5701, using socket ServerSocket[addr=/0:0:0:0:0:0:0:0,localport=5701], bind any local is true
set 24, 2014 11:04:20 AM com.hazelcast.spi.impl.BasicOperationScheduler
[10.3.15.55]:5701 [dev] [3.3] Starting with 16 generic operation threads and 16 partition operation threads.
set 24, 2014 11:04:20 AM com.hazelcast.system
[10.3.15.55]:5701 [dev] [3.3] Hazelcast 3.3 (20140907) starting at Address[10.3.15.55]:5701
set 24, 2014 11:04:20 AM com.hazelcast.system
[10.3.15.55]:5701 [dev] [3.3] Copyright (C) 2008-2014 Hazelcast.com
set 24, 2014 11:04:20 AM com.hazelcast.instance.Node
[10.3.15.55]:5701 [dev] [3.3] Creating MulticastJoiner
set 24, 2014 11:04:20 AM com.hazelcast.core.LifecycleService
[10.3.15.55]:5701 [dev] [3.3] Address[10.3.15.55]:5701 is STARTING
set 24, 2014 11:04:28 AM com.hazelcast.cluster.MulticastJoiner
[10.3.15.55]:5701 [dev] [3.3] 


Members [1] {
Member [10.3.15.55]:5701 this
}

set 24, 2014 11:04:28 AM com.hazelcast.core.LifecycleService
[10.3.15.55]:5701 [dev] [3.3] Address[10.3.15.55]:5701 is STARTED
1 Start sending msgs
1 Start receiving msgs
set 24, 2014 11:04:28 AM com.hazelcast.partition.InternalPartitionService
[10.3.15.55]:5701 [dev] [3.3] Initializing cluster partition table first arrangement...
set 24, 2014 11:04:28 AM com.hazelcast.nio.tcp.SocketAcceptor
[10.3.15.55]:5701 [dev] [3.3] Accepting socket connection from /10.3.15.55:52568
set 24, 2014 11:04:28 AM com.hazelcast.nio.tcp.TcpIpConnectionManager
[10.3.15.55]:5701 [dev] [3.3] Established socket connection between /10.3.15.55:5701 and /10.3.15.55:52568
<- messages 100000/1000000 = 1926ms (51921,078125 msg/sec)
-> messages 100000/1000000 = 1926ms (51921,078125 msg/sec)
-> messages 200000/1000000 = 1563ms (63979,527344 msg/sec)
<- messages 200000/1000000 = 1563ms (63979,527344 msg/sec)
-> messages 300000/1000000 = 1555ms (64308,683594 msg/sec)
<- messages 300000/1000000 = 1555ms (64308,683594 msg/sec)
set 24, 2014 11:04:34 AM com.hazelcast.cluster.ClusterService
[10.3.15.55]:5701 [dev] [3.3] 

Members [2] {
Member [10.3.15.55]:5701 this
Member [10.3.15.55]:5702
}

set 24, 2014 11:04:34 AM com.hazelcast.partition.InternalPartitionService
[10.3.15.55]:5701 [dev] [3.3] Re-partitioning cluster data... Migration queue size: 135
set 24, 2014 11:04:36 AM com.hazelcast.partition.InternalPartitionService
[10.3.15.55]:5701 [dev] [3.3] All migration tasks have been completed, queues are empty.
-> messages 400000/1000000 = 4496ms (22241,992188 msg/sec)
<- messages 400000/1000000 = 4500ms (22222,222656 msg/sec)
-> messages 500000/1000000 = 16450ms (6079,027344 msg/sec)
<- messages 500000/1000000 = 16452ms (6078,288574 msg/sec)
-> messages 600000/1000000 = 16766ms (5964,451660 msg/sec)
<- messages 600000/1000000 = 16763ms (5965,519531 msg/sec)
-> messages 700000/1000000 = 16696ms (5989,458496 msg/sec)
<- messages 700000/1000000 = 16694ms (5990,176270 msg/sec)

Process finished with exit code 137






On Wednesday, 24 September 2014 11:19:13 UTC+2, Luca Garulli wrote:
With 1 node only the performance are not so bad: 63k/sec (full output below). What it's weird is the coupling between producer and consumer. I was expecting the producer ended the loop much earlier than consumer, but seems the queue size is limited so the producer is BLOCKING. By inspecting the code the queue by default has Integer.MAX as limit, so this isn't the problem. But the operation is executed using Java Futures, so it uses the implicit Executor queue. This is the piece of code:

BasicInvocation.doInvokeLocal(){
  ...
        if (operationService.scheduler.isAllowedToRunInCurrentThread(op)) { <-----
            operationService.runOperationOnCallingThread(op);
        } else {
            operationService.executeOperation(op);
        }
}

In the next email the results of the same benchmark, but with 2 nodes.

Benchmark Output:

set 24, 2014 11:03:10 AM com.hazelcast.config.XmlConfigLocator
Loading 'hazelcast-default.xml' from classpath.
set 24, 2014 11:03:10 AM com.hazelcast.instance.DefaultAddressPicker
[LOCAL] [dev] [3.3] Prefer IPv4 stack is true.
set 24, 2014 11:03:10 AM com.hazelcast.instance.DefaultAddressPicker
[LOCAL] [dev] [3.3] Picked Address[10.3.15.55]:5701, using socket ServerSocket[addr=/0:0:0:0:0:0:0:0,localport=5701], bind any local is true
set 24, 2014 11:03:10 AM com.hazelcast.spi.impl.BasicOperationScheduler
[10.3.15.55]:5701 [dev] [3.3] Starting with 16 generic operation threads and 16 partition operation threads.
set 24, 2014 11:03:10 AM com.hazelcast.system
[10.3.15.55]:5701 [dev] [3.3] Hazelcast 3.3 (20140907) starting at Address[10.3.15.55]:5701
set 24, 2014 11:03:10 AM com.hazelcast.system
[10.3.15.55]:5701 [dev] [3.3] Copyright (C) 2008-2014 Hazelcast.com
set 24, 2014 11:03:10 AM com.hazelcast.instance.Node
[10.3.15.55]:5701 [dev] [3.3] Creating MulticastJoiner
set 24, 2014 11:03:10 AM com.hazelcast.core.LifecycleService
[10.3.15.55]:5701 [dev] [3.3] Address[10.3.15.55]:5701 is STARTING
set 24, 2014 11:03:18 AM com.hazelcast.cluster.MulticastJoiner
[10.3.15.55]:5701 [dev] [3.3] 


Members [1] {
  Member [10.3.15.55]:5701 this
}

set 24, 2014 11:03:18 AM com.hazelcast.core.LifecycleService
[10.3.15.55]:5701 [dev] [3.3] Address[10.3.15.55]:5701 is STARTED
0 Start sending msgs
0 Start receiving msgs
set 24, 2014 11:03:18 AM com.hazelcast.partition.InternalPartitionService
[10.3.15.55]:5701 [dev] [3.3] Initializing cluster partition table first arrangement...
-> messages 100000/1000000 = 2637ms (37921,882813 msg/sec)
<- messages 100000/1000000 = 2637ms (37921,882813 msg/sec)
-> messages 200000/1000000 = 1768ms (56561,085938 msg/sec)
<- messages 200000/1000000 = 1768ms (56561,085938 msg/sec)
-> messages 300000/1000000 = 1628ms (61425,062500 msg/sec)
<- messages 300000/1000000 = 1628ms (61425,062500 msg/sec)
-> messages 400000/1000000 = 1597ms (62617,406250 msg/sec)
<- messages 400000/1000000 = 1597ms (62617,406250 msg/sec)
-> messages 500000/1000000 = 1597ms (62617,406250 msg/sec)
<- messages 500000/1000000 = 1597ms (62617,406250 msg/sec)
-> messages 600000/1000000 = 1576ms (63451,777344 msg/sec)
<- messages 600000/1000000 = 1576ms (63451,777344 msg/sec)
-> messages 700000/1000000 = 1559ms (64143,683594 msg/sec)
<- messages 700000/1000000 = 1559ms (64143,683594 msg/sec)
-> messages 800000/1000000 = 1578ms (63371,355469 msg/sec)
<- messages 800000/1000000 = 1577ms (63411,539063 msg/sec)
-> messages 900000/1000000 = 1575ms (63492,062500 msg/sec)
<- messages 900000/1000000 = 1576ms (63451,777344 msg/sec)
-> messages 1000000/1000000 = 1578ms (63371,355469 msg/sec)
<- messages 1000000/1000000 = 1578ms (63371,355469 msg/sec)
17097 Finished sending msgs
17100 Test finished

Lvc@

Luca Garulli

unread,
Sep 24, 2014, 5:29:18 AM9/24/14
to haze...@googlegroups.com
Reply all
Reply to author
Forward
0 new messages