Hazelcast aggregation performance

373 views
Skip to first unread message

Minh Trần

unread,
Jan 31, 2018, 11:00:00 PM1/31/18
to Hazelcast

Hi Hazelcast team,



I do the test with aggregation of Hazelcast 3.9.2. And I got a not really good performance


Cache Size: 100000 entries (4.18 KB/entry)

Average query time: 156 millisecond

After doing some customization on the model, i got 91 millisecond 

My PC: 4 cores, 8GB RAM (7 GB for heap)


Does any one suggest me any solution to improve the performance of the benchmark ?



The cache is used is a map with following configuration:


       
    <properties>
   <property name="hazelcast.jmx">true</property>
   <property name="hazelcast.query.predicate.parallel.evaluation">true</property>
   <property name="hazelcast.aggregation.accumulation.parallel.evaluation">true</property>
    </properties>

    <map name="tasks">
        <in-memory-format>BINARY</in-memory-format>
        <backup-count>1</backup-count>
        <async-backup-count>0</async-backup-count>
        <time-to-live-seconds>0</time-to-live-seconds>
        <max-idle-seconds>0</max-idle-seconds>
        <eviction-policy>NONE</eviction-policy>
        <max-size policy="PER_NODE">0</max-size>
        <eviction-percentage>25</eviction-percentage>
        <min-eviction-check-millis>100</min-eviction-check-millis>
        <merge-policy>com.hazelcast.map.merge.PutIfAbsentMapMergePolicy</merge-policy>
        <cache-deserialized-values>INDEX-ONLY</cache-deserialized-values>
        
        <indexes>
        <index ordered="false">priority</index>
        <index ordered="false">username</index>
        <index ordered="false">userGroup</index>
        <index ordered="false">isPending</index>
        <index ordered="false">taskDocuments[any].userId</index>
        <index ordered="false">taskDocuments[any].documentType</index>
    </indexes>
    </map>


My aggregator implementation is as following:



public class TaskAggregator
        extends Aggregator<Entry<String, TaskEntity>, TaskEntity> {

    private static final long serialVersionUID = -2586341529070334236L;

    private TaskEntity result;

    private final TaskInfo taskInfo;

    public TaskAggregator(final TaskInfo taskInfo) {
        this.taskInfo = taskInfo;
    }

    @Override
    public void accumulate(final Entry<String, TaskEntity> input) {
        final TaskEntity taskEntity = input.getValue();

        if (taskEntity.getIsPending()) {
            return;
        }

        final String username = taskEntity.getUsername();
        if ((username != null)
                && !username.equals(this.taskInfo.getUsername())) {
            return;
        }

        if (!this.taskInfo.getUserGroups().stream().anyMatch(
            userGroup -> userGroup.equals(taskEntity.getUserGroup()))) {
            return;
        }

        if (taskEntity.getTaskDocuments().stream().anyMatch(
            taskDocument -> (taskDocument.getUserId()
                .equals(this.taskInfo.getUsername()))
                    || !taskDocument.getDocumentType()
                        .equals(this.taskInfo.getDocType()))) {
            return;
        }

        if ((this.result == null)
                || ((this.result.getPriority() == taskEntity.getPriority())
                        && (this.result.getCreatedTime() > taskEntity
                            .getCreatedTime()))
                || (this.result.getPriority() < taskEntity.getPriority())) {
            this.result = taskEntity;
        }
    }

    @Override
    public TaskEntity aggregate() {
        return this.result;
    }

    @Override
    public void combine(
        @SuppressWarnings("rawtypes") final Aggregator aggregator) {
        final TaskAggregator customAggregator = (TaskAggregator) aggregator;
        final TaskEntity taskEntity = customAggregator.result;

        if ((this.result == null)
                || ((this.result.getPriority() == taskEntity.getPriority())
                        && (this.result.getCreatedTime() > taskEntity
                            .getCreatedTime()))
                || (this.result.getPriority() < taskEntity.getPriority())) {
            this.result = taskEntity;
        }
    }

    public TaskEntity getResult() {
        return this.result;
    }

}



 

Thank so much


noctarius

unread,
Feb 1, 2018, 1:03:49 AM2/1/18
to haze...@googlegroups.com
How many nodes did you try? Hazelcast is not designed for single node operations. 

--
You received this message because you are subscribed to the Google Groups "Hazelcast" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hazelcast+unsubscribe@googlegroups.com.
To post to this group, send email to haze...@googlegroups.com.
Visit this group at https://groups.google.com/group/hazelcast.
To view this discussion on the web visit https://groups.google.com/d/msgid/hazelcast/3af73fa8-5e2b-4eb7-a827-bdc71b5c09ae%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Minh Trần

unread,
Feb 1, 2018, 3:41:53 AM2/1/18
to Hazelcast
I really did not about this, Christoph. I tried for one node only. I intentionally think that it must be faster than multiple nodes. Because it do not need the cost for network latency, aggregate results from multiple nodes.

Do you have any suggestion to me ?


To unsubscribe from this group and stop receiving emails from it, send an email to hazelcast+...@googlegroups.com.

Noctarius

unread,
Feb 1, 2018, 4:03:05 AM2/1/18
to haze...@googlegroups.com
Hey Minh,

Not in case of Hazelcast, since data is split into partitions but number of concurrently computed number of partitions is limited. That way multiple nodes can compute more partitions in parallel. Obviously for small amount of data network overhead might be bigger than the benefits but you should always use at least 3 nodes (best on different physical hardware).

Chris

Peter Veentjer

unread,
Feb 1, 2018, 4:24:27 AM2/1/18
to haze...@googlegroups.com
Can you run with a profiler and see where time is being spend?

Probably a huge amount of time is spend in deserialization since you are using BINARY in memory format.

Chris


To unsubscribe from this group and stop receiving emails from it, send an email to hazelcast+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Hazelcast" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hazelcast+unsubscribe@googlegroups.com.

To post to this group, send email to haze...@googlegroups.com.
Visit this group at https://groups.google.com/group/hazelcast.

Ozan Kılıç

unread,
Feb 1, 2018, 4:34:14 AM2/1/18
to haze...@googlegroups.com
Minh, 

can you try the following system properties and let us know the difference? 

hazelcast.clientengine.query.thread.count = 40 (increasing this as much as core*20)
hazelcast.index.copy.behavior=NEVER (see documentation)

Chris


--
You received this message because you are subscribed to the Google Groups "Hazelcast" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hazelcast+...@googlegroups.com.
To post to this group, send email to haze...@googlegroups.com.
Visit this group at https://groups.google.com/group/hazelcast.

--
You received this message because you are subscribed to the Google Groups "Hazelcast" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hazelcast+...@googlegroups.com.
To post to this group, send email to haze...@googlegroups.com.
Visit this group at https://groups.google.com/group/hazelcast.

For more options, visit https://groups.google.com/d/optout.
--
Ozan Kılıç
Solutions Architect

Ozan Kılıç

unread,
Feb 1, 2018, 4:52:21 AM2/1/18
to haze...@googlegroups.com
Minh, 

if you don't use hazelcast client for query, then ignore hazelcast.clientengine.query.thread.count
You can play with the query executor in your hazelcast config. The default is 16, so try to play with it:

<executor-service name="hz:query"> 
   <statistics-enabled>true</statistics-enabled>
   <pool-size>16</pool-size> 
   <queue-capacity>2147483647</queue-capacity>
 </executor-service>


Reply all
Reply to author
Forward
0 new messages