Hazelcast aggregation performance

Minh Trần

unread,

Jan 31, 2018, 11:00:00 PM1/31/18

to Hazelcast

Hi Hazelcast team,

I do the test with aggregation of Hazelcast 3.9.2. And I got a not really good performance

Cache Size: 100000 entries (4.18 KB/entry)

Average query time: 156 millisecond

After doing some customization on the model, i got 91 millisecond

My PC: 4 cores, 8GB RAM (7 GB for heap)

Does any one suggest me any solution to improve the performance of the benchmark ?

The cache is used is a map with following configuration:

        
    <properties>
	   <property name="hazelcast.jmx">true</property>
	   <property name="hazelcast.query.predicate.parallel.evaluation">true</property>
	   <property name="hazelcast.aggregation.accumulation.parallel.evaluation">true</property>
    </properties>
    <map name="tasks">
        <in-memory-format>BINARY</in-memory-format>
        <backup-count>1</backup-count>
        <async-backup-count>0</async-backup-count>
        <time-to-live-seconds>0</time-to-live-seconds>
        <max-idle-seconds>0</max-idle-seconds>
        <eviction-policy>NONE</eviction-policy>
        <max-size policy="PER_NODE">0</max-size>
        <eviction-percentage>25</eviction-percentage>
        <min-eviction-check-millis>100</min-eviction-check-millis>
        <merge-policy>com.hazelcast.map.merge.PutIfAbsentMapMergePolicy</merge-policy>
        <cache-deserialized-values>INDEX-ONLY</cache-deserialized-values>
        
        <indexes>
	        <index ordered="false">priority</index>
	        <index ordered="false">username</index>
	        <index ordered="false">userGroup</index>
	        <index ordered="false">isPending</index>
	        <index ordered="false">taskDocuments[any].userId</index>
	        <index ordered="false">taskDocuments[any].documentType</index>
	    </indexes>
    </map>

My aggregator implementation is as following:

public class TaskAggregator

        extends Aggregator<Entry<String, TaskEntity>, TaskEntity> {

    private static final long serialVersionUID = -2586341529070334236L;

    private TaskEntity result;

    private final TaskInfo taskInfo;

    public TaskAggregator(final TaskInfo taskInfo) {
        this.taskInfo = taskInfo;
    }

    @Override
    public void accumulate(final Entry<String, TaskEntity> input) {
        final TaskEntity taskEntity = input.getValue();

        if (taskEntity.getIsPending()) {
            return;
        }

        final String username = taskEntity.getUsername();
        if ((username != null)
                && !username.equals(this.taskInfo.getUsername())) {
            return;
        }

        if (!this.taskInfo.getUserGroups().stream().anyMatch(
            userGroup -> userGroup.equals(taskEntity.getUserGroup()))) {
            return;
        }

        if (taskEntity.getTaskDocuments().stream().anyMatch(
            taskDocument -> (taskDocument.getUserId()
                .equals(this.taskInfo.getUsername()))
                    || !taskDocument.getDocumentType()
                        .equals(this.taskInfo.getDocType()))) {
            return;
        }

        if ((this.result == null)
                || ((this.result.getPriority() == taskEntity.getPriority())
                        && (this.result.getCreatedTime() > taskEntity
                            .getCreatedTime()))
                || (this.result.getPriority() < taskEntity.getPriority())) {
            this.result = taskEntity;
        }
    }

    @Override
    public TaskEntity aggregate() {
        return this.result;
    }

    @Override
    public void combine(
        @SuppressWarnings("rawtypes") final Aggregator aggregator) {
        final TaskAggregator customAggregator = (TaskAggregator) aggregator;
        final TaskEntity taskEntity = customAggregator.result;

        if ((this.result == null)
                || ((this.result.getPriority() == taskEntity.getPriority())
                        && (this.result.getCreatedTime() > taskEntity
                            .getCreatedTime()))
                || (this.result.getPriority() < taskEntity.getPriority())) {
            this.result = taskEntity;
        }
    }

    public TaskEntity getResult() {
        return this.result;
    }

}

Thank so much

noctarius

unread,

Feb 1, 2018, 1:03:49 AM2/1/18

to haze...@googlegroups.com

How many nodes did you try? Hazelcast is not designed for single node operations.

--
You received this message because you are subscribed to the Google Groups "Hazelcast" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hazelcast+unsubscribe@googlegroups.com.
To post to this group, send email to haze...@googlegroups.com.
Visit this group at https://groups.google.com/group/hazelcast.
To view this discussion on the web visit https://groups.google.com/d/msgid/hazelcast/3af73fa8-5e2b-4eb7-a827-bdc71b5c09ae%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Minh Trần

unread,

Feb 1, 2018, 3:41:53 AM2/1/18

to Hazelcast

I really did not about this, Christoph. I tried for one node only. I intentionally think that it must be faster than multiple nodes. Because it do not need the cost for network latency, aggregate results from multiple nodes.

Do you have any suggestion to me ?

To unsubscribe from this group and stop receiving emails from it, send an email to hazelcast+...@googlegroups.com.

Noctarius

unread,

Feb 1, 2018, 4:03:05 AM2/1/18

to haze...@googlegroups.com

Hey Minh,

Not in case of Hazelcast, since data is split into partitions but number of concurrently computed number of partitions is limited. That way multiple nodes can compute more partitions in parallel. Obviously for small amount of data network overhead might be bigger than the benefits but you should always use at least 3 nodes (best on different physical hardware).

Chris

To view this discussion on the web visit https://groups.google.com/d/msgid/hazelcast/5c796d97-ecd8-42a0-adc2-56e18bc60daf%40googlegroups.com.

Peter Veentjer

unread,

Feb 1, 2018, 4:24:27 AM2/1/18

to haze...@googlegroups.com

Can you run with a profiler and see where time is being spend?

Probably a huge amount of time is spend in deserialization since you are using BINARY in memory format.

Chris

To unsubscribe from this group and stop receiving emails from it, send an email to hazelcast+unsubscribe@googlegroups.com.

To post to this group, send email to haze...@googlegroups.com.
Visit this group at https://groups.google.com/group/hazelcast.
To view this discussion on the web visit https://groups.google.com/d/msgid/hazelcast/5c796d97-ecd8-42a0-adc2-56e18bc60daf%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Hazelcast" group.

To unsubscribe from this group and stop receiving emails from it, send an email to hazelcast+unsubscribe@googlegroups.com.

To post to this group, send email to haze...@googlegroups.com.
Visit this group at https://groups.google.com/group/hazelcast.

To view this discussion on the web visit https://groups.google.com/d/msgid/hazelcast/0AFE4C0C-F2D0-438E-BB35-792B1EF43B3D%40googlemail.com.

Ozan Kılıç

unread,

Feb 1, 2018, 4:34:14 AM2/1/18

to haze...@googlegroups.com

Minh,

can you try the following system properties and let us know the difference?

hazelcast.clientengine.query.thread.count = 40 (increasing this as much as core*20)

hazelcast.index.copy.behavior=NEVER (see documentation)

Chris

To view this discussion on the web visit https://groups.google.com/d/msgid/hazelcast/5c796d97-ecd8-42a0-adc2-56e18bc60daf%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Hazelcast" group.

To unsubscribe from this group and stop receiving emails from it, send an email to hazelcast+...@googlegroups.com.
To post to this group, send email to haze...@googlegroups.com.
Visit this group at https://groups.google.com/group/hazelcast.

To view this discussion on the web visit https://groups.google.com/d/msgid/hazelcast/0AFE4C0C-F2D0-438E-BB35-792B1EF43B3D%40googlemail.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Hazelcast" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hazelcast+...@googlegroups.com.
To post to this group, send email to haze...@googlegroups.com.
Visit this group at https://groups.google.com/group/hazelcast.

To view this discussion on the web visit https://groups.google.com/d/msgid/hazelcast/CAGuAWdDqhC0FeeXat%2BiBGqj-vdHmYeyNy4O_hCaO%2BogcbH1d9Q%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

--

Ozan Kılıç

Solutions Architect

Ozan Kılıç

unread,

Feb 1, 2018, 4:52:21 AM2/1/18

to haze...@googlegroups.com

Minh,

if you don't use hazelcast client for query, then ignore hazelcast.clientengine.query.thread.count

You can play with the query executor in your hazelcast config. The default is 16, so try to play with it:

<executor-service name="hz:query">

<statistics-enabled>true</statistics-enabled>

<pool-size>16</pool-size>

<queue-capacity>2147483647</queue-capacity>

</executor-service>

Reply all

Reply to author

Forward