Hi,
I am using Hazelcast 4.0.2 with Oracle-JDK-1.8.
My use-case is:
- I put TreeSet of Data ordered by a filed score in an IMap.
- The set can grow up to a Milion records
- Every update of the data and score, I add/remove from the TreeSet
- What I want to achieve is very lo-latency read of Top-X items from the TreeSet.
My data class:
@Getter
public class Data implements IdentifiedDataSerializable {
private Long identifier;
private Double score;
public Data(Long identifier, Double score) {
this.identifier = identifier;
this.score = score;
}
@Override
public boolean equals(Object o) {
if (this == o) {
return true;
}
if (!(o instanceof Data)) {
return false;
}
Data that = (Data) o;
return identifier.equals(that.identifier);
}
@Override
public int hashCode() {
return Objects.hash(identifier);
}
@Override
@JsonIgnore
public int getFactoryId() {
return SerializerConstants.SERIALIZER_FACTORY_ID_SCORED_LEAD;
}
@Override
@JsonIgnore
public int getClassId() {
return SerializerConstants.SERIALIZER_CLASS_ID_SCORED_LEAD;
}
@Override
public void writeData(ObjectDataOutput out) throws IOException {
out.writeLong(identifier);
out.writeDouble(score);
}
@Override
public void readData(ObjectDataInput in) throws IOException {
identifier = in.readLong();
score = in.readDouble();
}
}
Also, a comparator is implemented on the score.
How I read data:
final IMap<String, TreeSet<Data>> imap = getImap("CACHE_NAME");
final TreeSet<Data> dataSet = imap.get(groupName);
if (CollectionUtils.isEmpty(dataSet)) {
return null;
}
Set<Data> selected = dataSet.stream()
.filter(val -> !excluding.contains(val.getIdentifier()))
.limit(limit)
.collect(Collectors.toSet());
...
To read top 100 items from the TreeSet
If the TreeSet has 50,000 or fewer items, then the time is 25 milliseconds.
If the TreeSet has 100,000 or more items, then it takes more than 200 milliseconds.
The issue I see here is, every final TreeSet<Data> dataSet = imap.get(groupName); hazelcast de-serializes the complete collection and apply comparator again.
Is there any way to limit the serialization of a collection?
And also not to use Comparator on read of a TreeSet?
Thanks in advance...