Hi -
1. - What does your index creation look like? I have a map with a trivial "HotelData" class that contains an array of strings called "hotels" and a timestamp. In v4.2, I have, in part 'indexConfig.addAttribute("hotels[any]");' and I use that to add a HASH index to a map. I populate the map with some dummy data and 'l.info("keys: {}", foo.keySet(Predicates.sql("hotels[any] = h1")));' acts as expected and provides a correct result.
In response to your other questions, each of the predicates is optimized separately - and this is an interesting point - so multiple indexes are used. My data isn't the same as yours, but I have two index expressions:
sorted - "timestamp" and
hash - "hotels[any]"
This gives me two indexes, for a map "foo" - "foo_sorted_timestamp" and "foo_hash_hotels[any]". The names are only important if you want to dissect the local map stats.
I created some predicates - one for
- Predicates.sql("hotels[any] in (h1, h2)");
- Predicates.sql("timestamp >= 1626455173965 and timestamp < 1626455213965")
- Predicate andP = Predicates.and(hotelP, timestampP);
The third one is what I think is most similar to what you're looking for - find some hotels within a range of timestamp data
.
I queried each predicate separately and printed out some local index stats in between.
foo.getLocalMapStats().getLocalIndexStats().get("foo_sorted_timestamp").getQueryCount() and
foo.getLocalMapStats().getLocalIndexStats().get("foo_hash_hotels[any]").getQueryCount()
So, the query counts each ended up at '1' after the non-composite queries were run and were both '2' after the composite 'and' predicate was queried, exactly as it should have been. This clearly shows that multiple indexes were used to indicate the logical 'and' query.
In the composite, you're correct - all query logic is executed on the members, and only the keySet() is returned over the wire.
Efficiency is always a concern, so this was a really good question, I thought.
There's another nuance to look at, as it may be helpful, partition-predicates. This allows you to control which member evaluates the query. In the above, there are a couple of data-dependent cases this could be used. If it had been a single hotel (i.e Predicates.sql("hotels[any] = h1") or for multiple hotels if they had used partition-aware keys and were on the same member (because they're on the same partition), we could have used
Predicate awareP = Predicates.partitionPredicate("h1", andP);
This would have taken the prior composite predicate and allow us to only run that query on the member where the data resides. There may well be a non-trivial benefit to reducing the distributed query load in a busy system - both CPU and network.
I want to mention that with the 5.0 release, streaming Jet and caching IMDG operations are now supported in a single binary. If this were a reservation system, we could combine streaming events (reservations, ...) and enrichment (hotel-code to hotel-name, for example), storage in an IMap, distributed queries, streaming queries, and streaming data out to external systems. This can be done in 3.x and 4.x as well, but by combining the products (which was pretty seamless, anyway).
Cheers
Tom