How to do iteration over AsyncMap in vert x3?

644 views
Skip to first unread message

Levin So

unread,
Jun 1, 2015, 5:46:06 AM6/1/15
to ve...@googlegroups.com
Hi,

is it possible to expose an interface of  such as 'getKeySet()' in AsyncMap for iteration?

thanks

levin

bytor99999

unread,
Jun 1, 2015, 12:34:57 PM6/1/15
to ve...@googlegroups.com
I don't know the answer. But, inside my head I am thinking, if the map gets large, and there are mutations happening while looping through that might cause issues. You could get a lock, but even that has its issues. I would recommend using something else. What, I don't know. Just some things to consider.

Mark

Levin So

unread,
Jun 5, 2015, 12:09:07 AM6/5/15
to ve...@googlegroups.com
In Hazelcast, it provides a keySet() method in Map and MultiMap, and also a iterator() method in List and Set.
could vertx simply bridging these method for us?

Jordan Halterman

unread,
Jun 5, 2015, 12:42:27 AM6/5/15
to ve...@googlegroups.com
It's possible, but it wouldn't be in the form of an actual key set or iterator. AFAIK when Hazelcast returns a key set, it's actually returning a set with pointers (sort of) to data that may or may not be on that node. When you access the key set or iterator, Hazelcast makes a blocking call to fetch the key or value just as it does with any data structure. So Vert.x would have to return an asynchronous set that accounts for Hazelcast's blocking calls for set membership and iterators.

Also, Mark is right that there are potential consistency issues, and there are also scalability issues. Hazelcast distributes many keys across many nodes, so aggregating them is not feasible or performant.

But none of that is to say it's a bad idea, just pointing out that it's more complex than fetching a set of keys. There seem to be a lot of map and collection methods that remain to be exposed by Vert.x, and map keys and iterators are legitimate candidates.
--
You received this message because you are subscribed to the Google Groups "vert.x" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vertx+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ian Andrews

unread,
Jun 5, 2015, 11:38:21 AM6/5/15
to ve...@googlegroups.com
The way that I have gotten around this problem in the past was to keep track of the keys myself using another map.

For example, if I had a map "my_data" that I was using to store data in the cluster, I would also create a map called "my_data_keys".  I would only use a single entry in "my_data_keys", and in that entry I would put a JsonArray that contained the keys that I was using in "my_data".

Now, this works, but I do not think that I would actually recommend using this in production without some extensive testing.  I only used this pattern to avoid using an actual database during development.  With that being said, I haven't currently experienced any problems with performance or consistency of the data.

Jordan Halterman

unread,
Jun 5, 2015, 9:22:24 PM6/5/15
to ve...@googlegroups.com
The only risks with this is a race condition where the map has n keys and the keys have n - 1 keys before both are synchronized, or a potential failure between the two operations that results in inconsistency between the map and key set. But whether that matters depends on your application. If you use the separate key set as the source of truth for the available keys in the map and verify that keys still exist when accessing the map then you shouldn't have any issues, practically speaking.
--

Levin So

unread,
Jun 7, 2015, 11:34:14 PM6/7/15
to ve...@googlegroups.com
Thanks Jordan and Ian for pointing out the technical difficulty in cluster environment and the work around.
In my case, i don't concern about 100% consistency, so i would try to go with Ian's way until vert.x could provide it.

Cosmic Interloper

unread,
Nov 19, 2015, 6:14:12 PM11/19/15
to ve...@googlegroups.com
Is there a solution that avoids race conditions where you could set the map immutable (using a shared lock), then safely get the whole key set?

I think some use cases would be satisfied by being able to 'snapshot' maps - or freeze them, then get the key set and iterate. 

jordan.h...@gmail.com

unread,
Nov 19, 2015, 6:54:45 PM11/19/15
to ve...@googlegroups.com
You can do that at the application level. Nothing wrong with that. You could use that approach of storing keys in a separate Set or something. But if you want to prevent the key set from being modified during iteration, that implies all writers have to acquire locks when writing to the map, and locks can be expensive (though maybe not so much in HA systems like Hazelcast).

Maybe it would be cool to add a `forEach(...)` like operation to asynchronous Sets and support keySet for maps. It would probably require iterating the Hazelcast key set (which requests keys in batches) in a background thread. I think the other cluster managers should be able to support this, but handling consistency issues for iteration might still have to be left to the application. I don't think we want to require expensive locking just to get a little consistency when iterating keys which should be a rare operation.

Also, you mentioned "snapshotting" maps. This is sort of how a lot of databases work - snapshot isolation. Basically, the idea would be that once a process starts iterating over the keys in a map, those keys and their values will not change, but other processes can still update keys. Updates from other processes aren't seen by the process iterating the keys since it's operating on a snapshot. That's an awesome idea but may not be practical for some cluster managers. Also, using snapshot isolation in an asynchronous framework could result in some odd behavior if, for instance, the process updates a key while iterating the snapshot. Synchronous Iterators have the benefit of context which is a bit more difficult to manage in an asynchronous environment. I think things like snapshots just have to be left to the details of specific cluster managers.

Jordan Halterman

Cosmic Interloper

unread,
Nov 30, 2015, 5:45:16 AM11/30/15
to ve...@googlegroups.com
Here is my proof of concept impl keeping track of key sets with CRDT's syncronized by eventbus (kryo serialization)

I just found this crdt impl lieing around on github (credit ref in txt file)

I make no gaurentees this is atomic, transactional, good, safe, tested or freindly (yet)

just an alternate idea



Reply all
Reply to author
Forward
0 new messages