Hi,
On Tuesday, May 22, 2012 3:17:32 PM UTC+2, Fuad Malikov wrote:
Hi Leo,
You basically should analyze MultiMap and you can start by looking at classes:
MultiMapProxyImpl
ConcurrentMapManager
CMap
PutMultiOperationHandler
I started looking into the code to see what needs to be adapted and added. The initial analysis shows that handling of MultiMaps as it is implemented now is rather complex and spread over quite some places (including Record, transactions, requests, etc), touching a lot of rather sensitive places. There is a lot of places, where there is a hard-coded explicit check if current data structure is a MultiMap and a special logic to handle it. Plus, the code assumes that it know how MultiMaps are represented internally (e.g. sets or lists) and uses this knowledge (as opposed e.g. to treating MultiMaps as black-boxes and delegating all the logic to a small set of MultiMap specific classes). IMHO, based on my still rather limited knowledge of HZ codebase, the current implementation does not make it very easy to add a new data structure if it is different enough from existing ones in terms of APIs (e.g. each operation takes more than one key, etc).
With this in mind, I see the following potential approaches:
1) Copy/paste all MultiMap related code and introduce corresponding NestedMap related code. It will probably work, but will introduce a lot of very similar code.
In addition, I have to add at very many places methods with NestedMap suffix/prefix, as it is done for MultiMaps currently, and this will pollute the code even more. Many of those methods with NestedMap suffix would allow for (key1, key2, value) parameters, similar to the way how it is done for MultiMap with (key, value) parameters (e.g. MProxy should be extended and so on). So, it will work, but it will not look like a nice and easily maintainable code. It would also require quite some effort. And should one want to add a new data structure in the future, the situation may become even worse.
2) An alternative could be to treat in the most places in code the last two elements of the (K1, K2, V) tuple, i.e. (K2, V), as a compound-value and sort of a black-box, whose semantic is not known to majority of the code. Such a compound-value is treated almost as an atomic value and can be e.g. internally sent/received during get/put operations. In this sense, it is not different to the way how usual simple map values are treated by Hazelcast. Only the endpoints of the chain exchanging these atoms need to know how to interpret this value when trying to insert it into the map. I.e. instead of inserting it into the map in a usual way like it is done e.g. for integers, it would split it into K2 and V parts and insert it into the local value-map belonging to the key K1. When one needs to send (K1, K2, V) for a put operation, then the opposite will be done. (K2, V) will be combined into one new value NV, which will be serialized and a usual map update operation will be sent for key K1 and value NV. The other side will receive it in a usual way and only when it will be about to update the entry for K1, it will check that the target entry is not a usual value, but a Map and therefore NV needs to be split into parts and processed accordingly.
I think this approach may cover also other data structures and e.g. MultiMap can be expressed in its terms as well. More over, it would allow for many further custom data-structures that are also delta-aware similar to MultiMap and NestedMap.
I hope I managed to explain my ideas in more or less understandable way ;-) And I'm very eager to hear your opinions on these issues, before a lot of time is spent on either of the proposed solutions.
Thanks,
Leo