Hi Yu
09.12.2018, 04:44, "
yuw...@gmail.com" <
yuw...@gmail.com>:
> I'm Yu from Carnegie Mellon University and I'm now working on a database wiki to introduce thousands of different Database system, including Elliptics.
> I have some confusions when writing the page of the Ellipitics via seeking the information in your open-source documents and code, so I would like to ask you whether I could do a short interview with your developers about what currently I don't find. I hope this won't bother you guys.
>
> Here are some of my confusions:
>
> 1. What's the Isolation level of the Elliptics? Will Elliptics support repeated read? Repeated Read means two clients will work on the same object, one read twice, while another one happens to write the object besides the two reads. Will the two reads get the same value? And how do you implement the isolation model? Via transactions or anything else?
If you do not get the lock between the reads, writer can sneak in and reads will return old and new content accordingly.
Elliptics holds the lock only for single operation, otherwise client may just disappear in the middle.
There is a different story to this question related to the reading from multiple replicas.
I.e. one client reads data from one replica, another one reads from different replica, the writer writes data, will both the client receive the same either old or new content?
In the case of Elliptics, it is possible that one reader will receive old content while the other reader will see the new one already.
This is because Elliptics updates replicas in parallel, but without holding the single 'replica' log. It has been done to make Elliptics the fastest on-disk storage among those you will be able to test.
In any case, writer will always receive the completion status for every replica, if you have to have atomic transaction among physically (potentially geographically) distributed replicas, you may
implement a central entry point which will hold the lock and updated it (and optionally rollback the transaction) only after all completion statuses have been received.
> 2. What kind of join algorithm does Elliptics support? Hash Join/Semi Join or anything else?
There is no join support in Elliptics, since it is not table-oriented and only guaranteed atomicity level is per object in single replica.
This decision was made for performance and scalability reasons.
> 3. In the very beginning of Elliptics homepage, it says "With default key generation policy it implements hash table object storage." Could you specify which key generation policy you are using?
By default key in Elliptics is sha512 checksum of the string use use as a key (this is required to implement very fast O(1) on-disk lookup). This can be overridden, although in practice this is not used.
> Thanks a lot if you are willing to help me. And I would be glad if we can uses emails to do the communication in the future!
Sure.
Thank you!