http://blog.basho.com/2010/04/05/why-vector-clocks-are-hard/
they say that vector clocks have to identify time passing on the client, not on the server, since otherwise you can have two clients concurrently updating, getting the same clock, and then silently losing data. That web page has a fine example of this.
Voldemort seems to have vector clocks that identify time passing on the server, at least in the old version I'm reading. src/voldemort/versioning/ClockEntry.java has a field "nodeId", and /voldemort/src/voldemort/cluster/Node.java seems to identify servers and have an "id" field. The former is short and the latter is int, but nevertheless I sincerely hope that nodeId is the id of a Node.
So what's going on here? Is the example wrong in some way I don't presently see, does Voldemort occasionally silently lose data when there are concurrent updates, or are the field names misleading and Voldemort clients have nodeId's?
Tim Freeman
Email: tim.f...@hp.com
Desk in Palo Alto: (650) 857-2581
Home: (408) 774-1298
Cell: (408) 348-7536
A client supplies a vector when writing to the server. If client a and
b connect to the server and write a concurrent vector clock, one of
them will be rejected. If you want to make sure that all clients are
able to perform read-modify-write loop, you can use the applyUpdate()
method for this (as an "optimistic lock").
The nodeId in the vector clock does refer to the server, but the
coordination is done entirely by the client.
Thanks,
- Alex
> --
> You received this message because you are subscribed to the Google Groups "project-voldemort" group.
> To post to this group, send email to project-...@googlegroups.com.
> To unsubscribe from this group, send email to project-voldem...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/project-voldemort?hl=en.
>
>
Just to check my understanding: I believe that if the table has a nontrivial InconsistencyResolver, using applyUpdate will use a loop to do something that could be done in one try and no waiting if the clocks were on the clients instead of the servers. Right?
Putting the clocks on the clients complicates things a lot, since you then have to identify the clients; the existing scheme that uses shorts for nodeId's in ClockEntry's wouldn't work. I'm not sure anyone uses nontrivial InconsistencyResolvers in practice anyway, so the tradeoff you made there is certainly reasonable, if I have understood it correctly.
Tim Freeman
Email: tim.f...@hp.com
Desk in Palo Alto: (650) 857-2581
Home: (408) 774-1298
Cell: (408) 348-7536